Skip to content

Conversation

@alex-jw-brooks
Copy link
Contributor

@alex-jw-brooks alex-jw-brooks commented Jul 28, 2025

This PR builds on top of #20 to try to make the tests more reusable.

Summary of changes from the above branch are:
- Splits the common shapes test out into more understandable helpers that are then reused in the cache test in the follow-up PR
- Renames some stuff to better align with conventions

if isinstance(common_batch_sizes, str):
common_batch_sizes = [int(bs) for bs in common_batch_sizes.split(",")]
if isinstance(COMMON_BATCH_SIZES, str):
COMMON_BATCH_SIZES = [int(bs) for bs in COMMON_BATCH_SIZES.split(",")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not merged yet but fya get_env_to_int_list method from Gaurav's PR here would be helpful here

@alex-jw-brooks alex-jw-brooks changed the title Add Cache Test / Refactor Decoder Tests Refactor Decoder Tests Jul 30, 2025
@alex-jw-brooks alex-jw-brooks marked this pull request as ready for review July 30, 2025 12:25
@alex-jw-brooks
Copy link
Contributor Author

I split this PR in two to hopefully make it easier to review - this PR is now just the refactor to make things more reusable, the cache test is added in #97 in this commit ad3073c

model,
micro_model_path,
validation_zero_info,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we are doing a restructuring and splitting up validation level 1 and 0, it might be a good opportunity to give a description here of what each validation level is doing. If not in this PR, we could do in a follow up PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @JRosenkranz, I rebased this PR and added a short description for level 0 / 1 for now. Happy to continue cleanup / add better docstrings in follow-up PRs as well 😄

@JRosenkranz
Copy link
Contributor

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1,8 SEQUENCE_LENGTH=64,2048 USE_TINY_MODEL=1

@JRosenkranz
Copy link
Contributor

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1,8 SEQUENCE_LENGTH=64,2048 USE_TINY_MODEL=0

@alex-jw-brooks alex-jw-brooks force-pushed the test_cache_refactor branch 2 times, most recently from b6e36d4 to d2551b9 Compare August 12, 2025 13:56
)
skip_assertions = os.environ.get("FMS_TEST_SHAPES_SKIP_ASSERTIONS", {})
validation_info_dir = os.environ.get(
SKIP_ASSERTIONS = os.environ.get("FMS_TEST_SHAPES_SKIP_ASSERTIONS", {})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull some of this env var setup outside of this script to use with the other pytests please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely! I'll do it in a different PR to try to keep things as isolated as possible here if that's ok 🙂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally ok and makes a lot of sense, thank you!

model_path, batch_size, seq_length, max_new_tokens, persistent_model
##### Common utils
# metric calculator based on the cross-entropy and mean diff for each decode step
def _metric_calculator(r: torch.Tensor, t: torch.Tensor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we use this in a few places, not necessarily for this PR but we might want to move this out into a utility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, looks like it! I will open some other cleanup PRs for stuff like this / clean up some of the env var stuff @tharapalanivel had asked for since this one is already a lot to look at

warmup_model(
model, input_ids, max_new_tokens, compile_dynamic_sendnn, **extra_kwargs
)
def _get_aiu_model(model_path, gptq_kwargs, persistent_model_inst):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the persistent model calling this in the current version with get_or_create. Is there a specific reason we moved this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main the reason was the cache test, because the branch I based it off of was not using the persistent model fixture, and I wanted to avoid changing the tests too much while cleaning them up, since I also wasn't very familiar with what they were testing. I agree and put it back to just use get_or_create though, and will just use that in the cache test also!

return cpu_validation_info

# Don't save iter 0 for AIU only
skip_save = device == "aiu" and token_iter == 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we are supposed to save every iteration here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, removed it!

@alex-jw-brooks alex-jw-brooks force-pushed the test_cache_refactor branch 2 times, most recently from 2e42e7c to 1e369b2 Compare September 11, 2025 12:10
Copy link
Contributor

@JRosenkranz JRosenkranz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@JRosenkranz
Copy link
Contributor

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1 SEQUENCE_LENGTH=2048 USE_TINY_MODEL=1 NUM_AIU=4

Copy link
Collaborator

@tharapalanivel tharapalanivel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need another rebase and lint fixes but lgtm once the bot tests also pass, thanks @alex-jw-brooks!

avery-blanchard and others added 6 commits October 3, 2025 12:22
Signed-off-by: Avery Blanchard <[email protected]>
Signed-off-by: Alex-Brooks <[email protected]>
commit ed571f728a351f8dd92737be5554c3dc46f71a30
Author: Alex-Brooks <[email protected]>
Date:   Tue Jul 29 09:20:06 2025 -0600

    Remove cache tests

commit 2848f7b2785b91c60c536b8993c3193c40c381ea
Author: Alex-Brooks <[email protected]>
Date:   Mon Jul 28 08:07:01 2025 -0600

    Add leading underscores, revert model name

commit c30b7b70a0f6e464d3212fd9bed4f9ea33f9de93
Author: Alex-Brooks <[email protected]>
Date:   Mon Jul 28 07:15:08 2025 -0600

    Explictly clear cache paths

commit 42aaf666d7f8ffb2fb611df7ad2d06b48e480dd7
Author: Alex-Brooks <[email protected]>
Date:   Mon Jul 28 07:14:23 2025 -0600

    Set the cache dir in conftest

commit b978e7225f02bf1d9a5f7b919ca6cbe2ee8d641a
Author: Alex-Brooks <[email protected]>
Date:   Mon Jul 28 06:15:10 2025 -0600

    run formatting

commit 8d64df08333991927c45f9a982ddaf95f39c94cf
Author: Alex-Brooks <[email protected]>
Date:   Fri Jul 25 11:18:13 2025 -0600

    refactor cache miss into fixture

commit 0b524b8c818495cb646add2adfc27a2884ac8de5
Author: Alex-Brooks <[email protected]>
Date:   Fri Jul 25 07:11:09 2025 -0600

    Consolidate cache test with common

commit d8a36d405a101e101ab9ede3b8d12fa3026cd01f
Author: Alex-Brooks <[email protected]>
Date:   Fri Jul 25 06:41:13 2025 -0600

    Run cache test first

commit 2efb797fb21587e9136b314c44ec56c658636826
Author: Alex-Brooks <[email protected]>
Date:   Fri Jul 25 05:48:25 2025 -0600

    Finish splitting out common shape test helpers

commit 4ae73dea18848005f86d1c9bcdf29f153711330f
Author: Alex-Brooks <[email protected]>
Date:   Fri Jul 25 05:28:31 2025 -0600

    refactor most of common shape test

commit 083afdc3a468649ec4b0bbadc921d40b47e37498
Author: Alex-Brooks <[email protected]>
Date:   Thu Jul 24 14:08:20 2025 -0600

    Move torch sendnn cache dir to common

commit e9b576381a738c59f91d5fc904ceaa2a0e410864
Author: Alex-Brooks <[email protected]>
Date:   Thu Jul 24 14:02:06 2025 -0600

    Use caps for constants, common post proc

Signed-off-by: Alex-Brooks <[email protected]>
Signed-off-by: Alex-Brooks <[email protected]>
@alex-jw-brooks
Copy link
Contributor Author

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1 SEQUENCE_LENGTH=2048 USE_TINY_MODEL=1 NUM_AIU=4

@Abhishek-TAMU
Copy link
Contributor

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1 SEQUENCE_LENGTH=2048 USE_TINY_MODEL=1 NUM_AIU=4 AIU_TESTS_GIT_COMMIT=fix_pr_bot

Signed-off-by: Alex-Brooks <[email protected]>
@Abhishek-TAMU
Copy link
Contributor

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1 SEQUENCE_LENGTH=2048 USE_TINY_MODEL=1 NUM_AIU=4

Signed-off-by: Alex-Brooks <[email protected]>
Signed-off-by: Alex-Brooks <[email protected]>
@alex-jw-brooks
Copy link
Contributor Author

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1 SEQUENCE_LENGTH=2048 USE_TINY_MODEL=1 NUM_AIU=4


# NOTE: we should configure the cachedir before importing torchsendnn's
# graph cache to prevent it from being initialized in the wrong place.
os.environ["TORCH_SENDNN_CACHE_DIR"] = os.path.join(os.getcwd(), ".cache")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a setdefault, this way it will only set it if a user did not already specify it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Changed it

@JRosenkranz
Copy link
Contributor

bot:test
TEST_FILE=test_decoders.py MODEL_ID=ibm-granite/granite-3.3-8b-instruct BATCH_SIZE=1 SEQUENCE_LENGTH=2048 USE_TINY_MODEL=1 NUM_AIU=4

@JRosenkranz JRosenkranz merged commit 281ff22 into foundation-model-stack:main Oct 9, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants