Skip to content

Conversation

@shanjiaz
Copy link
Collaborator

@shanjiaz shanjiaz commented Sep 29, 2025

Changes:

  • Added support for optional arguments eagle_aux_hidden_state_layer_ids and inference_type.
  • Added more robust logic for target_vocab_size. We default on using "t2d" length, if not available, load the config file of verifier model, recursively search the dict for vocab_size. (The search is needed for nested dict. e.g. target_config_dict["text_config"]["vocab_size"] )
  • Removed tests for adding verifier embeddings as it's handled on the vllm side now.
  • Removed forward pass tests since forward function is defined on the vllm side.

Command used:

speculators convert nvidia/Llama-4-Maverick-17B-128E-Eagle3 \
  --algorithm eagle3 \
  --verifier RedHatAI/Llama-4-Maverick-17B-128E-Instruct-quantized.w4a16 \
  --output-path Llama4-Maverick-Eagle3-Speculators \
  --validate-device cuda:0 \
  --algorithm-kwargs '{"eagle_aux_hidden_state_layer_ids": [1,23,44], "inference_type": "text"}'

Converted checkpoint:

shanjiaz/Llama4-Maverick-Eagle3-Speculators-converted

@github-actions
Copy link

github-actions bot commented Sep 29, 2025

📦 Build Artifacts Available
The build artifacts (`.whl` and `.tar.gz`) have been successfully generated and are available for download: https://github.com/vllm-project/speculators/actions/runs/18381300407/artifacts/4227990755.
They will be retained for up to 30 days.
Commit: 1f84913

@shanjiaz shanjiaz marked this pull request as ready for review October 1, 2025 01:28
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
@shanjiaz shanjiaz requested a review from rahul-tuli October 3, 2025 17:01
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good, but now since we are removing the forward pass through the model, does it still make sense to keep the --validate/ --validate-device arguments?

@shanjiaz shanjiaz requested a review from rahul-tuli October 6, 2025 15:12
fynnsu
fynnsu previously approved these changes Oct 6, 2025
Copy link
Collaborator

@fynnsu fynnsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a couple comments

rahul-tuli
rahul-tuli previously approved these changes Oct 7, 2025
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few questions/nits which can be addressed in a follow up, good work on this, LGTM once we raise the NotImplementedError for forward passes

rahul-tuli
rahul-tuli previously approved these changes Oct 7, 2025
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

dsikka
dsikka previously requested changes Oct 7, 2025
Copy link
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have test cases for multiple decoder layers?

Signed-off-by: shanjiaz <[email protected]>
Signed-off-by: shanjiaz <[email protected]>
@shanjiaz shanjiaz requested review from dsikka and rahul-tuli October 8, 2025 16:03
fynnsu
fynnsu previously approved these changes Oct 8, 2025
Copy link
Collaborator

@fynnsu fynnsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question below which might require a fix.

Signed-off-by: shanjiaz <[email protected]>
@rahul-tuli
Copy link
Collaborator

LGTM pending quality!

Signed-off-by: shanjiaz <[email protected]>
@shanjiaz shanjiaz requested a review from fynnsu October 9, 2025 16:26
Copy link
Collaborator

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shanjiaz shanjiaz dismissed dsikka’s stale review October 9, 2025 19:44

Added tests and review has been addressed.

@shanjiaz shanjiaz merged commit 8af566f into main Oct 9, 2025
12 checks passed
@shanjiaz shanjiaz deleted the hz-update-config branch October 9, 2025 19:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants