Skip to content

Conversation

@amirkl94
Copy link
Contributor

@amirkl94 amirkl94 commented Oct 21, 2025

Purpose

apply_router_weight_on_input=apply_router_weight_on_input,
)
elif self.flashinfer_moe_backend == FlashinferMoeBackend.CUTLASS:
assert not renormalize
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. why did you remove the assert that renormalize is not True?
  2. Maybe worth asserting that activation is either "silu" or "relu2"?

expert_tokens_meta: mk.ExpertTokensMetadata | None,
apply_router_weight_on_input: bool | None,
):
assert activation == "silu", (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe worth asserting that activation is one of the keys in activation_str_to_value_map?

When running cutlass FusedMoE FP8 the scaling factors that are passed are None.
This PR passes the correct scaling factors and enables the relevant test.

Signed-off-by: Amir Klein <[email protected]>
Signed-off-by: Amir Klein <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants