Skip to content

Conversation

@daniil-lyakhov
Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov commented Nov 19, 2025

Changes

AWQ: multiply scale shape is expanded to match the activation shape length: Mamba models have an AWQ pattern which has 3 dims, and insertion of default 2d scale leads to an error during inference

Reason for changes

To support AWQ algo for mamba models

Related tickets

173277

Tests

tests/cross_fw/test_templates/template_test_weights_compression.py::test_awq_scale_reference is updated to test the non mergable AWQ branch, testing the branch + new reshape implementation

@github-actions github-actions bot added NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX labels Nov 19, 2025
@daniil-lyakhov daniil-lyakhov force-pushed the dl/awq_batch_dim branch 3 times, most recently from 6e829b5 to 3bb01f7 Compare November 20, 2025 16:03
@daniil-lyakhov daniil-lyakhov marked this pull request as ready for review November 20, 2025 16:40
@daniil-lyakhov daniil-lyakhov requested a review from a team as a code owner November 20, 2025 16:40
merge_weight = self._backend_entity.get_weight(merge_node, port_id, model, graph)
merge_weight = (merge_weight * a_scale).astype(weight_dtype)
self._backend_entity.set_weight(merge_node, port_id, model, graph, merge_weight)
a_scale = fns.transpose(a_scale)
Copy link
Collaborator Author

@daniil-lyakhov daniil-lyakhov Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this a_scale is always being squeezed before the usage, there is no need to do transpose here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF ONNX Pull requests that updates NNCF ONNX NNCF OpenVINO Pull requests that updates NNCF OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant