[OpenVINO] Optimized weight compression for FP4 mode #3737

nikita-savelyevv · 2025-11-14T17:02:04Z

Changes

Added optimized weight compression through OpenVINO models for FP4 compression mode. Results should be similar to MXFP4 (#3550).

Reason for changes

Improving UX.

Tests

Extended tests/openvino/optimized_functions/test_compression_functions.py

tests/openvino/optimized_functions/test_compression_functions.py

nikita-savelyevv · 2025-11-14T17:05:35Z

src/nncf/openvino/optimized_functions/functions.py

    :param precomputed_scale: Optional precomputed scale.
-    :return: Returns quantized (for MXFP8_E4M3, FP4 and FP8_E4M3 normalized)
-        weight tensor and corresponding scale tensor.
+    :return: Returns quantized weight tensor and corresponding scale tensor.


MXFP8_E4M3 and FP8_E4M3 are actually not supported by optimized compression, no need to mention this.

nikita-savelyevv · 2025-11-14T17:08:27Z

src/nncf/quantization/algorithms/weight_compression/constants.py

 )

-MXFP4_QUANTILES = np.array(
+F4E2M1_QUANTILES = np.array(


mxfp4 is a compression format with f4e2m1 weight, f8e8m0 scale and group size 32. While this grid is defined according to f4e2m1 data type, irrespective of other parameters. So renamed.

daniil-lyakhov

Great contribution! A couple minor comments

src/nncf/openvino/optimized_functions/models.py

src/nncf/quantization/algorithms/weight_compression/weight_lowering.py

nikita-savelyevv requested a review from a team as a code owner November 14, 2025 17:02

github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Nov 14, 2025

nikita-savelyevv commented Nov 14, 2025

View reviewed changes

tests/openvino/optimized_functions/test_compression_functions.py Show resolved Hide resolved

nikita-savelyevv requested review from daniil-lyakhov and ljaljushkin November 14, 2025 17:03

Initial commit

c1caf1c

nikita-savelyevv commented Nov 14, 2025

View reviewed changes

Fix test

8135c29

daniil-lyakhov reviewed Nov 14, 2025

View reviewed changes

src/nncf/openvino/optimized_functions/models.py Show resolved Hide resolved

src/nncf/quantization/algorithms/weight_compression/weight_lowering.py Show resolved Hide resolved

nikita-savelyevv added 7 commits November 17, 2025 09:07

Introduce OPTIMIZED_COMPRESSION_COMPATIBLE_MODES

14a4ef8

Shorten error logs

f3c3b87

Fix ARM

0a9729b

Try MacOS

8a2a824

Try MacOS v2

b3ccd04

Update log logic

ae417e2

Revert ARM-related changes

d22af86

daniil-lyakhov approved these changes Nov 18, 2025

View reviewed changes

ljaljushkin approved these changes Nov 19, 2025

View reviewed changes

ljaljushkin merged commit e9e7cd0 into openvinotoolkit:develop Nov 19, 2025
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[OpenVINO] Optimized weight compression for FP4 mode #3737

[OpenVINO] Optimized weight compression for FP4 mode #3737

Uh oh!

nikita-savelyevv commented Nov 14, 2025

Uh oh!

Uh oh!

nikita-savelyevv Nov 14, 2025

Uh oh!

nikita-savelyevv Nov 14, 2025

Uh oh!

daniil-lyakhov left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[OpenVINO] Optimized weight compression for FP4 mode #3737

[OpenVINO] Optimized weight compression for FP4 mode #3737

Uh oh!

Conversation

nikita-savelyevv commented Nov 14, 2025

Changes

Reason for changes

Tests

Uh oh!

Uh oh!

nikita-savelyevv Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

nikita-savelyevv Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

daniil-lyakhov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants