Skip to content

Conversation

@nikita-savelyevv
Copy link
Collaborator

Changes

Added optimized weight compression through OpenVINO models for FP4 compression mode. Results should be similar to MXFP4 (#3550).

Reason for changes

Improving UX.

Tests

Extended tests/openvino/optimized_functions/test_compression_functions.py

@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner November 14, 2025 17:02
@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Nov 14, 2025
:param precomputed_scale: Optional precomputed scale.
:return: Returns quantized (for MXFP8_E4M3, FP4 and FP8_E4M3 normalized)
weight tensor and corresponding scale tensor.
:return: Returns quantized weight tensor and corresponding scale tensor.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MXFP8_E4M3 and FP8_E4M3 are actually not supported by optimized compression, no need to mention this.

)

MXFP4_QUANTILES = np.array(
F4E2M1_QUANTILES = np.array(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mxfp4 is a compression format with f4e2m1 weight, f8e8m0 scale and group size 32. While this grid is defined according to f4e2m1 data type, irrespective of other parameters. So renamed.

Copy link
Collaborator

@daniil-lyakhov daniil-lyakhov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great contribution! A couple minor comments

@ljaljushkin ljaljushkin merged commit e9e7cd0 into openvinotoolkit:develop Nov 19, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF OpenVINO Pull requests that updates NNCF OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants