Skip to content

Conversation

@nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Jun 18, 2025

Changes

Added optimized compression to MXFP4 data type for OpenVINO backend.

Model Memory Before (MiB) Memory After (MiB) Time Before (sec) Time After (sec)
llama-3.2-1b bf16 2778.55 2548.37 (-8.29%) 66.33 24.95 (-62.40%)
llama-3.2-1b fp16 3434.61 2963.03 (-13.73%) 62.12 24.85 (-59.98%)
llama-3.2-1b fp32 2041.79 1576.43 (-22.81%) 62.72 25.85 (-58.77%)
phi4-mini bf16 6384.81 5725.66 (-10.33%) 197.36 66.22 (-66.44%)
phi4-mini fp16 8863.53 8375.75 (-5.51%) 195.85 66.93 (-65.82%)
phi4-mini fp32 4406.82 3897.91 (-11.54%) 195.25 68.83 (-64.76%)
llama-3.1-8b bf16 7297.72 6096.06 (-16.46%) 413.86 135.25 (-67.32%)
llama-3.1-8b fp16 7946.93 8311.89 (+4.58%) 413.64 136.05 (-67.11%)
llama-3.1-8b fp32 7310.48 5043.89 (-30.03%) 411.45 140.66 (-65.81%)

Reason for changes

Improving user experience.

Related tickets

164717

Tests

Extended optimized compression tests.

@nikita-savelyevv nikita-savelyevv changed the title Optimized openvino compression to f4e2m1 data type Optimized openvino weights compression to f4e2m1 data type Jun 18, 2025
@github-actions github-actions bot added NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ labels Jun 18, 2025
@github-actions github-actions bot removed the NNCF PTQ Pull requests that updates NNCF PTQ label Jul 29, 2025
@nikita-savelyevv nikita-savelyevv changed the title Optimized openvino weights compression to f4e2m1 data type [OV] Optimized compression to f4e2m1 data type Jul 29, 2025
@nikita-savelyevv nikita-savelyevv changed the title [OV] Optimized compression to f4e2m1 data type [OV] Optimized compression to MXFP4 data type Oct 8, 2025
@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review October 9, 2025 08:45
@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner October 9, 2025 08:45
@ljaljushkin ljaljushkin merged commit 2925bad into openvinotoolkit:develop Oct 16, 2025
20 checks passed
ljaljushkin pushed a commit that referenced this pull request Nov 19, 2025
### Changes

Added optimized weight compression through OpenVINO models for FP4
compression mode. Results should be similar to MXFP4 (#3550).

### Reason for changes

Improving UX.

### Tests

Extended
`tests/openvino/optimized_functions/test_compression_functions.py`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Code Freeze NNCF OpenVINO Pull requests that updates NNCF OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants