Optimized compression for FP8 modes #3748

nikita-savelyevv · 2025-11-20T15:55:28Z

Changes

Reason for changes

UX improvement. On phi3-mini-4k-instruct for FP8_E4M3 mode, I get about 17x time reduction: 191 sec. -> 11 sec.

Related tickets

Tests

nikita-savelyevv · 2025-11-20T16:02:37Z

src/nncf/quantization/algorithms/weight_compression/fp8_conversion.py

+_f16_to_f8e4m3_bits_vec = np.vectorize(_f16_to_f8e4m3_bits_scalar, otypes=[np.uint8])
+
+
+def fp32_to_fp8e4m3_values(x: np.ndarray) -> np.ndarray:


The reference conversion implementation looks a bit ugly unfortunately. It also can't be done within our nncf.Tensor framework yet, because _f16_to_f8e4m3_bits_scalar has to be vectorized with NumPy.

The reason behind all this, is that fp32 ->f8e4m3 conversion is done through fp16, i.e. fp32 -> fp16 -> f8e4m3. It is implemented this way on OpenVINO (and OneDNN) side because it is a more hardware friendly way. Please see the following links:

OpenVINO implementation: https://github.com/openvinotoolkit/openvino/blame/master/src/core/src/type/float8_e4m3.cpp

PR with addition of 2-step conversion: Implement 2-step conversion from fp32 to fp8 openvino#28501

Initial commit

b7d8a7b

github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Nov 20, 2025

nikita-savelyevv commented Nov 20, 2025

View reviewed changes

Fix tests

bd421c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimized compression for FP8 modes #3748

Optimized compression for FP8 modes #3748

Uh oh!

nikita-savelyevv commented Nov 20, 2025

Uh oh!

nikita-savelyevv Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		_f16_to_f8e4m3_bits_vec = np.vectorize(_f16_to_f8e4m3_bits_scalar, otypes=[np.uint8])


		def fp32_to_fp8e4m3_values(x: np.ndarray) -> np.ndarray:

Optimized compression for FP8 modes #3748

Are you sure you want to change the base?

Optimized compression for FP8 modes #3748

Uh oh!

Conversation

nikita-savelyevv commented Nov 20, 2025

Changes

Reason for changes

Related tickets

Tests

Uh oh!

nikita-savelyevv Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant