Skip to content

[float8] Support power of 2 scales with PerRow scales for inference #2182

Open
@danielvegamyhre

Description

@danielvegamyhre

Summary

  • float8 training w/ rowwise scales uses power of 2 scales by default, to reduce quantization error
  • float8 inference w/ Float8DynamicActivationFloat8WeightConfig using PerRow scaling doesn't support power of 2 scales
  • users have reported they want to be able to use power of 2 scales for inference after training with them.

cc @drisspg @vkuzo

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions