[float8] Support power of 2 scales with PerRow scales for inference

## Summary
- float8 training w/ rowwise scales uses power of 2 scales by default, to reduce quantization error
- float8 inference w/ `Float8DynamicActivationFloat8WeightConfig` using `PerRow` scaling doesn't support power of 2 scales
- users have reported they want to be able to use power of 2 scales for inference after training with them.

cc @drisspg @vkuzo