Skip to content

[Headers][X86] Convert bf16 to f32 conversions to generic constexpr implementations #154911

@RKSimon

Description

@RKSimon

Now that we have the __bf16 type, the BF16 intrinsics can be converted to use standard conversions instead of bit twiddling that interferes with fp optimisations:
e.g.

static __inline__ float __DEFAULT_FN_ATTRS _mm_cvtsbh_ss(__bf16 __A) {
  return __builtin_ia32_cvtsbf162ss_32(__A); // remove __builtin_ia32_cvtsbf162ss_32
}

-->

static __inline__ float __DEFAULT_FN_ATTRS_CONSTEXPR _mm_cvtsbh_ss(__bf16 __A) {
  return (float)(__A);
}

and

static __inline__ __m512 __DEFAULT_FN_ATTRS512 _mm512_cvtpbh_ps(__m256bh __A) {
  return _mm512_castsi512_ps((__m512i)_mm512_slli_epi32(
      (__m512i)_mm512_cvtepi16_epi32((__m256i)__A), 16));
}

-->

static __inline__ __m512 __DEFAULT_FN_ATTRS512_CONSTEXPR _mm512_cvtpbh_ps(__m256bh __A) {
  return (__m512)__builtin_convertvector(__A, __v16sf);
}

All the mask/maskz cvtpbh variants in avx512bf16intrin.h and avx512vlbf16intrin.h can be updated similarly

Metadata

Metadata

Assignees

Labels

backend:X86clang:frontendLanguage frontend issues, e.g. anything involving "Sema"clang:headersHeaders provided by Clang, e.g. for intrinsicsconstexprAnything related to constant evaluationgood first issuehttps://github.com/llvm/llvm-project/contribute

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions