math_brute_force: fix `fdim` to use device's rounding when converting result back to half. #2223

cycheng · 2025-01-10T09:45:29Z

In the half-precision fdim test, the original code used CL_HALF_RTE to convert the float result back to half, causing a mismatch in computation results when the hardware uses RTZ. Some of the examples:

  fdim(0x365f, 0xdc63) = fdim( 0.398193f,  -280.75f)     =   281.148193f (RTE=0x5c65, RTZ=0x5c64)
  fdim(0xa4a3, 0xf0e9) = fdim(-0.018112f, 10056.0f)      = 10055.981445f (RTE=0x70e9, RTZ=0x70e8)
  fdim(0x1904, 0x9ab7) = fdim( 0.002449f,    -0.003279f) =     0.005728f (RTE=0x1dde, RTZ=0x1ddd)

Fixed this by using the hardware's default rounding mode when converting the result back to half.

… result back to half.

test_conformance/math_brute_force/binary_half.cpp

cycheng · 2025-01-14T10:21:52Z

Thank you @svenvh, in the current change, we only set halfRoundingMode to CL_HALF_RTZ when both isFDim and gIsInRTZMode are true.

I can try to apply the change to all binary f16 operations and redo the CTS tests on my side.

cycheng · 2025-01-14T10:22:16Z

@svenvh
CC @AhmedAmraniAkdi , @paulfradgley

Change halfRoundingMode to CL_HALF_RTZ for all binary operations, I was seeing 4 test failures on our RTZ device:

hypot
- Case1:
  inputs = 0xfbe7, 0xfabb = -64736, -55136
  double r = std::hypot(-64736, -55136) = 85033.6875
  cl_half_from_float(r, CL_HALF_RTZ) = 0x7bff = 65504
  cl_half_from_float(r, CL_HALF_RTE) = 0x7c00 = inf
  ulp = -304.651367
- Case2:
  inputs = 0xfbfe, 0x7387 = -65472, 15416
  double r = std::hypot(-65472, 15416) = 67262.4375
  cl_half_from_float(r, CL_HALF_RTZ) = 0x7bff = 65504
  cl_half_from_float(r, CL_HALF_RTE) = 0x7c00 = inf
  ulp = -26.975624
ldexp
Case 1:
inputs = 0x77d5, 162
fesetround(FE_TOWARDZERO)
double r = std::ldexp(32080, 162) = 3.402823e+38
cl_half_from_float(r, CL_HALF_RTZ) = 0x7bff = 65504
cl_half_from_float(r, CL_HALF_RTE) = 0x7c00 = inf
ulp = -2004.999877
pow
- Case 1:
  inputs = 0x3750, 0xcfc2
  double r = std::pow(0.45703125, -31.03125) = 3.567007e+10
  cl_half_from_float(r, CL_HALF_RTZ) = 0x7bff = 65504
  cl_half_from_float(r, CL_HALF_RTE) = 0x7c00 = inf
  ulp = -1063.048706
powr
- ..

bashbaug

LGTM

bashbaug · 2025-01-28T20:32:57Z

This has three approvals from three different companies, so no reason to wait to merge it.

math_brute_force: fix fdim to use device's rounding when converting…

1d3c814

… result back to half.

svenvh reviewed Jan 10, 2025

View reviewed changes

test_conformance/math_brute_force/binary_half.cpp Show resolved Hide resolved

svenvh approved these changes Jan 14, 2025

View reviewed changes

svenvh added the focused review label Jan 14, 2025

bashbaug requested a review from lakshmih January 14, 2025 17:03

lakshmih approved these changes Jan 27, 2025

View reviewed changes

bashbaug approved these changes Jan 28, 2025

View reviewed changes

bashbaug merged commit 5749818 into KhronosGroup:main Jan 28, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

math_brute_force: fix `fdim` to use device's rounding when converting result back to half. #2223

math_brute_force: fix `fdim` to use device's rounding when converting result back to half. #2223

cycheng commented Jan 10, 2025

cycheng commented Jan 14, 2025

cycheng commented Jan 14, 2025

bashbaug left a comment

bashbaug commented Jan 28, 2025

math_brute_force: fix fdim to use device's rounding when converting result back to half. #2223

math_brute_force: fix fdim to use device's rounding when converting result back to half. #2223

Conversation

cycheng commented Jan 10, 2025

cycheng commented Jan 14, 2025

cycheng commented Jan 14, 2025

bashbaug left a comment

Choose a reason for hiding this comment

bashbaug commented Jan 28, 2025

math_brute_force: fix `fdim` to use device's rounding when converting result back to half. #2223

math_brute_force: fix `fdim` to use device's rounding when converting result back to half. #2223