Port the CORE-MATH version of `cbrt` #475

tgross35 · 2025-01-24T12:05:02Z

Replace our current implementation with one that is correctly rounded.

Source: https://gitlab.inria.fr/core-math/core-math/-/blob/81d447bb1c46592291bec3476bc24fa2c2688c67/src/binary64/cbrt/cbrt.c

ci: allow-regressions

tgross35 · 2025-01-24T22:01:46Z

It looks like there is about a 5x slowdown for targets without hardware FMA:

icount::icount_bench_cbrt_group::icount_bench_cbrt logspace:setup_cbrt()
Performance has regressed: Instructions (177374 > 28650) regressed by +519.106% (>+5.00000)
  Baselines:                      softfloat|softfloat
  Instructions:                      177374|28650                (+519.106%) [+6.19106x]
  L1 Hits:                           209370|31728                (+559.890%) [+6.59890x]
  L2 Hits:                                1|1                    (No change)
  RAM Hits:                              44|10                   (+340.000%) [+4.40000x]
  Total read+write:                  209415|31739                (+559.803%) [+6.59803x]
  Estimated Cycles:                  210915|32083                (+557.404%) [+6.57404x]

I think this is fine as long as the hardfloat is reasonable. Marked allow-regressions for the softfloat version.

tgross35 · 2025-01-24T22:33:12Z

Hm, even with hardware FMA there is 2x slower. Probably still tolerable, future optimization might be possible.

icount::icount_bench_cbrt_group::icount_bench_cbrt logspace:setup_cbrt()
Performance has regressed: Instructions (72584 > 28650) regressed by +153.347% (>+5.00000)
  Baselines:                      hardfloat|hardfloat
  Instructions:                       72584|28650                (+153.347%) [+2.53347x]
  L1 Hits:                            95112|31730                (+199.754%) [+2.99754x]
  L2 Hits:                                3|0                    (+++inf+++) [+++inf+++]
  RAM Hits:                              29|9                    (+222.222%) [+3.22222x]
  Total read+write:                   95144|31739                (+199.770%) [+2.99770x]
  Estimated Cycles:                   96142|32045                (+200.022%) [+3.00022x]

beetrees · 2025-01-25T04:31:51Z

src/math/cbrt.rs

+fn fmaf64(x: f64, y: f64, z: f64) -> f64 {
+    #[cfg(intrinsics_enabled)]
+    {
+        return unsafe { core::intrinsics::fmaf64(x, y, z) };
+    }
+
+    #[cfg(not(intrinsics_enabled))]
+    {
+        return super::fma(x, y, z);
+    }
+}


Would this be better as a method on support::Float, similar to abs and copysign? That way the implementation could be shared between this and other (future) users.

Yeah, that would be preferable. I just did this as a temporary workaround until f16 and f128 also have an implementation, to keep the impl_float macro a bit simpler.

(I am hoping it will be possible to make this generic by putting the magic numbers in a helper trait and recalculating the polynomials for f128. But I'll get this cleaned up to merge before starting on that).

beetrees · 2025-01-25T05:31:24Z

src/math/fenv.rs

+    Nearest = 0,
+    Downward = 1,
+    Upward = 2,
+    ToZero = 3,


Suggested change

Nearest = 0,

Downward = 1,

Upward = 2,

ToZero = 3,

Nearest = FE_TONEAREST as isize,

Downward = FE_DOWNWARD as isize,

Upward = FE_UPWARD as isize,

ToZero = FE_TOWARDZERO as isize,

To keep the constants specified in one place (could also do it the other way round if const FE_TONEAREST: i32 = Rounding::Nearest as i32 etc. if you prefer).

Thanks, that is a good idea.

I don't really know what we should or shouldn't be doing to handle rounding modes, there is a moderate amount of untested code in this repo to handle them. I opened https://github.com/rust-lang/libm/issues/480 if you have any suggestions.

beetrees

LGTM

beetrees · 2025-01-25T05:38:58Z

src/math/cbrt.rs

+    let rm = Rounding::get();
+
+    /* rm=0 for rounding to nearest, and other values for directed roundings */
+    let hx: u64 = x.to_bits();


nit: this comment should be next to the let rm above, not the let hx below. Also the comment maybe needs modifying now that rm is an enum, not an integer?

Good catch, I will update this. Thank you for reviewing!

Before merging I still want to include the .wc tests from core-math. Or maybe download/submodule those similar to musl since each is ~100k entries.

I haven't gotten around to this unfortunately. There is now https://github.com/rust-lang/libm/blob/e66ec88df8325fbe151939c4dc0a9f7c25759fdf/crates/libm-test/src/gen/case_list.rs, it should be reasonably easy to parse .wc files there from a core-math submodule.

That being said, I our tests are thorough enough that I think I can just merge this now and add that later.

We only round using nearest, but some incoming code has more handling of rounding modes that would be nice to `match` on. Rather than checking integer values, add an enum representation.

Replace our current implementation with one that is correctly rounded. Source: https://gitlab.inria.fr/core-math/core-math/-/blob/81d447bb1c46592291bec3476bc24fa2c2688c67/src/binary64/cbrt/cbrt.c

With the correctly rounded implementation, we can reduce the ULP requirement for `cbrt` to zero. There is still an override required for `i586` because of the imprecise FMA.

Port the CORE-MATH version of `cbrt`

tgross35 force-pushed the core-cbrt branch 2 times, most recently from 8d3422e to f453c3e Compare January 25, 2025 00:42

tgross35 changed the title ~~core-math cbrt~~ Port the CORE-MATH version of cbrt Jan 25, 2025

tgross35 marked this pull request as ready for review January 25, 2025 00:43

beetrees reviewed Jan 25, 2025

View reviewed changes

beetrees approved these changes Jan 25, 2025

View reviewed changes

tgross35 force-pushed the core-cbrt branch 2 times, most recently from 98f76af to c0b4f0a Compare February 7, 2025 22:47

tgross35 added 3 commits February 7, 2025 23:04

Add an enum representation of rounding mode

68bfe1d

We only round using nearest, but some incoming code has more handling of rounding modes that would be nice to `match` on. Rather than checking integer values, add an enum representation.

Port the CORE-MATH version of cbrt

b19cde9

Replace our current implementation with one that is correctly rounded. Source: https://gitlab.inria.fr/core-math/core-math/-/blob/81d447bb1c46592291bec3476bc24fa2c2688c67/src/binary64/cbrt/cbrt.c

Decrease the allowed error for cbrt

f069b54

With the correctly rounded implementation, we can reduce the ULP requirement for `cbrt` to zero. There is still an override required for `i586` because of the imprecise FMA.

tgross35 force-pushed the core-cbrt branch from c0b4f0a to f069b54 Compare February 7, 2025 23:04

tgross35 merged commit 670f8a8 into rust-lang:master Feb 7, 2025
35 checks passed

tgross35 deleted the core-cbrt branch February 7, 2025 23:56

tgross35 added a commit that referenced this pull request Apr 18, 2025

Merge pull request #475 from tgross35/core-cbrt

3440603

Port the CORE-MATH version of `cbrt`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Port the CORE-MATH version of `cbrt` #475

Port the CORE-MATH version of `cbrt` #475

Uh oh!

tgross35 commented Jan 24, 2025 •

edited

Loading

Uh oh!

tgross35 commented Jan 24, 2025 •

edited

Loading

Uh oh!

tgross35 commented Jan 24, 2025 •

edited

Loading

Uh oh!

beetrees Jan 25, 2025

Uh oh!

tgross35 Jan 25, 2025

Uh oh!

beetrees Jan 25, 2025

Uh oh!

tgross35 Jan 25, 2025

Uh oh!

beetrees left a comment

Uh oh!

beetrees Jan 25, 2025

Uh oh!

tgross35 Jan 25, 2025 •

edited

Loading

Uh oh!

tgross35 Feb 7, 2025

Uh oh!

Uh oh!

Uh oh!

Port the CORE-MATH version of cbrt #475

Port the CORE-MATH version of cbrt #475

Uh oh!

Conversation

tgross35 commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgross35 commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgross35 commented Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beetrees Jan 25, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Jan 25, 2025

Choose a reason for hiding this comment

Uh oh!

beetrees Jan 25, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Jan 25, 2025

Choose a reason for hiding this comment

Uh oh!

beetrees left a comment

Choose a reason for hiding this comment

Uh oh!

beetrees Jan 25, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Jan 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgross35 Feb 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Port the CORE-MATH version of `cbrt` #475

Port the CORE-MATH version of `cbrt` #475

tgross35 commented Jan 24, 2025 •

edited

Loading

tgross35 commented Jan 24, 2025 •

edited

Loading

tgross35 commented Jan 24, 2025 •

edited

Loading

tgross35 Jan 25, 2025 •

edited

Loading