Reimplement the generic fmod #536

quaternic · 2025-04-16T00:41:59Z

Full reimplementation for fmod, that should be somewhat cleaner. This was showing a decent perf gain locally, but the major improvement will come later by implementing the reduction helper with something smarter, which does involve some tradeoffs and alternatives to consider.

quaternic · 2025-04-16T00:48:38Z

error: consider choosing a more descriptive name
  --> src/math/generic/fmod.rs:21:9
   |
21 |     let _1 = F::Int::ONE;
   |         ^^

Thanks, clippy, very helpful

tgross35 · 2025-04-16T01:07:38Z

 icount::icount_bench_fmod_group::icount_bench_fmod logspace:setup_fmod()
  Baselines:                      hardfloat|hardfloat
  Instructions:                     1027764|1103643              (-6.87532%) [-1.07383x]
  L1 Hits:                          1033134|1105843              (-6.57498%) [-1.07038x]
  L2 Hits:                                3|1                    (+200.000%) [+3.00000x]
  RAM Hits:                              13|11                   (+18.1818%) [+1.18182x]
  Total read+write:                 1033150|1105855              (-6.57455%) [-1.07037x]
  Estimated Cycles:                 1033604|1106233              (-6.56543%) [-1.07027x]
icount::icount_bench_fmodf128_group::icount_bench_fmodf128 logspace:setup_fmodf128()
  Baselines:                      hardfloat|hardfloat
  Instructions:                    30704086|31329937             (-1.99761%) [-1.02038x]
  L1 Hits:                         30720037|31364055             (-2.05336%) [-1.02096x]
  L2 Hits:                                6|11                   (-45.4545%) [-1.83333x]
  RAM Hits:                              24|36                   (-33.3333%) [-1.50000x]
  Total read+write:                30720067|31364102             (-2.05341%) [-1.02096x]
  Estimated Cycles:                30720907|31365370             (-2.05470%) [-1.02098x]
icount::icount_bench_fmodf16_group::icount_bench_fmodf16 logspace:setup_fmodf16()
  Baselines:                      hardfloat|hardfloat
  Instructions:                       78528|84682                (-7.26719%) [-1.07837x]
  L1 Hits:                            92228|101322               (-8.97535%) [-1.09860x]
  L2 Hits:                                3|1                    (+200.000%) [+3.00000x]
  RAM Hits:                              19|17                   (+11.7647%) [+1.11765x]
  Total read+write:                   92250|101340               (-8.96980%) [-1.09854x]
  Estimated Cycles:                   92908|101922               (-8.84402%) [-1.09702x]
icount::icount_bench_fmodf_group::icount_bench_fmodf logspace:setup_fmodf()
  Baselines:                      hardfloat|hardfloat
  Instructions:                      156024|187367               (-16.7281%) [-1.20089x]
  L1 Hits:                           158776|189568               (-16.2432%) [-1.19393x]
  L2 Hits:                                1|1                    (No change)
  RAM Hits:                              11|10                   (+10.0000%) [+1.10000x]
  Total read+write:                  158788|189579               (-16.2418%) [-1.19391x]
  Estimated Cycles:                  159166|189923               (-16.1945%) [-1.19324x]

Awesome! The cleanup is great too. I'll take a deeper look soon.

Cc @sjrd since you tracked this implementation.

tgross35

The implementation looks great, thanks for working on this. Left some review but it only touches superficial things, and this needs a rebase since unfortunately I gave you a conflict from #535 (edit: and also #533 now, sorry about that).

tgross35 · 2025-04-17T22:44:52Z

src/math/generic/fmod.rs

+/// Given the bits of a positive float, clamp the exponent field to [0,1]
+fn collapse_exponent<F: Float>(bits: F::Int) -> F::Int {
+    let sig = bits & F::SIG_MASK;
+    if sig == bits { sig } else { sig | F::IMPLICIT_BIT }


Suggested change

if sig == bits { sig } else { sig | F::IMPLICIT_BIT }

// Set the implicit bit the input was not subnormal

if sig == bits { sig } else { sig | F::IMPLICIT_BIT }

tgross35 · 2025-04-17T22:49:51Z

src/math/generic/fmod.rs

+    if num.is_zero() {
+        F::from_bits(sx)
    } else {


Suggested change

if num.is_zero() {

F::from_bits(sx)

} else {

if num.is_zero() {

// Return zero with the sign of x

return F::from_bits(sx)

}

May a well early return to get rid of the else branch block.

tgross35 · 2025-04-17T22:59:19Z

src/math/generic/fmod.rs

-/* SPDX-License-Identifier: MIT */
-/* origin: musl src/math/fmod.c. Ported to generic Rust algorithm in 2025, TG. */


If this is your own work then could you add /* SPDX-License-Identifier: MIT OR Apache-2.0 */, or leave MIT if it's still pretty derivative? I'm trying to get a bit better track of origins since licensing is pretty mixed.

Also feel free to add something like /* Authored in 2025 by quaternic */. We need to carry around copyrights from Sun in the 90s, so IMO we can afford to give some credit to anyone who authors an original implementation :)

tgross35 · 2025-04-17T23:00:15Z

src/math/generic/fmod.rs

+        let ilog = num.ilog2();
+        let shift = (ey + ilog).min(F::SIG_BITS) - ilog;
+        let scale = (ey + ilog).saturating_sub(F::SIG_BITS);

-    ix |= sx;
-
-    F::from_bits(ix)
+        let normalized = num << shift;
+        let scaled = normalized + (F::Int::cast_from(scale) << F::SIG_BITS);
+        F::from_bits(sx | scaled)


Would you mind adding a comment about the algorithm here?

tgross35 · 2025-04-17T23:03:09Z

src/math/generic/fmod.rs

+    if sig == bits { sig } else { sig | F::IMPLICIT_BIT }
+}
+
+/// Computes (x << e) % y


Suggested change

/// Computes (x << e) % y

/// Computes `(x << e) % y` exactly without using bigint math

Or something to indicate why this isn't exactly (x << e) % y

tgross35 · 2025-04-17T23:04:19Z

src/lib.rs

 #![allow(clippy::many_single_char_names)]
+#![allow(clippy::just_underscores_and_digits)]


Sorting nit

Suggested change

#![allow(clippy::many_single_char_names)]

#![allow(clippy::just_underscores_and_digits)]

#![allow(clippy::just_underscores_and_digits)]

#![allow(clippy::many_single_char_names)]

tgross35 · 2025-04-21T06:22:38Z

This repo has been merged into https://github.com/rust-lang/compiler-builtins and is getting archived, so I am going to close this PR; please do reopen it there! I think this should apply cleanly with git format-patch HEAD -1 then patch < /path/to/0001-whatever.patch within compiler-builtins/libm.

Reimplement the generic fmod

47029ed

quaternic force-pushed the fmod-opt branch from ca2306a to 47029ed Compare April 16, 2025 00:52

tgross35 mentioned this pull request Apr 16, 2025

fmod: Correct the normalization of subnormals #535

Merged

tgross35 reviewed Apr 17, 2025

View reviewed changes

tgross35 closed this Apr 21, 2025

quaternic mentioned this pull request Apr 22, 2025

Reimplement the generic fmod rust-lang/compiler-builtins#880

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reimplement the generic fmod #536

Reimplement the generic fmod #536

Uh oh!

quaternic commented Apr 16, 2025

Uh oh!

quaternic commented Apr 16, 2025

Uh oh!

tgross35 commented Apr 16, 2025 •

edited

Loading

Uh oh!

tgross35 left a comment •

edited

Loading

Uh oh!

tgross35 Apr 17, 2025

Uh oh!

tgross35 Apr 17, 2025

Uh oh!

tgross35 Apr 17, 2025

Uh oh!

tgross35 Apr 17, 2025

Uh oh!

tgross35 Apr 17, 2025

Uh oh!

tgross35 Apr 17, 2025

Uh oh!

tgross35 commented Apr 21, 2025

Uh oh!

Uh oh!

	if sig == bits { sig } else { sig \| F::IMPLICIT_BIT }
	// Set the implicit bit the input was not subnormal
	if sig == bits { sig } else { sig \| F::IMPLICIT_BIT }

		/* SPDX-License-Identifier: MIT */
		/* origin: musl src/math/fmod.c. Ported to generic Rust algorithm in 2025, TG. */

	/// Computes (x << e) % y
	/// Computes `(x << e) % y` exactly without using bigint math

		#![allow(clippy::many_single_char_names)]
		#![allow(clippy::just_underscores_and_digits)]

Reimplement the generic fmod #536

Reimplement the generic fmod #536

Uh oh!

Conversation

quaternic commented Apr 16, 2025

Uh oh!

quaternic commented Apr 16, 2025

Uh oh!

tgross35 commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgross35 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

tgross35 commented Apr 21, 2025

Uh oh!

Uh oh!

tgross35 commented Apr 16, 2025 •

edited

Loading

tgross35 left a comment •

edited

Loading