Skip to content

fmod: Correct the normalization of subnormals #535

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2025

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Apr 15, 2025

Discussed at 1, there was an off-by-one mistake when converting from
the loop routine to using leading_zeros for normalization.

Currently, using EXP_BITS has the effect that ix after the branch
has its MSB one bit to the left of the implicit bit's position,
whereas a shift by EXP_BITS + 1 ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from 1
appear to still have correct results.

Since the result of using EXP_BITS + 1 is more consistent with what is
done for normals, apply this here.

@tgross35
Copy link
Contributor Author

tgross35 commented Apr 15, 2025

@sjrd I added the test cases mentioned at #469 (comment), thank you for providing those. It doesn't seem like I am able to reproduce the failure without the +1. Is it possible there is a subtle difference in the port? Spot checking, the results for this crate, musl, and mpfr all seem to line up.

$ cargo run -p util -- eval musl fmodf 2.1 0x1.8b1p-137
8.085e-42 0x0.002d14p-126
$ cargo run -p util -- eval mpfr fmodf 2.1 0x1.8b1p-137
8.085e-42 0x0.002d14p-126
$ cargo run -p util -- eval libm fmodf 2.1 0x1.8b1p-137
8.085e-42 0x0.002d14p-126

I also checked the test cases you added at scala-js/scala-js#5147, and there don't seem to be any problems.

@tgross35
Copy link
Contributor Author

Cc @quaternic who has done a lot of experimentation with fmod

@quaternic
Copy link
Contributor

Right, the translation to a generic implementation introduced an unintentional change that subnormals are "normalized" to a form that is (2 * N) * 2^ (E - 1) instead of the intended (N * 2^E), where N.ilog2() == SIG_BITS. The overall value is still the same, so probably the code that follows just happens to work correctly for all those cases. That likely hinges on various subtle details and it's not surprising that the Scala-port wouldn't have done everything exactly the same.

@sjrd
Copy link

sjrd commented Apr 15, 2025

One thing we did differently in the Scala port is that we mask off the sign bit early:
https://github.com/scala-js/scala-js/pull/5149/files#diff-cb80a0a9c84409f42505687656241e73f7aa71519db22f7cfc6be9f6b0751c75R83-R88
We do this because we don't have (native) unsigned comparisons. With the sign bit masked off, we can use signed comparisons with the same outcome.

Other than that, I followed the Rust implementation pretty closely. Perhaps the saturating_sub towards the end makes an actual difference? (I used wrapping_sub here as well, since that's all we have natively in Scala, again.)

@tgross35
Copy link
Contributor Author

tgross35 commented Apr 16, 2025

Interestingly a switch to wrapping_sub rather than saturating_sub does make some things blow up and adding a + 1 does fix them, but that doesn't seem to have an effect on the test cases in question.

failures:

---- musl_quickspace_fmod stdout ----
QuickSpaced Musl fmod arg 1/2: 29 iterations (841 total)
QuickSpaced Musl fmod arg 2/2: 29 iterations (841 total)

thread 'musl_quickspace_fmod' panicked at ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/bit.rs:950:1:
attempt to shift left with overflow
panic with the following input: (-1.7993604494355926e264, -3e-323)

---- musl_quickspace_fmodf stdout ----
QuickSpaced Musl fmodf arg 1/2: 29 iterations (841 total)
QuickSpaced Musl fmodf arg 2/2: 29 iterations (841 total)

thread 'musl_quickspace_fmodf' panicked at ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/bit.rs:950:1:
attempt to shift left with overflow
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
panic with the following input: (-1.1589949e33, -1.7e-44)

---- musl_edge_case_fmod stdout ----
EdgeCases Musl fmod arg 1/2: 130 edge cases
EdgeCases Musl fmod arg 2/2: 130 edge cases

thread 'musl_edge_case_fmod' panicked at ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/bit.rs:950:1:
attempt to shift left with overflow
panic with the following input: (2.5e-323, 1.5e-323)

---- musl_edge_case_fmodf stdout ----
EdgeCases Musl fmodf arg 1/2: 130 edge cases
EdgeCases Musl fmodf arg 2/2: 130 edge cases

thread 'musl_edge_case_fmodf' panicked at ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/bit.rs:950:1:
attempt to shift left with overflow
panic with the following input: (7e-45, 4e-45)

@tgross35 tgross35 force-pushed the fmod-fixes branch 2 times, most recently from 332729b to cd46e6b Compare April 16, 2025 19:16
From discussion at [1] our loop count calculation is incorrect, causing
an issue with subnormal numbers. Add test cases for known failures.

[1]: rust-lang#469 (comment)
Discussed at [1], there was an off-by-one mistake when converting from
the loop routine to using `leading_zeros` for normalization.

Currently, using `EXP_BITS` has the effect that `ix` after the branch
has its MSB _one bit to the left_ of the implicit bit's position,
whereas a shift by `EXP_BITS + 1` ensures that the MSB is exactly at the
implicit bit's position, matching what is done for normals (where the
implicit bit is set to be explicit). This doesn't seem to have any
effect in our implementation since the failing test cases from [1]
appear to still have correct results.

Since the result of using `EXP_BITS + 1` is more consistent with what is
done for normals, apply this here.

[1]: rust-lang#469 (comment)
@tgross35 tgross35 changed the title fmod: add regression tests for subnormal issue fmod: Correct the normalization of subnormals Apr 16, 2025
@tgross35 tgross35 marked this pull request as ready for review April 16, 2025 19:17
@tgross35
Copy link
Contributor Author

It's possible there is a difference if signed arithmetic is used rather than unsigned at some point since the net effect is to shift one bit further left than expected, which would be a sign flip any of the subtract+shift ops wind up placing that in the MSB. I don't think it's worth digging into too much though; I'll apply the change here since it's more correct, but this will go away soon with @quaternic's rewrite anyway #536.

@tgross35 tgross35 enabled auto-merge (rebase) April 16, 2025 19:24
@tgross35 tgross35 merged commit 56bb84c into rust-lang:master Apr 16, 2025
35 checks passed
@tgross35 tgross35 deleted the fmod-fixes branch April 17, 2025 23:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants