Switch from fast_double_parser to fast_float #108

sirosen · 2025-05-19T05:47:36Z

First off, thanks for maintaining this library!
I'm trying to pitch in a bit to help and make sure things stay healthy.
It's my first time working with Cython, so I'll be a little slow to make changes.

This PR swaps the fast_double_parser for fast_float, per the deprecation warning cited in #83.
It all works cleanly except for a small number of tests regarding scientific notation.

I could use some input on how to push this over the finish line.
None of the fast_float behaviors matches what the JSON5 tests expect, and the one which is labeled json_or_infnan seems like the best fit for the spec in spirit.

Right now, I'm seeing pyjson5.loads(1e2.3) return 100, which is definitely wrong.
But I haven't yet figured out where the issue is. I would guess (from the above) that somehow it's truncating the data to 1e2.

The only oddity in here is handling around `std::errc` because `0` is not part of the named enum values. As a workaround, treat the enum type as an int, and explicitly cast it when comparing it against `0`.

This more accurately matches the supported types for the JSON5 spec.

Kijewski · 2025-05-19T06:29:03Z

Thank you very much for working on that! I'll try to have a look in the next few days.

(I wonder why the CI did not run for you PR, I'll have to investigate that.)

`fast_float` allows for exponents containing floats, which are not allowed per the JSON5 spec. Constraining `fast_float` to a format which does not allow for these exponents results in other supported usages being rejected. To handle this, explicitly check by adding a new guard. `has_invalid_exponent` is a string scan which returns true if it finds a suffix on the string which looks like an invalid exponent.

sirosen · 2025-05-19T22:50:45Z

I was able to confirm that the string "1e2.3", when passed to fast_float, parses "successfully" as 100.0.
I couldn't find a way to get it working using clever fast_float usage, but once I satisfied myself that it wasn't possible, I added a manual little function to guard against weird invalid exponents (any non-digit char after the sign).

I think it's now in good shape -- all tests pass -- and I hopefully have some more contributions to follow!

sirosen added 2 commits May 18, 2025 23:52

Convert from fast_double_parser to fast_float

a71f687

The only oddity in here is handling around `std::errc` because `0` is not part of the named enum values. As a workaround, treat the enum type as an int, and explicitly cast it when comparing it against `0`.

Update to set float parse format to json_or_infnan

fe9236d

This more accurately matches the supported types for the JSON5 spec.

sirosen mentioned this pull request May 19, 2025

Add cythonize annotations to improve DX #109

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Switch from fast_double_parser to fast_float #108

Switch from fast_double_parser to fast_float #108

Uh oh!

sirosen commented May 19, 2025

Uh oh!

Kijewski commented May 19, 2025

Uh oh!

sirosen commented May 19, 2025

Uh oh!

Uh oh!

Switch from fast_double_parser to fast_float #108

Are you sure you want to change the base?

Switch from fast_double_parser to fast_float #108

Uh oh!

Conversation

sirosen commented May 19, 2025

Uh oh!

Kijewski commented May 19, 2025

Uh oh!

sirosen commented May 19, 2025

Uh oh!

Uh oh!