FEAT: Implementing `same_value` casting rule in quaddtype #246

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

SwayamInSync merged 47 commits into numpy:main from SwayamInSync:same-value

Jan 14, 2026

Member

SwayamInSync commented Dec 26, 2025

closes numpy/numpy-quaddtype#26

As per the title

SwayamInSync added 30 commits

December 8, 2025 14:41


          updating TARGET_VERSION and numpy_to_quad resove desc

fe04b3b


          Merge branch 'main' into same-value

cdfe02c


          Merge branch 'main' into same-value

9e3afc3


          Merge branch 'main' into same-value

605aa59


          quad2quad

9b4d83d


          fix heisenbugs

95a253a


          refactor aligned/unaligned into templates

277ee7b


          resolve desc + quad2numpy loop fix

5cb4e4a


          adding same_value int tests


          handling nan in same_value

534a64e


          fix tests

f696848


          again union hesinbug?

22327f8


          just match with valueerror

6fa020d


          use memcpy

0babf9d


          use memcmp

a5cf124


          switch back to no union

90b824d


          addded float tests

ff69b8e


          use double's tiny in ld

30f6b95


          adding quad->str same_vale

5c5d791


          improve error msg

dde4a84


          make all from_quad uses const pointer to union

8862c23


          fixed string same_value

e374a36


          use quad2sleefquad

c458d53


          remove non-native order tets for respective systems

9d10144


          powerpc has ld as quad

c17c3d0


          memory barrier

ae9986b


          will cont tomorrow from here

308f136


          quad2quad same_value

00acaca


          nolong need pyucs path

e68e6be


          remove unused apis

3f0cf90

juntyr approved these changes

View reviewed changes

Contributor

juntyr left a comment

LGTM with minor nits addressed


          improve signbit assert

adf0bc4

SwayamInSync mentioned this pull request

Add basic documentation site #227

Merged

ngoldbaum reviewed

View reviewed changes

Member

ngoldbaum left a comment

@mattip any chance you have time and interest to look this over?

Member Author

SwayamInSync commented Dec 29, 2025

Details for easing the review process

Only casts.cpp and test_quaddtype requires the major review, I added comments wherever I made the changes to give reasoning for that step, the rest files are modified for the caues listed below

Small refactoring

quad_value union with __float128 causes lot of ABI & compiler optimization issues in C++, hence refactoring the methods to not perform any kind of copy of this passed union instead use pointers

Inter-backend operation is broken in previous builds, not sure why someone wants to go in that direction, but just for the sake of completeness I fixed it here as well, to implement the same_value casting between QuadPrecision with different backends

Remove the aligned/unlaigned separate loops with their templated versions

This is still valid, outside this are just tweaks and refactoring to make things work in stable manner

Member

mattip commented Dec 29, 2025

I can take a look after Jan 11.

Member

ngoldbaum commented Dec 29, 2025

@SwayamInSync do you mind waiting a couple weeks on this? I'd like to have Matti look because he added same_value casting and also has a try at this in #161.

In the meantime: one thing that would help is to make the PR more atomic. There's a fair amount of renaming and other stuff like that you could do in another PR then rebase this once the other PR is merged to reduce the diff size.

In general I'd much rather review several smaller more focused PRs rather than one big PR.

Contributor

juntyr commented Dec 30, 2025

I was hoping for a faster timeline to the release (since numpy-quaddtype is used in my research) but life is life and I'll adapt my own timelines if need be

Member Author

SwayamInSync commented Dec 30, 2025

@SwayamInSync do you mind waiting a couple weeks on this? I'd like to have Matti look because he added same_value casting and also has a try at this in #161.

In the meantime: one thing that would help is to make the PR more atomic. There's a fair amount of renaming and other stuff like that you could do in another PR then rebase this once the other PR is merged to reduce the diff size.

In general I'd much rather review several smaller more focused PRs rather than one big PR.

Right, I also didn't want to make it this big, but it the design choice of using union with SIMD types in C++ gave a lot of ABI breaks (as __float128 isn't standard so on x86-64 machines, doing careless copies gives heisenbugs)
I can do some independent refactor PRs, and since we are allowing some time window I'll also get some work to do on docs 😄

SwayamInSync mentioned this pull request

Refactor: General refactors + quad2quad inter-backend fixes #247

Merged


          fixing conflicts

22ec619

Member Author

SwayamInSync commented Jan 6, 2026

Cool so the diff got down now, mostly again covered by casts.cpp and tests

SwayamInSync added 2 commits

January 10, 2026 14:11


          Merge branch 'main' into same-value

2b3016c


          fixing conflicts

cc70134

Member Author

SwayamInSync commented Jan 12, 2026

Hi @mattip gentle ping to get this in your radar.

Contributor

juntyr commented Jan 13, 2026

I'm also hoping that we can merge this PR and publish v1.0 soon

SwayamInSync mentioned this pull request

Restructuring quaddtype to use src layout #253

Merged

ngoldbaum reviewed

View reviewed changes

Member

ngoldbaum left a comment

I did a pass over the C code and spotted some issues. I'd appreciate it if you could look elsewhere for similar patterns to the main issue I spotted: setting an error in an error path when an error is already set

quaddtype/numpy_quaddtype/src/casts.cpp

    
              }

              // Helper function for quad-to-quad same_value check (inter-backend)

              // NOTE: the inter-backend uses `double` as intermediate,

Member

ngoldbaum Jan 6, 2026

Does SLEEF not have any utilities for working with 80 bit floats? It’d be nice to not have this restriction. But also not critical.

Member Author

SwayamInSync Jan 14, 2026

They does not have any explicit cross-platform utilities for 80-bit floats.
That's why only values perfectly castable from "longdouble" -> "quad" are those that are under the "double" range (because of this intermediate)

If someone still wants the perfect cast then ld -> string -> quad is the route

quaddtype/numpy_quaddtype/src/casts.cpp Outdated

    
                      // Perform same_value check if requested

                      if (same_value_casting) {

                          if (quad_to_string_same_value_check(&in_val, str_buf, str_size, backend) < 0) {

Member

ngoldbaum Jan 13, 2026

Is it possible for the same value not to be preserved here? I don't think it is - StringDType shouldn't truncate data. That means this check isn't necessary.

Member Author

SwayamInSync Jan 14, 2026

Oh yes right, my bad
we shouldn't be needing same_value check here.

quaddtype/numpy_quaddtype/src/casts.cpp

    
              }

              static inline int

              quad_to_string_same_value_check(const quad_value *in_val, const char *str_buf, npy_intp str_len,

Member

ngoldbaum Jan 13, 2026

Maybe you could avoid allocating a temporary buffer to store a string if you knew how many digits are necessary to represent the quad precisely. Is that exposed by SLEEF?

I ask because this whole process - allocating and freeing a temporary and parsing the truncated string, seems very slow and probably adds a lot of overhead to the cast operation for the fixed-width strings.

Member Author

SwayamInSync Jan 14, 2026 •

edited

Loading

There's a nuance here, let me explain so 36 for exact decimal digits and 42 for scientific notation
They are not exposed by SLEEF, we keep the max string len to 50 (42 + some buffer)

Now when casting a QuadPrecision value to a fixed-width string (like U10) then the dragon4 will generate full buffer representation but it will get truncated to only 10 chars. Then here we in same_value check we need to re-create the null-terminated buffer.

Oh good point, that buffer can be in local stack, we don't be needing the heap allocation here

Member Author

SwayamInSync Jan 14, 2026

1 character for sign
1 character for leading digit
1 character for decimal point
36 significant digits
1 character for e
1 character for exponent sign
4 characters for exponent (max is 4932)
1 null terminator

So it should be total of 46 characters (still 50 is safe)

quaddtype/numpy_quaddtype/src/casts.cpp Outdated

    
                                   val_str);

                  }

                  else {

                      PyErr_SetString(PyExc_ValueError,

Member

ngoldbaum Jan 13, 2026

This is wrong: you shouldn't set a new error here if quad_to_string_adaptive_cstr already set an error. I'd just bubble up the error instead of trying to give more context here.

Member Author

SwayamInSync Jan 14, 2026

Fixed

quaddtype/numpy_quaddtype/src/casts.cpp Outdated

    
                                   val_str);

                  }

                  else {

                      PyErr_SetString(PyExc_ValueError,

Member

ngoldbaum Jan 13, 2026

same issue here - an error is already set, setting a new one without resetting the error indicator is wrong.

Member Author

SwayamInSync Jan 14, 2026

Fixed

quaddtype/numpy_quaddtype/src/casts.cpp Outdated

    
                                   val_str);

                  }

                  else {

                      PyErr_SetString(PyExc_ValueError,

Member

ngoldbaum Jan 13, 2026

same error

Member Author

SwayamInSync Jan 14, 2026

Fixed

mattip mentioned this pull request

implement same_value casting for numpy <-> quadtype #161

Closed

mattip approved these changes

View reviewed changes

Member

mattip left a comment

LGTM, pending the already-mentioned corrections. Tests should be catching edge cases.

SwayamInSync added 4 commits

January 14, 2026 09:07


          dont override error + no same_value for varlen strdtype

f6adeda


          no heap alloc

c6ec9bc


          fixed overriding error cases

b22a7b0


          update comment

a388f3e

Member Author

SwayamInSync commented Jan 14, 2026 •

edited

Loading

@ngoldbaum these new changes should address all the reviews, I found 2 more cases of error-overriding and patched them here (inside scalar.c and umath.cpp)

ngoldbaum approved these changes

View reviewed changes

Member

ngoldbaum left a comment

Thanks for your patience here!

Member Author

SwayamInSync commented Jan 14, 2026

Great! Merging this in.
Thanks everyone 🙌

SwayamInSync merged commit c182ddb into numpy:main

11 checks passed

Contributor

juntyr commented Jan 15, 2026

Thanks! How long do you think will it now take to release v1.0?

SwayamInSync mentioned this pull request

QuadDType Development History: PRs from numpy-user-dtypes numpy/numpy-quaddtype#43

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

numpy_quaddtype