gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

skirpichev · 2024-12-13T16:17:34Z

Benchmark	ref	patch
int(Decimal(1<<7))	648 ns	474 ns: 1.37x faster
int(Decimal(1<<38))	740 ns	501 ns: 1.48x faster
int(Decimal(1<<300))	2.06 us	2.02 us: 1.02x faster
int(Decimal(1<<3000))	115 us	115 us: 1.00x faster
Geometric mean	(ref)	1.20x faster

>>> sys.int_info[:2]
(30, 4)

# bench_Decimal-to-int.py

import pyperf
from decimal import Decimal

values = ['1<<7', '1<<38', '1<<300', '1<<3000']

runner = pyperf.Runner()
for v in values:
    d = Decimal(eval(v))
    bn = 'int(Decimal('+v+'))'
    runner.bench_func(bn, int, d)

Issue: Remove private _PyLong_FromDigits() function #127937

…757)

picnixz · 2024-12-13T17:31:54Z

hide _PyLong_FromDigits()? it's not used outside of the longobject.c anymore

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

news

Not needed I think, unless you want to indicate the performance gain (it's always nice to know that something is faster). I did report the improvements of fnmatch.translate, so I think you can report those improvements as well.

Modules/_decimal/_decimal.c

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev · 2024-12-14T00:47:15Z

Modules/_decimal/_decimal.c

+    n = (mpd_sizeinbase(x, 2) + bpd - 1) / bpd;
+    PyLongWriter *writer = PyLongWriter_Create(mpd_isnegative(x), n,
+                                               (void**)&ob_digit);
+    /* mpd_sizeinbase can overestimate size by 1 digit, set it to zero. */


BTW, this looks as a bug in the mpdecimal. C.f. the GNU GMP, the mpz_sizeinbase docs says: "If base is a power of 2, the result is always exact".

skirpichev · 2024-12-14T01:05:31Z

Let's not hide this. Maybe someone is using it (it was removed then restored IIRC).

I've updated the pr descriptions with my research. So far, I've found just one use case.

At least, I think we should deprecate (not soft) this. This apparently affects not so much projects and there is now a public alternative. @picnixz, what do you think?

picnixz · 2024-12-14T01:30:40Z

At least, I think we should deprecate (not soft) this

I would be fine with deprecating it, saying which alternative to use, so that we can simply remove it in some later versions. I think Victor was the one who removed and restored it so we should ask him as well.

picnixz · 2024-12-14T01:31:31Z

should dec_from_long() be modified here? (To use the PyLong_Export API.) I would prefer to do this in a separate PR.

If you prefer doing it in a follow-up PR because you fear it would be too hard to review, then it's better. If the change is minimal, we can do it this one (I didn't check the code to change)

skirpichev · 2024-12-14T02:03:21Z

If the change is minimal, we can do it this one

You can estimate them looking on the gmpy2 pr (referenced in the PEP): aleaxit/gmpy#495 In principle, I don't think that this will complicate review to much. On another hand, changes looks logically independent. I would rather include here deprecation.

picnixz · 2024-12-14T02:11:14Z

Let's change dec_from_long in another PR since the changes are independent (sorry it's 3 AM here and I don't have much energy).
For deprecating _PyLong_FromDigits, maybe it's better to make a separate PR so that we have a dedicated NEWS entry and re-use the issue that actually removed the private API (and not the issue that reverted the removal). WDYT? (we would also be able to change PyLong_Copy accordingly)

Modules/_decimal/_decimal.c

Misc/NEWS.d/next/C_API/2024-12-14-03-40-15.gh-issue-127925.FF7aov.rst

* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news

vstinner

LGTM.

mpd_qexport_*() functions used here with assumption, that no resizing
occur, i.e. len was obtained by a call to mpd_sizeinbase.

IMO it's a reasonable trade-off and an acceptable risk.

Modules/_decimal/_decimal.c

serhiy-storchaka · 2025-01-07T14:42:18Z

It is not guaranteed, and there is no way to enforce that resize does not occur in mpd_qexport_*() functions.

How to estimate the risk? If Python has undefined behavior in one of billion cases, is it acceptable risk?

Co-authored-by: Victor Stinner <[email protected]>

skirpichev · 2025-01-08T02:48:00Z

It is not guaranteed, and there is no way to enforce that resize does not occur in mpd_qexport_*() functions.

@serhiy-storchaka, we have a confirmation from the library author, that this expectation is correct, unless libm is broken. I guess it's not just one place where we depend on quality of system libraries.

Or do you believe that mpd_sizeinbase() can underestimate size with correct log10? If so, it's a bug. Lets just fix one. Here is the function (IIRC it's same in latest upstream version):

cpython/Modules/_decimal/libmpdec/mpdecimal.c

Lines 8084 to 8113 in 65ae3d5

    
           size_t 
        
           mpd_sizeinbase(const mpd_t *a, uint32_t base) 
        
           { 
        
               double x; 
        
               size_t digits; 
        
               double upper_bound; 
        
               assert(mpd_isinteger(a)); 
        
               assert(base >= 2); 
        
               if (mpd_iszero(a)) { 
        
                   return 1; 
        
               } 
        
               digits = a->digits+a->exp; 
        
           #ifdef CONFIG_64 
        
               /* ceil(2711437152599294 / log10(2)) + 4 == 2**53 */ 
        
               if (digits > 2711437152599294ULL) { 
        
                   return SIZE_MAX; 
        
               } 
        
               upper_bound = (double)((1ULL<<53)-1); 
        
           #else 
        
               upper_bound = (double)(SIZE_MAX-1); 
        
           #endif 
        
               x = (double)digits / log10(base); 
        
               return (x > upper_bound) ? SIZE_MAX : (size_t)x + 1; 
        
           }

Edit:
In fact, we need much simpler case, as base is a power of 2. So, we want ndigits * log2(10)/shift. This should be a correct bound:

(size_t)(3.321928094887363*((ndigits + shift - 1)/shift))

For shift=30 and ndigits ~ 1<<53 (upper_bound for typical case) - it will overestimate size in just 1 digit.

vstinner · 2025-01-08T10:06:18Z

@picnixz: Would you mind to review the latest PR version? It changed a lot since last month.

picnixz

A few final comments on English wording and some variables. Otherwise, LGTM. Sorry Victor, the ping got under my radar.

Modules/_decimal/_decimal.c

Co-authored-by: Bénédikt Tran <[email protected]>

vstinner · 2025-01-13T21:19:37Z

@serhiy-storchaka: Are you ok with this change? Or are you worried about the mpd_sizeinbase() issue?

skirpichev · 2025-01-24T02:24:01Z

I would prefer not do this, but to make some progress we could introduce a temporary buffer. @vstinner?

skirpichev · 2025-01-24T03:58:56Z

Ok, with a buffer (latest change) I got something like this:

Benchmark	ref	patch
int(Decimal(1<<7))	637 ns	480 ns: 1.33x faster
int(Decimal(1<<38))	735 ns	514 ns: 1.43x faster
int(Decimal(1<<300))	2.06 us	2.12 us: 1.03x slower
Geometric mean	(ref)	1.16x faster

Benchmark hidden because not significant (1): int(Decimal(1<<3000))

I'm not sure if the third case is a real speed regression or just a noise. But I think it can be acceptable, as this clear Serhiy concerns, that blocked the pr before. Hence, I push this.

@vstinner
@picnixz

serhiy-storchaka

LGTM. Later we can optimize it more. For very large values the total time is dominated by non-linear conversion time, so additional allocation and copying does not matter.

Modules/_decimal/_decimal.c

picnixz · 2025-01-24T10:47:47Z

LGTM (I've reviewed the two new commits and it's fine).

vstinner

LGTM

vstinner · 2025-01-24T11:14:06Z

Merged, thanks @skirpichev for the tedious change!

skirpichev · 2025-01-24T11:20:46Z

Thanks for reviews. I'll open an issue to track possible improvement (no temporary buffer).

pythongh-102471: convert decimal module to use PyLongWriter API (PEP …

80f1a04

…757)

bedevere-app bot mentioned this pull request Dec 13, 2024

The C-API for Python to C integer conversion is, to be frank, a mess. #102471

Open

skirpichev requested review from vstinner and picnixz December 13, 2024 16:42

picnixz reviewed Dec 13, 2024

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

skirpichev and others added 2 commits December 14, 2024 03:40

+ news

c13b7d2

Apply suggestions from code review

589f926

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev commented Dec 14, 2024

View reviewed changes

skirpichev marked this pull request as ready for review December 14, 2024 01:05

bedevere-app bot added the awaiting review label Dec 14, 2024

This comment was marked as outdated.

Sign in to view

skirpichev marked this pull request as draft December 14, 2024 05:07

bedevere-app bot removed the awaiting review label Dec 14, 2024

skirpichev added 2 commits December 14, 2024 08:42

Merge branch 'master' into long_export-decimal

f27adef

+ adapt dec_from_long() to use PEP 757

6669b89

skirpichev changed the title ~~gh-102471: convert decimal module to use PyLongWriter API (PEP 757)~~ gh-102471: convert decimal module to use import/export API for ints (PEP 757) Dec 14, 2024

skirpichev requested a review from picnixz December 14, 2024 06:53

skirpichev marked this pull request as ready for review December 14, 2024 07:10

bedevere-app bot added the awaiting review label Dec 14, 2024

skirpichev mentioned this pull request Dec 14, 2024

gh-128863: deprecate _PyLong_FromDigits() function #127939

Merged

Merge branch 'master' into long_export-decimal

05ec274

vstinner reviewed Dec 16, 2024

View reviewed changes

skirpichev added 2 commits December 16, 2024 10:56

Don't use PyLong_GetNativeLayout()

6e46bc1

Address review:

7f0061f

* cleanup: forgotten PyLongWriter_Discard, pylong variable * clarify news

vstinner approved these changes Jan 7, 2025

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

bedevere-app bot added awaiting merge and removed awaiting review labels Jan 7, 2025

Update Modules/_decimal/_decimal.c

b8bf49f

Co-authored-by: Victor Stinner <[email protected]>

picnixz approved these changes Jan 11, 2025

View reviewed changes

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Modules/_decimal/_decimal.c Show resolved Hide resolved

Modules/_decimal/_decimal.c Outdated Show resolved Hide resolved

Update Modules/_decimal/_decimal.c

a8189e6

Co-authored-by: Bénédikt Tran <[email protected]>

skirpichev requested a review from serhiy-storchaka January 23, 2025 04:46

skirpichev added 2 commits January 24, 2025 05:31

Merge branch 'master' into long_export-decimal

3854262

use temporary buffer of digits, tmp_digits

336e881

skirpichev requested a review from vstinner January 24, 2025 03:59

serhiy-storchaka approved these changes Jan 24, 2025

View reviewed changes

vstinner reviewed Jan 24, 2025

View reviewed changes

Modules/_decimal/_decimal.c Show resolved Hide resolved

address review: comment

e658f2b

skirpichev requested a review from vstinner January 24, 2025 10:44

vstinner approved these changes Jan 24, 2025

View reviewed changes

vstinner enabled auto-merge (squash) January 24, 2025 10:59

vstinner merged commit 3d8fc8b into python:main Jan 24, 2025
41 checks passed

bedevere-app bot removed the awaiting merge label Jan 24, 2025

skirpichev deleted the long_export-decimal branch January 24, 2025 11:10

skirpichev mentioned this pull request Jan 25, 2025

Avoid temporary allocation in dec_as_long() #129275

Closed

skirpichev mentioned this pull request May 13, 2025

gh-129275: avoid temporary buffer in dec_as_long() #129630

Closed

Uh oh!

gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

gh-127937: convert decimal module to use import API for ints (PEP 757) #127925

Uh oh!

Conversation

skirpichev commented Dec 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

picnixz commented Dec 13, 2024

Uh oh!

Uh oh!

skirpichev Dec 14, 2024

Choose a reason for hiding this comment

Uh oh!

skirpichev commented Dec 14, 2024

Uh oh!

picnixz commented Dec 14, 2024

Uh oh!

picnixz commented Dec 14, 2024

Uh oh!

skirpichev commented Dec 14, 2024

Uh oh!

picnixz commented Dec 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

serhiy-storchaka commented Jan 7, 2025

Uh oh!

skirpichev commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vstinner commented Jan 8, 2025

Uh oh!

picnixz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vstinner commented Jan 13, 2025

Uh oh!

skirpichev commented Jan 24, 2025

Uh oh!

skirpichev commented Jan 24, 2025

Uh oh!

serhiy-storchaka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

picnixz commented Jan 24, 2025

Uh oh!

vstinner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vstinner commented Jan 24, 2025

Uh oh!

skirpichev commented Jan 24, 2025

Uh oh!

Uh oh!

skirpichev commented Dec 13, 2024 •

edited

Loading

picnixz commented Dec 14, 2024 •

edited

Loading

skirpichev commented Jan 8, 2025 •

edited

Loading