Skip to content

Add Lah numbers and clean up combinatorics section #39379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from

Conversation

kwankyu
Copy link
Collaborator

@kwankyu kwankyu commented Jan 25, 2025

According to the Wikipedia article https://en.wikipedia.org/wiki/Lah_number, a formula that computes the Lah number $L(n,k)$, also called the Stirling numbers of the third kind, is

$$\binom{n-1}{k-1}\frac{n!}{k!}$$

but the most efficient form of the formula seems to be

$$\binom{n}{k}\frac{k}{n}\binom{n}{k}(n-k)!$$

contributed by @user202729.

Long time ago, #17161 proposed yet another formula that gives $L(n,k)$ for all integers $n,k$. But that formula is extremely slow when implemented in Sage. Like existing functions for Stirling numbers, we only support the main case where $n$ and $k$ are nonnegative.

After this PR, we may close #17161, #17159, #17123 since we do not intend to support the negative $k$ case for combinatorial functions, which is I think only of minor interest and in controversy.

I made some cosmetic changes to the file src/sage/combinat/combinat.py, mostly for user friendliness.

While at it, I ended up making extensive changes to index pages of the combinatorics section, for tidiness. In particular,

  • removed some of "forever" TODOs.
  • removed excessive cross references.
  • uniformized capitalization.

and also fixed #17421.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

Copy link

github-actions bot commented Jan 25, 2025

Documentation preview for this PR (built with commit 217b186; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@user202729
Copy link
Contributor

While we're at it, maybe binomial(n, k)*(n-k).factorial() would be faster than n.factorial()//k.factorial()?

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jan 26, 2025

Not really, at least on my machine:

sage: def lah_numbers4(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.factorial() // k.factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
....: def lah_numbers5(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = binomial(n, k)*(n-k).factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
sage: timeit('T1 = matrix([[lah_numbers4(n,k) for n in range(100)] for k in rang
....: e(100)])')
25 loops, best of 3: 9.74 ms per loop
sage: timeit('T2 = matrix([[lah_numbers5(n,k) for n in range(100)] for k in rang
....: e(100)])')
5 loops, best of 3: 116 ms per loop
sage: T1 = matrix([[lah_numbers4(n,k) for k in range(10)] for n in range(10)])
sage: T2 = matrix([[lah_numbers5(n,k) for k in range(10)] for n in range(10)])
sage: T1 == T2
True

@user202729
Copy link
Contributor

Apparently there's massive overhead in binomial(n, k) (symbolic version) over n.binomial(k) for small input. I only tested for large input.

Try this one

    a = n.binomial(k)
    return a * k // n * a * (n-k).factorial() 

(note that n.binomial(k) * k / n == (n-1).binomial(k-1) so the division is exact)

In my experiment, it has minimal overhead for small n and k (as in your matrix) and a few times faster for e.g. lah_numbers(10^6, 10^5) .

It looks impossible to have a version that calls binomial()/factorial() only twice and avoid division though. Best division-free version I got is (n-1).binomial(k-1) * n.binomial(k) * (n-k).factorial() but it takes 3 calls (slower).

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jan 26, 2025

Voilà!

sage: def lah_numbers4(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.factorial() // k.factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
sage: def lah_numbers6(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.binomial(k)
....:     return a * k // n * a * (n-k).factorial()
....: 
sage: timeit('T1 = matrix([[lah_numbers4(n,k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 9.36 ms per loop
sage: timeit('T1 = matrix([[lah_numbers6(n,k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 11.1 ms per loop
sage: timeit('T1 = matrix([[lah_numbers4(10000+n,10000+k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 1.91 s per loop
sage: timeit('T2 = matrix([[lah_numbers6(10000+n,10000+k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 20.7 ms per loop
sage: timeit('T1 = matrix([[lah_numbers4(10000+n,k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 10.7 s per loop
sage: timeit('T2 = matrix([[lah_numbers6(10000+n,k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 2.19 s per loop

Nice!

@kwankyu kwankyu force-pushed the p/add-lah-numbers branch 2 times, most recently from 260df5f to 42ef17b Compare January 26, 2025 22:23
@kwankyu kwankyu marked this pull request as ready for review January 26, 2025 23:41
@kwankyu kwankyu changed the title Add Lah numbers Add Lah numbers and clean up combinatorics section Jan 26, 2025
@kwankyu kwankyu force-pushed the p/add-lah-numbers branch 2 times, most recently from b5d05e2 to 44c3dc1 Compare January 31, 2025 13:54
@user202729 user202729 mentioned this pull request Apr 5, 2025
5 tasks
vbraun pushed a commit to vbraun/sage that referenced this pull request Apr 10, 2025
sagemathgh-39891: Speed up binomial
    
As first observed in sagemath#39379, when
`n` and `k` are small sage integers, `binomial(n, k)` are much slower
than `n.binomial(k)`.

This pull request fixes it. Now `%timeit binomial(20, 10)` is only twice
slower than `%timeit 20.binomial(10)` — previously it was **30** times
slower.

There's also a minor patch where some code is skipped when all arguments
are instances of `Element`. It is certainly a more common case for the
input to be `Element` than otherwise, so we optimize for it.

Another minor patch involves storing imported modules as C global
variables instead of re-import every time.

I think using not-overriding `_method_arguments` as an indicator of
"don't call the method"
(then silently catch `TypeError` and throw it away) is rather bad
design,
but this is not in scope here.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.
    
URL: sagemath#39891
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this pull request Apr 13, 2025
sagemathgh-39891: Speed up binomial
    
As first observed in sagemath#39379, when
`n` and `k` are small sage integers, `binomial(n, k)` are much slower
than `n.binomial(k)`.

This pull request fixes it. Now `%timeit binomial(20, 10)` is only twice
slower than `%timeit 20.binomial(10)` — previously it was **30** times
slower.

There's also a minor patch where some code is skipped when all arguments
are instances of `Element`. It is certainly a more common case for the
input to be `Element` than otherwise, so we optimize for it.

Another minor patch involves storing imported modules as C global
variables instead of re-import every time.

I think using not-overriding `_method_arguments` as an indicator of
"don't call the method"
(then silently catch `TypeError` and throw it away) is rather bad
design,
but this is not in scope here.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.
    
URL: sagemath#39891
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this pull request Apr 18, 2025
sagemathgh-39891: Speed up binomial
    
As first observed in sagemath#39379, when
`n` and `k` are small sage integers, `binomial(n, k)` are much slower
than `n.binomial(k)`.

This pull request fixes it. Now `%timeit binomial(20, 10)` is only twice
slower than `%timeit 20.binomial(10)` — previously it was **30** times
slower.

There's also a minor patch where some code is skipped when all arguments
are instances of `Element`. It is certainly a more common case for the
input to be `Element` than otherwise, so we optimize for it.

Another minor patch involves storing imported modules as C global
variables instead of re-import every time.

I think using not-overriding `_method_arguments` as an indicator of
"don't call the method"
(then silently catch `TypeError` and throw it away) is rather bad
design,
but this is not in scope here.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [ ] I have updated the documentation and checked the documentation
preview.
    
URL: sagemath#39891
Reported by: user202729
Reviewer(s): Travis Scrimshaw
@kwankyu kwankyu force-pushed the p/add-lah-numbers branch from 3b0e04a to 5c38935 Compare June 9, 2025 00:43
@kwankyu kwankyu force-pushed the p/add-lah-numbers branch from 5c38935 to 1ee62d8 Compare June 9, 2025 04:32
@kwankyu
Copy link
Collaborator Author

kwankyu commented Jun 11, 2025

@fchapoton As you currently seem to be in the "combinat" section, would you review this PR? This PR touches many files in the directory src/sage/combinat to improve the documentation, but most of the changes are quite small.

The new code for Lah numbers was, I think, already reviewed by @user202729.

@fchapoton fchapoton self-assigned this Jun 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic generation of the module list in the Sphinx documentation for sage.combinat Implement the unsigned Lah numbers
3 participants