Fix lbeta for large arguments #1612

martinmodrak · 2020-01-14T09:36:21Z

Summary

Reimplementing lbeta to be stable when one argument is large and other small.

It is a neat trick I found in the R codebase - you first compute an approximation to lbeta via Stirling approximation to lgamma which allows you to analytically manipulate the expression and avoid catastrophic cancellation for the "bulk" of the value. Then you compute the difference between lgamma and the Stirling approximation (the function lgamma_stirling_diff) and add this to the result to restore precision.

The code in R for the difference between lgamma and Stirling was opaque and used some tricks I didn't really get, but it turns out just using 3 additional terms of the Stirling series as found on Wikipedia is enough for all the tests I could come up with :-)

Possible issues:

Not sure if lgamma_stirling_diff is a good name for the function.
The R code that inspired this implementation (https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/nmath/lbeta.c) is GPLv2. While I reimplemented the code from scratch, some portions of lbeta.hpp are quite similar to the R code as there are only so many ways you can rearrange the Stirling approximation to get stable results. I believe this should not be a copyright issue, but I am not a lawyer.

Tests

A test against values and derivatives precomputed in Mathematica
A test against several identities
A test for extreme values, infinity
A test for stability of value around the cutoff for using the Stirling approximation

Side Effects

Hopefully none. I have not done any performance comparison of the old vs. new implementation. I can imagine the new is faster (can avoid calls to lgamma) or slower (needs quite a bunch of arithmetic operations).

I did not touch the way gradients are computed (as those look mostly OK, although for very small arguments I needed to increase the test tolerance to 1e-7).

Checklist

Math issue lbeta is unstable when the arguments differ a lot #1611
Institute of Microbiology of the Czech Academy of Sciences

By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

…gs/RELEASE_500/final)

martinmodrak · 2020-01-14T09:51:05Z

In #1579 @bob-carpenter inquired about Mathematica code. I will address it here, as this PR needs to be merged before #1579.

Also, have you checked that the IP is OK for us to include Mathematica code in comments and use the output of their approximations? I can't see why it wouldn't be, but it'd be good to check before we go too far down the road of using a proprietary software package.

I am not a lawyer, but the code is IMHO easy: I've written the code, I (or my employer) have the IP - this has nothing to do with what you do with the code after it's written.

For ouptut of software, the case is less clear cut, but AFAIK the important part is who did a "creative act" in creating the outputs. Referring to for example this source (but others agree): If the user provides non-trivial inputs to a software, they can usually claim copyright to its outputs (the same way Microsoft cannot claim copyright over stuff written in Word / Excel or Adobe cannot claim copyright to images created in Illustrator). In some cases, just running a software does not give you copyright over the results - for example for software with no inputs designed to produce the outputs. While there is an interesting little explored field over stuff like AI-created artwork, I think we are safely on the good side of the law - my input to Mathematica was creative and non-trivial.

mcol

Thank you for finding the problem and coming up with a proper solution. I've read through this and left you a bunch of nitpicks and a few style improvements. I don't feel confident enough to properly review this, so request a review from somebody more experienced. Hopefully, if you address my comments, they'll have an easier time reviewing the actual, important parts.

stan/math/prim/fun/lbeta.hpp

test/unit/math/prim/fun/lbeta_test.cpp

test/unit/math/prim/fun/lgamma_stirling_diff_test.cpp

test/unit/math/rev/fun/lbeta_test.cpp

stan/math/prim/fun/lgamma_stirling_diff.hpp

mcol · 2020-01-14T14:15:20Z

test/unit/math/rev/fun/lbeta_test.cpp

+constexpr double NaN = std::numeric_limits<double>::quiet_NaN();
+// Test values generated in Mathematice, reproducible notebook at
+// https://www.wolframcloud.com/obj/martin.modrak/Published/lbeta.nb
+// Mathematica Code reproduced below for convenience:


I don't have a strong opinion, but perhaps this should be recorded in the issue rather than committed here.

I didn't realize putting it into an issue is an option, that's a good point! Still I slightly prefer having it in code - it can evolve with the code and so it might make sense to version it along the code. And the code might be around longer than GitHub...

martinmodrak · 2020-01-14T14:58:08Z

Thanks @mcol for the suggestions, I've implemented all of them and learned a bit more about Stan's style.

…gs/RELEASE_500/final)

bob-carpenter · 2020-01-14T19:10:54Z

On Jan 14, 2020, at 4:51 AM, Martin Modrák ***@***.***> wrote: In #1579 @bob-carpenter inquired about Mathematica code. I will address it here, as this PR needs to be merged before #1579. Also, have you checked that the IP is OK for us to include Mathematica code in comments and use the output of their approximations? I can't see why it wouldn't be, but it'd be good to check before we go too far down the road of using a proprietary software package. I am not a lawyer, but the code is IMHO easy: I've written the code, I (or my employer) have the IP -

So far so good. You own the IP to what you write.

this has nothing to do with what you do with the code after it's written.

It does, as this is where the legal restrictions come in. License to distribute does not follow from owning copyright any more than owning a patent on a technology gives you the right to create it. Patents and copyright are about stopping other people from doing things with your work. For example, you can write code that's considered a derivative work of other code and then your distribution and licensing is bound by the original code's licensing terms. Or you can write code that's covered by a patent you don't own the rights to, so it would be illegal to distribute. Anyway, I think we should be OK with including code snippets given that Mathematica's a programming language.

For ouptut of software, the case is less clear cut, but AFAIK the important part is who did a "creative act" in creating the outputs. Referring to for example this source (but others agree): If the user provides non-trivial inputs to a software, they can usually claim copyright to its outputs (the same way Microsoft cannot claim copyright over stuff written in Word / Excel or Adobe cannot claim copyright to images created in Illustrator).

That makes sense. Thanks for looking into this.

…irling_diff

…gs/RELEASE_500/final)

…xp1~20180509124008.99 (branches/release_50)

mcol

A few more tiny changes, then you should tag somebody for review.

mcol · 2020-01-17T08:50:38Z

stan/math/prim/fun/lgamma_stirling.hpp

+ */
+template <typename T>
+return_type_t<T> lgamma_stirling(const T x) {
+  return 0.5 * LOG_TWO_PI + (x - 0.5) * log(x) - x;


Here I meant that HALF_LOG_TWO_PI is in constants.hpp and can be used already.

stan/math/prim/fun/lgamma_stirling_diff.hpp

mcol · 2020-01-17T09:02:34Z

test/unit/math/prim/fun/lgamma_stirling_diff_test.cpp

+  EXPECT_NEAR(diff_before, diff_after, tol) << "diff around useful cutoff";
+}
+
+namespace lgamma_stirling_diff_test_internal {


There's no need to namespace this in a test file.

I've heard there is a plan to run all unit tests in a single translation unit to save on compilation time (https://discourse.mc-stan.org/t/requiring-global-unique-names-for-unit-tests-in-stan-math/11270). Since I do have TestValue and testValues in other tests, this seemed like a reasonable precaution.

stan/math/prim/fun/lgamma_stirling.hpp

martinmodrak · 2020-01-17T10:21:36Z

Thanks @mcol for the suggestions, I've implemented them, except for the test namespace (see the comment).

I would tag @bbbales2 for review (but this does not need to make it into the upcoming release, so no rush).

…gs/RELEASE_500/final)

…stable/2017-11-14)

stan-buildbot · 2020-01-17T19:33:21Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.91	4.95	0.99	-0.72% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.98	-1.61% slower
eight_schools/eight_schools.stan	0.09	0.09	1.04	4.24% faster
gp_regr/gp_regr.stan	0.22	0.24	0.95	-5.78% slower
irt_2pl/irt_2pl.stan	6.08	6.12	0.99	-0.67% slower
performance.compilation	87.82	87.27	1.01	0.63% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.39	7.32	1.01	1.07% faster
pkpd/one_comp_mm_elim_abs.stan	22.22	20.29	1.1	8.69% faster
sir/sir.stan	104.41	104.95	0.99	-0.52% slower
gp_regr/gen_gp_data.stan	0.04	0.05	0.97	-2.93% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	3.0	2.96	1.01	1.28% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.31	0.31	1.0	-0.22% slower
arK/arK.stan	1.74	1.74	1.0	0.04% faster
arma/arma.stan	0.78	0.8	0.97	-3.3% slower
garch/garch.stan	0.59	0.59	1.0	-0.08% slower
Mean result: 1.00113690293

Jenkins Console Log
Blue Ocean
Commit hash: 17f3680

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

…a-large-arguments

stan-buildbot · 2020-01-19T00:55:45Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.89	4.99	0.98	-2.09% slower
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.98	-1.9% slower
eight_schools/eight_schools.stan	0.09	0.09	0.98	-2.23% slower
gp_regr/gp_regr.stan	0.22	0.22	0.99	-0.53% slower
irt_2pl/irt_2pl.stan	6.08	6.09	1.0	-0.1% slower
performance.compilation	88.02	86.93	1.01	1.24% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.31	7.31	1.0	-0.02% slower
pkpd/one_comp_mm_elim_abs.stan	21.07	20.66	1.02	1.93% faster
sir/sir.stan	96.44	93.63	1.03	2.92% faster
gp_regr/gen_gp_data.stan	0.05	0.05	0.98	-2.24% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.97	2.96	1.0	0.4% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.31	0.31	1.0	0.04% faster
arK/arK.stan	1.75	1.75	1.0	-0.11% slower
arma/arma.stan	0.81	0.81	1.0	-0.38% slower
garch/garch.stan	0.63	0.63	1.01	0.53% faster
Mean result: 0.998517831787

Jenkins Console Log
Blue Ocean
Commit hash: 084ddfb

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

…s' into bugfix/1611-lbeta-large-arguments

…a-large-arguments

…gs/RELEASE_500/final)

martinmodrak · 2020-01-27T10:35:59Z

Pinging @bbbales2 - do you have time for review or should I ask someone else? Thx!

stan-buildbot · 2020-01-27T16:16:18Z

Name	Old Result	New Result	Ratio	Performance change( 1 - new / old )
gp_pois_regr/gp_pois_regr.stan	4.88	4.84	1.01	0.91% faster
low_dim_corr_gauss/low_dim_corr_gauss.stan	0.02	0.02	0.99	-0.62% slower
eight_schools/eight_schools.stan	0.09	0.09	0.97	-3.11% slower
gp_regr/gp_regr.stan	0.23	0.22	1.0	0.09% faster
irt_2pl/irt_2pl.stan	6.17	6.08	1.01	1.45% faster
performance.compilation	88.83	86.86	1.02	2.22% faster
low_dim_gauss_mix_collapse/low_dim_gauss_mix_collapse.stan	7.31	7.33	1.0	-0.24% slower
pkpd/one_comp_mm_elim_abs.stan	21.74	20.37	1.07	6.3% faster
sir/sir.stan	98.47	98.31	1.0	0.17% faster
gp_regr/gen_gp_data.stan	0.04	0.04	0.99	-0.67% slower
low_dim_gauss_mix/low_dim_gauss_mix.stan	2.97	2.97	1.0	0.16% faster
pkpd/sim_one_comp_mm_elim_abs.stan	0.31	0.32	0.97	-2.62% slower
arK/arK.stan	1.78	1.76	1.01	0.89% faster
arma/arma.stan	0.8	0.8	1.0	-0.08% slower
garch/garch.stan	0.58	0.58	0.99	-0.67% slower
Mean result: 1.00324490478

Jenkins Console Log
Blue Ocean
Commit hash: 71b983e

Machine information

ProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010

CPU:
Intel(R) Xeon(R) CPU E5-1680 v2 @ 3.00GHz

G++:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

Clang:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.6.0
Thread model: posix

bbbales2

Looks good!

bbbales2 · 2020-01-28T18:56:27Z

Also, I asked Bob about:

While I reimplemented the code from scratch, some portions of lbeta.hpp are quite similar to the R code as there are only so many ways you can rearrange the Stirling approximation to get stable results.

He said algorithms can't be copyrighted, so we're good.

bob-carpenter · 2020-01-28T20:08:38Z

To clarify, algorithms themselves are like recipes for food---they can't be copyrighted. The confusing thing is that the expression of the algorithm in code, like the expression of a recipe in text, can be copyrighted. The grey area is whether a reimplementation in another language infringes on the copyright. The feedback we got from IP lawyers working for NumFOCUS is that we should be OK if we rewrite the algorithm in another language. Patents are another story. Algorithms can be patented, but it's much harder to do now than it was 10 or 20 years ago. The person holding the patent would be able to sue someone using the patent without their permission.

…

On Jan 28, 2020, at 1:56 PM, Ben Bales ***@***.***> wrote: Also, I asked Bob about: While I reimplemented the code from scratch, some portions of lbeta.hpp are quite similar to the R code as there are only so many ways you can rearrange the Stirling approximation to get stable results. He said algorithms can't be copyrighted, so we're good. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

martinmodrak and others added 4 commits January 14, 2020 08:37

Failing test

4dce6ad

Fixes stan-dev#1611

2fedf57

Test against derivatives, identitites at around cutoff for Stirling

b0394a4

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

c89f18c

…gs/RELEASE_500/final)

martinmodrak added 4 commits January 14, 2020 10:57

Fixing lint errors

5ae5f93

Lint errors

10bbde3

Revert expect_near_rel.hpp

e67e7a2

Line end

b74c0bc

This was referenced Jan 14, 2020

lbeta is unstable when the arguments differ a lot #1611

Closed

Improved numerical stability of binomial_coefficient_log #1614

Merged

Fixing negative binomial phi cutoff #1497

Merged

mcol reviewed Jan 14, 2020

View reviewed changes

Format and comment improvements as suggested in review.

6b87d37

stan-buildbot and others added 2 commits January 14, 2020 15:00

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

4157942

…gs/RELEASE_500/final)

Fixed headers

c74b152

martinmodrak and others added 7 commits January 16, 2020 13:41

Removed problematic constexpr

abe6d58

Tighten test accuracy, use more terms of Stirling series in lgamma_st…

0a93fdc

…irling_diff

Merge commit '45dce152f1fdb6fc079218d76b611d2664bf305e' into HEAD

6778e62

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

571e727

…gs/RELEASE_500/final)

Ceased using expect_near_rel

ab16ea5

[Jenkins] auto-formatting by clang-format version 5.0.2-svn328729-1~e…

06cf322

…xp1~20180509124008.99 (branches/release_50)

Slightly relaxed test to pass on Linux

1ced285

mcol reviewed Jan 17, 2020

View reviewed changes

Suggestions from @mcol

1c96115

stan-buildbot and others added 2 commits January 17, 2020 10:22

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

bcf5d48

…gs/RELEASE_500/final)

Further tweak to test tolerance

95aa753

stan-buildbot and others added 2 commits January 17, 2020 08:38

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

a0ae6c3

…stable/2017-11-14)

Tweak test tolerance for Linux

17f3680

Merge remote-tracking branch 'stan-dev/develop' into bugfix/1611-lbet…

084ddfb

…a-large-arguments

martinmodrak and others added 4 commits January 27, 2020 11:29

Slightly better Mathematica code to generate tests

9ce2093

Merge remote-tracking branch 'origin/bugfix/1611-lbeta-large-argument…

e731457

…s' into bugfix/1611-lbeta-large-arguments

Merge remote-tracking branch 'stan-dev/develop' into bugfix/1611-lbet…

c1cea9c

…a-large-arguments

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

71b983e

…gs/RELEASE_500/final)

bbbales2 approved these changes Jan 28, 2020

View reviewed changes

bbbales2 merged commit a46427f into stan-dev:develop Jan 28, 2020

martinmodrak deleted the bugfix/1611-lbeta-large-arguments branch January 29, 2020 08:24

martinmodrak mentioned this pull request Feb 4, 2020

neg_binomial_2_log_lpmf is not stable for large phi #1495

Closed

martinmodrak mentioned this pull request Feb 12, 2020

Scalar lgamma_stirling and lbeta truncate log input to integer #1704

Closed

Uh oh!

Fix lbeta for large arguments #1612

Fix lbeta for large arguments #1612

Uh oh!

Conversation

martinmodrak commented Jan 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Side Effects

Checklist

Uh oh!

martinmodrak commented Jan 14, 2020

Uh oh!

mcol left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mcol Jan 14, 2020

Choose a reason for hiding this comment

Uh oh!

martinmodrak Jan 14, 2020

Choose a reason for hiding this comment

Uh oh!

martinmodrak commented Jan 14, 2020

Uh oh!

bob-carpenter commented Jan 14, 2020 via email

Uh oh!

mcol left a comment

Choose a reason for hiding this comment

Uh oh!

mcol Jan 17, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mcol Jan 17, 2020

Choose a reason for hiding this comment

Uh oh!

martinmodrak Jan 17, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

martinmodrak commented Jan 17, 2020

Uh oh!

stan-buildbot commented Jan 17, 2020

Uh oh!

stan-buildbot commented Jan 19, 2020

Uh oh!

martinmodrak commented Jan 27, 2020

Uh oh!

stan-buildbot commented Jan 27, 2020

Uh oh!

bbbales2 left a comment

Choose a reason for hiding this comment

Uh oh!

bbbales2 commented Jan 28, 2020

Uh oh!

bob-carpenter commented Jan 28, 2020 via email

Uh oh!

Uh oh!

martinmodrak commented Jan 14, 2020 •

edited

Loading