-
-
Notifications
You must be signed in to change notification settings - Fork 193
Fix lbeta for large arguments #1612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix lbeta for large arguments #1612
Conversation
In #1579 @bob-carpenter inquired about Mathematica code. I will address it here, as this PR needs to be merged before #1579.
I am not a lawyer, but the code is IMHO easy: I've written the code, I (or my employer) have the IP - this has nothing to do with what you do with the code after it's written. For ouptut of software, the case is less clear cut, but AFAIK the important part is who did a "creative act" in creating the outputs. Referring to for example this source (but others agree): If the user provides non-trivial inputs to a software, they can usually claim copyright to its outputs (the same way Microsoft cannot claim copyright over stuff written in Word / Excel or Adobe cannot claim copyright to images created in Illustrator). In some cases, just running a software does not give you copyright over the results - for example for software with no inputs designed to produce the outputs. While there is an interesting little explored field over stuff like AI-created artwork, I think we are safely on the good side of the law - my input to Mathematica was creative and non-trivial. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for finding the problem and coming up with a proper solution. I've read through this and left you a bunch of nitpicks and a few style improvements. I don't feel confident enough to properly review this, so request a review from somebody more experienced. Hopefully, if you address my comments, they'll have an easier time reviewing the actual, important parts.
constexpr double NaN = std::numeric_limits<double>::quiet_NaN(); | ||
// Test values generated in Mathematice, reproducible notebook at | ||
// https://www.wolframcloud.com/obj/martin.modrak/Published/lbeta.nb | ||
// Mathematica Code reproduced below for convenience: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion, but perhaps this should be recorded in the issue rather than committed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize putting it into an issue is an option, that's a good point! Still I slightly prefer having it in code - it can evolve with the code and so it might make sense to version it along the code. And the code might be around longer than GitHub...
Thanks @mcol for the suggestions, I've implemented all of them and learned a bit more about Stan's style. |
…gs/RELEASE_500/final)
On Jan 14, 2020, at 4:51 AM, Martin Modrák ***@***.***> wrote:
In #1579 @bob-carpenter inquired about Mathematica code. I will address it here, as this PR needs to be merged before #1579.
Also, have you checked that the IP is OK for us to include Mathematica code in comments and use the output of their approximations? I can't see why it wouldn't be, but it'd be good to check before we go too far down the road of using a proprietary software package.
I am not a lawyer, but the code is IMHO easy: I've written the code, I (or my employer) have the IP -
So far so good. You own the IP to what you write.
this has nothing to do with what you do with the code after it's written.
It does, as this is where the legal restrictions come in.
License to distribute does not follow from owning copyright any more than owning a patent on a technology gives you the right to create it. Patents and copyright are about stopping other people from doing things with your work. For example, you can write code that's considered a derivative work of other code and then your distribution and licensing is bound by the original code's licensing terms. Or you can write code that's covered by a patent you don't own the rights to, so it would be illegal to distribute.
Anyway, I think we should be OK with including code snippets given that Mathematica's a programming language.
For ouptut of software, the case is less clear cut, but AFAIK the important part is who did a "creative act" in creating the outputs. Referring to for example this source (but others agree): If the user provides non-trivial inputs to a software, they can usually claim copyright to its outputs (the same way Microsoft cannot claim copyright over stuff written in Word / Excel or Adobe cannot claim copyright to images created in Illustrator).
That makes sense. Thanks for looking into this.
|
…gs/RELEASE_500/final)
…xp1~20180509124008.99 (branches/release_50)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more tiny changes, then you should tag somebody for review.
*/ | ||
template <typename T> | ||
return_type_t<T> lgamma_stirling(const T x) { | ||
return 0.5 * LOG_TWO_PI + (x - 0.5) * log(x) - x; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I meant that HALF_LOG_TWO_PI
is in constants.hpp and can be used already.
EXPECT_NEAR(diff_before, diff_after, tol) << "diff around useful cutoff"; | ||
} | ||
|
||
namespace lgamma_stirling_diff_test_internal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to namespace this in a test file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've heard there is a plan to run all unit tests in a single translation unit to save on compilation time (https://discourse.mc-stan.org/t/requiring-global-unique-names-for-unit-tests-in-stan-math/11270). Since I do have TestValue
and testValues
in other tests, this seemed like a reasonable precaution.
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
…a-large-arguments
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
…s' into bugfix/1611-lbeta-large-arguments
…a-large-arguments
…gs/RELEASE_500/final)
Pinging @bbbales2 - do you have time for review or should I ask someone else? Thx! |
Jenkins Console Log Machine informationProductName: Mac OS X ProductVersion: 10.11.6 BuildVersion: 15G22010CPU: G++: Clang: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Also, I asked Bob about:
He said algorithms can't be copyrighted, so we're good. |
To clarify, algorithms themselves are like recipes for food---they can't be copyrighted.
The confusing thing is that the expression of the algorithm in code, like the expression of a recipe in text, can be copyrighted.
The grey area is whether a reimplementation in another language infringes on the copyright. The feedback we got from IP lawyers working for NumFOCUS is that we should be OK if we rewrite the algorithm in another language.
Patents are another story. Algorithms can be patented, but it's much harder to do now than it was 10 or 20 years ago. The person holding the patent would be able to sue someone using the patent without their permission.
… On Jan 28, 2020, at 1:56 PM, Ben Bales ***@***.***> wrote:
Also, I asked Bob about:
While I reimplemented the code from scratch, some portions of lbeta.hpp are quite similar to the R code as there are only so many ways you can rearrange the Stirling approximation to get stable results.
He said algorithms can't be copyrighted, so we're good.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Summary
Reimplementing
lbeta
to be stable when one argument is large and other small.It is a neat trick I found in the R codebase - you first compute an approximation to
lbeta
via Stirling approximation tolgamma
which allows you to analytically manipulate the expression and avoid catastrophic cancellation for the "bulk" of the value. Then you compute the difference betweenlgamma
and the Stirling approximation (the functionlgamma_stirling_diff
) and add this to the result to restore precision.The code in R for the difference between
lgamma
and Stirling was opaque and used some tricks I didn't really get, but it turns out just using 3 additional terms of the Stirling series as found on Wikipedia is enough for all the tests I could come up with :-)Possible issues:
lgamma_stirling_diff
is a good name for the function.lbeta.hpp
are quite similar to the R code as there are only so many ways you can rearrange the Stirling approximation to get stable results. I believe this should not be a copyright issue, but I am not a lawyer.Tests
Side Effects
Hopefully none. I have not done any performance comparison of the old vs. new implementation. I can imagine the new is faster (can avoid calls to
lgamma
) or slower (needs quite a bunch of arithmetic operations).I did not touch the way gradients are computed (as those look mostly OK, although for very small arguments I needed to increase the test tolerance to
1e-7
).Checklist
Math issue lbeta is unstable when the arguments differ a lot #1611
Institute of Microbiology of the Czech Academy of Sciences
By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
the basic tests are passing
./runTests.py test/unit
)make test-headers
)make doxygen
)make cpplint
)the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested