-
-
Notifications
You must be signed in to change notification settings - Fork 28
Throttle garbage collector frequency in sig_occurred #227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The current implementation breaks the tests (which makes sense, if There are other options e.g. measure how much time has taken by garbage collector runs, only fire if it exceeds 0.1 cumulative (which wouldn't break tests/existing behavior as far as it can, but still ignore garbage collector runs if things get too slow). Thoughts? |
@tornaria @dimpase Can I get a review, thanks. As I mentioned in the linked issue, the proper fix for this is to do it "upstream" at the calling site — don't operate on internal mpz owned by an object collectible by Python GC, instead create a mpz first, call functions to write to it, then copy it to Python object later. But there are a lot of locations in Sage source code that calls a function writing to a mpz. There's also the option of modifying GMP — set pointer member to null first, deallocate later — but I don't know how practical this is. |
sagemathgh-40556: Avoid mutating value field of constructed Integer ## Problem Run the following code ``` a = [Integer(i) for i in range(100)] try: b = pow(2^100000, 3^100000, 5^100000) finally: del a ``` it will take a while. Press Ctrl-C before it finishes. Then **it will hang** for a while before printing `KeyboardInterrupt`. Meanwhile, the similar-looking code ``` a = [int(i) for i in range(100)] try: b = pow(2^100000, 3^100000, 5^100000) finally: del a ``` does not suffer from the problem. ## Reason There are several places in SageMath code that does something like the following ``` cdef Integer x = <Integer>PY_NEW(Integer) sig_on() mpz_powm(x.value, ...) # mutate x.value, slow sig_off() return x ``` If the user press ctrl-C, the destructor for `x` will be called, but because `mpz_powm` is interrupted, `mpz_powm` may be in an **inconsistent state**, so the destructor might do the wrong thing. ## The band-aid fix The band-aid is described in sagemath#24986 : a function `sig_occurred()` is implemented in cysignals that checks whether a signal thrown by `sig_on()` is currently being processed, and the destructor of `Integer` checks: ``` if sig_occurred(): leak this Integer object just in case else: put this Integer object into a common object pool ``` ## The issue There are two problems with this. * This leaks integers **indiscriminately**. In particular, in the example above, **all** the 100 `Integer` objects in `a` are leaked, even though only one is being mutated, being a local variable in the powering method. At any point in the code, we know exactly which object is currently being mutated, as such this indiscriminate leak is unnecessary. * `sig_occurred()` is **slow**: in the example above, it runs the **garbage collector** each time it's called. So, the reason for the hang is that after the user presses Ctrl-C, Python needs to run the garbage collector 100 times before reporting `KeyboardInterrupt`. ## The actual fix I look at the example reported in sagemath#24986 — `coerce_actions.pyx`. The problematic doctest is ``` E = EllipticCurve(GF(5), [4,0]) P = E([2,1,1]) from sage.doctest.util import ensure_interruptible_after with ensure_interruptible_after(0.001): 2^(10^8) * P ``` by some analysis, I determine that the responsible function is `_pow_long`. As such, this pull request makes sure to not call the destructor when the user interrupts. After this pull request, we can remove the `sig_occurred()` workaround introduced in sagemath#24986 . ## Other reported issues * sagemath#39224 * sagemath/cysignals#215 (comment) * sagemath/cysignals#227 ### 📝 Checklist <!-- Put an `x` in all the boxes that apply. --> - [x] The title is concise and informative. - [x] The description explains in detail what this PR is about. - [x] I have linked a relevant issue or discussion. - [ ] I have created tests covering the changes. - [ ] I have updated the documentation and checked the documentation preview. ### ⌛ Dependencies <!-- List all open PRs that this PR logically depends on. For example, --> <!-- - sagemath#12345: short description why this is a dependency --> <!-- - sagemath#34567: ... --> URL: sagemath#40556 Reported by: user202729 Reviewer(s): Michael Orlitzky, Travis Scrimshaw, user202729
Would fix #215 .
(Actually it's only a workaround until sagemath/sage#24986 is fixed the correct way i.e. catch the exception on Sage side and avoid destructing the relevant objects. Still, I don't think there's any case where running garbage collector repeatedly rapidly is desirable.)