-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
Debug build assertion failure with native threads attempting to acquire GIL on termination #131012
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So I think the GIL-hammering thread is adding itself to the list of threads after the call to [the test is not really simulating a production workload I have, it's just hammering the GIL to make sure that Python + Rust is stable in that case]. |
@arielb1 A minimal example reproducing the issue (in python or C) would be helpful. And would it be possible for you to test on the current main branch? |
How are you hammering the GIL? If it's via |
Yea, basically calling PyGILState_Ensure in a loop. The Rust code has an exception handler that turns the pthread_exit into a hang, but the tstate issue is separate AFAICT.
|
There wouldn't be any hang or exit (that would happen at somewhere like a |
It’s not crashing, it calls pthread_exit which my handler converts to a hang. It’s the main thread that’s crashing.
|
It's probably because Bottom line is that you can't call |
This sounds like a fixable race even with the current API - have finalization prevent threads from inserting themselves into the tstate list before emptying the tstate list.
|
Yeah, and it will probably be fixed with the new API. But |
But existing extensions will keep using it for a long time, I personally believe it should try to be safe, and safely hanging should be safe here. If it's not safe, then every Python program using native code in a daemon thread has a chance of crashing on exit. |
So I think (but have no confirmation) that at this code section Lines 1593 to 1605 in 98fa4a4
It needs to detect that the interpreter is shutting down, release runtime_lock, and do something that triggers an hang/pthread_exit, Of course, this won't help existing Python versions |
You should take a look at #124622 for prior discussion (and problems to solve) here. |
I mean, API-wise, we should have an PyGILState_STATE PyGILState_Ensure() {
PyGILState_EXTENDED_STATE state = PyGILState_EnsureOrFail();
if (state == PyGILState_FAILED) {
hang_or_exit_thread();
}
return (PyGILState_STATE)state;
} I don't think anyone disagrees about this? There's the problem of where should Python stop allowing thread states to be created you talked about in the other issue. |
Yes, that's essentially what the PR does. |
So the crash issue is #124619 |
Looks like it. I'm going to close this for now. Feel free to send your code (in C, I'm no good with Rust) and I'll reopen it if I determine that this isn't |
Bug description:
I'm writing PyO3/pyo3#4874, which tries to avoid Rust crashing on Python interpreter termination when there are native threads attempting to acquire the GIL.
To test it, I created a test that constantly hammers the GIL on a daemon thread, and on debug builds, I get this assertion failure fairly reliably on Python 3.13:
It looks like
zapthreads
is attempting to zap a native thread that does not currently hold the GIL.Would it help if I'll reproduce this in a C example?
CPython versions tested on:
3.13
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: