-
Notifications
You must be signed in to change notification settings - Fork 3.2k
faster shut down #24891
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
faster shut down #24891
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
… into snnn/shutdown
This PR is to reduce the chance of having crashes when a process is shutting down. The main idea is: if we know the process is shutting down(or if we know the ORT DLL won't be reloaded), we do not need to run C++ object destructors. The approach is recommended by Windows SDK's official document and many Windows developers. For example, 18 years ago Raymond Chen wrote a blog The old-fashioned theory on how processes exit. This is ORT's current behavior. Raymond Chen also wrote a blog what a better approach is
In our case, when onnxruntime is built as a python package, the DLL(onnxruntime_pybind11_state.pyd ) will never be manually unloaded. Same on Linux. Python does not unload the DLLs on exit. Therefore, we do not need to worry about the potential memory leaks caused by any global variables. Therefore we do not need to call OrtEnv's destructors, and we do not need to unload any EP dlls.
In most cases, people do not unload DLLs on Windows. And, on Linux it is even more complicated because GCC needs to maintain a unique table to avoid odr violations, and this feature makes most C++ shared library cannot be unloaded.
So, this change detects if the os is Windows and if the process is shutdown when calling destructors. If yes, the destructor will do nothing.
After this change on Windows in most cases OrtEnv will not be destroyed. The only exception is: if someone manually load the DLL and manually unload the DLL, and also do not have a global threadpool. Then I think the user is an advanced user and they should know that they need to destroy all inference session objects and the OrtEnv singleton object before unloading the DLL. Besides, if they have enabled global thread pool, the DLL won't be unloaded if they haven't shutdown the thread pool and delete the OrtEnv object. And, even if the user has manually loaded/unloaded the DLL, there would still be memory leaks(that are not related to this change). It's hard to get 100% clean.