-
-
Notifications
You must be signed in to change notification settings - Fork 23.7k
CommandQueueMT: Reduce contention + Fix race conditions #112506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CommandQueueMT: Reduce contention + Fix race conditions #112506
Conversation
352150c to
bf84697
Compare
bf84697 to
bd615b4
Compare
|
I did some basic testing of this fix using the Sponza scene (https://github.com/Calinou/godot-sponza). I confirmed it fixes the performance problem so that the performance loading a very complex glTF in a background thread takes the same time regardless of if using the Separate or Safe thread model. I also compared it to the performance without the fix for loading with Safe thread model and it was comparable, so I didn't see a degradation due to the memcpy. |
bd615b4 to
fa61594
Compare
|
@brycehutchings Thanks a lot for testing and reviewing this. |
fa61594 to
3e1dcfe
Compare
3e1dcfe to
4ba4558
Compare
|
Rebased. |
|
I noticed the issue with the lock being held while the RenderingServer is flushing the queue (and stalling everything that's queuing stuff for the next frame) while looking at Perfetto traces on Samsung Galaxy XR. This PR seems to solve that problem entirely! Here's a trace from
The (NOTE: this won't show up in traces of this project on Meta Quest 3, because Meta's runtime uses And here is a trace from this PR:
Notice that the process stuff happens super early now and overlaps the rendering, and it's only the (NOTE: this trace is actually worse for input latency, but I think that's really a problem with |
dsnopek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me - after I noticed the issue in Perfetto, but before I found this PR, I was thinking about trying to implement this same change (ie copying the queue in _flush()). However, I'm not all that familiar with this code, so I'm glad that @RandomShaper did it :-)
I can say that the result seems correct from my testing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Thanks! |


See #112452 for an explanation of why this would be beneficial.
TL;DR Keeping the lock held over the command queue of the rendering server in separate thread mode is not needed because there's a single thread dealing with such server, Therefore, the lock is only needed for thread safety of the command queue itself, which means the lock can be released while commands are run, allowing other threads to add commands without waiting.
Not tested at all due to lack of time...
UPDATE: I've marked this PR as cherry-pickable into some releases, but probably the only commit that should be cherry-picked is the first one (CommandQueueMT: Fix race conditions), which fixes a clear bug. The rest of the PR is a performance improvement, and as such it's better scheduled only for the current dev branch.