Skip to content

Conversation

@RandomShaper
Copy link
Member

@RandomShaper RandomShaper commented Nov 7, 2025

See #112452 for an explanation of why this would be beneficial.

TL;DR Keeping the lock held over the command queue of the rendering server in separate thread mode is not needed because there's a single thread dealing with such server, Therefore, the lock is only needed for thread safety of the command queue itself, which means the lock can be released while commands are run, allowing other threads to add commands without waiting.

Not tested at all due to lack of time...

UPDATE: I've marked this PR as cherry-pickable into some releases, but probably the only commit that should be cherry-picked is the first one (CommandQueueMT: Fix race conditions), which fixes a clear bug. The rest of the PR is a performance improvement, and as such it's better scheduled only for the current dev branch.

@RandomShaper RandomShaper added this to the 4.6 milestone Nov 7, 2025
@RandomShaper RandomShaper requested review from a team as code owners November 7, 2025 11:41
@RandomShaper RandomShaper added topic:core topic:rendering needs testing performance cherrypick:4.4 Considered for cherry-picking into a future 4.4.x release cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release labels Nov 7, 2025
@RandomShaper RandomShaper force-pushed the less_locky_cmd_queue branch 2 times, most recently from 352150c to bf84697 Compare November 7, 2025 13:36
@RandomShaper RandomShaper changed the title CommandQueueMT: Release the lock during commands for single-flusher usages CommandQueueMT: Reduce contention + Fix race conditions Nov 11, 2025
@brycehutchings
Copy link
Contributor

I did some basic testing of this fix using the Sponza scene (https://github.com/Calinou/godot-sponza). I confirmed it fixes the performance problem so that the performance loading a very complex glTF in a background thread takes the same time regardless of if using the Separate or Safe thread model. I also compared it to the performance without the fix for loading with Safe thread model and it was comparable, so I didn't see a degradation due to the memcpy.

@RandomShaper
Copy link
Member Author

@brycehutchings Thanks a lot for testing and reviewing this.

@RandomShaper
Copy link
Member Author

Rebased.

@dsnopek
Copy link
Contributor

dsnopek commented Nov 21, 2025

I noticed the issue with the lock being held while the RenderingServer is flushing the queue (and stalling everything that's queuing stuff for the next frame) while looking at Perfetto traces on Samsung Galaxy XR.

This PR seems to solve that problem entirely!

Here's a trace from master running my fork of the GDQuest TPS demo:

Selection_396

The set_render_display_info is the first thing on this frame that is attempting to queue a Callable for the rendering server, and it ends up stalled there until the render thread finishes rendering the previous frame. You can see all the process stuff waiting until the end.

(NOTE: this won't show up in traces of this project on Meta Quest 3, because Meta's runtime uses xrWaitFrame() to delay running process until close to the end of the frame anyway to improve input latency. However, it's still possible to trigger the issue there too, just not in this project and not as drastically.)

And here is a trace from this PR:

Selection_397

Notice that the process stuff happens super early now and overlaps the rendering, and it's only the RenderingServer::sync() that stalls until the rendering server is done, which is exactly what we would expect.

(NOTE: this trace is actually worse for input latency, but I think that's really a problem with xrWaitFrame() on Samsung Galaxy XR. Inadvertent lock contention on Godot's RenderingServer shouldn't be used as the solution for that :-))

Copy link
Contributor

@dsnopek dsnopek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me - after I noticed the issue in Perfetto, but before I found this PR, I was thinking about trying to implement this same change (ie copying the queue in _flush()). However, I'm not all that familiar with this code, so I'm glad that @RandomShaper did it :-)

I can say that the result seems correct from my testing!

Copy link
Member

@akien-mga akien-mga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on Bryce and David's review and testing, and Pedro's familiarity with this code, I think we can safely1 merge this.

Footnotes

  1. TIWAGOS, this code is touchy and regression prone. Keep it in mind when reviewing potential regression reports during the 4.6 dev/beta phase.

@Repiteo Repiteo merged commit 0e182ee into godotengine:master Nov 21, 2025
20 checks passed
@Repiteo
Copy link
Contributor

Repiteo commented Nov 21, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug cherrypick:4.4 Considered for cherry-picking into a future 4.4.x release cherrypick:4.5 Considered for cherry-picking into a future 4.5.x release enhancement performance topic:core topic:rendering

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants