-
-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Use mesh bounds center for transparent/transmissive sorting #22041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Transparent and transmissive phases previously used the instance translation from GlobalTransform as the sort position. This breaks down when mesh geometry is authored in "world-like" coordinates and the instance transform is identity or near-identity (common in building/CAD-style content). In such cases multiple transparent instances end up with the same translation and produce incorrect draw order. This change introduces sorting based on the world-space center of the mesh bounds instead of the raw translation. The local bounds center is stored per mesh/instance and transformed by the instance’s world transform when building sort keys. This adds a small amount of per-mesh/instance data but produces much more correct transparent and transmissive rendering in real-world scenes.
|
I haven't reviewed the code yet and I'm not opposed to the idea but I would say that for an app that cares a lot about correct transparency the solution should be using some form of order independent transparency. We have support for it in bevy, there's still some work to be done on it but it can definitely be used for CAD apps since it's already being used in production CAD apps. |
|
Okay, I looked at the code and everything seems to make sense to me. The only thing I would like to see is some kind of benchmark that shows that it isn't introducing a big performance regression. And if possible it would be nice to have numbers comparing with and without the tie breaker. |
Thanks for looking at the code and for the suggestion! What part of the change would you like to see benchmarked? From my side there are two main areas:
Even with that overhead it is still roughly 2-3x faster than I held off on implementing the packed key because it requires a proper f32-to-lex-u32 conversion utility (similar to I am also not sure how large the sorting cost is in the overall blending pipeline. For example, saving ~1 ms on sorting 100k instances might be outweighed significantly by the cost of issuing 100k draw calls. If this approach makes sense, I can:
Please let me know which option you would prefer, and I will update the PR accordingly. |
A bit of both. I mostly want to make sure this PR isn't a regression. I doubt that using the AABB center would have a high impact but it's still a pretty hot path so I would prefer to have at least some numbers to confirm it. I'd also like to see how much of an impact using the tie breaker makes. I assume it will be fairly small relative to everything else and is worth it for the stability gain but I always prefer having real numbers instead of assuming. As for how to test, you don't need to add new benches. Just try to run a few complex scenes with a lot of transparent meshes and compare the frametimes using tracy. Like, maybe just spawn a 50x50x50 grid of transparent cubes and see if you see any performance impact. Oh and don't bother about packing unless you confirm that the impact of sorting with the tie breaker is high enough that it matters. We can always do it later if necessary but I prefer having a baseline that's easier to understand. |
|
I should specify, I would even be happy with just a tracy comparison of a scene with a lot of meshes of main vs this PR. Comparing with vs without the tie breaker would be nice but not necessary at all. |
Thanks for the guidance! I tried it quickly without the tie breaker and already see about a 10% regression. I suspect this comes from the baseline using a no-op sort, so I'll need to set up the same test using instanced meshes where the distances are non-zero. |
Transparent and transmissive phases previously used the instance translation from GlobalTransform as the sort position. This breaks down when mesh geometry is authored in "world-like" coordinates and the instance transform is identity or near-identity (common in building/CAD-style content). In such cases multiple transparent instances end up with the same translation and produce incorrect draw order.
This change introduces sorting based on the world-space center of the mesh bounds instead of the raw translation. The local bounds center is stored per mesh/instance and transformed by the instance’s world transform when building sort keys. This adds a small amount of per-mesh/instance data but produces much more correct transparent and transmissive rendering in real-world scenes.
Objective
Currently, transparent and transmissive render phases in Bevy sort instances using the translation from GlobalTransform. This works only if the mesh origin is a good proxy for the geometry position. In many real-world cases (especially CAD/architecture-like content), the mesh data is authored in "world-like" coordinates and the instance
Transformis identity. In such setups, sorting by translation produces incorrect draw order for transparent/transmissive objects.I propose switching the sorting key from
GlobalTransform.translationto the world-space center of the mesh bounds for each instance.Solution
Instead of using
GlobalTransform.translationas the sort position for transparent/transmissive phases, use the world-space center of the mesh bounds:RenderMeshInstanceSharedascenter: Vec3derived from the meshAabb).This way:
The main trade-offs:
RenderMeshInstanceShared(typically +12 or +16 bytes depending on alignment),Alternative approach and its drawbacks
In theory, this could be fixed by baking meshes so that:
Transformis adjusted to move it back into place.However, this has several drawbacks:
In practice, this is not a scalable or convenient solution.
Secondary issue: unstable ordering when depth is equal
There is another related problem with the current sorting: when two transparent/transmissive instances end up with the same view-space depth (for example, their centers project onto the same depth plane), the resulting draw order becomes unstable. This leads to visible flickering, because the internal order of
RenderEntityitems is not guaranteed to bestable between frames.
In practice this happens quite easily, especially when multiple transparent instances share the same or very similar sort depth, and
their relative order in the extracted render list can change frame to frame.
To address this, I suggest extending the sort key with a deterministic tie-breaker, for example the entity's main index. Conceptually, the sort key would become:
This ensures that instances with the same depth keep a consistent draw order across frames, removing flickering while preserving the intended depth-based sorting behavior.
Testing
cargo run -p ci -- test cargo run -p ci -- doc cargo run -p ci -- compileRun this "example"
Showcase
In my tests with building models (windows, glass, etc.), switching from translation-based sorting to bounds-center-based sorting noticeably improves the visual result. Transparent surfaces that were previously fighting or blending incorrectly now render in a much more expected order.
Current:
https://youtu.be/WjDjPAoKK6w
Sort by aabb center:
https://youtu.be/-Sl4GOXp_vQ
Sort by aabb center + tie breaker:
https://youtu.be/0aQhkSKxECo