Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[d3d12/vk] Implement out of memory detection #7472

Open
wants to merge 7 commits into
base: trunk
Choose a base branch
from

Conversation

teoxoy
Copy link
Member

@teoxoy teoxoy commented Apr 3, 2025

Connections
Resolves D3D12 and Vulkan parts of #7460.

Description
Implements OOM detection. See #7460 for details.

Testing
I added a new crate oom-test for this.

Screenshot 2025-03-28 161825

Squash or Rebase?
Rebase

@teoxoy teoxoy requested a review from a team as a code owner April 3, 2025 16:52
@@ -2435,6 +2519,10 @@ impl crate::Device for super::Device {
&self,
desc: &wgt::QuerySetDescriptor<crate::Label>,
) -> Result<super::QuerySet, crate::DeviceError> {
// Assume each query is 256 bytes.
// On an AMD W6800 with driver version 32.0.12030.9, occlusion queries are 256.
self.check_for_oom(true, desc.count as u64 * 256)?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure about this, there doesn't seem to be any indication in the Vulkan API that query sets reside in a host visible heap (even if this is the case on the AMD card I tested).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually surprised they are in the host visible heap rather than the device local one (as they are on D3D12).

teoxoy added 7 commits April 8, 2025 15:43
…and acceleration structures

The D3D12 API doesn't guarantee that it returns `E_OUTOFMEMORY` in high memory pressure situations; drivers/kernel will happily start swapping objects that were in VRAM to RAM and then RAM to DISK, slowing down the system to a crawl if done in a loop.
…re, query set and acceleration structure creation
This removes the possibility of deadlocks happening since `release_gpu_resources` tries to lock resources (trackers, snatchable_lock, pending_writes, life_tracker) while they might be already locked; `handle_hal_error` is called in lots of places.

Removing the call only delays destruction since `release_gpu_resources` is still called in `maintain`.
This is to preserve the current behavior as tested by the `SAMPLER_CREATION_FAILURE` test.

This is not spec compliant but it's unclear what we should do instead. I opened gpuweb/gpuweb#5142 to figure out what we should do.
@cwfitzgerald cwfitzgerald self-assigned this Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants