Skip to content

Support headless testing on CI #136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
repi opened this issue Oct 25, 2020 · 10 comments
Closed

Support headless testing on CI #136

repi opened this issue Oct 25, 2020 · 10 comments
Labels
a: test Issues around testing rust-gpu. t: help wanted Extra attention is needed

Comments

@repi
Copy link
Contributor

repi commented Oct 25, 2020

@bjorn3 mentioned in #134 (comment) that Mesa's LLVMpipe can be a possibility for headlessCI testing.

Did some quick investigation and LLVMpipe has traditionally supported OpenGL but has recently merged in early Vulkan support ("Vallium"). It likely isn't that mature yet but it would be really interesting to try out and see it this can be an additional path for us for doing software-based testing of our graphics shaders, esp. targetting Linux CI.

Would be great if someone can simply try out just running our current Vulkan example program & shader with latest LLVMpipe and see if it works, or what issues there are with it, and report back here.

@repi repi added t: help wanted Extra attention is needed a: test Issues around testing rust-gpu. labels Oct 25, 2020
@nipunG314
Copy link
Contributor

(This was discussed on Discord. Putting it here as a matter of record)

Another option would be SwiftShader: https://github.com/google/swiftshader. It is conformant implementation of the Vulkan Spec (at least on 64-bit Linux) as described here: https://www.khronos.org/conformance/adopters/conformant-products#submission_403

I'll try to compile the latest commit and run the current examples against it.

@nipunG314
Copy link
Contributor

nipunG314 commented Oct 29, 2020

SwiftShader doesn't seem to an ideal candidate for running headlessCI tests right now.

The following capabilities are currently unsupported:

  1. VulkanMemoryModel (5345)
  2. VariablePointers (4442)
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:388 WARNING: UNSUPPORTED: Unsupported capability 5345
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:388 WARNING: UNSUPPORTED: Unsupported capability 4442
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:720 WARNING: UNSUPPORTED: SPIR-V Extension: SPV_KHR_vulkan_memory_model
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:467 WARNING: UNREACHABLE: FunctionParameter should have already been lowered.
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:725 WARNING: UNSUPPORTED: ReturnValue
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:467 WARNING: UNREACHABLE: FunctionParameter should have already been lowered.
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:467 WARNING: UNREACHABLE: FunctionParameter should have already been lowered.
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:467 WARNING: UNREACHABLE: FunctionParameter should have already been lowered.
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:725 WARNING: UNSUPPORTED: ReturnValue
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:467 WARNING: UNREACHABLE: FunctionParameter should have already been lowered.
/home/nipun/swiftshader/src/Pipeline/SpirvShader.cpp:467 WARNING: UNREACHABLE: FunctionParameter should have already been lowered.
Segmentation fault (core dumped)

Both the ash example and the wgpu example generates warnings about this. The ash example crashes the moment it is launched, but the wgpu example runs despite the warnings.

I'll try to see if llvmpipe has similar problems.

EDIT: If someone is trying to compile SwiftShader, I would recommend not using the cmake --build . --parallel command that is mentioned in the build instructions. It froze my system and led to a crash. My system is over 5 years old, so it may not be an issue on more modern computers. A similar experience has been documented here: https://issuetracker.google.com/issues/170565506

The cmake .. command generates a Makefile and make -j6 is sufficient for the build.

@eddyb
Copy link
Contributor

eddyb commented Nov 23, 2020

Using VK_ICD_FILENAMES=/nix/store/n2kzvhfcsvrxyh11ma5frpdll4gn3dns-swiftshader-2020-11-20/share/vulkan/icd.d/vk_swiftshader_icd.json cargo run --bin example-runner-wgpu (specifically at google/swiftshader@6d61205) I get:

    Finished dev [unoptimized + debuginfo] target(s) in 0.19s
     Running `target/debug/example-runner-wgpu`
SPIR-V ERROR: 0:0 PushConstant id '8' is missing Block decoration.
From Vulkan spec, section 14.5.1:
Such variables must be identified with a Block decoration
  %shared__ShaderConstants = OpTypeStruct %uint %uint %float

SPIR-V WARNING: 0:0 PushConstant id '8' is missing Block decoration.
From Vulkan spec, section 14.5.1:
Such variables must be identified with a Block decoration
  %shared__ShaderConstants = OpTypeStruct %uint %uint %float

Segmentation fault (core dumped)

(EDIT: looks like the warnings are from #247, so perhaps not related to the crash?)

I was getting the same exact output with the packaged swiftshader-2020-06-17 (google/swiftshader@763957e), which is why I tried switching it to the latest revision, but that clearly didn't fix anything. I'm also not seeing any difference between the wgpu and ash runners.

@nipunG314 Do you happen to know what exact SwiftShader and Rust-GPU revisions you used to get those warnings?
Maybe the changes to the examples (or the SPIR-V backend, or any of the post-processing etc.) made it worse (or better but still hitting warnings + crashes).


It froze my system and led to a crash. My system is over 5 years old, so it may not be an issue on more modern computers.

I noticed it's compiling LLVM, and I guess something goes wrong with --parallel (sigh it's probably nesting makes incorrectly), but also you may need a bunch of RAM to link LLVM, if not compile it. -j4 is a decent stopgap, but ideally this kind of thing would be built on a dedicated build server, not a regular dev machine.

I can get away with having Nix offload the build to the multi-user EPYC server I use for hacking on rustc, but if Rust-GPU starts using SwiftShader for testing, we should probably have some sort of CI build (unless Google has prebuilds somewhere I'm missing?), for the cases where distro builds are missing / outdated (and for non-Linux).

@eddyb
Copy link
Contributor

eddyb commented Nov 24, 2020

With a debug build (cmake -DCMAKE_BUILD_TYPE=Debug -DSWIFTSHADER_WARNINGS_AS_ERRORS=0), I get:

/build/SwiftShader-6d61205/src/Vulkan/VkShaderModule.cpp:34 ABORT: ASSERT(spirvTools.Validate(getCode()))

But I'm not sure how related it is, and this happens before those warnings, and there's no additional output presented.


After building a proper unoptimized debug build, and combining that with -DCMAKE_CXX_FLAGS=-DNDEBUG to remove the above ASSERT, I can finally see the cause of the crash: this load-bearing assertion doesn't fire (because ASSERT is a noop in release builds), and apparently the iterator over instructions assumes insns isn't empty.

The same assert is also in the caller (but also doesn't fire, for the same reason), and I'm guessing running the SPIRV-Tools Optimizer failed silently.

@eddyb
Copy link
Contributor

eddyb commented Nov 24, 2020

As suspected, SPIRV-Tools' Optimizer::Run returns false on failure, but SwiftShader ignores the return value.

However, it's unclear why validation failed

Oh, I missed that one of the "warnings" is actually ERROR. So getting SwiftShader to work is blocked on either solving #247, or patching it to disable validation (and hope that SPIRV-Tools can still work).

@eddyb
Copy link
Contributor

eddyb commented Nov 24, 2020

Decided to try patching it to disable validation, and it works! (the extra true is the skip_validation parameter)

image

Both wgpu and ash work just as well, with the same set of warnings:

/build/SwiftShader-6d61205/src/Pipeline/SpirvShader.cpp:388 WARNING: UNSUPPORTED: Unsupported capability 5345
/build/SwiftShader-6d61205/src/Pipeline/SpirvShader.cpp:388 WARNING: UNSUPPORTED: Unsupported capability 4442
/build/SwiftShader-6d61205/src/Pipeline/SpirvShader.cpp:721 WARNING: UNSUPPORTED: SPIR-V Extension: SPV_KHR_vulkan_memory_model
/build/SwiftShader-6d61205/src/Pipeline/SpirvShader.cpp:388 WARNING: UNSUPPORTED: Unsupported capability 5345
/build/SwiftShader-6d61205/src/Pipeline/SpirvShader.cpp:388 WARNING: UNSUPPORTED: Unsupported capability 4442
/build/SwiftShader-6d61205/src/Pipeline/SpirvShader.cpp:721 WARNING: UNSUPPORTED: SPIR-V Extension: SPV_KHR_vulkan_memory_model

So we can probably start using unmodified SwiftShader once #247 is addressed.

@eddyb
Copy link
Contributor

eddyb commented Nov 24, 2020

And the usual on a headless server:

Xvfb :0&
x11vnc -display :0 -localhost&
DISPLAY=:0 VK_ICD_FILENAMES=.../share/vulkan/icd.d/vk_swiftshader_icd.json cargo run --bin example-runner-wgp

seems to be working as well!

image

@Jasper-Bekkers
Copy link
Contributor

Filed two tickets on the SwiftShader repo, one for VariablePointer support and one for VK_KHR_memory_model support (https://issuetracker.google.com/issues/176819537 and https://issuetracker.google.com/issues/176819536)

@XAMPPRocky XAMPPRocky changed the title Try using Mesa LLVMpipe for no-GPU testing Support headless testing on CI Jan 7, 2021
@repi
Copy link
Contributor Author

repi commented Feb 9, 2021

Looks like at least the VK_KHR_vulkan_memory_model issue has been fleshed out further and some type of support in, but the VK_KHR_vulkan_memory_model issue the devs had some more questions.

Would be so nice to be able to get in some initial headless CI testing of Rust GPU in this repo!

And if it can work here for our example/test shaders, wonder what would be needed for our more advanced Ark Rust shaders

@eddyb
Copy link
Contributor

eddyb commented Feb 11, 2021

One thing that I've noticed more recently is Mesa's lavapipe (see this Phoronix article, or the Mesa 20.3 changelog), especially as I was trying to get a (mixed vendor) multi-GPU setup working, and it was always present - and I suspect it might be easier to get access to in CI, we just need offscreen render tests and a distro with recent Mesa packages, IIUC.

Though testing both SwiftShader and lavapipe wouldn't be a bad idea, we could require that at least one of them passes, in case their limitations start showing (or we trigger bugs in them etc.).


Speaking of the multi-GPU setup, I didn't get to play a lot with it, but I did get it to the point where I have an Intel iGPU, and two dGPUs (both of them small low-profile cards, mostly so that I don't waste a lot of power nor need an expensive/loud PSU):

  • Nvidia Kepler (Quadro K600)
  • AMD GCN1 / Oland (Radeon R5 240)
(click to open a screenshot from `vulkaninfo`, showing all 3 GPUs)

image

The machine has no persistent storage and boots off the network (from my main workstation), and I could easily set something up to run a test via SSH with no monitors attached - but I need to change the runners to support offscreen rendering first.
If you want to follow along, I've kept my configuration in https://github.com/LykenSol/GPU-TestBox.

@repi repi closed this as not planned Won't fix, can't repro, duplicate, stale Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a: test Issues around testing rust-gpu. t: help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants