-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU: Improvements for sorting / thrust external allocator #14103
Conversation
REQUEST FOR PRODUCTION RELEASES:
This will add The following labels are available |
@mconcas : For using the thrust external allocator, see https://github.com/davidrohr/AliceO2/blob/c9e82dd6c8452852ea2e2eadfff5c4ac9887c901/GPU/Common/GPUCommonAlgorithmThrust.h#L96 The way it works is:
|
c9e82dd
to
9ceb34e
Compare
performance problem fixed |
Error while checking build/O2/fullCI_slc9 for c9e82dd at 2025-03-24 18:55:
Full log here. |
Error while checking build/O2/fullCI_slc9 for 9ceb34e at 2025-03-24 20:27:
Full log here. |
9ceb34e
to
471fdbf
Compare
…s not working any more
…on device from host
…ze the last kernel
471fdbf
to
1627d1a
Compare
Unfortunately, this PR causes a major performance regression both on my NVIDIA GPU and on EPNs. I don't understand why yet. Just want to check it in the CI, and show @mconcas how to use the external allocator.