-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Be able to share compiled OpenCL binary caches among identical GPUs #187
base: master
Are you sure you want to change the base?
Conversation
…uring a multi-device rendering
Can somebody give me a hint about the jenkins build report? |
Baikal/Utils/cl_program.cpp
Outdated
@@ -262,6 +281,9 @@ CLWProgram CLProgram::GetCLWProgram(const std::string &opts) | |||
// Save binaries | |||
result.GetBinaries(0, binary); | |||
SaveBinaries(cached_program_path, binary); | |||
|
|||
// Block other workers until binary cache generated | |||
cache_lock.unlock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unlock is excessive since the cache_lock goes out of scope right after the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
Baikal/Utils/cl_program.h
Outdated
@@ -95,5 +96,7 @@ namespace Baikal | |||
CLWContext m_context; | |||
std::set<std::string> m_included_headers; ///< Set of included headers | |||
|
|||
static std::unordered_map<std::string, std::shared_ptr<std::mutex>> s_binary_cache_names; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a part of the class interface, so it might be moved into the .cpp file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved into cl_program.cpp
Hello, according to the build logs the test 'AovTest.Aov_ObjectId' is failed. You should be able to reproduce this issue on Ubuntu/Windows machines. Looks like this is permanent issue for all of them. Try to launch Baikal unit tests |
Thanks, I will look into it |
The |
Yes, seems the only way to solve the issue is by update the unit test reference image And no more static required when put m_default_material inside the scene object |
This change solved the following issue
Because each worker thread use its own instance of default_material in ClwSceneController, which has its own SceneObject id
Different SceneObject id means different inputmap id, lead to different inputmap.cl
Change default_material from member variable to static variable solve this issue
But if there is no cache exists, all the GPUs will try to compile and generate the very same binary cache, at the same time
This may cause a huge spike in memory consumption with lots of wasted works
Added a lock mechanism to allow only one worker to compile and generate binary cache, while the others will simply wait for it
Any workers with different GPU/CPU model won't be affected and will be able to generate their own version of kernel binaries