-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile bug: Vulkan can not work on Android (cross-compilation from linux) - Aborted without explaination #11327
Comments
It's hard to be sure with the multithreaded compiles, but it seems like it's having trouble with the matmul_id shaders. Can you try enabling the Vulkan Validation Layers? Also, can you try disabling mul_mat_id_l/m/s in ggml_vk_get_device? |
So I set the -DGGML_VULKAN_VALIDATE=ON, disabling mul_mat_id_l/m/s in the ggml_vk_get_device in the ggml-vulkan.cpp file. Also, compiling without multi-threading. It still is aborted as before. Where it is aborted (the last ggml_vk_create_pipeline) is different everytime. Once in a while it is also just stuck instead of aborted... Quirky... |
Qualcomm GPUs are known not to work with the Vulkan backend yet, there's a number of issues about it. |
I see... thought it has been fixed right now, as my impression was some people have been able to find their ways around it. Do we have some kind of ongoing discussion about it I can join? I can contribute as well. Or Github Issues and Discussions here are all we got |
I wonder, how come AI on mobile, such as utilizing Qualcomm Adreno GPU, not become the focus of many. As running it on edge seems to be the next big step in AI, no? |
I believe we mentioned this before. |
There's the OpenCL backend, but also @slp is looking into implementing/optimizing Vulkan for embedded GPUs. Github issues and discussions are all we got, basically. |
I did try your team's implementation before. It also had an issue which is I opted to focus on Vulkan as it seems to have more future support for different kind of GPUs, mainly Mali, Adreno... |
I can contact him to ask how far along are we in the development. And if there is some TODO list for the topic already. |
FWIW, I'm getting something ready and I plan to open a PR for discussion next week. Hardest part is ensuring the code doesn't get too convoluted, but I think it can be done. |
Any branch I can already take a look? That sounds great, and let me know if there is something I can do apart from testing it on my device 😄 |
@samkoesnadi Which model gave you that error with OpenCL? |
I tried with two models with the following details:
This gave me the following error
And this one only gave me After further investigation, I notice this in the debug log: I wonder why is it 0 MiB free, where there are still around 3 GiB available in the RAM Here is the complete log:
|
It might be better to have a separate issue to track the OpenCL backend failure. For the Vulkan backend, please try #11406. It'll only compile the shaders that are actually needed and might dodge some problems. |
I will try that PR out, many thanks! |
This works as a fix for LLM! With one small error only when the offload layers are all in the GPU: Command prompt:
If I set the -ngl to 28 then it works like a charm, with eval time being faster than the CPU-counterpart (2.25 vs 3.09 tokens/s). However, if the -ngl is too low like one, then the speed is actually worse than CPU - around 1.2 tokens/s. I assume this is expected. Utilizing almost 100% GPU seems to have one bad thing, the mobile seems to become laggy. |
For VLM itself is trickier to try at the moment, since the visual projector is defaulted to CPU. Will see later |
Hmm, I'm not able to reproduce this error on my system. I had a bug that caused this symptom in my first commit, but I had fixed it before asking you to test. |
#11406 is merged now. Are you still seeing the out_of_range exception at TOT? I tried for a while today and still haven't been able to reproduce that. |
I tried, and it is still there. Caches were already clear and all. I am trying to debug it now |
Thanks. In an earlier version there was a pointer getting nulled out in load_shaders that was freeing a previously compiled pipeline. Maybe it's something like that, but I couldn't find anything by inspection. |
I used lldb to debug it - log snippet below. Perhaps, the numbering of the first index of device->pipeline_descriptor_set_requirements pair list has to be the same as device->pipelines, which is currently not reflected in the code? The fact that the issue happens in my system and not yours, might mean also that the code actually works on my system which do not support many shaders features. This is as far as I can understand for now. But still that it works for partial offload layers and not all of them is still something I can not yet make sense.
|
Can you enable GGML_VULKAN_DEBUG and share the log? |
Here you go... I attached the full log in a txt file. Below is the snippet of the last log before it gets aborted:
|
Thanks. I see 18 unique pipelines being requested, all 18 call into ggml_vk_create_pipeline_func, and all but three of them made it to ggml_pipeline_allocate_descriptor_sets. Please also add a VK_LOG_DEBUG right before the crash to print out the name of the crashing pipeline. And please also add a log around this line to see if the pipeline is added to the map:
I still don't understand where things are going wrong. Maybe the compile fails and somehow doesn't end up in the map? Or maybe pipeline_descriptor_set_requirements is messed up somehow. |
I have a pretty good guess as to what's happening. I think the pipeline creation fails, probably for the mul_mat_vec_q6_k_f32_f32_1 pipeline (the other two possibilities were mul_f32 and rms_norm_f32), vulkan.hpp turns the failure into an exception, std::future silently swallows the exception and the pipeline ends up not being in device->pipelines leading to the out of range exception. The pipeline creation failure is likely a driver/compiler bug. I'll add some code to catch the exception and make the failure more obvious. If you want you could try experimenting with the shader to see if you can get it to successfully compile. It's a shame it's q6_k that's broken, I think that's pretty common as a final layer in many networks. |
Oh yeah, that's probably it. We've had issues with Qualcomm's shader compiler before. It crashed on (non-threadgroup-uniform) branches on loads/stores to/from global/shared memory, if I remember correctly. It started working after removing the branches, but probably came back after some optimization work on the mmv shaders. |
The fact that it comes from Qualcomm's shader compiler just made it a more difficult fix, I guess. I currently have some other todos, but will get to it after... |
@samkoesnadi when you get a chance, please try #11436 and verify it prints a useful message like:
|
TermuxDeveloper options -> enable "Disable child process restrictions"
|
Git commit
2139667
Operating systems
Linux, Other? (Please let us know in description)
GGML backends
Vulkan
Problem description & steps to reproduce
I have followed all instructions, all existing solutions to build Vulkan on Android using cross compilation method. I just can not seem to make it work. The cli just aborts without explanation.
My phone is Redmi Note 13 Pro 5G. Using qualcomm CPU and Adreno GPU.
Operating System I use to cross-compile: Linux. Although, I also tried to cross compile it on Windows with the exact same issue.
NDK=26 and 28 give the same result
I have attached the log output below. Thank you in advance!
First Bad Commit
No response
Compile command
cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=latest -DCMAKE_C_FLAGS=-march=armv8.4a+dotprod -DGGML_VULKAN=ON -DGGML_VULKAN_CHECK_RESULTS=OFF -DGGML_VULKAN_DEBUG=ON -DGGML_VULKAN_MEMORY_DEBUG=ON -DGGML_VULKAN_SHADER_DEBUG_INFO=ON -DGGML_VULKAN_PERF=OFF -DGGML_VULKAN_VALIDATE=OFF -DGGML_VULKAN_RUN_TESTS=OFF -DVK_USE_PLATFORM_ANDROID_KHR=ON -B build-android cmake --build build-android --config Release -j8 cmake --install build-android --prefix install-android --config Release adb push install-android /data/local/tmp/
Relevant log output
The text was updated successfully, but these errors were encountered: