-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Request: ARM SME support (for Apple M4).. #4715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
PRs welcome... do you have the hardware to test ? |
Not yet.. waiting for a mac mini m4.. |
This comment was marked as outdated.
This comment was marked as outdated.
try reading that again, it's about SME... |
This comment was marked as off-topic.
This comment was marked as off-topic.
You can develop and test using the Fixed Virtual Platform (FVP): |
This comment was marked as off-topic.
This comment was marked as off-topic.
some implementation hints there: https://scalable.uni-jena.de/opt/sme/index.html |
Hi, |
Bought an M4 mini myself recently but have not gotten around to doing much with it yet. |
Based on a SC24 workshop Hello SME, llvm/llvm-project#114987 and llvm/llvm-project#95478 . Apple M4 does not support SVE outside of streaming. |
Further more, I have made some test on differences between SME1 and SME2 recently. It's quiet different to achieve best performance. |
Yes, M4 only does streaming SVE so you'd need at least some setup code to enter streaming mode and perhaps save some dual-use registers beforehand, or even work in a totally different set of registers than what the existing SVE code uses. Both #5011 and #5084 introduced an ARMV9SME target for differentiation, it would also be possible to select kernel implementations (either at the KERNEL file level or within individual implementations) based on HAVE_SME or a similiar define. As #5011 is a WIP only concerned with GEMM and related functions, it does not work outside its narrow scope. The way forward - at least short-term - should be to split out M4 from the general "VORTEX" target into its own designation and enable the SME-based "small gemm" pathway for it. I hope to complete this very soon. |
Hi, I would like to ask why I encountered the following error on M4pro: Is it possible that my compiler does not recognize streaming flags? Compilation: Clang version: |
looks like the same problem mentioned above. Try using |
Thank you for your kindly reply, I disassembled it in lldb and the illegel instruction turns out to be That means the streaming flags will make the compiler add some illegal sve instructions that are not in streaming mode. This is somehow wired. Because I can't manually set the streaming mode before main is called. So I tried to remove the streaming flags in main and moved the sve code into another function I also manually placed the invocation statement of After these, the code could finally run normally! |
congrats, I also tried resolve similar issues. I encounter |
the fast path for small matrix sgemm should be working on M4 with #5222 - but I'm now stuck on an illegal instruction error involving cntd/cntw myself, trying to get dot_kernel_sve working in streaming mode with the __arm_streaming attribute |
Is it a bug of LLVM compiler? |
No need for unofficial Apple AMX intruction set on M4..
2tflops possible..
The text was updated successfully, but these errors were encountered: