Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does the speed performance have anything to do with the shape of the multiplication matrix? Why is the matrix multiplication of [34,2048]*[2048,5632] much faster than the matrix multiplication of [34,5632]*[5632,2048]? They have the same FLOPs. #48

Open
lijianxing123 opened this issue Sep 23, 2024 · 1 comment
Labels
question Further information is requested

Comments

@lijianxing123
Copy link

No description provided.

@kaleid-liner
Copy link
Collaborator

Can you provide me more data? The former one can be slightly faster due to pre computation. However, according to our profiling, the latency of [32, 4096] x [4096, 11008] and [32, 11008] x [11008, 4096] is very close.

@kaleid-liner kaleid-liner added the question Further information is requested label Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants