Skip to content

Commit 80d8931

Browse files
authored
[webgpu] Use subgroup for matmulnbits (#23224)
### Description This PR applies subgroup to implement matmulnbits when tile_m > 1 for intel devices. With this PR, prefill for 500 tokens prompt for phi3 becomes 3.5s from 8.5s on intel Meteor Lake.
1 parent 73f5b0c commit 80d8931

File tree

4 files changed

+293
-127
lines changed

4 files changed

+293
-127
lines changed

0 commit comments

Comments
 (0)