Practice how to write high performance kernels
- FlashAttention
- LayerNorm
- RMSNorm
- Split
- Cat
- Gemm
- Gemv
- SoftMax
- Gelu
- Silu
- Swiglu
- Add
- Mul
- Permute
- LlamaRotatePosition2D
- Reduce
- FlashAttention
- LayerNorm
- RMSNorm
- Split
- Cat
- Gemm
- Gemv
- SoftMax
- Gelu
- Silu
- Swiglu
- Add
- Mul
- Permute
- LlamaRotatePosition2D
- Reduce
- FlashAttention
- LayerNorm
- RMSNorm
- Split
- Cat
- Gemm
- Gemv
- SoftMax
- Gelu
- Silu
- Swiglu
- Add
- Mul
- Permute
- LlamaRotatePosition2D
- Reduce