Skip to content

GaoXiangYa/KernelPractice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KernelPractice

Practice how to write high performance kernels

CUDA

  • FlashAttention
  • LayerNorm
  • RMSNorm
  • Split
  • Cat
  • Gemm
  • Gemv
  • SoftMax
  • Gelu
  • Silu
  • Swiglu
  • Add
  • Mul
  • Permute
  • LlamaRotatePosition2D
  • Reduce

CPU

  • FlashAttention
  • LayerNorm
  • RMSNorm
  • Split
  • Cat
  • Gemm
  • Gemv
  • SoftMax
  • Gelu
  • Silu
  • Swiglu
  • Add
  • Mul
  • Permute
  • LlamaRotatePosition2D
  • Reduce

OPENCL

  • FlashAttention
  • LayerNorm
  • RMSNorm
  • Split
  • Cat
  • Gemm
  • Gemv
  • SoftMax
  • Gelu
  • Silu
  • Swiglu
  • Add
  • Mul
  • Permute
  • LlamaRotatePosition2D
  • Reduce

About

Practice how to write high performance kernels

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors