Skip to content

Mu7annad0/100GPU

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

68 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

100 Days of GPU Challenge

This repository is a part of the 100 Days of GPU Challenge, a 100-day long challenge to learn GPU programming.

Day Kernel Description
1 Vector Addition Implemented a basic element-wise addition kernel using CUDA to add two vectors.
Read the first two chapters from the PMPP Book.
2 Matrix Addition Implemented a basic matrix Addition kernel using CUDA to add two matrices.
3 RGB to Grayscale Conversion Implemented a RGB to Grayscale Conversion kernel using CUDA.
Read the first 2 sections from the third chapter of the PMPP Book.
4 Blur a RGB Image Implemented a Blur rgb image conversion kernel using CUDA.
Read the section 3 from the PMPP Book, and also this blog.
5 Matrix Multiplication Implemented a Matrix Multiplication kernel using CUDA.
Finished chapter 3 of PMPP Book.
6 Matrix Transpose Implemented a Matrix Transpose kernel using CUDA.
Started reading Chapter 4 and gained a comprehensive understanding of the architecture of modern CUDA-capable GPUs, including block scheduling, synchronization, and transparent scalability.
7 Softmax Implemnted Softmax Function with CUDA.
8 ReLU Implemented a ReLU kernel using CUDA.
Finished Chapter 4. Gained an understanding of warp scheduling, latency tolerance, and control divergence.
9 Tiled Matrix Multiplication Implemented Matrix Multiplication kernel using Shared Memory
10 GeLU Implemented GeLU Kernel using CUDA.
Finished Chapter 5 and get to know the different types of CUDA memory and how tiling helps reduce memory traffic.
11 Conv1D Implemented 1D Convolution with shared memory.
12 Online Softmax Implemented Online Softmax.
13 Softmax (Shared Memory) Implemented Softmax with shared-memory using CUDA.

About

100 Days of CUDA: Optimizing My Life, One Kernel at a Time. πŸ”„πŸ”₯

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published