You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains materials for the Efficient Deep Learning Systems course taught at the [Faculty of Computer Science](https://cs.hse.ru/en/) of [HSE University](https://www.hse.ru/en/) and [Yandex School of Data Analysis](https://academy.yandex.com/dataschool/).
3
3
4
-
__This branch corresponds to the ongoing 2024 course. If you want to see full materials of past years, see the ["Past versions"](#past-versions) section.__
4
+
__This branch corresponds to the ongoing 2025 course. If you want to see full materials of past years, see the ["Past versions"](#past-versions) section.__
5
5
6
6
# Syllabus
7
7
-[__Week 1:__](./week01_intro)__Introduction__
8
8
- Lecture: Course overview and organizational details. Core concepts of the GPU architecture and CUDA API.
9
9
- Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
10
-
-[__Week 2:__](./week02_management_and_testing)__Experiment tracking, model and data versioning, testing DL code in Python__
11
-
- Lecture: Experiment management basics and pipeline versioning. Configuring Python applications. Intro to regular and property-based testing.
12
-
- Seminar: Example DVC+Weights & Biases project walkthrough. Intro to testing with pytest.
- Lecture: Mixed-precision training. Data storage and loading optimizations. Tools for profiling deep learning workloads.
15
-
- Seminar: Automatic Mixed Precision in PyTorch. Dynamic padding for sequence data and JPEG decoding benchmarks. Basics of profiling with py-spy, PyTorch Profiler, PyTorch TensorBoard Profiler, nvprof and Nsight Systems.
16
-
-[__Week 4:__](./week04_distributed)__Basics of distributed ML__
17
-
- Lecture: Introduction to distributed training. Process-based communication. Parameter Server architecture.
-[__Week 5:__](./week05_data_parallel)__Data-parallel training and All-Reduce__
20
-
- Lecture: Data-parallel training of neural networks. All-Reduce and its efficient implementations.
21
-
- Seminar: Introduction to PyTorch Distributed. Data-parallel training primitives.
22
-
-[__Week 6:__](./week06_large_models)__Training large models__
23
-
- Lecture: Model parallelism, gradient checkpointing, offloading, sharding.
24
-
- Seminar: Gradient checkpointing and tensor parallelism in practice.
25
-
-[__Week 7:__](./week07_application_deployment)__Python web application deployment__
26
-
- Lecture/Seminar: Building and deployment of production-ready web services. App & web servers, Docker, Prometheus, API via HTTP and gRPC.
27
-
-[__Week 8:__](./week08_inference_software)__LLM inference optimizations and software__
28
-
- Lecture: Inference speed metrics. KV caching, batch inference, continuous batching. FlashAttention with its modifications and PagedAttention. Overview of popular LLM serving frameworks.
29
-
- Seminar: Basics of the Triton language. Layer fusion in PyTorch and Triton. Implementation of KV caching, FlashAttention in practice.
30
-
-[__Week 9:__](./week09_compression)__Efficient model inference__
31
-
- Lecture: Hardware utilization metrics for deep learning. Knowledge distillation, quantization, LLM.int8(), SmoothQuant, GPTQ. Efficient model architectures. Speculative decoding.
32
-
- Seminar: Measuring Memory Bandwidth Utilization in practice. Data-free quantization, GPTq, and SmoothQuant in PyTorch.
33
-
-[__Week 10:__](./week10_invited)__MLOps, k8s, GitOps and other acronyms__ by [Gleb Vazhenin](https://github.com/punkerpunker), Bumble
10
+
-__Week 2:____Experiment tracking, model and data versioning, testing DL code in Python__
Copy file name to clipboardExpand all lines: week01_intro/README.md
+3-1Lines changed: 3 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,14 @@
1
1
# Week 1: Introduction
2
2
3
3
* Lecture: [link](./lecture.pdf)
4
-
* Seminar + bonus home assignment: [link](./seminar.ipynb)
4
+
* Seminar: [link](./seminar.ipynb)
5
5
6
6
## Further reading
7
7
*[CUDA MODE reading group Resource Stream](https://github.com/cuda-mode/resource-stream)
8
8
*[CUDA Programming Guide](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) and [CUDA C++ Best Practices Guide](https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html)
0 commit comments