- Lecture and seminar materials for each week are in
./week*
folders, seeREADME.md
for materials and instructions - Any technical issues, ideas, bugs in course materials, contribution ideas - add an issue
- The current version of the course is conducted in autumn 2024 at the CS Faculty of HSE.
For previous years versions, see Past Versions section.
-
week01 Introduction to Course
- Lecture: Introduction to Course
- Seminar: Experiment tracking,
Hydra
,Git
,VS code
- Self-Study: Introduction to
PyTorch
-
week02 Introduction to Digital Signal Processing
- Lecture: Signals, Fourier Transform, spectrograms, MelScale, MFCC
- Seminar: DSP in practice, spectrogram creation, IRF, frequency filtering
-
week03 Speech Recognition I
- Lecture: Metrics, Datasets, Connectionist Temporal Classification (CTC), Classic Models, Beam Search, Language models
- Seminar: Audio Augmentations, Beam Search
- Q&A Session: Homework discussion, R&D coding tips
-
week04 Speech Recognition II
- Lecture: LAS, RNN-T, Language models for RNN-T and LAS
- Seminar: Hybrid RNN-T and CTC model training and inference
-
week05 Guest Lecture. Speech Recognition III and Audio SSL
- Lecture: Self-Supervised Models for Audio, Audio LLMs
-
week06 Source Separation I
- Lecture: A review of general Source Separation and Denoising, Encoder-Decoder-Separator architectures, Demucs family, DCCRN, FullSubNet+, BandSplitRNN
- Seminar: Metrics
-
week07 Source Separation II
- Lecture: Speech separation, Blind and Target Separation, Recurrent(TasNet, DPRNN, VoiceFilter) and CNN(ConvTasNet, SpEx+)
- Seminar: WienerFilter, SincFilter and DEMUCS; streaming processing and performance metrics
-
week08 Audio-Visual Deep Learning
- Lecture: Audio-Visual Fusion, Source Separation, Speech Recognition, and Self-Supervised Models. Wav2Lip and SadTalker (talking face)
- Q&A: Project and Slurm discussion
- Extra Seminar: Create Your Own Intelligent Voice Assistant
- HW_ASR Training speech recognition model
- Project_AVSS Training audio-visual speech separation model
See our project template.
Some of the weeks have English recordings. See the corresponding sub-directories.
Course materials and teaching (in different years) were delivered by: