Source-Aware Neural Speech Coding for Noisy Speech Compression

Yang, Haici, et al. "Source-Aware Neural Speech Coding for Noisy Speech Compression." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.

Paper: https://arxiv.org/abs/2008.12889
Webpage: https://minjekim.com/research-projects/sanac/

Prerequisites

Python 3.6.8
torch 1.6.0
torchaudio 0.6.0

Dataset

Speech: TIMIT(https://www.ldc.upenn.edu)
Noise: Duan stational noise (http://www2.ece.rochester.edu/~zduan/is2012/examples.html)
- Duan, Zhiyao, Gautham J. Mysore, and Paris Smaragdis. "Speech enhancement by online non-negative spectrogram decomposition in nonstationary noise environments." In Thirteenth Annual Conference of the International Speech Communication Association. 2012.

Model training

Main hyper-parameters and their default setting for model training:

Symbol	Description
filters = 100	Output channel size of encoder
d = 1	Dimension of the codec
m = 32	The number of codes in the code book
sr = True	To do super-resolution based downsampling or not
lr = 0.0001	Learning rate
br = 8	Bitrate(khz)
scale = 1000	Scale to control the hardness of the softmax function.
label = time.strftime("%m%d_%H%M%S")	Model label
weight_mse = 30	Loss weight for MSE(waveforms) term
weight_mel = 0.5	Loss weight for mel-spectogram term
weight_qtz = 0.5	Loss weight for quantization
weight_etp_total = 0.1	Loss weight for the total entropy
weight_etp_ratio = 0.05	Loss weight for the entropy ratio between source and noise
ratio = 1.0	Ratio of assigned bitrate between source and noise
update_ratio = False	Whether update the ratio during training or not
db = 0	Initial SDR of input data, 0 or 5

Train proposed model, python3 train_model.py.
Train baseline model, python3 train_base.py.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Source-Aware Neural Speech Coding for Noisy Speech Compression

Prerequisites

Dataset

Model training

About

Releases

Packages

Languages

haiciyang/SANAC

Folders and files

Latest commit

History

Repository files navigation

Source-Aware Neural Speech Coding for Noisy Speech Compression

Prerequisites

Dataset

Model training

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages