Vietnamese Automatic Speech Recognition

Description

This repo is about learning the basic process in speech's field by practicing creating an Automatic Speech Recognition (ASR) system. The process contains data preprocessing, model training and evaluation.

Dataset

The dataset used in training ASR model is the VIVOS Corpus. You can download the dataset from here.

Model

Here is a DeepSpeech2 model trained on VIVOS with the batch size of 8 and epochs of 200, which reached 0.4390 WER on VIVOS test set. You can download the model in link

Instruction

Take a look at audio processing: audio processing
Train an ASR model with DeepSpeech2 and CTC Loss: training
Examine edit distance for word_error_rate: word error rate
Use the trained model to apply ASR part of Virtual Assistant: virutal assistant demo

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
README.md		README.md
audio_processing.ipynb		audio_processing.ipynb
training.ipynb		training.ipynb
virtual_assistant_demo.ipynb		virtual_assistant_demo.ipynb
word_error_rate.ipynb		word_error_rate.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vietnamese Automatic Speech Recognition

Description

Dataset

Model

Instruction

About

Releases

Packages

Languages

wjnwjn59/Vietnamese_Automatic_Speech_Recognition

Folders and files

Latest commit

History

Repository files navigation

Vietnamese Automatic Speech Recognition

Description

Dataset

Model

Instruction

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages