LOL-Recongnition-by-Deep-Speaker

We adopted the voice package of league of legends, based on the audio information of 142 heroes in league of legends, classified them, and identified which hero the audio object was by obtaining audio features. Our project is based on baidu's paper "Deep Speaker: an end-to-end Neural Speaker Embedding System"

The theory of implementation: https://arxiv.org/pdf/1705.02304.pdf

Audio Samples

We package the audio data and store it on baidu cloud. Please scan the qr code by Wechat to download. The extraction code is “ tt8w”.

Quick Start

Installing dependencies

Install Python 3.
Install the latest version of TensorFlow for your platform. For better performance, install with GPU support if it's available. This code works with TensorFlow 1.3 and later.
Install requirements:
```
pip install -r requirements.txt
```

Visual

python SpeakerRecog.pyw

Model

We package the model data and store it on baidu cloud. Please scan the qr code by Wechat to download. The extraction code is “ 6umw”.

After unpacking, your tree should look like this for model.

model
	|- pre-model
	|- train-model
			|- best_checkpoint
			|- GRU
			|- ResidualCNN

Training

Download the speech dataset.

Unzip the absolute path to constants.py, or unzip the path to the same path as the DATASET_DIR in constants.py. see “Audio Samples” for the download link.

Preprocess the data

python pre_process.py

Train a model

python train.py

note : Pre-training and then training is recommended to reduce training time.

Pre-train:

python pretraining.py

Notes

Since we rename the data after processing it into npy format, the detailed code can refer to rename.py,This will help you avoid unnecessary problems when importing new data.Otherwise if you will train the new data, you can adjust the parameters at constants.py.

Summarize

Because the training data is relatively pure, the training accuracy is already higher when we do not use GRU.

Moreover, we found that the distance between the sound source and the microphone would affect the detection quality, and the noisy environment would also affect the accuracy.The corresponding spectrum diagrams of the audio spectrum in different scenarios are as follows:

original sound:

Stay away - quiet:

Stay away - noise:

Close - quiet:

Close - noise:

Our roc curve performed particularly well:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
demo-audio		demo-audio
interface		interface
rename		rename
README.md		README.md
SpeakerRecog.pyw		SpeakerRecog.pyw
constants.py		constants.py
eval_metrics.py		eval_metrics.py
get_voice.py		get_voice.py
guess.py		guess.py
kaldi_form_preprocess.py		kaldi_form_preprocess.py
models.py		models.py
play.py		play.py
pre_process.py		pre_process.py
pretraining.py		pretraining.py
random_batch.py		random_batch.py
requirements.txt		requirements.txt
select_batch.py		select_batch.py
silence_detector.py		silence_detector.py
test_model.py		test_model.py
train.py		train.py
triplet_loss.py		triplet_loss.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LOL-Recongnition-by-Deep-Speaker

Audio Samples

Quick Start

Installing dependencies

Visual

Model

Training

Notes

Summarize

About

Releases

Packages

Languages

zhudu/LOL-Recongnition-by-Deep-Speaker

Folders and files

Latest commit

History

Repository files navigation

LOL-Recongnition-by-Deep-Speaker

Audio Samples

Quick Start

Installing dependencies

Visual

Model

Training

Notes

Summarize

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages