Skip to content

Latest commit

 

History

History
46 lines (28 loc) · 1.68 KB

README.md

File metadata and controls

46 lines (28 loc) · 1.68 KB

Speech Emotion Recognition Project

General

  • A model to classify the emotions of speeches
  • Features were extracted by modified pyAudioAnalysis library
  • Preproccess the features.

Feature Extraction

pyAudioAnalysis library is modified by the addition of functions in order to extract original features from .wav files and present them in 3D arrays.

  • To use the modified codes, one shall overwrite the package installed with the files:MidTermFeatures.py and audioTrainTest.py

Modifications to MidTermFeatures.py

  • directory_feature_extraction_no_avg

    This function is able to extract features from a directory without averaging each file.

  • multiple_directory_feature_extraction_no_avg

    This function is able to extract features from multiple directories without averaging each file.

  • directory_feature_extraction_no_avg_3D

    This function aims to extract audio features from a directory and turn a 3D array in terms of (batch,step,features)

  • multiple_directory_feature_extraction_no_avg_3D
    Multi-directories extraction for 3D array.

Window selection

In order to determine the window size, window step and window number.
`read_audio_length` file is executed to read the audios' length in the directories and visualize the length by plotting a histogram.

Other functions

Other functions defined and used were listed in ultil.py

Models

All the models used during for the project were listed in the model_training.py There are three attention based models. Residual attention, Multiplicative attention and MultiHead Attention.