Skip to content

mehdi-mirzapour/Emotional_Speech_Recognition

Repository files navigation

Emotional_Speech_Recognition

This repository will contain the implementation for the paper "Emotional Speech Recognition with Pre-trained Deep Visual Models" very soon.

Paper Reference

If you're interested in understanding the details of the approach, please refer to the paper:

📄 "Emotional Speech Recognition with Pre-trained Deep Visual Models" 🔗 Read on arXiv 📑 Download PDF

Abstract

This work explores a novel approach to emotional speech recognition (ESR) by leveraging pre-trained deep visual models. Instead of traditional speech processing methods, the technique involves:

  • Converting acoustic features into image representations.
  • Utilizing pre-trained deep learning models (such as VGG-16) designed for computer vision to classify emotions.
  • Demonstrating state-of-the-art results on the Berlin EMO-DB dataset.

Stay tuned for the code implementation! 🚀

About

Emotional Speech Recognition with Pre-trained Deep Visual Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published