This repository will contain the implementation for the paper "Emotional Speech Recognition with Pre-trained Deep Visual Models" very soon.
If you're interested in understanding the details of the approach, please refer to the paper:
📄 "Emotional Speech Recognition with Pre-trained Deep Visual Models" 🔗 Read on arXiv 📑 Download PDF
This work explores a novel approach to emotional speech recognition (ESR) by leveraging pre-trained deep visual models. Instead of traditional speech processing methods, the technique involves:
- Converting acoustic features into image representations.
- Utilizing pre-trained deep learning models (such as VGG-16) designed for computer vision to classify emotions.
- Demonstrating state-of-the-art results on the Berlin EMO-DB dataset.
Stay tuned for the code implementation! 🚀