Skip to content

A curated list of papers and resources in Speech & Audio Generation. Feel free to contribute!

License

Notifications You must be signed in to change notification settings

Haulyn5/Audio-Speech-Generation-Paper-List

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Audio & Speech Generation Paper List

A curated list of papers and resources related to Speech & Audio Generation. This project is just starting and still requires a lot of work. So feel free to contribute!

Paper

Text to Speech (TTS)

Voice Conversion (VC)

Audio Generation and Text to Audio (TTA)

Notes that actually many audio generation models are also able to generate speech.

Singing Voice Synthesis (SVS)

Speech to Speech Translation (S2ST/ STST)

Streaming & Simultaneous Translation

Speech Translation Dataset

Text to Music(TTM)

Large Language Model(LLM)

Software/ Libraries

Speech Synthesis

  • BERT-VITS2: A TTS tool shows great performance on Chinese speech synthesis.
  • Amphion: An Open-Source Audio, Music, and Speech Generation Toolkit. The Goal of Amphion is to offer a platform for studying the conversion of any inputs into audio. (TTS, SVS, VC, SVC, TTA, TTM) [Paper, Video(Chinese)]
  • Speech Brain: A PyTorch-based Speech Toolkit.
  • ESPNet: An End-to-End Speech Processing Toolkit.

About

A curated list of papers and resources in Speech & Audio Generation. Feel free to contribute!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published