Cocktail-Party-Audio-Processing-Python

Google Colab Link to View: https://colab.research.google.com/drive/1UAxk0NQBS_uoItruo89zZkerRQxBGVtB

Sebastian Ouslis

Paper Name: THE COCKTAIL FORK PROBLEM: THREE-STEM AUDIO SEPARATION FOR REAL-WORLD SOUNDTRACKS

Problem Formulation:

The Cocktail Party problem is a challenge for computers to be able to isolate any source of interest within a complex acoustic scene. It is called the cocktail party problem because it is similar to if you were at a party where multiple conversations or sounds were occuring at the same time and you wanted to listen to one. This is very easy for humans to do but it is more difficult for computers.

This paper theorises a new problem where three audio sources are combined together: music, speech, and sound fx (ambient noise and natural sounds).

The paper is a report explaining how to make a dataset for this problem using the following datasets: LibriVox (speech), FSD50K (SFX), and FMA (music)

Proposed Solution:

Step 1: Grab the Datasets

Step 2: Convert datasets to the same file type (WAV)

Step 3: Normalize loudness based on audio type

Step 4: Resample audio to same sampling rate

Step 5 (Optional): Append short audio clips to themselves to take up more time in the audio

Step 6: Combine audio clips

Datasets (Warning they are very big) :

FSD50K -- https://zenodo.org/record/4060432#.YTkaoN8pBPY

FMA-Medium Set -- https://github.com/mdeff/fma

LibriSpeech/LibriVox -- https://www.openslr.org/12

Read the HTML document or the Google Colab to see more.

https://colab.research.google.com/drive/1UAxk0NQBS_uoItruo89zZkerRQxBGVtB

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Sebastian_Ouslis_313_Project.html		Sebastian_Ouslis_313_Project.html
Sebastian_Ouslis_313_Project.ipynb		Sebastian_Ouslis_313_Project.ipynb
test		test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cocktail-Party-Audio-Processing-Python

About

Releases

Packages

Languages

SebastianOuslis/Cocktail-Party-Audio-Processing-Python

Folders and files

Latest commit

History

Repository files navigation

Cocktail-Party-Audio-Processing-Python

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages