Based on: SciPy, NumPy, scikit-learn.
Codes in mfcc.py partly originates from scikits.talkbox.
Developed on python 3.
The program currently uses MFCC(Mel Frequency Cepstral Coefficents), △MFCC and △△MFCC as the features coefficients.
SVM is used as the classifier and is trained under the "one-against-one" approach.
Follow these steps to run it:
- Convert all the training and testing audios to 16bit/32bit/floating-point
.wavfiles. pydub may help you convert MP3 to WAV. - Arrange the training audios in this structure:
+ Store the audios played by a same instrument in a same folder.
+ Name the folders the instruments' names.
+ Put all the folders in a same path.
+ Make sure there aren't any audios that are not training audios contained in the path. - Put the testing audios together in one folder. The structure should look like this:
training audios/
|
|-piano/
| |-*.wav
| |-*.wav
| |-...
|
|-guitar/
| |-*.wav
| |-*.wav
| |-...
|
|-violin/
| |-*.wav
| |-*.wav
| |-...
|
|-...
testing audios/
|-*.wav
|-*.wav
|-...
- Run
generateMFCC.py. Follow the program's instruction and enter the path of the training audios (In the above example, it is the path of the folder called "training audios"). You'll get MFCC, △MFCC and △△MFCC saved ininsrument_name.npyfiles. - Run
trainmodel_SVM.py. You'll get the SVM model namedmodel_svmand a file namednameswhich stores the names of the instruments. - Run
test.pyand enter the path of the testing audios. The detection results will be shown.