-
Notifications
You must be signed in to change notification settings - Fork 39
Using pre trained audio classification models available with this library
There are three models that have been pre-trained and provided in this project. They are as follows.
Contains a pre-trained SVM classifier to classify audio into 10 music genres - blues, classical, country, disco, hiphop, jazz, metal, pop, reggae, rock. This classifier was trained using mfcc, gfcc, spectral and chroma features. The baseline dataset used in GTZAN.
The following commands in Python can be used to classify your data.
from pyAudioProcessing.run_classification import classify_ms
# Classify a single file
results = classify_genre(file="/Users/xyz/Documents/audio.wav")
# Classify multiple files with known paths
results = classify_genre(
file_names={
"audios_1": ["/Users/xyz/Documents/audio.wav", "/Users/xyz/Desktop/sound.wav"],
"audios_2": ["/Users/xyz/Downloads/sound_4.wav"]
}
)
# Classify multiple files stored in the directory structure as specified in the readme
# folder -> sub-folder/s -> audio files
results = classify_genre(folder_path="/Users/xyz/Documents/audios")
Contains a pre-trained SVM classifier that classifying audio into two possible classes - music and speech. This classifier was trained using mfcc, spectral and chroma features. The baseline dataset used was curated specifically for this purpose.
from pyAudioProcessing.run_classification import classify_ms
# Classify a single file
results = classify_ms(file="/Users/xyz/Documents/audio.wav")
# Classify multiple files with known paths
results = classify_ms(
file_names={
"audios_1": ["/Users/xyz/Documents/audio.wav", "/Users/xyz/Desktop/sound.wav"],
"audios_2": ["/Users/xyz/Downloads/sound_4.wav"]
}
)
# Classify multiple files stored in the directory structure as specified in the readme
# folder -> sub-folder/s -> audio files
results = classify_ms(folder_path="/Users/xyz/Documents/audios")
Contains pre-trained SVM classifier that classifying audio into three possible classes - music, speech and birds. This classifier was trained using mfcc, spectral and chroma features. The baseline dataset used was curated specifically for this purpose.
from pyAudioProcessing.run_classification import classify_ms
# Classify a single file
results = classify_msb(file="/Users/xyz/Documents/audio.wav")
# Classify multiple files with known paths
results = classify_msb(
file_names={
"audios_1": ["/Users/xyz/Documents/audio.wav", "/Users/xyz/Desktop/sound.wav"],
"audios_2": ["/Users/xyz/Downloads/sound_4.wav"]
}
)
# Classify multiple files stored in the directory structure as specified in the readme
# folder -> sub-folder/s -> audio files
results = classify_msb(folder_path="/Users/xyz/Documents/audios")
Sample results look like
{'audios_1': {'audio.wav': {'probabilities': [0.8899067858599712, 0.011922234412695229, 0.0981709797273336], 'classes': ['music', 'speech', 'birds']}, ...}