GitHub - zimeng44/Foley-Gen: A generative machine learning model that generates noval foley sounds

Introduction:

Foley-Gen is a generative machine learning model that generates noval foley sounds in 7 categories: Dog Bark, Footstep, Gunshot, Typing on Keyboard, Moving Motor Vehicle, Rain, Sneeze Cough. Our training dataset consists of 5,496 sound clips from the UrbanSound8K, FSD50K, and BBC Sound Effects datasets.

Usage:

If you want to use our checkpoint:

Download the checkpoint here: https://drive.google.com/file/d/1hLbUi0veQ1D-yYGTxF-3rCfrVSIpzd6_/view?usp=sharing
Unzip the checkpoint and put the 'checkpoint' folder at the root level of this project.
Run python inference.py

(The number of sounds that will be generated for each category can be modified by running python inference.py --number_of_synthesized_sound_per_class = <number>. The default number is 1 per category.)

If you want to train the models yourself:

Train VQ-VAE: python train_vqvae.py
Extract code/embedding from trained VQ-VAE: python extract_code.py
Train PixelSnail: python train_pixelsnail.py
Inference: python inference.py

(The number of sounds that will be generated for each category can be modified by running python inference.py --number_of_synthesized_sound_per_class = <number>. The default number is 1 per category.)

The synthesized sound samples will be saved to ./synthesized

Caution: Inference needs GPU with memory larger than 50GB, otherwise, 'CUDA Out of Memory' error might occur.

Reference:

This project is based on a baseline model https://github.com/DCASE2023-Task7-Foley-Sound-Synthesis/dcase2023_task7_baseline . Other models used in this project include MERT, VQ-VAE and PixelSnail.

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
DataSet		DataSet
configs		configs
synthesized		synthesized
Group 4 - DL4M Final Project Presentation.pdf		Group 4 - DL4M Final Project Presentation.pdf
HiFiGanWrapper.py		HiFiGanWrapper.py
LICENSE		LICENSE
README.md		README.md
audio2cembed.py		audio2cembed.py
audio2mel.py		audio2mel.py
dataset_splits.csv		dataset_splits.csv
datasets.py		datasets.py
environment.yml		environment.yml
extract_code.py		extract_code.py
inference.py		inference.py
pixelsnail.py		pixelsnail.py
requirements.txt		requirements.txt
scheduler.py		scheduler.py
train_pixelsnail.py		train_pixelsnail.py
train_vqvae.py		train_vqvae.py
utils.py		utils.py
vqvae.py		vqvae.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction:

Usage:

If you want to use our checkpoint:

If you want to train the models yourself:

Caution: Inference needs GPU with memory larger than 50GB, otherwise, 'CUDA Out of Memory' error might occur.

Reference:

About

Releases

Packages

Languages

License

zimeng44/Foley-Gen

Folders and files

Latest commit

History

Repository files navigation

Introduction:

Usage:

If you want to use our checkpoint:

If you want to train the models yourself:

Caution: Inference needs GPU with memory larger than 50GB, otherwise, 'CUDA Out of Memory' error might occur.

Reference:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages