Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help running on MacOS M1? #13

Open
danielraffel opened this issue Jun 9, 2023 · 25 comments
Open

Help running on MacOS M1? #13

danielraffel opened this issue Jun 9, 2023 · 25 comments

Comments

@danielraffel
Copy link

danielraffel commented Jun 9, 2023

Update: For most this should work #13 (comment)

Any chance of getting help and/or updated instructions suitable for running audiocraft on MacOS and M1? At the very least, I think I need to know where to put the models I downloaded from Hugging Face. But, it's likely based on the errors I have some other issues too. My steps + errors follow. Thanks for any tips!

I adapted the instructions here for macOS: https://github.com/facebookresearch/audiocraft#installation

First, I ran each line in my terminal...

conda create -n audiocraft
conda activate audiocraft
pip install 'torch>=2.0'
pip install -U audiocraft 
pip install ffmpeg
jupyter notebook

Second, I downloaded these two items from Hugging Face but wasn't sure where to put them: https://huggingface.co/facebook/musicgen-melody

  1. melody: 1.5B model, text to music and text+melody to music - 🤗 Hub
  2. large: 3.3B model, text to music only - 🤗 Hub

Third, when Jupyter opened in Safari I created a new notebook and ran this from here: https://github.com/facebookresearch/audiocraft#api

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('melody')
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']
wav = model.generate(descriptions)  # generates 3 samples.

melody, sr = torchaudio.load('./assets/bach.mp3')
# generates using the melody from the given audio and the provided descriptions.
wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), sr)

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness")

Fourth, I got these errors in Jupyter


AssertionError Traceback (most recent call last)
Cell In [2], line 5
2 from audiocraft.models import MusicGen
3 from audiocraft.data.audio import audio_write
----> 5 model = MusicGen.get_pretrained('melody')
6 model.set_generation_params(duration=8) # generate 8 seconds.
7 wav = model.generate_unconditional(4) # generates 4 unconditional audio samples

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/audiocraft/models/musicgen.py:88, in MusicGen.get_pretrained(name, device)
86 else:
87 ROOT = 'https://dl.fbaipublicfiles.com/audiocraft/musicgen/v0/'
---> 88 compression_model = load_compression_model(ROOT + 'b0dbef54-37d256b525.th', device=device)
89 names = {
90 'small': 'ba7a97ba-830fe5771e',
91 'medium': 'aa73ae27-fbc9f401db',
92 'large': '9b6e835c-1f0cf17b5e',
93 'melody': 'f79af192-61305ffc49',
94 }
95 sig = names[name]

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/audiocraft/models/loaders.py:45, in load_compression_model(file_or_url, device)
43 cfg = OmegaConf.create(pkg['xp.cfg'])
44 cfg.device = str(device)
---> 45 model = builders.get_compression_model(cfg)
46 model.load_state_dict(pkg['best_state'])
47 model.eval()

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/audiocraft/models/builders.py:82, in get_compression_model(cfg)
79 renormalize = renorm is not None
80 warnings.warn("You are using a deprecated EnCodec model. Please migrate to new renormalization.")
81 return EncodecModel(encoder, decoder, quantizer,
---> 82 frame_rate=frame_rate, renormalize=renormalize, **kwargs).to(cfg.device)
83 else:
84 raise KeyError(f'Unexpected compression model {cfg.compression_model}')

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:1145, in Module.to(self, *args, **kwargs)
1141 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
1142 non_blocking, memory_format=convert_to_format)
1143 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
-> 1145 return self._apply(convert)

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
795 def _apply(self, fn):
796 for module in self.children():
--> 797 module._apply(fn)
799 def compute_should_use_set_data(tensor, tensor_applied):
800 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
801 # If the new tensor has compatible tensor type as the existing tensor,
802 # the current behavior is to change the tensor in-place using .data =,
(...)
807 # global flag to let the user control whether they want the future
808 # behavior of overwriting the existing tensor or not.

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
795 def _apply(self, fn):
796 for module in self.children():
--> 797 module._apply(fn)
799 def compute_should_use_set_data(tensor, tensor_applied):
800 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
801 # If the new tensor has compatible tensor type as the existing tensor,
802 # the current behavior is to change the tensor in-place using .data =,
(...)
807 # global flag to let the user control whether they want the future
808 # behavior of overwriting the existing tensor or not.

[... skipping similar frames: Module._apply at line 797 (2 times)]

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:797, in Module._apply(self, fn)
795 def _apply(self, fn):
796 for module in self.children():
--> 797 module._apply(fn)
799 def compute_should_use_set_data(tensor, tensor_applied):
800 if torch._has_compatible_shallow_copy_type(tensor, tensor_applied):
801 # If the new tensor has compatible tensor type as the existing tensor,
802 # the current behavior is to change the tensor in-place using .data =,
(...)
807 # global flag to let the user control whether they want the future
808 # behavior of overwriting the existing tensor or not.

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:820, in Module._apply(self, fn)
816 # Tensors stored in modules are graph leaves, and we don't want to
817 # track autograd history of param_applied, so we have to use
818 # with torch.no_grad():
819 with torch.no_grad():
--> 820 param_applied = fn(param)
821 should_use_set_data = compute_should_use_set_data(param, param_applied)
822 if should_use_set_data:

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/nn/modules/module.py:1143, in Module.to..convert(t)
1140 if convert_to_format is not None and t.dim() in (4, 5):
1141 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
1142 non_blocking, memory_format=convert_to_format)
-> 1143 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)

File /Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/cuda/init.py:239, in _lazy_init()
235 raise RuntimeError(
236 "Cannot re-initialize CUDA in forked subprocess. To use CUDA with "
237 "multiprocessing, you must use the 'spawn' start method")
238 if not hasattr(torch._C, '_cuda_getDeviceCount'):
--> 239 raise AssertionError("Torch not compiled with CUDA enabled")
240 if _cudart is None:
241 raise AssertionError(
242 "libcudart functions unavailable. It looks like you have a broken build?")

AssertionError: Torch not compiled with CUDA enabled

@danielraffel danielraffel changed the title MacOS M1 Install instructions? Help running on MacOS M1? Jun 9, 2023
@jet3004
Copy link

jet3004 commented Jun 9, 2023

Thanks for this post, following because I'm running into same issues...

@james-see
Copy link

Me 2

@ravsau
Copy link

ravsau commented Jun 10, 2023

Seems like cuda issue can be solved by running it on CPU instead of CUDA.
https://stackoverflow.com/a/75049743

I also suggest taking a look at this whisper.cpp project for a reference of where macOS coreML support was added to a model: ggerganov/whisper.cpp#126

@xaviviro
Copy link

Try with MusicGen.get_pretrained('melody',device='cpu') it's slow, but it works. I've tried using device='mps' but even when adding PYTORCH_ENABLE_MPS_FALLBACK=1, it fails because the FFT doesn't work with MPS. So, we'll have to wait for newer versions of Torch for MPS.

@danielraffel
Copy link
Author

danielraffel commented Jun 10, 2023

thanks xaviviro - this does seem to be running

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('melody',device='cpu')
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']
wav = model.generate(descriptions)  # generates 3 samples.

melody, sr = torchaudio.load('./assets/bach.mp3')
# generates using the melody from the given audio and the provided descriptions.
wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), sr)

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness")

@danielraffel
Copy link
Author

danielraffel commented Jun 10, 2023

@xaviviro Did you try MPS with a nightly PyTorch build? Maybe it's supported...

https://developer.apple.com/metal/pytorch/

Update: I guess still not supported pytorch/pytorch#78044

@danielraffel
Copy link
Author

danielraffel commented Jun 10, 2023

might have spoken too soon...when I run this in jupyter...

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('melody',device='cpu')
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']
wav = model.generate(descriptions)  # generates 3 samples.

melody, sr = torchaudio.load('./assets/bach.mp3')
# generates using the melody from the given audio and the provided descriptions.
wav = model.generate_with_chroma(descriptions, melody[None].expand(3, -1, -1), sr)

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness")

I get this error...

/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

I've tried running the following updates and still no luck...any ideas?

pip install ipywidgets
pip install -U jupyter

also tried enabling the extensions (which seemed to work)

jupyter nbextension enable --py widgetsnbextension     
Enabling notebook extension jupyter-js-widgets/extension...
      - Validating: OK

@dataf3l
Copy link

dataf3l commented Jun 11, 2023

I am facing the same issue

@james-see
Copy link

It worked for me after changing to cpu. I will post the steps once I get back to my computer.

@JonathanFly
Copy link
Contributor

JonathanFly commented Jun 12, 2023

I bet the notebook is somehow using different version. But actually, do you still need "from .autonotebook import tqdm as notebook_tqdm" I think it's all automatic now, it just figures out what progress bar to use.

from tqdm import tqdm

@james-see
Copy link

james-see commented Jun 12, 2023

Here is what I got working:

  1. git clone repo
  2. try to install requirements.txt via pip on 3.10.11 my default, fails due to some dependency not found bs
  3. use pyenv to switch to 3.9.16 and install and it works fine
  4. try running app.py, fails due to readme being wrong, needs to be python3 ./app.py or drop a directory down and run app.py unless you have your env set for the flask app properly, which usually you wouldn't know to do that.
  5. try running app.py again, fails due to torch error
  6. pip install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
  7. change app.py code to include device=cpu as others have mentioned above
  8. works fine, tested generating audio both a 10 sec and 30 sec cut works fine

@DJChief
Copy link

DJChief commented Jun 12, 2023

can you make an idiot proof guide for getting this to work on m1?
really want to get this thing working!!!!

@james-see
Copy link

can you make an idiot proof guide for getting this to work on m1?

really want to get this thing working!!!!

That sounds worthy to pay it forward. I will create a Medium post that is very detailed and link to it. The problem is code changes and tutorials will have to change too.

@dataf3l
Copy link

dataf3l commented Jun 12, 2023

Hi @DJChief , all I did was take the example posted, and asked ChatGPT to "change it so it works on cpu", it hallucinated some code, but it led to the right solution, all it took was reading the code a bit, reading the code helps a lot in understanding how to change the options, and what options are interesting to change.

So the main issue appears to be if you have a fancy computer, you can use cuda, if you have a lame-ass computer like mine, you can use mps and cpu and if you are running it in "potato" you have to use cpu.

I tried changing "cuda" to "mps" and it didn't work for some weird reason, so I was like "meh", I guess "cpu" it is.

in my m1 it takes like 5 mins to generate 10 seconds of audio, but my m1 is old, so maybe your computer does it faster? idk, I closed all the tabs to allow the about 10GB ram or so to do it's thing, the "medium" model seems good enough.

Also apparently, the prompt really matters.

this code runs on my machine (TM)

import sys

import torch
import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

# Set the device to CPU
device = torch.device('cpu')

model = MusicGen.get_pretrained('medium', device='cpu') # <-- this line here dude

model.set_generation_params(duration=10)  # 30 is max?

descriptions = [sys.argv[1]]

with torch.no_grad():
    wav = model.generate(descriptions)  
    
for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(f'{idx}', one_wav.cpu(), model.sample_rate, strategy="loudness")


As for how to run it, maybe try these steps:

  1. make a folder for the project, anywhere in your computer, (always put stuff in folders, the computer seems to like that.)

  2. name the folder whatever you want, but don't put spaces on the folder name, that never leads to god things.

  3. set aside some 10-20GB of free space, (I suggest deleting important files first, and then unimportant files later, for this, you can use GrandPerspective https://grandperspectiv.sourceforge.net/ if you have a lot of space, skip this step)

you can run this command to figure out how much free space you have:

df -h

somewhere, in a soup of numbers, the free disk space is visible.

3 install XCode, https://apps.apple.com/us/app/xcode/id497799835?mt=12 (like seriously, isn't it about time you wrote that app you were thinking about 10 years ago?, well now this is the excuse), you will need another 10GB for that, so maybe delete even more files or perhaps a couple DAWs you aren't using too much.

4 install brew (yes, that's even, another thing, can you believe it!) https://brew.sh/ (so, here is the deal, brew is required because every * developer ever wants to use a package manager (it's the convenience, you see, package managers are like app-stores for developers, that's the reason we love them), maybe another GB or 2 of stuff will be lost forever in some random section of your hard-drive, no way to get it back, but like, whatever)

  1. install python (using brew)
brew install python

that should create a file called /opt/homebrew/bin/python3, treasure this file, and light a candle for guido, for we are not worthy.

  1. create a virtual environment with all the pip libraries (we do this so we don't mess up the other python stuff, like the system python).
python3 -m venv venv

these steps should be done only once, including the pip installs, however,Every time you are going to run the thing, you need to: 1. cd into the folder and 2.then activate the virtual environment, this tells bash (the computer's command interpreter) that we are using this specific set of libraries, activating the venv is done like this:.

source venv/bin/activate/

you may be tempted to ignore the step where we make the virtual env and just mess up the main system python.
unless you want your python projects to start interfering with each other, I suggest you make the virtual env, it's like
folders for your dependencies, one per project, nice and neat.

recently Apple forced zsh down's people troats because having compatibility is apparently unimportant, so if you see zsh or a % symbol on the terminal, just type bash.

  1. pip install a bunch of stuff:
pip install transformers torch
pip install einops
pip install accelerate
pip install torchaudio
pip install audiocraft

in my device, I also had to do this, which you may or may not have to do, depending on how far away into the future
you read this:

pip install torch===2.0.1 -f https://download.pytorch.org/whl/torch_stable.html
  1. run this: (where you put the example code)
python your-file.py "some song here"
  1. and then when it runs, see if you can find a file called 0.wav, in the same folder as the thing.

if found, rename it, and then run it again, if you want to generate multiple ideas, from a file, you can try this script (but this is optional, feel free to just run the example and then rename the 0.wav file over an over as well, to each it's own):

import os
import subprocess
import datetime
import time
# read genres.txt line by line
lines = []
with open('genres.txt') as f:
    lines = f.readlines()
    lines = [line.strip() for line in lines]

# for each line in genres.txt
for line in lines:
    # print line
    print(line)

    t = time.time()
    p = subprocess.run(['python', 'mine.py', line])
    t2 = time.time()
    print(f'Time taken in ms: {(t2-t)*1000}')
    # print stdout and stderr
    print(p.stderr)
    print(p.stdout)

    # Get the current date and time
    now = datetime.datetime.now()
    date_and_time = now.strftime("%Y%m%d%H%M%S")

    # Rename 0.wav to $dateandtime.wav
    os.rename('0.wav', f'{date_and_time} {line}.wav')

Here I just made a txt file with genres I wanted to try out, one genre per line, from a file called genres.txt, you can rename it to ideas.txt I guess. but feel free to change the example, the key part here is that
instead of "generating all the samples" and THEN writing to disk (like the original example program from the project's repo) this program writes to disk each time, this is because I don't trust the process to be perfect, and I expect the program to fail from time to time and I want to hear results as they come out, and not wait for the results at the end of the batch.

I suggest generating multiple samples, one single sample doesn't do it for me.

if any of the steps seem too "high level", feel free to "research the way to do step x" by entering the step description in your AI of choice, I recommend ChatGPT, you can probably ask ChatGPT to take this very text and to expand on it or summarize it, if you feel lost, I'm sure it won't mind doing it.

Oh btw if you don't like using ChatGPT what are you even doing in this part of town anyway?

Don't be afraid to work with incomplete and imperfect instructions, and with incomplete and imperfect code, and let ChatGPT guide you along the way. being willing to tinker with the code and changing the code will allow you to run other music generation models, and who knows even make some cool music in the process.

I guess my message here is don't be afraid to try stuff, it's OK if things break, you can always re-install things, re-create the virtual environment, etc.

There are several models out there, most of them require some fiddling.

have fun!

@dataf3l
Copy link

dataf3l commented Jun 12, 2023

when I run pip freeze it looks like this, if something fails, perhaps a pip package is missing:

antlr4-python3-runtime==4.9.3
appdirs==1.4.4
audiocraft==0.0.1
audioread==3.0.0
av==10.0.0
blis==0.7.9
catalogue==2.0.8
certifi==2023.5.7
cffi==1.15.1
charset-normalizer==3.1.0
click==8.1.3
cloudpickle==2.2.1
colorlog==6.7.0
confection==0.0.4
cymem==2.0.7
Cython==0.29.35
decorator==5.1.1
demucs==4.0.0
diffq==0.2.4
docopt==0.6.2
dora-search==0.1.12
einops==0.6.1
filelock==3.12.1
flashy==0.0.2
fsspec==2023.6.0
huggingface-hub==0.15.1
hydra-colorlog==1.2.0
hydra-core==1.3.2
idna==3.4
Jinja2==3.1.2
joblib==1.2.0
julius==0.2.7
lameenc==1.4.2
langcodes==3.3.0
lazy_loader==0.2
librosa==0.10.0.post2
llvmlite==0.40.1rc1
MarkupSafe==2.1.3
mpmath==1.3.0
msgpack==1.0.5
murmurhash==1.0.9
mypy-extensions==1.0.0
networkx==3.1
num2words==0.5.12
numba==0.57.0
numpy==1.24.3
omegaconf==2.3.0
openunmix==1.2.1
packaging==23.1
pathy==0.10.1
pooch==1.6.0
preshed==3.0.8
pycparser==2.21
pydantic==1.10.9
pyre-extensions==0.0.29
PyYAML==6.0
regex==2023.6.3
requests==2.31.0
retrying==1.3.4
safetensors==0.3.1
scikit-learn==1.2.2
scipy==1.10.1
sentencepiece==0.1.99
six==1.16.0
smart-open==6.3.0
soundfile==0.12.1
soxr==0.3.5
spacy==3.5.2
spacy-legacy==3.0.12
spacy-loggers==1.0.4
srsly==2.4.6
submitit==1.4.5
sympy==1.12
thinc==8.1.10
threadpoolctl==3.1.0
tokenizers==0.13.3
torch==2.0.1
torchaudio==2.0.2
tqdm==4.65.0
transformers==4.30.1
treetable==0.2.5
typer==0.7.0
typing-inspect==0.9.0
typing_extensions==4.6.3
urllib3==2.0.3
wasabi==1.1.2
xformers==0.0.20

@redblobgames
Copy link

redblobgames commented Jun 12, 2023

Thanks @dataf3l! I too tried mps but had no luck.

I didn't have to install as many packages:

python3 -m venv env
source env/bin/activate

pip install 'torch>=2.0'
pip install -U 'git+https://git@github.com/facebookresearch/audiocraft#egg=audiocraft'
python test.py

and then this test.py program worked (it automatically downloaded the models and generated wav files):

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('small', device='cpu') # note cpu not cuda or mps
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']

wav = model.generate(descriptions)  # generates 3 samples.

# haven't tried the melody stuff yet

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(str(f'{idx}'), one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)

@danielraffel
Copy link
Author

danielraffel commented Jun 12, 2023

Pre-requisites

  1. Download and install Miniforge3 (to use Conda) https://github.com/conda-forge/miniforge
  2. Install XcodeTools in your terminal with this command xcode-select --install (so that you can use git)

Note: I assumed audiocraft requires Python 3.9 so I installed that in the next steps and call it when running the program but perhaps it just needs something >=3.9

Audiocraft install instructions (in your terminal)

conda create --name audiocraft python=3.9
conda activate audiocraft
pip install ffmpeg
pip install 'torch>=2.0' 
pip install -U git+https://git@github.com/facebookresearch/audiocraft#egg=audiocraft 

Copy this to a text file in your current working directory and name it test.py
Note: if you don't know your current working directory in terminal type pwd

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('small', device='cpu') # note cpu not cuda or mps
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']

wav = model.generate(descriptions)  # generates 3 samples.

# haven't tried the melody stuff yet

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(str(f'{idx}'), one_wav.cpu(), model.sample_rate, strategy="loudness", loudness_compressor=True)

Run the program in your terminal calling python3.9

python3.9 test.py

The first time it runs it will download the necessary files and then generate the following files in your working directory

3.scd
2.wav
2.scd
1.wav
1.scd
0.wav

Now you can tweak the program as you wish (diff prompts, longer audio, etc)

Thanks for the tips @redblobgames

@kmashal
Copy link

kmashal commented Jun 12, 2023

I get :raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled" when I try to run python app.py on my Mac M1, any work around ?

@ravsau
Copy link

ravsau commented Jun 12, 2023

I followed the instruction above from redblobgames and was getting an error

TypeError: audio_write() got an unexpected keyword argument 'loudness_compressor'

I then removed the loudness_compressor=True argument and I was able to get audio output.

So this code worked for me

import torchaudio
from audiocraft.models import MusicGen
from audiocraft.data.audio import audio_write

model = MusicGen.get_pretrained('small', device='cpu') # note cpu not cuda or mps
model.set_generation_params(duration=8)  # generate 8 seconds.
wav = model.generate_unconditional(4)    # generates 4 unconditional audio samples
descriptions = ['happy rock', 'energetic EDM', 'sad jazz']

wav = model.generate(descriptions)  # generates 3 samples.

for idx, one_wav in enumerate(wav):
    # Will save under {idx}.wav, with loudness normalization at -14 db LUFS.
    audio_write(str(f'{idx}'), one_wav.cpu(), model.sample_rate, strategy="loudness")

@kmashal
Copy link

kmashal commented Jun 12, 2023

I was able to get python app.py runnig on my M1 by modifying code in app.py to this function:

def load_model(version):
print("Loading model", version)
return MusicGen.get_pretrained(version, device='cpu')

@danielraffel
Copy link
Author

danielraffel commented Jun 13, 2023

Below are install instructions that should work for folks who wanna run the Gradio interface on their Mac (can confirm it works with all 4 model variations)

Pre-requisites

  1. Download and install Miniforge3 (to use Conda) https://github.com/conda-forge/miniforge
  2. Install XcodeTools in your terminal with this command xcode-select --install (so that you can use git)

Audiocraft install instructions (in your terminal)

git clone https://github.com/facebookresearch/audiocraft.git 
conda create --name gen python=3.9
conda activate gen
pip install ffmpeg
pip install 'torch>=2.0' 
cd audiocraft
pip install -r requirements.txt

Download ffmpeg and ffprobe binaries (https://evermeet.cx/ffmpeg/) and place in /usr/local/bin
Note: update the path below to match where you downloaded ffmpeg and ffprobe

sudo cp /Users/yourUsername/Downloads/ffmpeg /usr/local/bin
sudo cp /Users/yourUsername/Downloads/ffprobe /usr/local/bin
sudo chmod 755 /usr/local/bin/ffmpeg
sudo chmod 755 /usr/local/bin/ffprobe

Update path to the binaries in your profile vi ~/.zshrc

export PATH="/usr/local/bin/ffmpeg:$PATH"
export PATH="/usr/local/bin/ffmprobe:$PATH"

Optional: Confirm binaries are installed

ffmpeg -version
ffprobe -version

In the file audiocraft/app.py update/replace the first function so it falls back to CPU

def load_model(version):
    print("Loading model", version)
    return MusicGen.get_pretrained(version, device='cpu')

Finally, start the program that will load the gradio interface (which will direct you to a URL to enter in your browser)
python3.9 app.py

@jet3004
Copy link

jet3004 commented Jun 15, 2023

In the file audiocraft/app.py update/replace the first function so it falls back to CPU

def load_model(version):
    print("Loading model", version)
    return MusicGen.get_pretrained(version, device='cpu')

Thank you! This was very helpful. I didn't do this and it still loads and works but realize I need to because it's slooooow. I don't see this, exactly, in app.py so just want to, quite literally, ask... Am I replacing this (the only thing I see)

def load_model(version='melody'):
    global MODEL
    print("Loading model", version)
    if MODEL is None or MODEL.name != version:
        MODEL = MusicGen.get_pretrained(version)

With this?

def load_model(version):
    print("Loading model", version)
    return MusicGen.get_pretrained(version, device='cpu')

Also: where are the (sizable) models stored after downloading? I don't see them in "Models," but want to know in case I want to prune space. Thanks!

@kmashal
Copy link

kmashal commented Jun 15, 2023

In the file audiocraft/app.py update/replace the first function so it falls back to CPU

def load_model(version):
    print("Loading model", version)
    return MusicGen.get_pretrained(version, device='cpu')

Thank you! This was very helpful. I didn't do this and it still loads and works but realize I need to because it's slooooow. I don't see this, exactly, in app.py so just want to, quite literally, ask... Am I replacing this (the only thing I see)

def load_model(version='melody'):
    global MODEL
    print("Loading model", version)
    if MODEL is None or MODEL.name != version:
        MODEL = MusicGen.get_pretrained(version)

With this?

def load_model(version):
    print("Loading model", version)
    return MusicGen.get_pretrained(version, device='cpu')

Yes!!

@jet3004
Copy link

jet3004 commented Jun 15, 2023

Yes!!

Thank you! Ok so now it says

def load_model(version):
    print("Loading model", version)
    return MusicGen.get_pretrained(version, device='cpu')

But – and again I know there's not GPU Here – I can't say it's anymore faster (M1 Max, 64GB of RAM), still quite slow... Being thick here but, should it be?

@redblobgames
Copy link

redblobgames commented Jun 16, 2023

But – and again I know there's not GPU Here – I can't say it's anymore faster (M1 Max, 64GB of RAM), still quite slow... Being thick here but, should it be?

Yes, it's slow for me (M2 Max, 32GB of RAM), taking around 15–20 min to generate the three 8-second sample files.

I think the models are saved in ~/.cache/huggingface

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants