Can't get Deepspeed to work on Ubuntu 22.04 #3531
Closed
AlpsAficionado
started this conversation in
General
Replies: 3 comments
-
Here is the output of 'ds_report' in my working conda environment in the VM.
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Just same error... |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hi @AlpsAficionado - this looks like something incorrect is setup in your environment gcc wise. If you're still having an issue, could you open an issue where we can dig into the specifics more? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello esteemed Deepspeed community.
I've spent several hours bashing my head against getting deepspeed to function properly on my system. I run oobabooga/text-generation-webui inside an Ubuntu 22.04 VM on my server (Nvidia Quadro RTX 8000 with 48 VRAM; 128GB system RAM; the VM is given total control of the 8000 via PCI passthrough/iommu, and has 96GB of system RAM allocated to it).
After hours of frustration, I was unable to get deepspeed to function with oobabooga on Ubuntu 22.04.
I was able to get it to run on Ubuntu 23.04; however, 22.04 is the current long-term supported version (and it still took me significant manual intervention to get it going in 23.04).
A rundown of my issues with 22.04:
Error:
ModuleNotFoundError: No module named 'deepspeed'
Solution:
pip install deepspeed
Error: ModuleNotFoundError: No module named 'mpi4py'
Solution:
sudo apt install libopenmpi-dev ; pip install mpi4py
Error: 'pip install mpi4py' won't work; it crashes like so:
Solution:
env LD_LIBRARY_PATH=/usr/lib/x86_64-linux-gnu/ pip install --no-cache-dir mpi4py
Solution: conda install -c conda-forge gcc=12.1.0 (rebuilds/reinstalls a whole bunch of crap, see below:
Full error spew:
Solution: I don't know; this is where I am stuck. #1037 suggests that I just need to 'apt install libaio-dev', but I've done that and it doesn't help.
I'm still stuck and cannot for the life of me get the --deepspeed option working on 22.04. I'd truly appreciate any help. Thanks to all in advance!
Beta Was this translation helpful? Give feedback.
All reactions