Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using --device mps results in RuntimeError: PyTorch is not linked with support for mps devices #2496

Closed
ngreve opened this issue Sep 29, 2023 · 1 comment

Comments

@ngreve
Copy link

ngreve commented Sep 29, 2023

My System:

OS: ArchLinux
CPU: AMD Ryzen 7 3700X
GPU: AMD Radeon 6900XT
RAM: 16GB

The commands I've executed:

$ python3 -m venv vicuna_venv
$ source ./vicuna_venv/bin/activate
(vicuna_venv)$ pip3 install "fschat[model_worker,webui]"

(vicuna_venv)$ python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.3 --device mps --load-8bit
/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")
You are using the default legacy behaviour of the <class 'transformers.models.llama.tokenization_llama.LlamaTokenizer'>. If you see this, DO NOT PANIC! This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thouroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
  0%|                                                                                                                                                                 | 0/2 [00:32<?, ?it/s]
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/fastchat/serve/cli.py", line 283, in <module>
    main(args)
  File "/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/fastchat/serve/cli.py", line 208, in main
    chat_loop(
  File "/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/fastchat/serve/inference.py", line 311, in chat_loop
    model, tokenizer = load_model(
                       ^^^^^^^^^^^
  File "/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/fastchat/model/model_adapter.py", line 236, in load_model
    model, tokenizer = adapter.load_compress_model(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/fastchat/model/model_adapter.py", line 82, in load_compress_model
    return load_compress_model(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/nico/prog/vicuna_venv/lib/python3.11/site-packages/fastchat/model/compression.py", line 187, in load_compress_model
    compressed_state_dict[name] = tmp_state_dict[name].to(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: PyTorch is not linked with support for mps devices

I've also tried deleting ~/.cache/pip beforehand, but get the same error.

Starting with --device cpu (python3 -m fastchat.serve.cli --model-path lmsys/vicuna-7b-v1.3 --device cpu --load-8bit) works.

@ngreve
Copy link
Author

ngreve commented Sep 29, 2023

Didn't saw the More Platforms and Quantization section.
I got it running by following the instructions from #104 (comment) by installing the pytorch ROCm version and sudo pacman -S rocm-opencl-runtime.

@ngreve ngreve closed this as completed Sep 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant