Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in example causing config.py to crash on computers with no CUDA-enabled GPUs #55

Merged
merged 1 commit into from
Jul 11, 2023

Conversation

serhatgktp
Copy link
Contributor

@serhatgktp serhatgktp commented Jul 3, 2023

When running guardrails with the example configuration found in examples/llm/hf_pipeline_dolly, the following error will occur if the host machine does not have any CUDA-enabled GPUs:

│ /Users/username/anaconda3/lib/python3.10/site-packages/langchain/llms/huggingface_pipeline.py:106   │
│ in from_model_id                                                                                 │
│                                                                                                  │
│   103 │   │   │                                                                                  │
│   104 │   │   │   cuda_device_count = torch.cuda.device_count()                                  │
│   105 │   │   │   if device < -1 or (device >= cuda_device_count):                               │
│ ❱ 106 │   │   │   │   raise ValueError(                                                          │
│   107 │   │   │   │   │   f"Got device=={device}, "                                              │
│   108 │   │   │   │   │   f"device is required to be within [-1, {cuda_device_count})"           │
│   109 │   │   │   │   )                                                                          │
ValueError: Got device==0, device is required to be within [-1, 0)

This happens because the following lines in config.py attempt to select the "first" CUDA-enabled GPU on the device, when in reality there aren't any:

    # Using the first GPU
    device = 0

    llm = HuggingFacePipeline.from_model_id(
        model_id=repo_id, device=device, task="text-generation", model_kwargs=params
    )

However, this script runs fine if we simply don't specify which GPU the initializer should use, in which case it will revert to the default value. Thus, we may modify the example slightly so that it runs without crashing regardless of whether the host machine has a CUDA device:

device = 0 if torch.cuda.device_count() else -1

The caveat is that the example might seem a little bit more complex, but I think this is a better approach compared to setting device = -1 without checking for CUDA devices as the model will run significantly slower.

@serhatgktp serhatgktp changed the title Fix bug causing config.py to crash on computers with no CUDA-enabled GPUs Fix bug in example causing config.py to crash on computers with no CUDA-enabled GPUs Jul 5, 2023
@drazvan
Copy link
Collaborator

drazvan commented Jul 11, 2023

Thanks for this @serhatgktp! Can you make sure the commit is signed (same as #51). Feel free to force-push and then I can merge.

@drazvan drazvan self-assigned this Jul 11, 2023
@serhatgktp
Copy link
Contributor Author

Hi @drazvan, the commit looks to be already signed on my end. Could you please check and let me know if you'd still like me to overwrite the commit?

Thanks!

image

@drazvan
Copy link
Collaborator

drazvan commented Jul 11, 2023

Your previous commit also had the GPG signature and showed as "Verified".

image

notice the "Verified" badge on the right of the commit

image

@serhatgktp
Copy link
Contributor Author

Today I learned. I'll take care of this and drop a ping - thanks!

@serhatgktp serhatgktp reopened this Jul 11, 2023
@drazvan drazvan merged commit 237c067 into NVIDIA:main Jul 11, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants