Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu is not detected #479

Closed
ena2016 opened this issue May 26, 2022 · 2 comments
Closed

gpu is not detected #479

ena2016 opened this issue May 26, 2022 · 2 comments
Labels
cuda docker An issue with Docker

Comments

@ena2016
Copy link

ena2016 commented May 26, 2022

Hi all,

I believe this is a simple fix but I am stuck on step "Run run_docker.py". I just put the input and output.

I'm running linux mint and I've had this working until the recent nvidia driver update and I performed the workaround by adding the new keys on Dockerfile.

I have an RTX 3090 gpu and have tried using the GPU Enumeration from 0-all on the "--gpu_devices=XX"

Any help is appreciated.

Input:
python3 docker/run_docker.py --fasta_paths=/home/eric/alphafold/SIVmac239_Env.fasta --max_template_date=None --model_preset=multimer --data_dir=/media/eric/NewCupcake --docker_user=0 --output_dir=/home/eric/alphafold/tmp/alphafold --gpu_devices=all

Output:
`I0526 07:57:35.722631 139808385271616 run_docker.py:113] Mounting /home/eric/alphafold -> /mnt/fasta_path_0
I0526 07:57:35.722714 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/uniref90 -> /mnt/uniref90_database_path
I0526 07:57:35.722770 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/mgnify -> /mnt/mgnify_database_path
I0526 07:57:35.722814 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake -> /mnt/data_dir
I0526 07:57:35.722855 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/pdb_mmcif/mmcif_files -> /mnt/template_mmcif_dir
I0526 07:57:35.722901 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/pdb_mmcif -> /mnt/obsolete_pdbs_path
I0526 07:57:35.722949 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/uniprot -> /mnt/uniprot_database_path
I0526 07:57:35.722995 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/pdb_seqres -> /mnt/pdb_seqres_database_path
I0526 07:57:35.723045 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/uniclust30/uniclust30_2018_08 -> /mnt/uniclust30_database_path
I0526 07:57:35.723094 139808385271616 run_docker.py:113] Mounting /media/eric/NewCupcake/bfd -> /mnt/bfd_database_path
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 268, in _raise_for_status
response.raise_for_status()
File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http+docker://localhost/v1.41/containers/8fcc849fe7dc1991af9a8cec8707ee9b088c55c4fd52e02e849213de39a9cdac/start

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "docker/run_docker.py", line 264, in
app.run(main)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "docker/run_docker.py", line 234, in main
container = client.containers.run(
File "/usr/local/lib/python3.8/dist-packages/docker/models/containers.py", line 818, in run
container.start()
File "/usr/local/lib/python3.8/dist-packages/docker/models/containers.py", line 404, in start
return self.client.api.start(self.id, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/docker/utils/decorators.py", line 19, in wrapped
return f(self, resource_id, *args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/docker/api/container.py", line 1111, in start
self._raise_for_status(res)
File "/usr/local/lib/python3.8/dist-packages/docker/api/client.py", line 270, in _raise_for_status
raise create_api_error_from_http_exception(e)
File "/usr/local/lib/python3.8/dist-packages/docker/errors.py", line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.APIError: 500 Server Error for http+docker://localhost/v1.41/containers/8fcc849fe7dc1991af9a8cec8707ee9b088c55c4fd52e02e849213de39a9cdac/start: Internal Server Error ("could not select device driver "nvidia" with capabilities: [[gpu]]")`

@Augustin-Zidek Augustin-Zidek added docker An issue with Docker cuda labels May 27, 2022
@tomwardio
Copy link
Member

Hi @ena2016! Thanks for reporting, it looks like there's an issue with your NVIDIA docker installation, maybe some of the suggestions in this NVIDIA issue might help?

Please do re-open if you have any issues with AFOS after resolving the above!

@mohsenumn
Copy link

If the issue persists for you, try the following:

pip install --upgrade "jax[cuda11_pip]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda docker An issue with Docker
Projects
None yet
Development

No branches or pull requests

4 participants