CUDA12.6, gcc9.4, but "There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found" #272

Huilin-Li · 2024-11-03T06:57:29Z

Caution: Please only report your issue related to the installation on your local PC or macOS. If you can get the help message by colabfold_batch --help or run a test prediction successfully, your installation is successful. Requests or questions regarding ColabFold features should be directed to ColabFold repo's issues.

What is your installation issue?
I firstly executed: colabfold_batch myfa0.fa myfa0_out --msa-only works. However, then

(/storage/shenhuaizhongLab/lihuilin/mycolabfold/localcolabfold/colabfold-conda) [lihuilin@ga40q08 apgfastas]$ colabfold_batch myfa0.fa myfa0_out
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1730616792.448210 2934203 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1730616792.452606 2934203 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2024-11-03 14:53:24,193 Running colabfold 1.5.5 (c21e1768d18e3608e6e6d99c97134317e7e41c75)

WARNING: You are welcome to use the default MSA server, however keep in mind that it's a
limited shared resource only capable of processing a few thousand MSAs per day. Please
submit jobs only from a single IP address. We reserve the right to limit access to the
server case-by-case when usage exceeds fair use. If you require more MSAs: You can
precompute all MSAs with `colabfold_search` or host your own API and pass it to `--host-url`

2024-11-03 14:53:24,471 Running on GPU
2024-11-03 14:53:26,236 Found 5 citations for tools or databases
2024-11-03 14:53:26,236 Query 1/30: aaalA (length 309)
2024-11-03 14:53:26,262 Loaded myfa0_out/aaalA.pickle
E1103 14:53:29.987652 2934203 cuda_dnn.cc:502] There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found
E1103 14:53:29.988284 2934203 cuda_dnn.cc:502] There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found
2024-11-03 14:53:30,205 Could not predict aaalA. Not Enough GPU memory? FAILED_PRECONDITION: DNN library initialization failed. Look at the errors above for more details.
2024-11-03 14:53:30,206 Query 2/30: aaavA (length 418)
2024-11-03 14:53:30,642 Loaded myfa0_out/aaavA.pickle
^Z
[2]+  Stopped                 colabfold_batch myfa0.fa myfa0_out

Computational environment

OS: [e.g. Ubuntu 22.04, Windows10 & WSL2, macOS...]
CUDA version if Linux (Show the output of /usr/local/cuda/bin/nvcc --version.)

To Reproduce

Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior

A clear and concise description of what you expected to happen.

The text was updated successfully, but these errors were encountered:

punit-jha123 · 2024-12-28T04:54:23Z

I am getting the same error! did you manage to fix this?

I have nvcc -V as 12.1 and gcc 10.2.1-6 Debian

Huilin-Li · 2024-12-28T10:46:28Z

I am getting the same error! did you manage to fix this?

I have nvcc -V as 12.1 and gcc 10.2.1-6 Debian

I remember the problem in my case is I didn't activate CUDA environment in my HPC environment when I installed the localcolabfold.

Guillem-Roche · 2025-01-15T03:00:33Z

Hi, I'm worning in an HPC and I have the same problem, trying with a conda installation and got to the point that this
colabfold_batch A0A023I7F4.fasta test_smallprot --msa-only
works but without the --msa-only argument it gives the same error:

colabfold_batch A0A023I7F4.fasta test_smallprot
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1736909380.198550 2049982 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1736909380.203537 2049982 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-14 21:49:42,354 Running colabfold 1.5.5 (16536057f3041920fa7150439182c0affcc4c947)

WARNING: You are welcome to use the default MSA server, however keep in mind that it's a
limited shared resource only capable of processing a few thousand MSAs per day. Please
submit jobs only from a single IP address. We reserve the right to limit access to the
server case-by-case when usage exceeds fair use. If you require more MSAs: You can
precompute all MSAs with `colabfold_search` or host your own API and pass it to `--host-url`

2025-01-14 21:49:42,577 Running on GPU
2025-01-14 21:49:44,647 Found 5 citations for tools or databases
2025-01-14 21:49:44,647 Query 1/1: tr_A0A023I7F4_A0A023I7F4_HUMAN_Cytochrome_b_OS_Homo_sapiens_OX_9606_GN_CYTB_PE_3_SV_1 (length 380)
2025-01-14 21:49:44,660 Loaded test_smallprot/tr_A0A023I7F4_A0A023I7F4_HUMAN_Cytochrome_b_OS_Homo_sapiens_OX_9606_GN_CYTB_PE_3_SV_1.pickle
2025-01-14 21:49:48,196 Setting max_seq=512, max_extra_seq=5120
E0114 21:49:48.373238 2049982 cuda_dnn.cc:502] There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found
E0114 21:49:48.373853 2049982 cuda_dnn.cc:502] There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found
2025-01-14 21:49:48,382 Could not predict tr_A0A023I7F4_A0A023I7F4_HUMAN_Cytochrome_b_OS_Homo_sapiens_OX_9606_GN_CYTB_PE_3_SV_1. Not Enough GPU memory? FAILED_PRECONDITION: DNN library initialization failed. Look at the errors above for more details.
2025-01-14 21:49:48,382 Done

I tried with loading a CUDA module first but it doesn't work. and also I tried to match the cudnn and cuda versions to the tensorflow version following this table but I still get the same error.

Huilin-Li · 2025-01-15T10:53:33Z

Hi, I'm worning in an HPC and I have the same problem, trying with a conda installation and got to the point that this colabfold_batch A0A023I7F4.fasta test_smallprot --msa-only works but without the --msa-only argument it gives the same error:

colabfold_batch A0A023I7F4.fasta test_smallprot
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1736909380.198550 2049982 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1736909380.203537 2049982 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-14 21:49:42,354 Running colabfold 1.5.5 (16536057f3041920fa7150439182c0affcc4c947)

WARNING: You are welcome to use the default MSA server, however keep in mind that it's a
limited shared resource only capable of processing a few thousand MSAs per day. Please
submit jobs only from a single IP address. We reserve the right to limit access to the
server case-by-case when usage exceeds fair use. If you require more MSAs: You can
precompute all MSAs with `colabfold_search` or host your own API and pass it to `--host-url`

2025-01-14 21:49:42,577 Running on GPU
2025-01-14 21:49:44,647 Found 5 citations for tools or databases
2025-01-14 21:49:44,647 Query 1/1: tr_A0A023I7F4_A0A023I7F4_HUMAN_Cytochrome_b_OS_Homo_sapiens_OX_9606_GN_CYTB_PE_3_SV_1 (length 380)
2025-01-14 21:49:44,660 Loaded test_smallprot/tr_A0A023I7F4_A0A023I7F4_HUMAN_Cytochrome_b_OS_Homo_sapiens_OX_9606_GN_CYTB_PE_3_SV_1.pickle
2025-01-14 21:49:48,196 Setting max_seq=512, max_extra_seq=5120
E0114 21:49:48.373238 2049982 cuda_dnn.cc:502] There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found
E0114 21:49:48.373853 2049982 cuda_dnn.cc:502] There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found
2025-01-14 21:49:48,382 Could not predict tr_A0A023I7F4_A0A023I7F4_HUMAN_Cytochrome_b_OS_Homo_sapiens_OX_9606_GN_CYTB_PE_3_SV_1. Not Enough GPU memory? FAILED_PRECONDITION: DNN library initialization failed. Look at the errors above for more details.
2025-01-14 21:49:48,382 Done

I tried with loading a CUDA module first but it doesn't work. and also I tried to match the cudnn and cuda versions to the tensorflow version following this table but I still get the same error.

hi, im unclear about you hpc environment. maybe you should make sure the environment has gpu, cuda and good internet connection when you install the tool.

AdamInTokyo · 2025-01-16T07:37:44Z

We encountered a similar issue where the first errors were showing "Attempting to register factory for plugin cuDNN when one has already been registered" followed by errors with code 500 cudaErrorSymbolNotFound. Upon referencing this tensorflow issues thread, we found that ensuring tensorflow version 2.16.1 and cuDNN version 8.9 worked for us with our 12.1 drivers and nvcc.

Huilin-Li changed the title ~~Question:There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found~~ CUDA12.6, gcc9.4, but "There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found" Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA12.6, gcc9.4, but "There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found" #272

CUDA12.6, gcc9.4, but "There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found" #272

Huilin-Li commented Nov 3, 2024 •

edited

Loading

punit-jha123 commented Dec 28, 2024

Huilin-Li commented Dec 28, 2024

Guillem-Roche commented Jan 15, 2025

Huilin-Li commented Jan 15, 2025

AdamInTokyo commented Jan 16, 2025

CUDA12.6, gcc9.4, but "There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found" #272

CUDA12.6, gcc9.4, but "There was an error before creating cudnn handle (500): cudaErrorSymbolNotFound : named symbol not found" #272

Comments

Huilin-Li commented Nov 3, 2024 • edited Loading

punit-jha123 commented Dec 28, 2024

Huilin-Li commented Dec 28, 2024

Guillem-Roche commented Jan 15, 2025

Huilin-Li commented Jan 15, 2025

AdamInTokyo commented Jan 16, 2025

Huilin-Li commented Nov 3, 2024 •

edited

Loading