Having trouble getting your deep learning model to run on GPU. Please follow the instructions.
This is a step by step instructions of how to install:
- CUDA 11.8
- CuDNN 8.6.0
- TensorFlow 2.12.*
- Pytorch 2.0
Note:
- You can skip TensorFlow or Pytorch if don't use it.
- Ubuntu 16.04 or higher (64-bit)
- NVIDIA Graphics Card *
Note:
- I don't recommend trying to use GPU on Windows, believe me it's not worth the effort.
- TensorFlow only officially support Ubuntu. However, the following instructions may also work for other Linux distros.
- * AMD doesn't have CUDA cores. CUDA is proprietary framework created by Nvidia and it's used only on Nvidia cards.
- Personally I am using Zorin OS and it works fine.
- Python 3.8–3.11
- NVIDIA GPU drivers version 450.80.02 or higher.
- Miniconda (Recommended) *
Note:
- I will also include how to install the NVIDIA Driver and Miniconda in this instructions if you don't already have it.
- * Miniconda is the recommended approach for installing TensorFlow with GPU support. It creates a separate environment to avoid changing any installed software in your system. This is also the easiest way to install the required software especially for the GPU setup.
Check if you already have it Verification:
If not, follow those step bellow (2 Method):
Easy! But sometimes error (Try method 2 if it's not work)
- Install any pending updates and all required packages
sudo apt update && sudo apt upgrade -y sudo apt install build-essential libglvnd-dev pkg-config
- Install nvidia driver
sudo apt install nvidia-driver-525
- The Nvidia driver is now installed. Reboot your system
reboot
Try this if method 1 is not work
- Go to NVIDIA Driver Downloads site: https://www.nvidia.com/download/index.aspx?lang=en-us
- Search for your GPU and then download it. Remember to choose
Linux 64-bit
Operating System - Blacklist nouveau
Note: It does not work with CUDA and must be disabled
sudo bash -c "echo blacklist nouveau > /etc/modprobe.d/blacklist-nvidia-nouveau.conf" sudo bash -c "echo options nouveau modeset=0 >> /etc/modprobe.d/blacklist-nvidia-nouveau.conf"
- Remove old NVIDIA driver (optional)
Note: Desktop maybe temporary at lower resolution after this step
sudo apt-get remove '^nvidia-*' sudo apt autoremove reboot
- Install any pending updates and all required packages
sudo apt update && sudo apt upgrade -y sudo apt install build-essential libglvnd-dev pkg-config
- Install the driver:
Note: Your driver version may higher than this instructions, those following command is an example. Please use
Tab
to autocomplete the file name.- Stop current display server
Note: For the smoothest installation
sudo telinit 3
- Enter terminal mode, press:
CTRL + ALT + F1
and login with your username and password - Navigate to your directory containing the driver
# Example cd Downloads/ ls # It must contain: NVIDIA-Linux-x86_64-5xx.x.x.run
- Give execution permission
# Example sudo chmod -x NVIDIA-Linux-x86_64-5xx.x.x.run
- Run the installation
# Example sudo ./NVIDIA-Linux-x86_64-5xx.x.x.run
- Following the wizard and search google if unsure
Note: Usually you just need to press Enter the whole thing
- The Nvidia driver is now installed. Reboot your system
reboot
- Stop current display server
nvidia-smi
If you got the output, the NVIDIA Driver is already installed. Then go to the next step.
You can use the following command to install Miniconda
Download
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3-latest-Linux-x86_64.sh
Install
- Run bellow:
sh Miniconda3-latest-Linux-x86_64.sh
- Press Enter to continue
- Press q to skip the License Agreement detail
- Type
yes
and press Enter - Press Enter to confirm the installation location
- Reopen your terminal or:
source ~/.bashrc
- Disable conda auto activate base
conda config --set auto_activate_base false
conda -V
Note: Miniconda is a free minimal installer for conda. Is a package and environment manager that helps you install, update, and remove packages from your command-line interface. You can use it to write your own packages and maintain different versions of them in separate environments.
- The installation bellow is CUDA Toolkit 11.8
- It automatically recognize the distro and install the appropriate version.
Download:
wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run
Install:
- Run bellow, it will take some minutes please be patient.
sudo sh cuda_11.8.0_520.61.05_linux.run --silent --toolkit
- Add CUDA to path:
echo 'export PATH=/usr/local/cuda-11.8/bin${PATH:+:${PATH}}' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda-11.8/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}' >> ~/.bashrc
- Reopen your terminal or:
source ~/.bashrc
Note: Same as the driver, it has many other way to install it but with this way you can install and use multiple version of CUDA by simply change the version of CUDA in path (~/.bashrc).
nvcc --version
Output:
The installation bellow is cuDNN v8.6.0
- Go to this site: https://developer.nvidia.com/rdp/cudnn-archive
- You'll have to log in, answer a few questions then you will be redirected to download
- Select Download cuDNN v8.6.0 (October 3rd, 2022), for CUDA 11.x
- Select Local Installer for Linux x86_64 (Tar)
- Open terminal and then navigate to your directory containing the cuDNN tar file
- Unzip the CuDNN package
tar -xvf cudnn-linux-x86_64-8.6.0.163_cuda11-archive.tar.xz
- Copy those files into the CUDA toolkit directory
sudo cp cudnn-*-archive/include/cudnn*.h /usr/local/cuda/include sudo cp -P cudnn-*-archive/lib/libcudnn* /usr/local/cuda/lib64 sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
Note: You need to have a developer account to get CuDNN there are no direct links to download files. Why? Ask Nvidia.
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
Output:
Please read the Requirements and the Preparation sections before continue the installation bellow.
-
Create a conda environment
- Create a new conda environment named
tf
andpython 3.9
:conda create --name tf python=3.9
- You can deactivate and activate it:
conda deactivate conda activate tf
Note: Make sure it is activated for the rest of the installation.
- Create a new conda environment named
-
GPU setup
- You can skip this section if you only run TensorFlow on the CPU.
- Make sure the NVIDIA GPU driver is installed. Use the following command to verify it:
nvidia-smi
- Then install CUDA and cuDNN with conda and pip.
conda install -c conda-forge cudatoolkit=11.8.0 pip install nvidia-cudnn-cu11==8.6.0.163
- Configure the system paths.
mkdir -p $CONDA_PREFIX/etc/conda/activate.d CUDNN_PATH=$(dirname $(python -c "import nvidia.cudnn;print(nvidia.cudnn.__file__)")) echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CONDA_PREFIX/lib/:$CUDNN_PATH/lib' > $CONDA_PREFIX/etc/conda/activate.d/env_vars.sh
-
Install TensorFlow
- TensorFlow requires a recent version of pip, so upgrade your pip installation to be sure you're running the latest version.
pip install --upgrade pip
- Then, install TensorFlow with pip.
pip install tensorflow==2.12.*
Note: Do not install TensorFlow with conda. It may not have the latest stable version. pip is recommended since TensorFlow is only officially released to PyPI.
- TensorFlow requires a recent version of pip, so upgrade your pip installation to be sure you're running the latest version.
- Verify the CPU setup:
If a tensor is returned, you've installed TensorFlow successfully.
python3 -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
- Verify the GPU setup:
If a list of GPU devices is returned, you've installed TensorFlow successfully.
python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
Please read the Requirements and the Preparation sections before continue the installation bellow.
Note: Pytorch come with it own CuDNN so you can skip CuDNN installation if use Pytorch only.
-
Create a conda environment
- Create a new conda environment named
torch
andpython 3.9
:conda create --name torch python=3.9
- You can deactivate and activate it:
conda deactivate conda activate torch
Note: Make sure it is activated for the rest of the installation.
- Create a new conda environment named
-
Install PyTorch
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
# Check CUDA is available
python3 -c "import torch; print(torch.cuda.is_available())"
# CUDA device count
python3 -c "import torch; print(torch.cuda.device_count())"
# Current CUDA device
python3 -c "import torch; print(torch.cuda.current_device())"
# Get device 0 name
python3 -c "import torch; print(torch.cuda.get_device_name(0))"
This project is licensed under the MIT License. See LICENSE for more details.
- NVIDIA Driver: https://www.nvidia.com/download/index.aspx?lang=en-us
- CUDA Toolkit: https://developer.nvidia.com/cuda-toolkit-archive
- CuDNN: https://developer.nvidia.com/rdp/cudnn-archive
- TensorFlow: https://www.tensorflow.org/install/pip
- Pytorch: https://pytorch.org/get-started/locally/
Open an issue: New issue
Mail: pthung7102002@gmail.com