-
Notifications
You must be signed in to change notification settings - Fork 9
GPU tools
This page is frequently updated and tested for recent kernel and driver versions. Current instructions were updated and tested on May 29, 2020 for system configuration:
Ubuntu 20.04 (Server)
Linux 5.6.10-rt5
nvidia 440.82
cuda_10.2.89
magma-2.5.3
Steps and commands may vary for other configurations.
Need Help ? Post questions on the RTC config chat room.
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nvidia-nouveau.conf
echo "options nouveau modeset=0" | sudo tee -a /etc/modprobe.d/blacklist-nvidia-nouveau.conf
Confirm the content of the new modprobe config file:
cat /etc/modprobe.d/blacklist-nvidia-nouveau.conf
blacklist nouveau
options nouveau modeset=0
Regenerate initramfs:
sudo update-initramfs -u
Then reboot.
cd $HOME
mkdir -p soft && cd soft
wget http://us.download.nvidia.com/XFree86/Linux-x86_64/440.82/NVIDIA-Linux-x86_64-440.82.run
chmod +x NVIDIA-Linux-x86_64-440.82.run
./NVIDIA-Linux-x86_64-440.82.run -x
cd /home/scexao/soft/NVIDIA-Linux-x86_64-440.82/
sudo IGNORE_PREEMPT_RT_PRESENCE=1 ./nvidia-installer
If prompted (depends on version), answer as follows:
- "Register kernel module" -> NO
- "Unable to find 32bit install" -> OK
- "Install 32 bit compatibility" -> NO
- "An incomplete installation of libglvnd was found. All of the essential libglvnd libraries are present, but one or more optional components are missing. Do you want to install a full copy of libglvnd? This will overwrite any existing libglvnd libraries." -> Install and Overwrite
- "Would you like to run the nvidia-xconfig utility to automatically update your X configuration file so that the NVIDIA X driver will be used when you restart X? Any pre-existing X configuration file will be backed up." -> NO
Check which version to use on the nvidia CUDA website, and follow instructions for runtime(local) installer:
cd ${HOME}/soft
wget https://developer.download.nvidia.com/compute/cuda/10.2/Prod/local_installers/cuda_10.2.89_440.33.01_linux.run
chmod +x cuda_10.2.89_440.33.01_linux.run
Run the installer:
sudo ./cuda_10.2.89_440.33.01_linux.run
Note CUDA may refuse to install - it usually requires a gcc
a few versions back of the system's. See 2.1.1 below.
- Accept EULA
- Unselect nvidia driver (already installed)
- Select install
Note: these lines have been added to profile file:
export PATH=$PATH:/usr/local/cuda-10.1/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-10.1/lib64
Note: To uninstall the CUDA Toolkit, run the uninstall script in /usr/local/cuda-10.q/bin
In case the system's gcc
is too recent. This is the case on Ubuntu 20.04 (gcc
suite v9, with CUDA compiling with gcc
suite up to 8.). Change the 9s in what's below with the system versions, and the 8s with the CUDA-required version.
SYSTEM=9
CUDA_WANTS=8
sudo apt install gcc-$CUDA_WANTS g++-$CUDA_WANTS gfortran-$CUDA_WANTS gfortran
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-$SYSTEM $SYSTEM
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-$SYSTEM $SYSTEM
sudo update-alternatives --install /usr/bin/gfortran f95 /usr/bin/gfortran-$SYSTEM $SYSTEM
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-$CUDA_WANTS $CUDA_WANTS
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-$CUDA_WANTS $CUDA_WANTS
sudo update-alternatives --install /usr/bin/gfortran gfortran /usr/bin/gfortran-$CUDA_WANTS $CUDA_WANTS
Switch the system to $CUDA_WANTS
version:
sudo update-alternatives --set g++ /usr/bin/g++-$CUDA_WANTS
sudo update-alternatives --set gcc /usr/bin/gcc-$CUDA_WANTS
sudo update-alternatives --set gfortran /usr/bin/gfortran-$CUDA_WANTS
After installing CUDA, revert:
sudo update-alternatives --auto g++
sudo update-alternatives --auto gcc
sudo update-alternatives --auto gfortran
To run a given CUDA sample test:
cd /usr/local/cuda/samples/**X_XXX**/**TestName**
sudo make
./**TestName**
We recommend to run the following to check everything is fine:
1_Utilities/deviceQuery
1_Utilities/bandwidthTest
1_Utilities/p2pBandwidthLatencyTest
0_Simple/matrixMulCUBLAS
0_Simple/simpleMultiGPU
Output for 0_Simple/matrixMulCUBLAS
:
[Matrix Multiply CUBLAS] - Starting...
GPU Device 0: "GeForce RTX 2080 Ti" with compute capability 7.5
GPU Device 0: "GeForce RTX 2080 Ti" with compute capability 7.5
MatrixA(640,480), MatrixB(480,320), MatrixC(640,320)
Computing result using CUBLAS...done.
Performance= 5042.82 GFlop/s, Time= 0.039 msec, Size= 196608000 Ops
Computing result using host CPU...done.
Comparing CUBLAS Matrix Multiply with CPU results: PASS
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
NOTE: To run graphics examples:
sudo apt-get install libglu1-mesa libxi-dev libxmu-dev libglu1-mesa-dev freeglut3 freeglut3-dev
sudo apt-get install libopenblas-base libopenblas-dev
sudo apt-get install liblapack-dev
sudo apt install gfortran
Select the compute capability for GPU(s) installed on your system. See table below for examples.
The deviceQuery
CUDA sample will indicate those values for the GPUs in the system.
GPU | Compute capability | Architecture |
---|---|---|
RTX A6000 | 8.6 (Req. CUDA 11+) | Ampere |
RTX3080Ti | 8.6 | Ampere |
RTX2080Ti | 7.5 | Turing |
GTX1080Ti | 6.1 | Pascal |
GTX980Ti | 5.2 | Maxwell |
Install magma:
cd ${HOME}/soft
wget icl.utk.edu/projectsfiles/magma/downloads/magma-2.5.3.tar.gz
gunzip magma-2.5.3.tar.gz
tar -xvf magma-2.5.3.tar
cd magma-2.5.3
mkdir build
cd build
cmake -DGPU_TARGET='sm_75,sm_61,sm_52' ..
make -j $(nproc)
sudo make install
Add to .bashrc
or .profile
:
export LD_LIBRARY_PATH=/usr/local/magma/lib:$LD_LIBRARY_PATH
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/magma/lib/pkgconfig/
sudo apt install gfortran
export CONDA_ROOT=$HOME/miniconda3
export PATH=$CONDA_ROOT/bin:$PATH
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p $CONDA_ROOT
conda install -y numpy mkl-include
MAGMA is available here : http://icl.cs.utk.edu/magma/software/index.html
extract the tgz file and go into the new directory
wget http://icl.cs.utk.edu/projectsfiles/magma/downloads/magma-2.5.1.tar.gz -O - | tar xz
cd magma-2.5.1
You have to create your own make.inc based on make.inc.openblas:
cp make.inc-examples/make.inc.mkl-gcc make.inc
sed -i -e 's:/intel64: -Wl,-rpath=$(CUDADIR)/lib64 -Wl,-rpath=$(MKLROOT)/lib:' make.inc
just compile the shared target (and test if you want)
export MKLROOT=$CONDA_ROOT
export CUDA_ROOT=/usr/local/cuda
export NCPUS=8
GPU_TARGET=sm_XX MKLROOT=$MKLROOT CUDADIR=$CUDA_ROOT make -j $NCPUS shared sparse-shared
Where:
- sm_XX is compatible with the compute capability. For example, sm_60 for Tesla Tesla P100
- NCPUS is the number of CPUs in your system
To install libraries and include files in a given prefix, run:
sudo GPU_TARGET=sm_XX MKLROOT=$MKLROOT CUDADIR=$CUDA_ROOT make install prefix=/usr/local/magma
Add to .bashrc:
export LD_LIBRARY_PATH=/usr/local/magma/lib:$LD_LIBRARY_PATH
export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/magma/lib/pkgconfig/
Check that inclusion of magma will not override the gnu11 C standard adopted by cacao :
pkg-config --cflags magma
should return :
-DNDEBUG -DADD_ -fopenmp -I/usr/local/magma/include -I/usr/local/cuda/include
If CFLAGS includes "-std=c99", edit the magma.pc file to remove it.
You may also need to tweak LIBS:
pkg-config --libs magma
should return :
-L/usr/local/magma/lib -L/usr/local/cuda/lib64 -lmagma_sparse -lmagma -fopenmp -lopenblas -lcublas -lcusparse -lcudart -lcudadevrt
Note: Tweaks to the magma.pc will most likely be required if using older ubuntu distributions (16.04). Installation on ubuntu 18.04 and 19.04 will likely produce the desired magma.pc output.
Get the link to the latest installer bash script from https://www.anaconda.com/distribution:
cd ${HOME}/soft
wget https://repo.anaconda.com/archive/Anaconda3-2019.03-Linux-x86_64.sh
bash Anaconda3-2019.03-Linux-x86_64.sh
# Accept the default install directory, and say ok to running conda init
# Restart terminal
# By default, anaconda puts the environment name in the shell prompt. To disable this, do
conda config --set changeps1 False
Set up for tensorflow. now super easy thanks to conda!
# Make sure CUDA and NVIDIA drivers are already installed
# Make a new conda environment containing Tensorflow for GPU
conda create --name tf_gpu tensorflow-gpu
Activate the new environment:
conda activate tf_gpu
Note - to activate it and go back to the base environment, do
conda deactivate
While in the environment, install other useful packages, e.g.
conda install astropy matplotlib keras ipython
Test the installation - within python, do:
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
and check it sees all the devices.
compute and control for adaptive optics (cacao) - https://github.com/cacao-org/cacao
- Real-Time OS install
- OS Performance Tuning
- Real-time OS benchmarks:
- GPU drivers and tools
- cacao Performance