Basic installation instructions for local and artifact use mode.

Implements #1, supersedes #5.
JuliaGPU · Mar 10, 2020 · 86dc93d · 86dc93d
1 parent 15c78af
commit 86dc93d
Show file tree

Hide file tree

Showing 2 changed files with 91 additions and 36 deletions.
diff --git a/docs/src/installation/overview.md b/docs/src/installation/overview.md
@@ -1,58 +1,111 @@
 # [Overview](@id InstallationOverview)
 
 The Julia CUDA stack requires users to have a functional [NVIDIA
-driver](https://www.nvidia.com/Download/index.aspx) and matching [CUDA
-toolkit](https://developer.nvidia.com/cuda-downloads). For now, both of these components
-should be manually installed. If you are a Linux user, you should consider installing these
-dependencies using a package manager instead of downloading them from the NVIDIA homepage;
-refer to your distribution's documentation for more details.
+driver](https://www.nvidia.com/Download/index.aspx) and corresponding [CUDA
+toolkit](https://developer.nvidia.com/cuda-downloads). The former should be installed by you
+or your system administrator, while the latter can be automatically downloaded by Julia
+using the artifact subsystem.
 
-To make sure you have everything set-up, you can try executing some of the applications that
-the driver and toolkit provide. On Linux, you can verify driver availability by executing
-`nvidia-smi`, and you have installed CUDA successfully if you can execute `ptxas --version`.
 
 
-## CUDA discovery
+## Platform support
 
-Once you've installed the NVIDIA driver and CUDA toolkit, the Julia CUDA packages should
-automatically pick up your installation by means of the functionality in CUDAapi.jl. Some
-guidelines to make sure this works:
+All three major operation systems are supported: Linux, Windows and macOS. However, that
+support is subject to NVIDIA providing a CUDA toolkit for your system, subsequently macOS
+support might be deprecated soon.
 
-- CUDA driver: the driver library should be loadable with Libdl (e.g.,
-  `Libdl.dlopen("libcuda")`)
-- CUDA toolkit: the CUDA binaries should be on `PATH`
+Similarly, we support x86, ARM, PPC, ... as long as Julia is supported on it and there
+exists an NVIDIA driver and CUDA toolkit for your platform. The main development platform
+(and the only CI system) however is x86_64 on Linux, so if you are using a more exotic
+combination there might be bugs.
 
-Alternatively, you can use the `CUDA_HOME` environment variable to point to an installation
-of the CUDA toolkit.
 
-To debug this, set `JULIA_DEBUG=CUDAapi` (or more generally `JULIA_DEBUG=all`) for details
-on which paths are probed. If you file an issue, always include this information.
 
+## NVIDIA driver
 
-### Multiple CUDA toolkits
+To use the Julia GPU stack, you need to install the NVIDIA driver for your system and GPU.
+You can find [detailed instructions](https://www.nvidia.com/Download/index.aspx) on the
+NVIDIA home page.
 
-Generally, multiple installed CUDA toolkits are no supported because this may lead to
-incompatible libraries being picked up. However, if you use the `CUDA_HOME` environment
-variable to point to an installation, all other discovery heuristics will be disabled. This
-should result in only that version of the CUDA toolkit being used, on the condition no other
-toolkit is present in the global environment (`PATH`, `LD_LIBRARY_PATH`).
+If you're using Linux you should always consider installing the driver through the package
+manager of your distribution. In the case that driver is out of date or does not support
+your GPU, and you need to download a driver from the NVIDIA home page, similarly prefer a
+distribution-specific package (e.g., deb, rpm) instead of the generic runfile option.
 
+If you are using a shared system, ask your system administrator on how to install or load
+the NVIDIA driver. Generally, you should be able to find and use the CUDA driver library,
+called `libcuda.dll` on Linux, `libcuda.dylib` on macOS and `nvcuda64.dll` on Windows. You
+should also be able to execute the `nvidia-smi` command, which lists all available GPUs you
+have access to.
 
-## Version compatibility
+Finally, to be able to use all of the Julia GPU stack you need to have permission to profile
+GPU code. On Linux, that means loading the `nvidia` kernel module with the
+`NVreg_RestrictProfilingToAdminUsers=0` option configured (e.g., in `/etc/modprobe.d`).
+Refer to the [following
+document](https://developer.nvidia.com/nvidia-development-tools-solutions-ERR_NVGPUCTRPERM-permission-issue-performance-counters)
+for more information.
 
-You should always use a CUDA toolkit that is supported by your driver. That means that the
-toolkit version number should be lower or equal than the CUDA API version that is supported
-by your driver (only taking into account the major and minor part of the version number).
 
-Both these versions can be queried using the tools mentioned above, but you can also use the
-Julia packages:
 
-```julia
-julia> using CUDAdrv, CUDAnative
+## CUDA toolkit
 
-julia> CUDAdrv.version() # CUDA toolkit supported by the driver
-v"10.2.0"
+There are two different options to provide CUDA: either you [install it
+yourself](https://developer.nvidia.com/cuda-downloads) in a way that is discoverable by the
+Julia CUDA packages, or you let the packages download CUDA from artifacts. If you can use
+artifacts (i.e., you are not using an unsupported platform or have no specific
+requirements), it is recommended to do so: The CUDA toolkit is tightly coupled to the NVIDIA
+driver, and compatibility is automatically taken into account when selecting an artifact to
+use.
 
-julia> CUDAnative.version() # CUDA toolkit installed
+
+### Artifacts
+
+Use of artifacts is the default option: Importing the Julia CUDA packages will automatically
+download CUDA upon first use of the API. You can inspect details about the process by
+enabling debug logging:
+
+```
+$ JULIA_DEBUG=CUDAnative julia
+
+julia> using CUDAnative
+
+julia> CUDAnative.version()
+┌ Debug: Trying to use artifacts...
+└ @ CUDAnative CUDAnative/src/bindeps.jl:52
+┌ Debug: Using CUDA 10.2.89 from an artifact at /home/tim/Julia/depot/artifacts/93956fcdec9ac5ea76289d25066f02c2f4ebe56e
+└ @ CUDAnative CUDAnative/src/bindeps.jl:108
 v"10.2.89"
 ```
+
+!!! note
+
+    Automatic installation of CUDA using artifacts is only supported by CUDAnative v3+ and CuArrays v2+.
+
+
+### Local installation
+
+If artifacts are unavailable for your platform, the Julia CUDA packages will look for a
+local CUDA installation using CUDAapi.jl:
+
+```
+julia> CUDAnative.version()
+┌ Debug: Trying to use artifacts...
+└ @ CUDAnative CUDAnative/src/bindeps.jl:52
+┌ Debug: Could not find a compatible artifact.
+└ @ CUDAnative CUDAnative/src/bindeps.jl:73
+
+┌ Debug: Trying to use local installation...
+└ @ CUDAnative CUDAnative/src/bindeps.jl:114
+┌ Debug: Found local CUDA 10.0.326 at /usr/local/cuda-10.0/targets/aarch64-linux, /usr/local/cuda-10.0
+└ @ CUDAnative CUDAnative/src/bindeps.jl:141
+v"10.0.326"
+```
+
+You might want to disallow use of artifacts, e.g., because an optimized CUDA installation is
+available for your system. You can do so by setting the environment variable
+`JULIA_CUDA_USE_BINARYBUILDER` to `false`.
+
+To troubleshoot discovery of a local CUDA installation, you can set `JULIA_DEBUG=CUDAapi`
+and see the various paths where CUDAapi.jl looks. By setting any of the `CUDA_HOME`,
+`CUDA_ROOT` or `CUDA_PATH` environment variables, you can guide the package to a specific
+directory.
diff --git a/docs/src/installation/troubleshooting.md b/docs/src/installation/troubleshooting.md
@@ -1,5 +1,6 @@
 # Troubleshooting
 
+
 ## CUDA toolkit does not contain `XXX`
 
 This means that you have an incomplete or missing CUDA toolkit, or that not all required
@@ -8,6 +9,7 @@ and fix your CUDA toolkit installation if it isn't. Else, if you installed CUDA
 nonstandard location, use the `CUDA_HOME` environment variable to direct Julia to that
 location.
 
+
 ## UNKNOWN_ERROR(999)
 
 If you encounter this error, there are several known issues that may be causing it: