Update Thrust to CCCL #206

bernhardmgruber · 2024-06-26T15:33:46Z

The Thrust library no longer exists as a standalone project since it has been absorbed into the CCCL. I therefore propose to update the Thrust-based version of BabelStream to account for this. Specifically, this PR:

will download CCCL from GitHub via FetchContent if instructed to do so by the user. If not, then it tries to find CCCL via cmake's find_package. Newer CUDA toolkits ship CCCL, and you can point cmake to pick it up with the already available SDK_DIR. If no version of CCCL is found, cmake will try to find the legacy Thrust and CUB targets.
recent versions of Thrust no longer allow including device_vector into code that is not compiled by nvcc. I therefore pulled the corresponding data members into the source file using a the PIMPL pattern.
recent versions of Thrust (since CUDA 11.3) allow passing a pack of iterators to make_zip_iterator, which simplifies the benchmark implementation. With this change, CUDA < 11.3 will no longer be able to compile the benchmark.

I also want to emphasize that using a new version of CCCL is crucial, since CCCL constantly improves performance of Thrust algorithms. However, for comparability of benchmarks, it makes sense to pick the CCCL version of the CUDA SDK by default, and not introduce another variable.

I tried configuring and building thrust-stream on my Ubuntu 24.04 (default nvcc 12.5, default g++ 13.2) from a <git_clone>/build directory:

cmake .. -DMODEL=thrust -DCMAKE_CUDA_COMPILER=nvcc -DCUDA_ARCH=86 -DFETCH_CCCL=ON && make && ./thrust-stream
cmake .. -DMODEL=thrust -DCMAKE_CUDA_COMPILER=/usr/local/cuda-12.5/bin/nvcc -DCUDA_ARCH=86 -DSDK_DIR=/usr/local/cuda-12.5/lib64/cmake && make && ./thrust-stream
cmake .. -DMODEL=thrust -DCMAKE_CUDA_COMPILER=/opt/nvidia/hpc_sdk/Linux_x86_64/24.5/cuda/12.4/bin/nvcc -DCUDA_ARCH=86 -DSDK_DIR=/opt/nvidia/hpc_sdk/Linux_x86_64/24.5/cuda/12.4/lib64/cmake && make && ./thrust-stream
cmake .. -DMODEL=thrust -DCMAKE_CUDA_COMPILER=/usr/local/cuda-11.4/bin/nvcc -DCUDA_ARCH=86 -DSDK_DIR=/usr/local/cuda-11.4/include -DCMAKE_CUDA_HOST_COMPILER=g++-10 && make && ./thrust-stream

So fetching CCCL seems to work, using CCCL from a CUDA toolkit works, using CCCL from a HPC SDK works, and using legacy Thrust and CUB from an old CUDA SDK works.

I did not test whether rocThrust works.

alliepiper · 2024-06-26T16:47:16Z

src/thrust/model.cmake

-        thrust_create_target(Thrust${BACKEND}
-                HOST CPP
-                DEVICE ${BACKEND})


These lines switch between the different Thrust backends, and this feature is lost in the new version. By default, CCCL uses the CUDA backend for Thrust, but it looks like BabelStream may want to use TBB or OMP.

To keep this functionality, force-set the cache variable CCCL_THRUST_DEVICE_SYSTEM to the desired backend before any find_package/CPM calls.

Force setting that variable tells the CCCL CMake package to use the desired device backend for the CCCL::Thrust target that gets linked into CCCL::CCCL. See this for details. This is a convenience for the common case where only one backend is used per configure. If multiple Thrust targets with different backends are needed in the same build, lmk, there's a different approach that can be used.

Also, using paths based off of CMAKE_SOURCE_DIR is fragile. If BabelStream is included in another project (via add_subdirectory, CPM, etc), this variable will point to the top-level project's source dir, not BabelStream's. Use the project specific variable BabelStream_SOURCE_DIR instead.

The include call should wrap it's argument in quotes. This is best practice for all paths in cmake to support spaces, etc.

Finally, rather than manually checking find_package and falling back to CPMAddPackage, it looks like CPMFindPackage will do this for us:

The function CPMFindPackage will try to find a local dependency via CMake's find_package and fallback to CPMAddPackage, if the dependency is not found.

All together, you're looking at something like:

set(MIN_CCCL_VERSION 2.4.0) # Tell CCCL's package to configure Thrust using the desired backend: set(CCCL_THRUST_DEVICE_SYSTEM ${BACKEND} CACHE STRING "" FORCE) # CPMFindPackage will: # 1. Attempt to locate a local installation of CCCL. # Set CCCL_DIR to the directory containing `cccl-config.cmake` to force a # specific installation to be used. # 2. If no local package is found, the requested version will be downloaded # from GitHub and configured using CPMAddPackage. include("${BabelStream_SOURCE_DIR}/src/thrust/CPM.cmake") CPMFindPackage( NAME CCCL GITHUB_REPOSITORY nvidia/cccl GIT_TAG v${MIN_CCCL_VERSION} ) register_link_library(CCCL::CCCL)

Thank you for the review! I set CCCL_THRUST_DEVICE_SYSTEM now before starting to find and fetch CCCL.

As for handling CPM, I decided now to revert to FetchContent to just make the CMake setup simpler.

Let me know if anything is still missing, thx!

Recent changes in Thrust, for the CUDA backend, no longer allow including `<thrust/device_vector.h>` into a translation unit that is not compiled by a CUDA compiler. As a workaround, the vectors containing the data are moved from the header into the benchmark implementation file.

bernhardmgruber · 2024-06-26T20:21:51Z

@gonzalobg I would appreciate your input on this PR. Do these changes break any of your workflows?

bernhardmgruber · 2024-08-21T14:40:47Z

Ping. @tomdeakin, this PR is really important to enable users to use CCCL instead of the legacy Thrust library.

tomdeakin · 2024-08-21T14:53:06Z

We are planning a re-think of CUDA and Thrust models generally with @gonzalobg. To manage expectations on timescales, we hope to release v6 in time for Supercomputing, but it depends on how much progress we make on other changes for v6.

bernhardmgruber · 2024-08-21T16:53:52Z

Thank you for the answer! I knew @gonzalobg is somehow involved, but I didn't know the plan. In any case, I hope v6 will support Thrust from CCCL, however it is done.

alliepiper reviewed Jun 26, 2024

View reviewed changes

bernhardmgruber force-pushed the thrust_fixes branch from 6590a17 to 5e86faa Compare June 26, 2024 19:19

tomdeakin changed the base branch from main to develop June 26, 2024 19:51

bernhardmgruber added 2 commits June 26, 2024 22:00

Avoid tuple creation when making zip iterator

f787167

bernhardmgruber force-pushed the thrust_fixes branch 2 times, most recently from 05afc2e to eafbf41 Compare June 26, 2024 20:18

Enable using Thrust from CCCL

be3132b

bernhardmgruber force-pushed the thrust_fixes branch from eafbf41 to be3132b Compare June 26, 2024 20:20

bernhardmgruber marked this pull request as ready for review July 1, 2024 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Thrust to CCCL #206

Update Thrust to CCCL #206

bernhardmgruber commented Jun 26, 2024 •

edited

Loading

alliepiper Jun 26, 2024

bernhardmgruber Jun 26, 2024

bernhardmgruber commented Jun 26, 2024

bernhardmgruber commented Aug 21, 2024

tomdeakin commented Aug 21, 2024

bernhardmgruber commented Aug 21, 2024

Update Thrust to CCCL #206

Are you sure you want to change the base?

Update Thrust to CCCL #206

Conversation

bernhardmgruber commented Jun 26, 2024 • edited Loading

alliepiper Jun 26, 2024

Choose a reason for hiding this comment

bernhardmgruber Jun 26, 2024

Choose a reason for hiding this comment

bernhardmgruber commented Jun 26, 2024

bernhardmgruber commented Aug 21, 2024

tomdeakin commented Aug 21, 2024

bernhardmgruber commented Aug 21, 2024

bernhardmgruber commented Jun 26, 2024 •

edited

Loading