-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update latest CUDA version for build/test to 12.5 #73
Comments
@ajschmidt8 - Please take a look at the possibility of updating this in July. |
I’ve spent a bit of time investigating and discussing this topic with others (including @jrhemstad). I’m coming to the conclusion that we may not need any driver updates, because the Production Branch driver we currently use (R550) is supported for CUDA Forward Compatibility with 12.5+ according to this table: https://docs.nvidia.com/deploy/cuda-compatibility/#id3 My local tests on a machine with R535, the LTS driver, also indicate compatibility should be fine. The key here is for us to remain on only Production Branch or LTS Branch drivers! I propose a change to this plan: we should try to use CUDA 12.5 to build and test for a couple repos (rmm and cudf) and if it works we can update to 12.5 instead of 12.4. I will file PRs to the miniforge-cuda, ci-imgs, and shared-workflows repositories to enable this test. |
It is worth noting that CUDA 12.5.1 packages (in various formats) are now out Also |
Looking at this one...
It appears the builds are already pulling in the latest distro packages for CUDA 12.5.1 For example this job from yesterday, shows the following
Note that these match the new versions in CUDA 12.5U1 So looks like this is done already Though it would be nice to update this Edit: Also it looks like the Given this, will go ahead and checking these boxes |
cc @KyleFromNVIDIA (as we discussed this offline) |
Contributes to rapidsai/build-planning#73
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #1056
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #647
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #608
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #1617
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #193
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #407
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #749
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #1359
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #234
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham URL: #247
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham URL: #16314
This PR updates the latest CUDA build/test version 12.2.2 to 12.5.1. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - https://github.com/jakirkham Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham URL: #1405
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham URL: #5970
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham
After updating everything to CUDA 12.5.1, use `shared-workflows@branch-24.08` again. Contributes to rapidsai/build-planning#73 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham
Contributes to rapidsai/build-planning#73 Proposes the following: * adding CUDA 12.5 images * PR builds: * +1 test job covering `(cuda=12.5, arch=arm64)` * branch builds: * +1 test job covering `(cuda=12.5, arch=x86_64)` * -1 test job covering `(cuda=12.2, arch=arm64)` context: rapidsai/build-planning#73 (comment) ## Notes for Reviewers Per offline discussion with @raydouglass , this would be accompanied by a deprecation notice in the RAPIDS 24.08 release stating that the CUDA 12.2 images will be removed in some future release (future release not yet determined). Authors: - James Lamb (https://github.com/jameslamb) - https://github.com/jakirkham Approvers: - Ray Douglass (https://github.com/raydouglass) - Kyle Edwards (https://github.com/KyleFromNVIDIA) - https://github.com/jakirkham URL: #689
I'm proposing switching to CUDA 12.5 images + Python 3.11 in the docs at https://docs.rapids.ai/deployment |
For completeness, this is largely complete. All that remains is...
Both of which are in the checklist in the OP These typically happen after the release is complete |
Ok the remaining webpage PRs are up. Thanks Ray! 🙏 Have noted those above and xref'd them here |
Closing as completed 🥳 Thanks everyone for all of your hard work shipping CUDA 12.5 this release! 👏 |
For future reference: |
It's a little hard to know what to do with that repo given cuML seems to vendor it. We've also struggled with updating it in the past ( for example: rapidsai/gputreeshap#42 ). Maybe we should have a separate discussion on how it should be managed |
With the recent CCCL update (rapidsai/rapids-cmake#607), we should now be able to build RAPIDS with CUDA versions 12.5 and older.
We have CUDA driver R550 in CI now,
which only supports up to CUDA 12.4, so that's the latest version we could adequately test. CUDA 12.5 needs driver R555, which does not yet have a production branch (PB) or long-term support (LTS) release.edit: R550 is a Production Branch driver, and therefore supports CUDA Forward Compatibility with CUDA 12.5 containers. This means we are able to support CUDA 12.5 (the latest version at the time of writing).
I propose to update CI images, shared workflows, devcontainers, etc. to replace CUDA 12.2 with CUDA 12.5. We would retain CI testing for CUDA 12.0 as a lower bound of 12.x.
This will also align with PyTorch's upcoming CUDA 12.4 support (there have been a series of PRs adding CUDA 12.4 support like pytorch/builder#1720).edit: We will upgrade to the latest CUDA, 12.5, instead of 12.4. I will separately address the issues of CUDA compatibility questions between RAPIDS and PyTorch by working on our docs and release selector (see also: https://github.com/rapidsai/build-infra/issues/55).Tasks
We can start this work now (not blocked by 12.5.1 updates above):
cuda-version
matrix entry for 12.5.github/workflows/
to use shared-workflows branchmatrix_filter
entries using 12.2 to 12.5rapidsai/docker
(add CUDA 12.5 images docker#689)Once all repos are migrated, merge the
shared-workflows
PR and then revert to the current defaultshared-workflows
branch.Docs changes (wait until all repos are migrated):
The text was updated successfully, but these errors were encountered: