Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CUDA 12.2 #328

Merged
merged 22 commits into from
Feb 10, 2024
Merged

Conversation

jameslamb
Copy link
Member

Description

  • switches to CUDA 12.2.2 for building conda packages and wheels
  • adds new tests running against CUDA 12.2.2

Notes for Reviewers

This is part of ongoing work to build and test packages against CUDA 12.2.2 across all of RAPIDS.

For more details see:

Planning a second round of PRs to revert these references back to a proper branch-24.{nn} release branch of shared-workflows once rapidsai/shared-workflows#166 is merged.

(created with rapids-reviser)

@jameslamb jameslamb marked this pull request as ready for review January 11, 2024 17:16
@jameslamb jameslamb requested a review from a team as a code owner January 11, 2024 17:16
@madsbk madsbk added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Jan 11, 2024
@jameslamb jameslamb marked this pull request as draft January 11, 2024 19:41
@jameslamb jameslamb changed the title add CUDA 12.2 support for conda packages and wheels WIP: add CUDA 12.2 support for conda packages and wheels Jan 11, 2024
@jameslamb jameslamb changed the title WIP: add CUDA 12.2 support for conda packages and wheels WIP: use CUDA 12.2 for building and testing wheels Jan 11, 2024
@jameslamb jameslamb changed the title WIP: use CUDA 12.2 for building and testing wheels WIP: add CUDA 12.2 support for conda packages and wheels Jan 11, 2024
@jameslamb jameslamb changed the title WIP: add CUDA 12.2 support for conda packages and wheels WIP: (DO NOT MERGE) add CUDA 12.2 support for conda packages and wheels Jan 11, 2024
@jameslamb jameslamb changed the title WIP: (DO NOT MERGE) add CUDA 12.2 support for conda packages and wheels (DO NOT MERGE) add CUDA 12.2 support for conda packages and wheels Jan 11, 2024
@jameslamb jameslamb marked this pull request as ready for review January 11, 2024 22:51
@jakirkham jakirkham added feature request New feature or request conda DO NOT MERGE and removed improvement Improves an existing functionality labels Jan 13, 2024
@jameslamb jameslamb changed the base branch from branch-24.02 to branch-24.04 January 22, 2024 15:37
@jameslamb jameslamb changed the title (DO NOT MERGE) add CUDA 12.2 support for conda packages and wheels Support CUDA 12.2 Jan 25, 2024
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing cufile being pulled in arm 12.0 testing. I think we want to disable that, right? Note that it's pulling the 12.3 version of cufile (1.8.1.2) from the nvidia channel since it can't get it from cf.

@bdice bdice removed the DO NOT MERGE label Feb 9, 2024
@bdice
Copy link
Contributor

bdice commented Feb 9, 2024

I'm seeing cufile being pulled in arm 12.0 testing. I think we want to disable that, right? Note that it's pulling the 12.3 version of cufile (1.8.1.2) from the nvidia channel since it can't get it from cf.

I think I fixed this in 9aeab68, we'll need to check CI closely on this repo.

@bdice
Copy link
Contributor

bdice commented Feb 9, 2024

The aarch64 / libcufile part needed a bit more work. I took a stronger approach in f45ac7b, which removes libcufile-dev from aarch64 completely. I don't think we have a very good way to support "only aarch64 and CUDA 12.2+" in our build/test workflows yet, because we only build once with CEC enabled and we would need two aarch64 builds to support CUDA 12.0/12.1 without libcufile and 12.2+ with libcufile. I think it's best to wait until a later time to handle this -- possibly a new CUDA major version, or dropping CUDA 12.0/12.1 support in RAPIDS.

@bdice bdice requested a review from vyasr February 9, 2024 22:03
Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to think through exactly what cases are OK here and which aren't. kvikio dlopens libcufile, so in principle we ought to be able to put the runtime protections in place to avoid trying to load libcufile even if it is installed into an environment where it is not supported (CUDA 12.0, aarch64). We don't have to solve that in this PR (we can merge 12.2 support without aarch64 libcufile support) but I do think we should be able to enable this in a follow-up without waiting to drop CUDA 12.0/12.1. We can just allow installation of libcufile but make sure the library isn't loaded at runtime.

If we're OK with that result, can we just open an issue to track that as future work?

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry meant to approve based on my previous comment.

@bdice
Copy link
Contributor

bdice commented Feb 10, 2024

Great. You're right, that sounds plausible. I hadn't considered that this could be a runtime guard because it's dlopen'd. #339

Merging this for now.

@bdice
Copy link
Contributor

bdice commented Feb 10, 2024

/merge

@rapids-bot rapids-bot bot merged commit 9ef242e into rapidsai:branch-24.04 Feb 10, 2024
33 checks passed
rapids-bot bot pushed a commit that referenced this pull request Feb 14, 2024
This PR closes #341.

The kvikio `INTERFACE_COMPILE_DEFINITIONS` were being set based on the packages available during the libkvikio conda build (e.g. CUDA 12.2 since #328), which might not be the same packages/versions as when libkvikio is actually being used (e.g. to build libcudf with CUDA 12.0). Because we built libkvikio with CUDA 12.2 and then tried to use it with CUDA 12.0 devcontainers, the build failed to find the cuFile Stream APIs that were introduced in CUDA 12.2.

This PR defers these definitions until the call to `find_package`, which will then use the exact cuFile features present (if cuFile is available at all) when building a package like cudf that depends on kvikio. The libkvikio example/test binary is built with the cuFile features available at build time, for use in the `libkvikio-tests` conda package. However, this test binary will still be compatible with a runtime where cuFile is unavailable or is version 12.0, as it is dlopen-ing the library and has runtime checks for the batch/stream features it tries to use.

I did local testing of this PR with cudf devcontainers. I tested both 12.0 and 12.2 to reproduce (and fix) the failure, and also tested clean builds of libcudf after removing `libcufile` (to test when cuFile is not found). All seems to work as intended.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Robert Maynard (https://github.com/robertmaynard)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #342
rapids-bot bot pushed a commit that referenced this pull request Feb 20, 2024
Follow-up to #328

For all GitHub Actions configs, replaces uses of the `test-cuda-12.2` branch on `shared-workflows`
with `branch-24.04`, now that rapidsai/shared-workflows#166 has been merged.

### Notes for Reviewers

This is part of ongoing work to build and test packages against CUDA 12.2 across all of RAPIDS.

For more details see:

* rapidsai/build-planning#7

*(created with `rapids-reviser`)*

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)

URL: #343
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conda feature request New feature or request non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants