-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support CUDA 12.2 #161
Support CUDA 12.2 #161
Conversation
conda builds and tests on CUDA 12.0.1 and 12.2.2 are segfaulting (build link). I don't see similar errors on recent builds on |
Those are indeed flaky tests, they don't always happen in same form. Let's first try rerunning them, based on the log for 12.2 almost all tests passed, the one that failed is one of the very last ones and likely to pass soon. |
Ok thank you! |
Thanks James & Peter! 🙏 Looks like that cleared things up and builds now pass Though am going to mark this "do not merge" for the moment while we work through issues in the other PRs |
* Move conda-only dependencies out of `pyproject` and `requirements` sections in `dependencies.yaml` * Add `rmm`, `cudf`, and `cupy` matrices Authors: - Paul Taylor (https://github.com/trxcllnt) Approvers: - Bradley Dice (https://github.com/bdice) - Peter Andreas Entschev (https://github.com/pentschev) - Ray Douglass (https://github.com/raydouglass) URL: rapidsai#169
The C++ code was documented for some time but Doxygen build process was not included. This change now introduces Doxygen builds and fixes all documentation warnings. Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Jake Awe (https://github.com/AyodeAwe) - Charles Blackmon-Luca (https://github.com/charlesbluca) URL: rapidsai#164
785b670
to
591af06
Compare
Looks like there is a conflict here It is a generated file. So can just regenerate it with |
fixed in 3dbdb61 |
…endencies.yaml (#174) Contributes to rapidsai/build-planning#13. Updates `update-version.sh` to correctly handle RAPIDS dependencies like `cudf-cu12==24.2.*`. This also pulls in some dependency refactoring originally added in #161, which allows greater use of dependencies.yaml globs (and therefore less maintenance effort to support new CUDA versions). ### How I tested this The portability of this updated `sed` command was tested here: rapidsai/cudf#14825 (comment). In this repo, I ran the following: ```shell ./ci/release/update-version.sh '0.36.00' git diff ./ci/release/update-version.sh '0.37.00 git diff ``` Confirmed that that first `git diff` changed all the things I expected, and that second one showed 0 changes. Authors: - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) - https://github.com/jakirkham Approvers: - Jake Awe (https://github.com/AyodeAwe) - https://github.com/jakirkham URL: #174
I see a new test failure on the v11.8.0 conda tests. There are quite a few stacktraces and things that look like errors in the logs, but only a single unit test case failure. I think these snippets summarize it well:
NOTE: as of #174, |
Vyas restarted CI as we believe this is an unrelated flaky test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing it's because this PR was started before #145, but it needs updating for the wheel jobs. The conda outputs look sensible.
I was going to ask some questions about the necessity of the CUDA dependency in ucx-py because it's pure Python, but I realize those are less relevant here since there is an actual C++ dependency in ucxx since it is not a pure Python package and we do need to compile against CUDA.
/merge |
Follow-up to #161 For all GitHub Actions configs, replaces uses of the `test-cuda-12.2` branch on `shared-workflows` with `branch-24.04`, now that rapidsai/shared-workflows#166 has been merged. ### Notes for Reviewers This is part of ongoing work to build and test packages against CUDA 12.2 across all of RAPIDS. For more details see: * rapidsai/build-planning#7 *(created with `rapids-reviser`)* Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Ray Douglass (https://github.com/raydouglass) URL: #191
`ucxx`'s CUDA compiler dependency added a `cuda-version` constraint at runtime that meant packages could only be installed with the same CUDA version `ucxx` was built with or newer. As a result CUDA 12.2 builds of `ucxx` required that CUDA 12.2+ would be used at runtime However as we use [CUDA Compatibility]( https://docs.nvidia.com/deploy/cuda-compatibility/index.html ) in RAPIDS, we know that even if we built with CUDA 12.2, we can still use packages for other CUDA 12.x This was largely handled for other dependencies as part of PR ( #161 ). However this wasn't handled for `ucxx`, which was likely in part as it was handling the CUDA compiler dependency differently from the other packages here. More history about `ucxx`'s CUDA compiler dependency in PR: #108 This change aligns how CUDA compiler is handled across packages to make this more consistent. Also it ignores the CUDA compiler constraints added at runtime. In all cases the packages handle this themselves by requiring `cuda-version` (properly constrained) and when CUDA 11 is concerned they add `cudatoolkit` Thus this change should fix CI issues that were seen due to this overly constrained `cuda-version` by relaxing that constraint Authors: - https://github.com/jakirkham Approvers: - Bradley Dice (https://github.com/bdice) - Peter Andreas Entschev (https://github.com/pentschev) - Ray Douglass (https://github.com/raydouglass) URL: #195
Description
Notes for Reviewers
This is part of ongoing work to build and test packages against CUDA 12.2.2 across all of RAPIDS.
For more details see:
Planning a second round of PRs to revert these references back to a proper
branch-24.{nn}
release branch ofshared-workflows
once rapidsai/shared-workflows#166 is merged.(created with
rapids-reviser
)