Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CUDA 12.2 #672

Merged
merged 23 commits into from
Feb 9, 2024
Merged
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
d74e254
use CUDA 12.2 for building and testing wheels and conda packages
jameslamb Jan 11, 2024
9ba0dea
update dependencies.yaml
jameslamb Jan 11, 2024
44ea0f4
remove 1.20 env file
jameslamb Jan 11, 2024
b311a8b
Merge branch 'branch-24.04' into test-cuda-12.2
jameslamb Jan 22, 2024
b03f694
Merge branch 'branch-24.04' into test-cuda-12.2
jameslamb Jan 22, 2024
ffb0c47
Merge branch 'branch-24.04' into test-cuda-12.2
jameslamb Jan 22, 2024
a856724
Merge branch 'branch-24.04' into test-cuda-12.2
jameslamb Jan 23, 2024
7176701
Merge branch 'test-cuda-12.2' of github.com:jameslamb/cucim into test…
jameslamb Jan 23, 2024
ccdcae9
Merge branch 'branch-24.04' into test-cuda-12.2
jakirkham Jan 24, 2024
d4cab19
try adding cuda_version to py_build
jameslamb Jan 24, 2024
03e2253
Merge branch 'test-cuda-12.2' of github.com:jameslamb/cucim into test…
jameslamb Jan 24, 2024
651adce
Pin `cuda-version` during `*cucim` install
jakirkham Jan 25, 2024
38856c4
Merge rapidsai/branch-24.04 into jameslamb/test-cuda-12.2
jakirkham Jan 29, 2024
2bed0b3
Drop `cuda-version` install workaround
jakirkham Jan 29, 2024
a378184
Remove leftover CUDA 11.8 workaround attempt
jakirkham Jan 29, 2024
e075700
Run ci/release/update-version.sh 24.04.00.
bdice Jan 29, 2024
3081630
Merge rapidsai/branch-24.04 into jameslamb/test-cuda-12.2
jakirkham Jan 30, 2024
e703868
Merge branch 'branch-24.04' into test-cuda-12.2
jakirkham Jan 30, 2024
b318986
Ignore run-exports from CUDA 12 compiler.
bdice Feb 6, 2024
eca560e
Loosen run-exports from -dev libraries.
bdice Feb 6, 2024
6e837c9
Merge branch 'branch-24.04' into test-cuda-12.2
bdice Feb 6, 2024
b56a9ac
Add cuda-cudart-dev with run-exports ignored.
bdice Feb 8, 2024
cd71cc9
Merge branch 'test-cuda-12.2' of github.com:jameslamb/cucim into test…
bdice Feb 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions ci/test_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ rapids-print-env
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
"cuda-version=${RAPIDS_CUDA_VERSION%.*}" \
"libcucim=${RAPIDS_VERSION_NUMBER}" \
"cucim=${RAPIDS_VERSION_NUMBER}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion offline, we determine the CUDA 11.8 build was failing as the packages were being upgraded in this step to CUDA 12.3, which was unexpected

To try and fix this, have pinned cuda-version while installing libcucim & cucim. It appears that resolves the upgrade issue and allows the tests to pass

That said, we didn't expect to need a cuda-version pinning here. That may deserve some additional investigation on its own (with possible follow up here and in other RAPIDS projects)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be the root cause of what we see here? conda-forge/cupy-feedstock#247 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With cuda-version added to cupy in PR ( conda-forge/cupy-feedstock#249 ), think we can now try dropping cuda-version

Suggested change
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
"cuda-version=${RAPIDS_CUDA_VERSION%.*}" \
"libcucim=${RAPIDS_VERSION_NUMBER}" \
"cucim=${RAPIDS_VERSION_NUMBER}"
rapids-mamba-retry install \
--channel "${CPP_CHANNEL}" \
--channel "${PYTHON_CHANNEL}" \
"libcucim=${RAPIDS_VERSION_NUMBER}" \
"cucim=${RAPIDS_VERSION_NUMBER}"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! Thanks @jakirkham

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, would note that when we saw the issue previously (before adding the cuda-version workaround above), we do see cuda-version=11.8 in the specs from the environment update on CI

Transaction

  Prefix: /opt/conda/envs/test

  Updating specs:

   - gputil[version='>=1.4.0']
   - cuda-version=11.8
   - imagecodecs[version='>=2021.6.8']
   - matplotlib-base
   - openslide-python[version='>=1.3.0']
   - pip
   - pooch[version='>=1.6.0']
   - psutil[version='>=5.8.0']
   - pytest-cov[version='>=2.12.1']
   - pytest-lazy-fixture[version='>=0.6.3']
   - pytest-xdist
   - pytest[version='>=6.2.4']
   - python=3.10
   - tifffile[version='>=2022.7.28']

IOW the solver recognizes we've explicitly requested cuda-version with a specific version constraint

Despite this the solver later ignores this constraint and updates cuda-version anyways later in the same CI log:

  - cuda-version         11.8  h70ddcb2_2                       conda-forge          Cached
  + cuda-version         12.3  h32bc705_2                       conda-forge            21kB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we still have this issue. However it is now with CUDA 12.0. Here is a relevant snippet below (also when cupy is installed with the PR build of cucim) taken from CI:

  - cuda-version         12.0  hffde075_2                       conda-forge             Cached
  + cuda-version         12.3  h32bc705_2                       conda-forge               21kB

Copy link
Contributor

@bdice bdice Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CUDA 12 problems should be resolved by the fixes discussed here: rapidsai/build-planning#8 (comment)


Expand Down
Loading