-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RELEASE] dask-cuda v25.02 #1438
Open
AyodeAwe
wants to merge
22
commits into
main
Choose a base branch
from
branch-25.02
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Forward-merge branch-24.12 into branch-25.02
Forward-merge branch-24.12 into branch-25.02
By default, CI runs on draft PRs. This leads to many CI runs that may be unnecessary. With this PR's change to `.github/copy-pr-bot.yaml`, an `/ok to test` comment from a trusted user is required to trigger CI on draft PRs. Non-draft PRs will run CI by default, assuming that all commits are signed by trusted users. Otherwise an `/ok to test` is required (as before) -- see the `copy-pr-bot` docs at https://docs.gha-runners.nvidia.com/apps/copy-pr-bot/ for more information. Part of rapidsai/build-planning#123. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) URL: #1412
Forward-merge branch-24.12 into branch-25.02
Conda builds are failing due to missing `setuptools`, this change add the missing dependency to fix the failure. Authors: - Peter Andreas Entschev (https://github.com/pentschev) - James Lamb (https://github.com/jameslamb) Approvers: - James Lamb (https://github.com/jameslamb) - Bradley Dice (https://github.com/bdice) URL: #1418
When PyNVML fails to identify CPU affinity appropriately, it may cause an error with launching Dask-CUDA. After extensive discussions in #1381, it seems appropriate to allow continuing if CPU affinity identification fails and print a warning with a link to documentation instead. New documentation is also added to help in first steps of troubleshooting. Unfortunately testing warnings in Distributed plugins seems very hard to do, I couldn't find a way to do that even with `distributed.utils_tests.captured_logger`, which runs only after the cluster is created with a `LocalCluster` (or `LocalCUDACluster`). For the `dask cuda worker` CLI there's no way for us to mock the value passed to `CPUAffinity` to force a warning to be raised, so no tests are added at this time. Closes #1381 . Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Benjamin Zaitlen (https://github.com/quasiben) URL: #1420
Do not skip `pynvml` if it's not importable, given `pynvml` is a hard-dependency. Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - https://github.com/jakirkham - James Lamb (https://github.com/jameslamb) URL: #1421
Bump `pynvml` from `11` to `12`. This version of `pynvml` also now depends on `nvidia-ml-py` for core functionality. Authors: - https://github.com/jakirkham - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - James Lamb (https://github.com/jameslamb) URL: #1419
Adding this now that wheels are available - **deps(kvikio): add kvikio to CUDA version matrices** - **test(wheels): enable wheel tests in CI** Resolves #1344 Authors: - Gil Forsyth (https://github.com/gforsyth) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - James Lamb (https://github.com/jameslamb) URL: #1416
Removes testing/handling for "legacy" Dask cuDF (i.e. `DASK_DATAFRAME__QUERY_PLANNING=False`). This PR also adds support for the `"explicit-comms"` config with query-planning enabled (we used to raise an error telling the user to disable query planning). This should be merged **before** rapidsai/cudf#17558 (otherwise Dask-CUDA CI will break). This PR is marked as "breaking", because it technically breaks the `"explicit-comms"` config with the "legacy" version of Dask cuDF (which we are about to remove in 25.02 anyway). Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - James Lamb (https://github.com/jameslamb) - Mads R. B. Kristensen (https://github.com/madsbk) URL: #1417
Follow up to #1417 Cleans up some imports (some of which don't work for `dask>2024.12.1`). Authors: - Richard (Rick) Zamora (https://github.com/rjzamora) Approvers: - Mads R. B. Kristensen (https://github.com/madsbk) - Peter Andreas Entschev (https://github.com/pentschev) URL: #1424
Numba 0.61.0 just got released with couple of breaking changes, this pr is required to unblock the ci. xref: rapidsai/cudf#17777 Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Gil Forsyth (https://github.com/gforsyth) URL: #1426
Pull in build dependencies from `pyproject.toml` into Conda's `meta.yaml`. Authors: - https://github.com/jakirkham Approvers: - Peter Andreas Entschev (https://github.com/pentschev) - Ray Douglass (https://github.com/raydouglass) URL: #1425
shellcheck is a fast, static analysis tool for shell scripts. It's good at flagging up unused variables, unintentional glob expansions, and other potential execution and security headaches that arise from the wonders of bash (and other shlangs). This PR adds a pre-commit hook to run shellcheck on all of the sh-lang files in the ci/ directory, and the changes requested by shellcheck to make the existing files pass the check. xref: rapidsai/build-planning#135 Authors: - Gil Forsyth (https://github.com/gforsyth) Approvers: - Bradley Dice (https://github.com/bdice) - Peter Andreas Entschev (https://github.com/pentschev) URL: #1427
A new configuration to the UCX comms module was introduced in rapidsai/rapids-dask-dependency#80, this is designed to help with timeouts in larger clusters, and sometimes even small ones depending on the architecture. This change documents that new configuration. Authors: - Peter Andreas Entschev (https://github.com/pentschev) Approvers: - Benjamin Zaitlen (https://github.com/quasiben) URL: #1428
Contributes to rapidsai/build-planning#142 `ucx-proc` is no longer necessary, for the reasons described in that issue. This proposes dropping the dependency on it here. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #1429
This PR uses CUDA 12.8.0 to build and test. xref: rapidsai/build-planning#139 Authors: - Bradley Dice (https://github.com/bdice) Approvers: - James Lamb (https://github.com/jameslamb) URL: #1432
This PR points the shared workflow branches back to the default 25.02 branches. xref: rapidsai/build-planning#139 Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #1436
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
❄️ Code freeze for
branch-25.02
and v25.02 releaseWhat does this mean?
Only critical/hotfix level issues should be merged into
branch-25.02
until release (merging of this PR).What is the purpose of this PR?
branch-25.02
intomain
for the release