Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Depend on released version of Dask #85

Merged
merged 2 commits into from
Feb 21, 2025
Merged

Conversation

TomAugspurger
Copy link
Contributor

@TomAugspurger TomAugspurger commented Feb 12, 2025

This updates the required version of Dask / distributed to 2025.2.0, rather than main.

Paired with the recent changes in dask-upstream-testing, we'll have some coverage for downstream RAPIDS projects against Dask main.

The summary of our version policy is for RAPIDS libraries to depend only on released versions of dask, which is controlled in rapids-dask-dependency. CI runs in repos like rapidsai/cudf will pick up the latest version of dask allowed by rapids-dask-dependency.

To ensure we catch and address issues between Dask and RAPIDS before, rapids-dask-dependency includes a cron job that creates an environment with RAPIDS nightly and Dask main. It then runs upstream tests from Dask and downstream tests (e.g. cudf_dask) in that environment. Any failures there will be addressed by PRs in the appropriate library.

This updates the required version of Dask / distributed to 2025.1.0,
rather than `main`.
TomAugspurger added a commit to TomAugspurger/dask-upstream-testing that referenced this pull request Feb 12, 2025
rapidsai/rapids-dask-dependency#85 changes the
dask dependency to use released versions of Dask. To catch prominant
breakage in downstream libraries (like dask-cudf) with dask main, we'll
add some novel tests here that run downstream (cudf) nightly against
upstream (dask) `main`.
TomAugspurger added a commit to TomAugspurger/dask-upstream-testing that referenced this pull request Feb 12, 2025
rapidsai/rapids-dask-dependency#85 changes the
dask dependency to use released versions of Dask. To catch prominant
breakage in downstream libraries (like dask-cudf) with dask main, we'll
add some novel tests here that run downstream (cudf) nightly against
upstream (dask) `main`.
@TomAugspurger TomAugspurger marked this pull request as ready for review February 12, 2025 19:12
@TomAugspurger TomAugspurger requested a review from a team as a code owner February 12, 2025 19:12
Copy link
Contributor

@galipremsagar galipremsagar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an open request I raised here: rapidsai/dask-upstream-testing#6 (comment), which will be blocking to this PR.

pyproject.toml Outdated
Comment on lines 15 to 16
"dask>=2025.1.0",
"distributed>=2025.1.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"dask>=2025.1.0",
"distributed>=2025.1.0",
"dask==2025.1.0",
"distributed==2025.1.0",

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want this, otherwise if Dask gets released tomorrow everything will break again.

Copy link
Member

@rjzamora rjzamora Feb 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want this, otherwise if Dask gets released tomorrow everything will break again.

There is technically nothing broken in dask:main right now (and this PR probably won't be merged until after tomorrow). This PR is meant to be paired with rapidsai/dask-upstream-testing#6, and it's essentially a proposal to prevent dask:main commits from breaking all of rapids CI. We really just want dask-upstream-testing to catch the breaking change so we can fix things without blocking all down-stream projects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really just want dask-upstream-testing to catch the breaking change so we can fix things without blocking all down-stream projects.

I will again oppose this. It has been the case in the past that with minimal testing, and sometimes even with extended testing across all RAPIDS projects (as setup today) is hard to catch errors. Now what this is going to do is to essentially NOT test all of RAPIDS against changes with main, which could help us catching early problems and fix them (hopefully) quickly. With this change we push all of that to the time when a new release occurs, which has historically been very problematic for us, especially when those changes occur close to a RAPIDS release. Furthermore, this will not prevent issues from showing up broadly in RAPIDS' CI, it will merely shift when that happens, instead of showing various smaller breakages one at a time it will ultimately show various breakages all at once after a new release.

This is another step going the exact opposite direction from what I've been working for the past 5+ years to make testing "somewhat" useful. If you want to go that direction then I'll really stop watching about any Dask testing and issues whatsoever from this moment onwards and someone will have to take full responsibility for those issues. Until recently I've been the only person in the entirety of RAPIDS who have been always on top of the state of Dask testing, and if things are setup in this way is because the experience I gathered showed this was the most balanced way given the way Dask moves, if now you want to ignore that and change to something you think is best, please take full responsibility going further.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some relevant thoughts in rapidsai/cudf#17999.

@pentschev - Your strong opposition to the proposed plan is well-received, and will definitely be considered carefully. You are correct that the plan will not solve the (very difficult) Dask + RAPIDS maintenance problem. It will only solve the "daily CI fire-drill problem". When we adopt a new release, we are still bound to uncover new problems.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your strong opposition to the proposed plan is well-received, and will definitely be considered carefully.

At this point I'm not interested in having opinions considered or not, I want people making the decisions to stand by them, take the responsibility and ultimately come up with solutions. So far it's been the case decisions are sequentially made that oppose my recommendations, but when SHTF then I'm questioned why problems are piling up and everyone is pissed off by that. IOW, I had to take responsibility for things I recommended against.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOW, I had to take responsibility for things I recommended against.

It is unacceptable for you to take responsibility for anything you recommended against. I absolutely take responsibility for the state of RAPIDS + Dask testing and integration, and I will try to make sure you don't take any heat in the future. You are a fantastic engineer, so I tend to ask for your thoughts often. I will try to be more clear that you are not personally responsible for dask-cudf/dask-cuda maintenance.

@TomAugspurger
Copy link
Contributor Author

Updated to pin (exactly) to dask 2025.2.0.

https://github.com/rapidsai/dask-upstream-testing is currently running dask-cudf's and dask-cuda's tests in an environment with rapids nightly and dask main. I'm going to go through and find other rapids projects that have some exposure to dask main to add those tests to dask-upstream-testing as well.

If we're happy with the overall approach of "regular CI tested against released dask, dask-dev tested nightly in dask-upstream-testing" then we should be able to merge this when we're comfortable with the coverage in dask-upstream-testing.

@TomAugspurger TomAugspurger added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels Feb 20, 2025
@rjzamora
Copy link
Member

Thanks @TomAugspurger !

I believe the next cron job for https://github.com/rapidsai/dask-upstream-testing will kick off in about an hour. I don't expect the tests to pass. However, I do hope that the result gives us confidence that we can update our pinning strategy in rapids-dask-dependency (and we can merge this PR later today).

@galipremsagar - How does that plan sound?

@rjzamora rjzamora dismissed galipremsagar’s stale review February 21, 2025 14:52

Review out of date.

Copy link
Member

@rjzamora rjzamora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RAPIDS CI is breaking every day from changes in dask::main.

For now, we can use https://github.com/rapidsai/dask-upstream-testing to see these failures without blocking RAPIDS CI. Thanks @TomAugspurger !

@rjzamora
Copy link
Member

/merge

@rapids-bot rapids-bot bot merged commit 2993770 into branch-25.04 Feb 21, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants