Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin to dask-2024.9.0 for the rapids 24.10 release #60

Merged
merged 1 commit into from
Sep 23, 2024

Conversation

rjzamora
Copy link
Member

I propose that we pin to dask-2024.9.0 for the rapids 24.10 release. The next release of dask is expected to be just after code freeze.

cc @pentschev @charlesbluca @quasiben @galipremsagar

@rjzamora rjzamora added the non-breaking Introduces a non-breaking change label Sep 23, 2024
@rjzamora rjzamora self-assigned this Sep 23, 2024
@rjzamora rjzamora requested review from vyasr and a team as code owners September 23, 2024 18:29
@galipremsagar galipremsagar added the improvement Improves an existing functionality label Sep 23, 2024
@galipremsagar
Copy link
Contributor

/merge

@rapids-bot rapids-bot bot merged commit cf61252 into rapidsai:branch-24.10 Sep 23, 2024
7 checks passed
@rjzamora rjzamora deleted the pin-2024.9.0 branch September 23, 2024 18:38
@pentschev
Copy link
Member

No objections from me. However, moving forward I would expect to allow time for people to raise any concerns, ideally a minimum of 24h, and not getting merged within 8 minutes.

@galipremsagar
Copy link
Contributor

@pentschev My Apologies for being too quick here ! I'll make sure to wait for the said time from here on.

@pentschev
Copy link
Member

Thanks @galipremsagar !

@rjzamora
Copy link
Member Author

It looks like this may have broken CI, may be necessary to pin the version of dask-expr to 1.1.14.

rapids-bot bot pushed a commit that referenced this pull request Sep 23, 2024
Follow up to #60

Dask 2024.9.0 does not automatically pin the correct version of dask-expr, so we need to be very specific.

Authors:
  - Richard (Rick) Zamora (https://github.com/rjzamora)

Approvers:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

URL: #64
@ilan-gold
Copy link

Hello @rjzamora and @pentschev, why is there a hard pin instead of a lower bound?

@pentschev
Copy link
Member

This is how the Dask dependency is handled in RAPIDS for at least 2-3 years. In summary, the reason is essentially that Dask simply does not provide any forward/backward compatibility and has a really short release cadence, so the only way for us to guarantee some stability is to have a hard pin. For more context see rapidsai/dask-cuda#848 .

@ilan-gold
Copy link

Hmmm @pentschev while I understand the need, and I am not a fan of calendar versioning in many ways for this reason, I do think it's a bit restrictive. For example dask/dask#11290 has not been fixed, which is fairly critical for us, and we are upper pinning as a result. With your pin, though, I'm going to have to fix this now on my own for dask, but I do think there could be a way forward here. It seems part of your issue was minimum pinning, can offer this script which may help: https://github.com/scverse/anndata/blob/main/ci/scripts/min-deps.py seems to work well for us. We could explore making it a package if you were interested.

@ilan-gold
Copy link

Perhaps maintaining pins per-"month" version going back three or four months?

@pentschev
Copy link
Member

I am not a fan of calendar versioning in many ways for this reason, I do think it's a bit restrictive.

I agree with both.

It seems part of your issue was minimum pinning

Not really, it goes both ways. As I said previously, Dask does not provide either forward nor backwards compatibility, so if we test with version X, there's no guarantee either X-1 nor X+1 will work, and leaving a minimum pin is just going to open up the possibility for version X+1 to break all of RAPIDS when it's released as has happened countless times in the past.

Perhaps maintaining pins per-"month" version going back three or four months?

This makes no difference, there's no per-month semantics in the versions, 2024.10.1 breaking something that worked with 2024.10.0 is just as likely as 2024.10.0 breaking something that worked with 2024.9.1.

With your pin, though, I'm going to have to fix this now on my own for dask

While I appreciate your input and desire to help and I'm also sorry to hear you're gonna have to fix something on your own, that decision is unlikely to change until Dask provides some sort of semantic versioning and/or backward/forward compatibility. One of the reasons why rapids-dask-dependency exists is to provide us a chance for fixing issues that cannot be backported or won't have a release in time for the next RAPIDS release simply because Dask does not have policies for either one, which essentially means we have to monkey-patch Dask through rapids-dask-dependency to achieve that, much like what it sounds you're gonna have to do.

@ilan-gold
Copy link

ilan-gold commented Oct 14, 2024

This makes no difference, there's no per-month semantics in the versions, 2024.10.1 breaking something that worked with 2024.10.0 is just as likely as 2024.10.0 breaking something that worked with 2024.9.1.

Ah, maybe I wasn't clear. I meant one-per-month. so once you confirm your release works, you keep it. but now you have something like

[project]
...
dependencies = []
[project.optional-dependencies]
dask202409 = [
    "dask==2024.9.0",
    "distributed==2024.9.0",
    "dask-expr==1.1.14",
]
dask202408 = [
    "dask==2024.8.0",
    "distributed==2024.8.0",
    "dask-expr==1.1.14",
]

and so forth, and then upstream, dask-cuda can allow any one of these as the allowable dependency.

In any case, having written this out, I see what you mean now. The month is completely arbitrary here - it just so happens that things broke for us here on exactly the month instead of on a sub-month version. I am not aware of any stability guarantees from dask in this way

@ilan-gold
Copy link

I guess what this would do is provide some flexibility as opposed to literally just one pin

@pentschev
Copy link
Member

If you need latest features/fixes, you can always use RAPIDS (including Dask-CUDA's) nightlies, they have a lower pin (the version that was pinned for the previous release) but no higher pin until the code freeze stage, which is when we decide what Dask version RAPIDS will depend on. As for flexibility, it really doesn't make any difference for us, again we can't guarantee either 2024.8.0 or 2024.9.0 is "good" until we do extensive testing with all RAPIDS projects that depend on Dask, testing is really the only way that gives us some confidence about stability, if we were to provide flexibility and support multiple Dask versions that would mean we must test on all of them which is a CI cost that is simply too high to maintain for very limited benefit.

TL;DR: if you absolutely need some latest fixes/features, use nightlies, if you're using a RAPIDS release you're gonna get what was the "latest stable" Dask version at the time, where "latest stable" in this context means the latest Dask version that we tested against at the time of code freeze and that presented no obvious errors.

@pentschev
Copy link
Member

Incidentally I've also realized we didn't unpin for 24.12 yet, which will be done when #67 is merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improves an existing functionality non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants