-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deduplicate scheduler requests in P2P #8899
Deduplicate scheduler requests in P2P #8899
Conversation
Last week @phofl mentioned that it would be great to get more reviews on PRs as that is currently a point of friction for folks actively working on Dask/Distributed. However I find it hard to dive into reviewing a PR like this as it doesn't close an open issue and it doesn't have a description of what the change does or any examples of what this fixes. I'd love to help out more with reviews here, but I don't know where to begin with this one. cc @fjetter |
@jacobtomlinson, it looks like you wrote this just as I added a description. Is this helpful? (Sorry for having it "ready for review" without a description, lost my internet connection for a couple of minutes so the update to the description didn't go out in sync with the change in readiness.) |
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 25 files ± 0 25 suites ±0 10h 16m 54s ⏱️ + 3m 10s For more details on these failures, see this check. Results for commit 2257deb. ± Comparison against base commit fa9806b. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this lock makes sense to me to slow down concurrent connections.
@jacobtomlinson The examples you're listing are surprising for me
I understand some criticism about the first (bad title) and last (lacking context) but the others are fine, aren't they? |
That's fair, I just grepped through the last couple of pages of I'm very happy to help out with PR review if that's something that would be valuable, but folks outside of the Coiled team have a lot less context about why things are being done, so it would be really helpful to have more descriptive PRs. |
My sense is that generally we do provide decent descriptions. I don't really want to dig through the past and discuss bad issues or PRs now. I don't think this is the reason why few people are helping out. I'm happy to revisit this if this is a persistent issue |
Thats totally fair
I'm trying to feed back to you all that it's one of the reasons why I help out less than I used to. I hope you can be receptive to that. I'm also trying to communicate that I want to help more. Sorry that this turned into such a big deal. My intent was to gently nudge so that I can get more involved with reviews. But it's blown up a bit so apologies for any bad feeling generated from this discussion. |
It's noted and appreciated. To me, this seems like your usual chicken-and-egg problem. With little external engagement around reviews or code contributions, there's little time for writing good descriptions (and less value in doing so). With bad descriptions, there's little external engagement. Let's try to get to a minimal standard where we write something that's helpful but also doesn't take up much time. Personally, I'd say: Feel free to ping people on PRs you'd like to review but lack sufficient context. We'll just have to find a good middle ground here so that we don't blow the effort for documentation out of proportion. |
This is a good suggestion! |
In P2P, a worker may send many concurrent requests to the scheduler if it is not aware of a specific shuffle run. This could lead to DDoS-like scenarios on large-scale clusters with P2P rechunking. This PR prohibits concurrent fetch requests.
pre-commit run --all-files