-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-32918][SHUFFLE] RPC implementation to support control plane coordination for push-based shuffle #30163
Conversation
…ination for push-based shuffle
...n/network-shuffle/src/main/java/org/apache/spark/network/shuffle/MergeFinalizerListener.java
Show resolved
Hide resolved
...n/network-shuffle/src/main/java/org/apache/spark/network/shuffle/MergeFinalizerListener.java
Outdated
Show resolved
Hide resolved
Can one of the admins verify this patch? |
...network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockStoreClient.java
Show resolved
Hide resolved
…dd implementation in ExternalBlockStoreClient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A minor comment. Rest looks good to me.
...n/network-shuffle/src/main/java/org/apache/spark/network/shuffle/MergeFinalizerListener.java
Show resolved
Hide resolved
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/BlockStoreClient.java
Show resolved
Hide resolved
...network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockStoreClient.java
Outdated
Show resolved
Hide resolved
...n/network-shuffle/src/main/java/org/apache/spark/network/shuffle/MergeFinalizerListener.java
Outdated
Show resolved
Hide resolved
...network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockStoreClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is very hard (even impossible) to reason about code changes which are not used (as well as not even tested) so please consider cutting the push based shuffle feature as whole into PRs which covers a bit more meaningful parts on its own.
...network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockStoreClient.java
Show resolved
Hide resolved
@attilapiros thanks for that feedback, and we acknowledge the sentiment for the issues with small patches in a large feature. |
…RPC to shuffle service
@tgravescs, @attilapiros, @Ngone51 are there any other comments ? |
kicked the builds again as the error should have been fixed by 6da8ade |
Thanks for the +1 @tgravescs ! The scala 2.13 error is unrelated. |
Thanks for working on this @zhouyejoe ! |
Thanks everyone on reviewing this PR. |
…ordination for push-based shuffle ### What changes were proposed in this pull request? This is one of the patches for SPIP SPARK-30602 which is needed for push-based shuffle. Summary of changes: This PR introduces a new RPC to be called within Driver. When the expected shuffle push wait time reaches, Driver will call this RPC to facilitate coordination of shuffle map/reduce stages and notify external shuffle services to finalize shuffle block merge for a given shuffle. Shuffle services also respond back the metadata about a merged shuffle partition back to the caller. ### Why are the changes needed? Refer to the SPIP in SPARK-30602. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? This code snippets won't be called by any existing code and will be tested after the coordinated driver changes gets merged in SPARK-32920. Lead-authored-by: Min Shen mshenlinkedin.com Closes #30163 from zhouyejoe/SPARK-32918. Lead-authored-by: Ye Zhou <yezhou@linkedin.com> Co-authored-by: Min Shen <mshen@linkedin.com> Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
What changes were proposed in this pull request?
This is one of the patches for SPIP SPARK-30602 which is needed for push-based shuffle.
Summary of changes:
This PR introduces a new RPC to be called within Driver. When the expected shuffle push wait time reaches, Driver will call this RPC to facilitate coordination of shuffle map/reduce stages and notify external shuffle services to finalize shuffle block merge for a given shuffle. Shuffle services also respond back the metadata about a merged shuffle partition back to the caller.
Why are the changes needed?
Refer to the SPIP in SPARK-30602.
Does this PR introduce any user-facing change?
No
How was this patch tested?
This code snippets won't be called by any existing code and will be tested after the coordinated driver changes gets merged in SPARK-32920.
Lead-authored-by: Min Shen mshen@linkedin.com