[SPARK-41471][SQL] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning #42194

Hisoka-X · 2023-07-28T02:10:43Z

What changes were proposed in this pull request?

When only one side of a SPJ (Storage-Partitioned Join) is KeyGroupedPartitioning, Spark currently needs to shuffle both sides using HashPartitioning. However, we may just need to shuffle the other side according to the partition transforms defined in KeyGroupedPartitioning. This is especially useful when the other side is relatively small.

Add new config spark.sql.sources.v2.bucketing.shuffle.enabled to control this feature enable or not.
Add KeyGroupedPartitioner use to partition when we know the tranform value of another side (KeyGroupedPartitioning at now). Spark already know the partition value with partition id of KeyGroupedPartitioning side in EnsureRequirements. Then save it in KeyGroupedPartitioner use to shuffle another partition, to make sure the same key data will shuffle into same partition.
only identity transform will work now. Because have another problem for now, same transform between DS V2 connector implement and catalog function will report different value, before solve this problem, we should only support identity. eg: in test package, YearFunction https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/connector/catalog/functions/transformFunctions.scala#L47 and https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryBaseTable.scala#L143

Why are the changes needed?

Reduce data shuffle in specific SPJ scenarios

Does this PR introduce any user-facing change?

No

How was this patch tested?

add new test

…is KeyGroupedPartitioning

Hisoka-X · 2023-07-28T02:11:02Z

cc @sunchao @cloud-fan

sunchao · 2023-08-03T20:31:33Z

Sorry for the delay. I'll take a look at this in the next 1-2 days.

core/src/main/scala/org/apache/spark/Partitioner.scala

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala

core/src/main/scala/org/apache/spark/Partitioner.scala

sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala

sunchao

Thanks for the update @Hisoka-X , looks much better now!

core/src/main/scala/org/apache/spark/Partitioner.scala

sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala

sunchao

LGTM except one minor comment.

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

sunchao · 2023-08-12T01:00:18Z

cc @cloud-fan for another check too

sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala

szehon-ho

I'm wondering, will this work if other SPJ flags: spark.sql.sources.v2.bucketing.partiallyClusteredDistribution.enabled and spark.sql.sources.v2.bucketing.pushPartValues.enabled are set to true, and we have case of multiple split for a partition?

In that case, it seems the BatchScanExec of the side with KeyGroupedPartitioning will group partition splits, will it make it out-of-sync with the other side using KeyGroupedPartitioner?

It may also be hard to work with #42306 which uses a similar mechanism to group partition splits?

Hisoka-X · 2023-08-15T09:43:30Z

I'm wondering, will this work if other SPJ flags: spark.sql.sources.v2.bucketing.partiallyClusteredDistribution.enabled and spark.sql.sources.v2.bucketing.pushPartValues.enabled are set to true, and we have case of multiple split for a partition?

In that case, it seems the BatchScanExec of the side with KeyGroupedPartitioning will group partition splits, will it make it out-of-sync with the other side using KeyGroupedPartitioner?

It may also be hard to work with #42306 which uses a similar mechanism to group partition splits?

This is a problem, let me add a test case for this, maybe we should use partitionValue which after group partition splits. Thanks for point that, cc @sunchao

szehon-ho · 2023-08-15T17:11:20Z

maybe we should use partitionValue which after group partition splits.

Thanks, I think that would be great if we can somehow !

sunchao · 2023-08-15T21:36:02Z

Yes good point @szehon-ho . If spark.sql.sources.v2.bucketing.partiallyClusteredDistribution.enabled and spark.sql.sources.v2.bucketing.pushPartValues.enabled are both turned on, this may not work since the data on the hash partitioning side is shuffled according to the partition values before the grouping, which contain duplicates.

Perhaps we should use KeyGroupedPartitioning.uniquePartitionValues when computing valueMap.

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala

sunchao

Looks good to me, just one nit. Thanks @Hisoka-X !

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala

sunchao

LGTM! (again) Sorry @Hisoka-X I have a few more tiny nits. Otherwise it looks great to me!

core/src/main/scala/org/apache/spark/Partitioner.scala

sql/core/src/test/scala/org/apache/spark/sql/execution/exchange/EnsureRequirementsSuite.scala

sunchao · 2023-08-24T15:36:19Z

Merged to master, thanks @Hisoka-X and @szehon-ho !

Hisoka-X · 2023-08-24T23:41:04Z

Thanks @sunchao and @szehon-ho !

cloud-fan · 2023-08-25T02:53:11Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+    buildConf("spark.sql.sources.v2.bucketing.shuffle.enabled")
+      .doc("During a storage-partitioned join, whether to allow to shuffle only one side." +
+        "When only one side is KeyGroupedPartitioning, if the conditions are met, spark will " +
+        "only shuffle the other side. This optimization will reduce the amount of data that " +


shall we make the algorithm smarter? If the other side is large, doing a KeyGroupedPartitioning may lead to skew and it's still better to shuffle both sides with hash partitioning.

Let's think of an extreme case: one side reports KeyGroupedPartitioning with only one partition, with this optimization, we end up with doing the join using a single thread.

I think the ShuffleSpec "framework" in EnsureRequirements already takes this into consideration. This PR mainly makes KeyGroupedShuffleSpec behaves similar to HashShuffleSpec and be able to shuffle the other side (via making canCreatePartitioning return true).

…is KeyGroupedPartitioning ### What changes were proposed in this pull request? When only one side of a SPJ (Storage-Partitioned Join) is KeyGroupedPartitioning, Spark currently needs to shuffle both sides using HashPartitioning. However, we may just need to shuffle the other side according to the partition transforms defined in KeyGroupedPartitioning. This is especially useful when the other side is relatively small. 1. Add new config `spark.sql.sources.v2.bucketing.shuffle.enabled` to control this feature enable or not. 2. Add `KeyGroupedPartitioner` use to partition when we know the tranform value of another side (KeyGroupedPartitioning at now). Spark already know the partition value with partition id of KeyGroupedPartitioning side in `EnsureRequirements`. Then save it in `KeyGroupedPartitioner` use to shuffle another partition, to make sure the same key data will shuffle into same partition. 3. only `identity` transform will work now. Because have another problem for now, same transform between DS V2 connector implement and catalog function will report different value, before solve this problem, we should only support `identity`. eg: in test package, `YearFunction` https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/connector/catalog/functions/transformFunctions.scala#L47 and https://github.com/apache/spark/blob/master/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/InMemoryBaseTable.scala#L143 ### Why are the changes needed? Reduce data shuffle in specific SPJ scenarios ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? add new test Closes apache#42194 from Hisoka-X/SPARK-41471_one_side_keygroup. Authored-by: Jia Fan <fanjiaeminem@qq.com> Signed-off-by: Chao Sun <sunchao@apple.com>

[SPARK-41471][SQL] Reduce Spark shuffle when only one side of a join …

bc7f179

…is KeyGroupedPartitioning

github-actions bot added SQL CORE labels Jul 28, 2023

add new test case

260d054

Merge branch 'master' into SPARK-41471_one_side_keygroup

2a29387

sunchao reviewed Aug 8, 2023

View reviewed changes

update

b2b3a10

Hisoka-X force-pushed the SPARK-41471_one_side_keygroup branch from 96bce41 to b2b3a10 Compare August 9, 2023 06:55

Hisoka-X added 2 commits August 9, 2023 14:58

revert format

cb82be4

format

7c6cc23

sunchao reviewed Aug 9, 2023

View reviewed changes

core/src/main/scala/org/apache/spark/Partitioner.scala Show resolved Hide resolved

core/src/main/scala/org/apache/spark/Partitioner.scala Show resolved Hide resolved

core/src/main/scala/org/apache/spark/Partitioner.scala Show resolved Hide resolved

sunchao reviewed Aug 9, 2023

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala Outdated Show resolved Hide resolved

Hisoka-X added 2 commits August 10, 2023 13:25

update

50632c7

update

5a4a6dc

sunchao approved these changes Aug 11, 2023

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala Outdated Show resolved Hide resolved

Hisoka-X added 3 commits August 12, 2023 09:41

change config name

31eca2d

Merge branch 'master_' into SPARK-41471_one_side_keygroup

030516e

Merge branch 'master_' into SPARK-41471_one_side_keygroup

fa93c66

szehon-ho reviewed Aug 15, 2023

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala Outdated Show resolved Hide resolved

szehon-ho reviewed Aug 15, 2023

View reviewed changes

update

82488fb

fix bug

a1db61d

Hisoka-X commented Aug 18, 2023

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala Outdated Show resolved Hide resolved

Merge branch 'master_' into SPARK-41471_one_side_keygroup

3a7ea49

sunchao reviewed Aug 22, 2023

View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala Outdated Show resolved Hide resolved

Hisoka-X added 3 commits August 23, 2023 09:47

fix review

21255cf

update comment

1a5e2b7

update comment

3c98fd7

sunchao approved these changes Aug 23, 2023

View reviewed changes

fix review

5d227f2

sunchao closed this in ce12f6d Aug 24, 2023

cloud-fan reviewed Aug 25, 2023

View reviewed changes

Hisoka-X deleted the SPARK-41471_one_side_keygroup branch September 1, 2023 12:34

szehon-ho mentioned this pull request Apr 27, 2024

[SPARK-48012][SQL] SPJ: Support Transfrom Expressions for One Side Shuffle #46255

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-41471][SQL] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning #42194

[SPARK-41471][SQL] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning #42194

Hisoka-X commented Jul 28, 2023 •

edited

Loading

Hisoka-X commented Jul 28, 2023

sunchao commented Aug 3, 2023

sunchao left a comment

sunchao left a comment

sunchao commented Aug 12, 2023

szehon-ho left a comment •

edited

Loading

Hisoka-X commented Aug 15, 2023

szehon-ho commented Aug 15, 2023

sunchao commented Aug 15, 2023

sunchao left a comment

sunchao left a comment

sunchao commented Aug 24, 2023 •

edited

Loading

Hisoka-X commented Aug 24, 2023

cloud-fan Aug 25, 2023

sunchao Aug 25, 2023

[SPARK-41471][SQL] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning #42194

[SPARK-41471][SQL] Reduce Spark shuffle when only one side of a join is KeyGroupedPartitioning #42194

Conversation

Hisoka-X commented Jul 28, 2023 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Hisoka-X commented Jul 28, 2023

sunchao commented Aug 3, 2023

sunchao left a comment

Choose a reason for hiding this comment

sunchao left a comment

Choose a reason for hiding this comment

sunchao commented Aug 12, 2023

szehon-ho left a comment • edited Loading

Choose a reason for hiding this comment

Hisoka-X commented Aug 15, 2023

szehon-ho commented Aug 15, 2023

sunchao commented Aug 15, 2023

sunchao left a comment

Choose a reason for hiding this comment

sunchao left a comment

Choose a reason for hiding this comment

sunchao commented Aug 24, 2023 • edited Loading

Hisoka-X commented Aug 24, 2023

cloud-fan Aug 25, 2023

Choose a reason for hiding this comment

sunchao Aug 25, 2023

Choose a reason for hiding this comment

Hisoka-X commented Jul 28, 2023 •

edited

Loading

szehon-ho left a comment •

edited

Loading

sunchao commented Aug 24, 2023 •

edited

Loading