Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-44647][SQL] Support SPJ where join keys are less than cluster keys #42306

Closed
wants to merge 5 commits into from

Commits on Sep 8, 2023

  1. [SPARK-44647][SQL] Support SPJ where join keys are less than cluster …

    …keys
    
     ### What changes were proposed in this pull request?
    - Add new conf spark.sql.sources.v2.bucketing.allowJoinKeysSubsetOfPartitionKeys.enabled
    - Change key compatibility checks in EnsureRequirements.  Remove checks where all partition keys must be in join keys to allow isKeyCompatible = true in this case (if this flag is enabled)
    - "Project" partitions by join keys in KeyGroupedPartitioning/KeyGroupedShuffleSpec
    - Add join key grouping to the partition grouping in BatchScanExec
    
       ### Why are the changes needed?
    - Support Storage Partition Join in cases where the join condition does not contain all the partition keys, but just some of them
    
        ### Does this PR introduce _any_ user-facing change?
    No
    
        ### How was this patch tested?
    -Added tests in KeyGroupedPartitioningSuite
    -Because of apache#37886   we have to select all join keys to trigger SPJ in this case, otherwise DSV2 scan does not report KeyGroupedPartitioning and SPJ does not get triggered.  Need to see how to relax this in separate PR.
    szehon-ho committed Sep 8, 2023
    Configuration menu
    Copy the full SHA
    1436c5a View commit details
    Browse the repository at this point in the history
  2. Review comments

    szehon-ho committed Sep 8, 2023
    Configuration menu
    Copy the full SHA
    0df6e97 View commit details
    Browse the repository at this point in the history

Commits on Sep 9, 2023

  1. Review comments

    szehon-ho committed Sep 9, 2023
    Configuration menu
    Copy the full SHA
    a62e32b View commit details
    Browse the repository at this point in the history
  2. Fix typo

    szehon-ho committed Sep 9, 2023
    Configuration menu
    Copy the full SHA
    6a7ca35 View commit details
    Browse the repository at this point in the history

Commits on Sep 10, 2023

  1. Fix sqlconf test

    szehon-ho committed Sep 10, 2023
    Configuration menu
    Copy the full SHA
    e832652 View commit details
    Browse the repository at this point in the history