-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-7824][SQL]Collapsing operator reordering and constant folding into a single batch to push down the single side. #6351
Conversation
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
merge lastest spark
This optimizer can void CartesianProduct
SQL
Plan before modify
Plan after modify
|
Get it. The root cause here is we moved i think this change is reasonable. |
merge lastest spark
Would you get the same result (and possibly more) by collapsing |
yes, it can work. but this batch will have two different types of optimizers. @marmbrus |
That is not fundamentally a problem. Honestly some more thought probably needs to be put into the batches. Really the only reasons for splitting are the following:
|
@marmbrus /cc |
ok to test |
Test build #33867 has finished for PR 6351 at commit
|
@@ -36,21 +36,20 @@ object DefaultOptimizer extends Optimizer { | |||
// SubQueries are only needed for analysis and can be removed before execution. | |||
Batch("Remove SubQueries", FixedPoint(100), | |||
EliminateSubQueries) :: | |||
Batch("Operator Reordering", FixedPoint(100), | |||
Batch("Operator Optimizations", FixedPoint(100), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about "Operator Reordering and ConstantFolding"
Test build #33869 has finished for PR 6351 at commit
|
Test build #33875 has finished for PR 6351 at commit
|
merge lastest spark
ok to test |
Test build #34717 has finished for PR 6351 at commit
|
Thanks! Merged to master. |
NullPropagation, | ||
OptimizeIn, | ||
ConstantFolding, | ||
LikeSimplification, | ||
BooleanSimplification, | ||
PushPredicateThroughJoin, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we don't need to change the rule order inside a batch with fixed point. Rules within a single batch shouldn't be sensitive to execution order.
…into a single batch. SQL ``` select * from tableA join tableB on (a > 3 and b = d) or (a > 3 and b = e) ``` Plan before modify ``` == Optimized Logical Plan == Project [a#293,b#294,c#295,d#296,e#297] Join Inner, Some(((a#293 > 3) && ((b#294 = d#296) || (b#294 = e#297)))) MetastoreRelation default, tablea, None MetastoreRelation default, tableb, None ``` Plan after modify ``` == Optimized Logical Plan == Project [a#293,b#294,c#295,d#296,e#297] Join Inner, Some(((b#294 = d#296) || (b#294 = e#297))) Filter (a#293 > 3) MetastoreRelation default, tablea, None MetastoreRelation default, tableb, None ``` CombineLimits ==> Limit(If(LessThan(ne, le), ne, le), grandChild) and LessThan is in BooleanSimplification , so CombineLimits must before BooleanSimplification and BooleanSimplification must before PushPredicateThroughJoin. Author: Zhongshuai Pei <799203320@qq.com> Author: DoingDone9 <799203320@qq.com> Closes apache#6351 from DoingDone9/master and squashes the following commits: 20de7be [Zhongshuai Pei] Update Optimizer.scala 7bc7d28 [Zhongshuai Pei] Merge pull request apache#17 from apache/master 0ba5f42 [Zhongshuai Pei] Update Optimizer.scala f8b9314 [Zhongshuai Pei] Update FilterPushdownSuite.scala c529d9f [Zhongshuai Pei] Update FilterPushdownSuite.scala ae3af6d [Zhongshuai Pei] Update FilterPushdownSuite.scala a04ffae [Zhongshuai Pei] Update Optimizer.scala 11beb61 [Zhongshuai Pei] Update FilterPushdownSuite.scala f2ee5fe [Zhongshuai Pei] Update Optimizer.scala be6b1d5 [Zhongshuai Pei] Update Optimizer.scala b01e622 [Zhongshuai Pei] Merge pull request apache#15 from apache/master 8df716a [Zhongshuai Pei] Update FilterPushdownSuite.scala d98bc35 [Zhongshuai Pei] Update FilterPushdownSuite.scala fa65718 [Zhongshuai Pei] Update Optimizer.scala ab8e9a6 [Zhongshuai Pei] Merge pull request apache#14 from apache/master 14952e2 [Zhongshuai Pei] Merge pull request apache#13 from apache/master f03fe7f [Zhongshuai Pei] Merge pull request apache#12 from apache/master f12fa50 [Zhongshuai Pei] Merge pull request apache#10 from apache/master f61210c [Zhongshuai Pei] Merge pull request apache#9 from apache/master 34b1a9a [Zhongshuai Pei] Merge pull request apache#8 from apache/master 802261c [DoingDone9] Merge pull request apache#7 from apache/master d00303b [DoingDone9] Merge pull request apache#6 from apache/master 98b134f [DoingDone9] Merge pull request apache#5 from apache/master 161cae3 [DoingDone9] Merge pull request apache#4 from apache/master c87e8b6 [DoingDone9] Merge pull request apache#3 from apache/master cb1852d [DoingDone9] Merge pull request apache#2 from apache/master c3f046f [DoingDone9] Merge pull request apache#1 from apache/master
SQL
Plan before modify
Plan after modify
CombineLimits ==> Limit(If(LessThan(ne, le), ne, le), grandChild) and LessThan is in BooleanSimplification , so CombineLimits must before BooleanSimplification and BooleanSimplification must before PushPredicateThroughJoin.