-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature](nereids) Support inner join query rewrite by materialized view #27922
Conversation
run buildall |
7 similar comments
run buildall |
run buildall |
run buildall |
run buildall |
run buildall |
run buildall |
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
e93fd55
to
f0075e6
Compare
run buildall |
e0ce9d1
to
cdf8817
Compare
run buildall |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
run buildall |
9c9ba36
to
d2497b3
Compare
run buildall |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
planner.plan(unboundMvPlan, PhysicalProperties.ANY, ExplainLevel.ALL_PLAN); | ||
Plan mvAnalyzedPlan = planner.getAnalyzedPlan(); | ||
Plan mvRewrittenPlan = planner.getRewrittenPlan(); | ||
Plan mvPlan = mvRewrittenPlan instanceof LogicalResultSink |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getAnalyzedPlan is only visible for testing.
May be can only use rewritten plan and get plan by:
planner.plan(unboundMvPlan, PhysicalProperties.ANY, ExplainLevel.ALL_PLAN); | |
Plan mvAnalyzedPlan = planner.getAnalyzedPlan(); | |
Plan mvRewrittenPlan = planner.getRewrittenPlan(); | |
Plan mvPlan = mvRewrittenPlan instanceof LogicalResultSink | |
Plan rewrittenPlan = planner.plan(plan, PhysicalProperties.ANY, ExplainLevel.REWRITTEN_PLAN); | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, you are right
// TODO Should get struct info from hyper graph and check | ||
return false; | ||
HyperGraph hyperGraph = structInfo.getHyperGraph(); | ||
HashSet<JoinType> requiredJoinType = Sets.newHashSet(JoinType.INNER_JOIN, JoinType.LEFT_OUTER_JOIN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be as a static member
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,have fix it
// Can not compensate, bail out | ||
if (compensatePredicates == null || compensatePredicates.isEmpty()) { | ||
if (compensatePredicates.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about when predicates are exactly the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the same, the compensatePredicates will be always true. call this method org.apache.doris.nereids.rules.exploration.mv.Predicates.SplitPredicate#isAlwaysTrue
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
run buildall |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
PR approved by anyone and no changes requested. |
10c27d0
to
be85ee2
Compare
run buildall |
(From new machine)TeamCity pipeline, clickbench performance test result: |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
# Conflicts: # fe/fe-core/src/main/java/org/apache/doris/qe/StmtExecutor.java
be85ee2
to
d302529
Compare
run buildall |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
if (context.getSessionVariable().isEnableMaterializedViewRewrite()) { | ||
planner.addHook(InitMaterializationContextHook.INSTANCE); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better do this when init planner?
@@ -39,7 +39,7 @@ public class TableCollector extends DefaultPlanVisitor<Void, TableCollectorConte | |||
public Void visit(Plan plan, TableCollectorContext context) { | |||
if (plan instanceof CatalogRelation) { | |||
TableIf table = ((CatalogRelation) plan).getTable(); | |||
if (context.getTargetTableTypes().contains(table.getType())) { | |||
if (context.getTargetTableTypes().isEmpty() || context.getTargetTableTypes().contains(table.getType())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why add isEmpty? should add some comment to explain it
@Override | ||
public boolean equals(Object obj) { | ||
return super.equals(obj); | ||
} | ||
|
||
@Override | ||
public int hashCode() { | ||
return super.hashCode(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary code
@Override | ||
public boolean equals(Object obj) { | ||
return super.equals(obj); | ||
} | ||
|
||
@Override | ||
public int hashCode() { | ||
return super.hashCode(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary code
PR approved by at least one committer and no changes requested. |
…iew (apache#27922) Work in process. Support inner join query rewrite by materialized view in some scene. Such as an exmple as following: > mv = "select lineitem.L_LINENUMBER, orders.O_CUSTKEY " + > "from orders " + > "inner join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY " > query = "select lineitem.L_LINENUMBER " + > "from lineitem " + > "inner join orders on lineitem.L_ORDERKEY = orders.O_ORDERKEY "
… condition has alias (#44779) ### What problem does this PR solve? Related PR: #27922 Problem Summary: query and mv def are as following,` partsupp.public_col as public_col ` is alias, this would cause rewritting fail by materialized view with msg, the graph logic between query and view is different. select o_custkey, o_orderdate, o_shippriority, o_comment, o_orderkey, orders.public_col as col1, l_orderkey, l_partkey, l_suppkey, lineitem.public_col as col2, ps_partkey, ps_suppkey, partsupp.public_col as col3, partsupp.public_col * 2 as col4, o_orderkey + l_orderkey + ps_partkey * 2, sum( o_orderkey + l_orderkey + ps_partkey * 2 ), count() as count_all from ( select o_custkey, o_orderdate, o_shippriority, o_comment, o_orderkey, orders.public_col as public_col from orders ) orders left join ( select l_orderkey, l_partkey, l_suppkey, lineitem.public_col as public_col from lineitem where lineitem.public_col is null or lineitem.public_col <> 1 ) lineitem on l_orderkey = o_orderkey inner join ( select ps_partkey, ps_suppkey, partsupp.public_col as public_col from partsupp ) partsupp on ps_partkey = o_orderkey where lineitem.public_col is null or lineitem.public_col <> 1 and o_orderkey = 2 group by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14; ### Release note Fix rewrite fail by materialized view when filter or join condition has alias
… condition has alias (apache#44779) Related PR: apache#27922 Problem Summary: query and mv def are as following,` partsupp.public_col as public_col ` is alias, this would cause rewritting fail by materialized view with msg, the graph logic between query and view is different. select o_custkey, o_orderdate, o_shippriority, o_comment, o_orderkey, orders.public_col as col1, l_orderkey, l_partkey, l_suppkey, lineitem.public_col as col2, ps_partkey, ps_suppkey, partsupp.public_col as col3, partsupp.public_col * 2 as col4, o_orderkey + l_orderkey + ps_partkey * 2, sum( o_orderkey + l_orderkey + ps_partkey * 2 ), count() as count_all from ( select o_custkey, o_orderdate, o_shippriority, o_comment, o_orderkey, orders.public_col as public_col from orders ) orders left join ( select l_orderkey, l_partkey, l_suppkey, lineitem.public_col as public_col from lineitem where lineitem.public_col is null or lineitem.public_col <> 1 ) lineitem on l_orderkey = o_orderkey inner join ( select ps_partkey, ps_suppkey, partsupp.public_col as public_col from partsupp ) partsupp on ps_partkey = o_orderkey where lineitem.public_col is null or lineitem.public_col <> 1 and o_orderkey = 2 group by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14; Fix rewrite fail by materialized view when filter or join condition has alias
… condition has alias (apache#44779) ### What problem does this PR solve? Related PR: apache#27922 Problem Summary: query and mv def are as following,` partsupp.public_col as public_col ` is alias, this would cause rewritting fail by materialized view with msg, the graph logic between query and view is different. select o_custkey, o_orderdate, o_shippriority, o_comment, o_orderkey, orders.public_col as col1, l_orderkey, l_partkey, l_suppkey, lineitem.public_col as col2, ps_partkey, ps_suppkey, partsupp.public_col as col3, partsupp.public_col * 2 as col4, o_orderkey + l_orderkey + ps_partkey * 2, sum( o_orderkey + l_orderkey + ps_partkey * 2 ), count() as count_all from ( select o_custkey, o_orderdate, o_shippriority, o_comment, o_orderkey, orders.public_col as public_col from orders ) orders left join ( select l_orderkey, l_partkey, l_suppkey, lineitem.public_col as public_col from lineitem where lineitem.public_col is null or lineitem.public_col <> 1 ) lineitem on l_orderkey = o_orderkey inner join ( select ps_partkey, ps_suppkey, partsupp.public_col as public_col from partsupp ) partsupp on ps_partkey = o_orderkey where lineitem.public_col is null or lineitem.public_col <> 1 and o_orderkey = 2 group by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14; ### Release note Fix rewrite fail by materialized view when filter or join condition has alias
Proposed changes
Work in process. Support inner join query rewrite by materialized view in some scene.
Such as an exmple as following:
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...