Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](Nereids): one side eager aggregation #28143

Merged
merged 2 commits into from
Dec 12, 2023
Merged

Conversation

jackwener
Copy link
Member

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@jackwener
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit a49a17f0107ccf4e6acf516da70a7b3f3d19b574, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4748	4508	4494	4494
q2	370	148	166	148
q3	1453	1274	1206	1206
q4	1119	918	888	888
q5	3103	3141	3135	3135
q6	251	129	126	126
q7	976	488	488	488
q8	2185	2209	2176	2176
q9	6655	6672	6658	6658
q10	3205	3237	3264	3237
q11	320	195	185	185
q12	348	199	200	199
q13	4556	3811	3826	3811
q14	240	214	218	214
q15	557	518	530	518
q16	435	379	386	379
q17	992	597	568	568
q18	7434	7653	7438	7438
q19	1522	1370	1422	1370
q20	540	371	782	371
q21	3045	2680	2726	2680
q22	347	277	282	277
Total cold run time: 44401 ms
Total hot run time: 40566 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4470	4456	4462	4456
q2	266	165	175	165
q3	3545	3533	3511	3511
q4	2385	2359	2365	2359
q5	5735	5725	5735	5725
q6	240	122	123	122
q7	2361	1881	1904	1881
q8	3517	3520	3521	3520
q9	9066	8983	9015	8983
q10	3883	3976	3994	3976
q11	503	369	382	369
q12	760	596	582	582
q13	4277	3542	3550	3542
q14	286	262	264	262
q15	569	517	515	515
q16	499	435	452	435
q17	1861	1870	1857	1857
q18	8698	8164	9108	8164
q19	1707	1719	1760	1719
q20	2249	1937	1938	1937
q21	6501	6139	6136	6136
q22	491	420	413	413
Total cold run time: 63869 ms
Total hot run time: 60629 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.06 seconds
stream load tsv: 584 seconds loaded 74807831229 Bytes, about 122 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.6 seconds inserted 10000000 Rows, about 349K ops/s
storage size: 17216251346 Bytes

@jackwener
Copy link
Member Author

run buildall

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 68412eb874a571cc9942751c02458f78aae13592, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4671	4420	4478	4420
q2	371	143	159	143
q3	1462	1223	1206	1206
q4	1113	921	888	888
q5	3122	3173	3149	3149
q6	258	131	129	129
q7	1017	487	478	478
q8	2190	2250	2192	2192
q9	6658	6675	6639	6639
q10	3249	3260	3254	3254
q11	318	201	206	201
q12	346	205	211	205
q13	4549	3815	3779	3779
q14	247	209	219	209
q15	574	528	524	524
q16	440	383	383	383
q17	1015	598	603	598
q18	7530	7665	6850	6850
q19	1522	1398	1419	1398
q20	496	301	295	295
q21	3080	2633	2703	2633
q22	348	277	277	277
Total cold run time: 44576 ms
Total hot run time: 39850 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4408	4408	4425	4408
q2	265	162	173	162
q3	3535	3519	3524	3519
q4	2382	2365	2358	2358
q5	5736	5738	5733	5733
q6	238	119	123	119
q7	2369	1879	1895	1879
q8	3538	3527	3542	3527
q9	9022	9047	9014	9014
q10	3898	3998	3988	3988
q11	503	380	388	380
q12	768	586	601	586
q13	4304	3567	3542	3542
q14	284	252	255	252
q15	570	524	522	522
q16	493	466	453	453
q17	1848	1854	1863	1854
q18	8770	8259	8291	8259
q19	1706	1774	1757	1757
q20	2247	1958	1923	1923
q21	6487	6148	6117	6117
q22	511	413	401	401
Total cold run time: 63882 ms
Total hot run time: 60753 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 43.89 seconds
stream load tsv: 578 seconds loaded 74807831229 Bytes, about 123 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 33 seconds loaded 861443392 Bytes, about 24 MB/s
insert into select: 28.9 seconds inserted 10000000 Rows, about 346K ops/s
storage size: 17219185026 Bytes

logicalAggregate(innerLogicalJoin())
.when(agg -> agg.child().getOtherJoinConjuncts().isEmpty())
.whenNot(agg -> agg.child().children().stream().anyMatch(p -> p instanceof LogicalAggregate))
.when(agg -> agg.getGroupByExpressions().stream().allMatch(e -> e instanceof Slot))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not put "one side only" checking at the pre-checking stage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a big impact, change it together with the next PR

Count oldTopCnt = (Count) ((Alias) ne).child();

Slot slot = (Slot) oldTopCnt.child(0);
if (leftCntSlotToOutput.containsKey(slot)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we do some pre-processing ahead, such as judge which side will push down, and at this point, we just do the transformation without the additional 'if-else'?

logicalAggregate(innerLogicalJoin())
.when(agg -> agg.child().getOtherJoinConjuncts().isEmpty())
.whenNot(agg -> agg.child().children().stream().anyMatch(p -> p instanceof LogicalAggregate))
.when(agg -> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some as the above comments, do 'one side only' judgment ahead here?

Sum oldTopSum = (Sum) ((Alias) ne).child();

Slot slot = (Slot) oldTopSum.child(0);
if (leftSumSlotToOutput.containsKey(slot)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some as the above comments, do the definite side push transformation where without 'if-else' by moving some pre-processing at former logic?

Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 12, 2023
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@jackwener jackwener merged commit d25cbdd into apache:master Dec 12, 2023
@jackwener jackwener deleted the eager branch December 12, 2023 07:38
@wuyanxing
Copy link

why a rbo not a cbo path? @jackwener

xzj7019 pushed a commit to xzj7019/doris that referenced this pull request Dec 13, 2023
XuJianxu pushed a commit to XuJianxu/doris that referenced this pull request Dec 14, 2023
HappenLee pushed a commit to HappenLee/incubator-doris that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants