Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](nereids) topN filter: use ObjectId as map key instead of Topn node #46551

Merged
merged 1 commit into from
Jan 8, 2025

Conversation

englefly
Copy link
Contributor

@englefly englefly commented Jan 7, 2025

What problem does this PR solve?

Plan node is not good to be hash map key, because two plan nodes in different tree level may be regarded as "equal". for example, in following tree, topn1.equals(topn2) may be true.
Topn filter generator should distinguish them, and hence topn
node is not suitable to be used as hash map key.

topn1
-->some node
-->topn2
-->other node

Issue Number: close #xxx

Related PR: #31485

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Jan 7, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

englefly commented Jan 7, 2025

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32382 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 8f4cbd6bd9e4880fb3e7d8e944ce3da563925228, data reload: false

------ Round 1 ----------------------------------
q1	17683	6207	6017	6017
q2	2057	303	173	173
q3	10479	1254	700	700
q4	10214	862	425	425
q5	7504	2150	1940	1940
q6	199	183	149	149
q7	883	738	604	604
q8	9245	1320	1115	1115
q9	5217	4916	4963	4916
q10	6751	2279	1832	1832
q11	472	281	264	264
q12	348	351	217	217
q13	17766	3692	3070	3070
q14	241	228	206	206
q15	578	514	501	501
q16	629	620	572	572
q17	567	829	336	336
q18	7267	6419	6314	6314
q19	1817	946	548	548
q20	304	320	188	188
q21	2931	2151	1992	1992
q22	354	329	303	303
Total cold run time: 103506 ms
Total hot run time: 32382 ms

----- Round 2, with runtime_filter_mode=off -----
q1	6239	6184	6196	6184
q2	233	325	230	230
q3	2184	2642	2330	2330
q4	1439	1822	1378	1378
q5	4322	4710	4754	4710
q6	189	179	144	144
q7	2149	1945	1809	1809
q8	2582	2753	2648	2648
q9	7308	7209	7187	7187
q10	3041	3292	2845	2845
q11	590	516	481	481
q12	664	743	600	600
q13	3377	3849	3171	3171
q14	274	309	282	282
q15	570	503	514	503
q16	642	683	647	647
q17	1198	1715	1242	1242
q18	7685	7451	7166	7166
q19	772	914	1053	914
q20	1958	1976	1804	1804
q21	5386	5058	4866	4866
q22	603	590	576	576
Total cold run time: 53405 ms
Total hot run time: 51717 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 190703 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 8f4cbd6bd9e4880fb3e7d8e944ce3da563925228, data reload: false

query1	982	383	381	381
query2	6523	2327	2233	2233
query3	6706	215	210	210
query4	33743	23988	23710	23710
query5	4346	638	483	483
query6	305	209	188	188
query7	4625	489	298	298
query8	318	249	242	242
query9	9748	2813	2800	2800
query10	474	316	256	256
query11	17961	15586	15387	15387
query12	162	110	106	106
query13	1680	544	413	413
query14	10579	7362	6809	6809
query15	239	197	187	187
query16	8174	598	461	461
query17	1589	764	568	568
query18	2092	403	315	315
query19	254	174	162	162
query20	124	115	111	111
query21	206	121	105	105
query22	4441	4374	4383	4374
query23	34593	33609	33531	33531
query24	6382	2280	2349	2280
query25	487	444	386	386
query26	1207	273	151	151
query27	2002	460	329	329
query28	5380	2496	2460	2460
query29	689	540	411	411
query30	224	181	152	152
query31	984	901	814	814
query32	74	60	58	58
query33	498	363	316	316
query34	748	827	530	530
query35	792	822	779	779
query36	1010	1061	968	968
query37	121	99	79	79
query38	4097	4133	4070	4070
query39	1529	1428	1426	1426
query40	203	108	98	98
query41	45	44	49	44
query42	119	108	100	100
query43	524	542	477	477
query44	1328	806	810	806
query45	181	172	171	171
query46	850	1036	649	649
query47	1920	1936	1857	1857
query48	372	420	323	323
query49	786	473	396	396
query50	638	677	373	373
query51	6792	7002	6908	6908
query52	104	101	92	92
query53	244	253	181	181
query54	474	483	405	405
query55	78	80	87	80
query56	261	258	229	229
query57	1201	1191	1120	1120
query58	239	224	226	224
query59	2984	3153	2938	2938
query60	282	262	277	262
query61	118	115	127	115
query62	880	802	746	746
query63	242	197	200	197
query64	4279	999	697	697
query65	3287	3203	3219	3203
query66	1069	419	312	312
query67	15879	15787	15466	15466
query68	7696	713	515	515
query69	463	289	257	257
query70	1191	1159	1155	1155
query71	462	286	260	260
query72	6228	3831	3740	3740
query73	663	751	364	364
query74	10306	9262	8875	8875
query75	4056	3162	2674	2674
query76	3738	1197	770	770
query77	771	360	278	278
query78	10032	10090	9401	9401
query79	3657	820	595	595
query80	756	515	472	472
query81	489	260	240	240
query82	602	150	127	127
query83	198	168	143	143
query84	283	91	69	69
query85	768	355	308	308
query86	419	292	305	292
query87	4684	4570	4405	4405
query88	4693	2232	2198	2198
query89	409	336	297	297
query90	1931	191	190	190
query91	146	138	106	106
query92	68	58	58	58
query93	1663	854	543	543
query94	662	373	302	302
query95	345	271	267	267
query96	492	614	282	282
query97	2948	2947	2883	2883
query98	233	203	202	202
query99	1691	1572	1427	1427
Total cold run time: 294210 ms
Total hot run time: 190703 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 8f4cbd6bd9e4880fb3e7d8e944ce3da563925228, data reload: false

query1	0.06	0.03	0.03
query2	0.07	0.03	0.03
query3	0.23	0.08	0.07
query4	1.60	0.11	0.11
query5	0.42	0.43	0.41
query6	1.15	0.64	0.65
query7	0.02	0.02	0.02
query8	0.04	0.03	0.03
query9	0.59	0.49	0.49
query10	0.54	0.56	0.55
query11	0.15	0.10	0.11
query12	0.14	0.10	0.10
query13	0.60	0.61	0.59
query14	2.84	2.75	2.72
query15	0.90	0.84	0.83
query16	0.38	0.39	0.39
query17	1.06	1.04	1.02
query18	0.25	0.21	0.22
query19	1.97	2.00	1.79
query20	0.02	0.01	0.01
query21	15.35	0.93	0.57
query22	0.76	0.79	0.61
query23	15.38	1.46	0.56
query24	3.06	1.44	1.04
query25	0.20	0.06	0.14
query26	0.31	0.15	0.13
query27	0.07	0.05	0.05
query28	14.01	1.53	1.04
query29	12.58	4.04	3.37
query30	0.25	0.08	0.08
query31	2.82	0.63	0.39
query32	3.24	0.54	0.45
query33	3.09	3.16	3.13
query34	16.80	5.08	4.49
query35	4.49	4.47	4.48
query36	0.78	0.47	0.48
query37	0.09	0.06	0.06
query38	0.04	0.03	0.03
query39	0.03	0.02	0.02
query40	0.17	0.13	0.13
query41	0.07	0.02	0.02
query42	0.04	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 106.69 s
Total hot run time: 31.36 s

@morrySnow
Copy link
Contributor

add related PR in desc

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 8, 2025
Copy link
Contributor

github-actions bot commented Jan 8, 2025

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Jan 8, 2025

PR approved by anyone and no changes requested.

@englefly englefly merged commit 811f936 into apache:master Jan 8, 2025
29 of 31 checks passed
@englefly englefly deleted the fix-topn-map branch January 8, 2025 02:25
github-actions bot pushed a commit that referenced this pull request Jan 8, 2025
…ode (#46551)

### What problem does this PR solve?
Plan node is not good to be hash map key, because two plan nodes in
different tree level may be regarded as "equal". for example, in
following tree, topn1.equals(topn2) may be true.
 Topn filter generator should distinguish them, and hence topn
node is not suitable to be used as hash map key.

topn1
    -->some node
        -->topn2
             -->other node

Related PR: #31485
englefly added a commit to englefly/incubator-doris that referenced this pull request Jan 8, 2025
…ode (apache#46551)

### What problem does this PR solve?
Plan node is not good to be hash map key, because two plan nodes in
different tree level may be regarded as "equal". for example, in
following tree, topn1.equals(topn2) may be true.
 Topn filter generator should distinguish them, and hence topn
node is not suitable to be used as hash map key.

topn1
    -->some node
        -->topn2
             -->other node

Related PR: apache#31485

(cherry picked from commit 811f936)
morrySnow pushed a commit that referenced this pull request Jan 8, 2025
…ad of Topn node #46551 (#46584)

Cherry-picked from #46551

Co-authored-by: minghong <zhouminghong@selectdb.com>
englefly added a commit that referenced this pull request Jan 8, 2025
…ode (#46551) (branch-3.0) (#46585)

pick#46551

### What problem does this PR solve?
Plan node is not good to be hash map key, because two plan nodes in
different tree level may be regarded as "equal". for example, in
following tree, topn1.equals(topn2) may be true.
 Topn filter generator should distinguish them, and hence topn
node is not suitable to be used as hash map key.

topn1
    -->some node
        -->topn2
             -->other node

Related PR: #31485

(cherry picked from commit 811f936)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.4-merged p0_b reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants