Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](function)Optimize the performance of the pad function under UTF-8. #40162

Merged
merged 3 commits into from
Sep 9, 2024

Conversation

Mryange
Copy link
Contributor

@Mryange Mryange commented Aug 30, 2024

Proposed changes

  1. Removed the calculation of str_index.
  2. If pad is constant, calculate pad_index only once.
  3. Do not insert res_chars inside the loop; instead, insert them all together after the loop completes.
mysql [test]>select count(lpad(Title, 100, "abc")) from hits_10m;
+--------------------------------+
| count(lpad(Title, 100, 'abc')) |
+--------------------------------+
|                       10000000 |
+--------------------------------+
1 row in set (3.97 sec)

mysql [test]>select count(lpad(Title, 100, "abc")) from hits_10m;
+--------------------------------+
| count(lpad(Title, 100, 'abc')) |
+--------------------------------+
|                       10000000 |
+--------------------------------+
1 row in set (2.87 sec)

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@Mryange
Copy link
Contributor Author

Mryange commented Aug 30, 2024

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 38528 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit ab4e6daf6c3df5db63b4698025ef2a6b45edc4ab, data reload: false

------ Round 1 ----------------------------------
q1	17609	4418	4343	4343
q2	2019	178	191	178
q3	11687	953	1076	953
q4	10532	730	769	730
q5	7760	2873	2783	2783
q6	226	140	144	140
q7	969	637	602	602
q8	9322	2182	2093	2093
q9	6966	6543	6562	6543
q10	6992	2272	2252	2252
q11	463	240	248	240
q12	408	239	237	237
q13	17774	3071	3076	3071
q14	284	238	239	238
q15	513	483	494	483
q16	590	498	495	495
q17	1001	723	758	723
q18	7294	6922	6998	6922
q19	1388	1160	1035	1035
q20	702	344	339	339
q21	3898	3239	3135	3135
q22	1121	993	1020	993
Total cold run time: 109518 ms
Total hot run time: 38528 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4353	4333	4238	4238
q2	387	275	264	264
q3	2860	2634	2651	2634
q4	1997	1643	1635	1635
q5	5618	5740	5777	5740
q6	239	140	139	139
q7	2280	1848	1821	1821
q8	3307	3491	3506	3491
q9	8866	8853	8885	8853
q10	3626	3397	3357	3357
q11	602	508	501	501
q12	833	677	671	671
q13	15129	3200	3286	3200
q14	319	289	292	289
q15	538	477	491	477
q16	650	602	562	562
q17	1861	1578	1553	1553
q18	8034	7854	7773	7773
q19	1734	1536	1622	1536
q20	2170	1937	1912	1912
q21	5734	5586	5836	5586
q22	1145	1068	1026	1026
Total cold run time: 72282 ms
Total hot run time: 57258 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192837 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit ab4e6daf6c3df5db63b4698025ef2a6b45edc4ab, data reload: false

query1	1247	886	857	857
query2	6393	1972	1958	1958
query3	10608	3836	3936	3836
query4	59334	24272	23213	23213
query5	5414	503	503	503
query6	399	164	170	164
query7	5762	294	297	294
query8	293	199	201	199
query9	8965	2504	2480	2480
query10	476	274	275	274
query11	18230	14988	15261	14988
query12	160	112	113	112
query13	1566	382	402	382
query14	11264	7125	7542	7125
query15	227	170	168	168
query16	7555	470	465	465
query17	1153	571	562	562
query18	2053	308	322	308
query19	304	149	149	149
query20	127	110	111	110
query21	203	115	105	105
query22	4807	4648	4674	4648
query23	34309	33539	34186	33539
query24	5958	2885	2856	2856
query25	506	409	396	396
query26	681	160	164	160
query27	1749	285	281	281
query28	3618	2119	2090	2090
query29	654	429	428	428
query30	244	155	155	155
query31	923	782	758	758
query32	84	54	59	54
query33	471	289	301	289
query34	877	487	493	487
query35	834	731	721	721
query36	1065	948	911	911
query37	146	95	98	95
query38	4000	3823	3926	3823
query39	1446	1406	1378	1378
query40	206	117	118	117
query41	48	49	49	49
query42	119	103	97	97
query43	529	485	472	472
query44	1108	752	751	751
query45	199	173	169	169
query46	1098	733	744	733
query47	1905	1835	1835	1835
query48	379	306	299	299
query49	776	458	445	445
query50	814	433	424	424
query51	7124	7179	6931	6931
query52	99	87	87	87
query53	261	182	177	177
query54	566	458	524	458
query55	74	76	77	76
query56	278	250	260	250
query57	1198	1081	1040	1040
query58	215	243	221	221
query59	3079	2864	2721	2721
query60	293	267	264	264
query61	99	101	120	101
query62	758	659	635	635
query63	217	180	188	180
query64	2836	693	685	685
query65	3247	3131	3165	3131
query66	678	346	355	346
query67	15346	15071	15282	15071
query68	3397	599	597	597
query69	410	283	308	283
query70	1155	1141	1117	1117
query71	340	274	280	274
query72	6041	4010	4046	4010
query73	769	340	337	337
query74	9366	8794	8844	8794
query75	3381	2643	2726	2643
query76	1480	1026	1007	1007
query77	539	317	365	317
query78	9785	9042	9123	9042
query79	1040	549	542	542
query80	715	528	517	517
query81	554	232	232	232
query82	258	151	150	150
query83	170	149	149	149
query84	261	75	75	75
query85	745	301	283	283
query86	311	276	296	276
query87	4367	4231	4272	4231
query88	3328	2332	2348	2332
query89	398	286	284	284
query90	2131	195	203	195
query91	175	101	98	98
query92	60	51	55	51
query93	1076	534	541	534
query94	762	303	295	295
query95	357	266	266	266
query96	594	269	273	269
query97	3168	3086	3089	3086
query98	217	198	204	198
query99	1657	1267	1269	1267
Total cold run time: 306661 ms
Total hot run time: 192837 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.31 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit ab4e6daf6c3df5db63b4698025ef2a6b45edc4ab, data reload: false

query1	0.05	0.05	0.04
query2	0.07	0.04	0.03
query3	0.22	0.06	0.05
query4	1.68	0.08	0.07
query5	0.51	0.49	0.49
query6	1.13	0.72	0.74
query7	0.02	0.02	0.01
query8	0.05	0.04	0.04
query9	0.55	0.47	0.47
query10	0.53	0.53	0.55
query11	0.16	0.11	0.11
query12	0.15	0.12	0.12
query13	0.62	0.59	0.59
query14	2.05	2.06	2.10
query15	0.90	0.82	0.80
query16	0.37	0.37	0.41
query17	1.08	1.03	1.07
query18	0.21	0.21	0.20
query19	1.95	1.70	1.71
query20	0.02	0.01	0.01
query21	15.41	0.67	0.67
query22	4.01	6.76	2.29
query23	18.29	1.43	1.30
query24	2.16	0.23	0.22
query25	0.16	0.08	0.08
query26	0.27	0.18	0.18
query27	0.08	0.08	0.08
query28	13.22	1.01	1.00
query29	12.62	3.28	3.26
query30	0.24	0.06	0.05
query31	2.88	0.40	0.39
query32	3.27	0.49	0.49
query33	2.99	2.95	3.03
query34	16.96	4.41	4.41
query35	4.45	4.43	4.44
query36	0.67	0.48	0.49
query37	0.20	0.17	0.16
query38	0.16	0.15	0.16
query39	0.06	0.03	0.04
query40	0.17	0.12	0.12
query41	0.10	0.05	0.05
query42	0.06	0.05	0.04
query43	0.05	0.05	0.04
Total cold run time: 110.8 s
Total hot run time: 32.31 s

@Mryange Mryange changed the title [only test now] [opt](function)Optimize the performance of the pad function under UTF-8. Aug 30, 2024
@Mryange Mryange marked this pull request as ready for review August 30, 2024 10:16
@Mryange
Copy link
Contributor Author

Mryange commented Aug 30, 2024

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 38413 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 09aa6ada6fc6dbc0e7223dfdf9ace2d18c323c0b, data reload: false

------ Round 1 ----------------------------------
q1	17612	4383	4340	4340
q2	2010	181	183	181
q3	10589	1202	1181	1181
q4	10132	798	783	783
q5	7981	2948	2871	2871
q6	235	140	138	138
q7	986	624	604	604
q8	9605	2076	2053	2053
q9	7079	6526	6513	6513
q10	7007	2204	2171	2171
q11	441	238	246	238
q12	403	225	225	225
q13	18642	3057	3010	3010
q14	284	243	231	231
q15	536	476	495	476
q16	576	504	511	504
q17	989	739	693	693
q18	7281	6933	6785	6785
q19	1392	998	1019	998
q20	708	323	343	323
q21	3963	3282	3098	3098
q22	1094	997	1023	997
Total cold run time: 109545 ms
Total hot run time: 38413 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4357	4343	4264	4264
q2	386	270	266	266
q3	2918	2665	2671	2665
q4	1932	1677	1650	1650
q5	5395	5413	5410	5410
q6	229	134	133	133
q7	2081	1730	1739	1730
q8	3198	3357	3388	3357
q9	8463	8411	8416	8411
q10	3472	3157	3183	3157
q11	600	509	515	509
q12	828	621	618	618
q13	12564	3084	3029	3029
q14	309	278	276	276
q15	515	500	475	475
q16	640	571	558	558
q17	1826	1490	1473	1473
q18	7861	7433	7491	7433
q19	1685	1583	1494	1494
q20	2059	1816	1829	1816
q21	5357	5181	5156	5156
q22	1092	1012	1040	1012
Total cold run time: 67767 ms
Total hot run time: 54892 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187491 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 09aa6ada6fc6dbc0e7223dfdf9ace2d18c323c0b, data reload: false

query1	909	381	371	371
query2	6467	2048	2065	2048
query3	6661	210	219	210
query4	31175	23182	23122	23122
query5	4177	491	475	475
query6	253	175	164	164
query7	4603	305	288	288
query8	249	208	201	201
query9	8622	2459	2466	2459
query10	426	265	270	265
query11	17842	14913	15063	14913
query12	152	102	103	102
query13	1642	391	386	386
query14	10091	6859	7355	6859
query15	278	179	174	174
query16	8062	475	411	411
query17	1613	561	553	553
query18	2126	310	288	288
query19	332	142	140	140
query20	116	112	113	112
query21	208	103	101	101
query22	4362	4345	4120	4120
query23	33786	33408	34953	33408
query24	11221	2930	2825	2825
query25	514	378	385	378
query26	788	157	156	156
query27	2711	286	284	284
query28	7390	2119	2112	2112
query29	638	429	401	401
query30	315	162	156	156
query31	954	754	796	754
query32	99	58	58	58
query33	742	299	274	274
query34	960	482	484	482
query35	852	735	727	727
query36	1091	905	936	905
query37	149	95	94	94
query38	3981	3835	3856	3835
query39	1447	1383	1396	1383
query40	268	119	115	115
query41	49	46	47	46
query42	112	95	98	95
query43	528	464	478	464
query44	1285	765	770	765
query45	193	164	172	164
query46	1134	746	746	746
query47	1855	1745	1787	1745
query48	362	294	289	289
query49	1031	427	425	425
query50	812	407	419	407
query51	7222	7100	6981	6981
query52	101	90	93	90
query53	258	189	184	184
query54	864	451	448	448
query55	78	78	75	75
query56	271	250	255	250
query57	1182	1092	1093	1092
query58	235	238	241	238
query59	3106	2906	2749	2749
query60	306	266	296	266
query61	100	97	102	97
query62	847	677	678	677
query63	212	185	183	183
query64	4164	672	656	656
query65	3218	3137	3154	3137
query66	1268	343	354	343
query67	15667	15174	15219	15174
query68	3125	580	577	577
query69	393	278	262	262
query70	1153	1133	1021	1021
query71	333	277	274	274
query72	6027	4026	4017	4017
query73	750	331	348	331
query74	9165	8846	8947	8846
query75	3371	2740	2699	2699
query76	1899	986	961	961
query77	501	312	327	312
query78	9526	9009	9031	9009
query79	1045	533	541	533
query80	705	499	500	499
query81	453	241	236	236
query82	246	144	150	144
query83	170	149	151	149
query84	236	86	75	75
query85	731	354	283	283
query86	321	312	256	256
query87	4373	4295	4253	4253
query88	3416	2333	2323	2323
query89	378	292	290	290
query90	1881	197	193	193
query91	125	101	102	101
query92	66	61	52	52
query93	1046	542	540	540
query94	696	290	303	290
query95	367	262	264	262
query96	587	268	272	268
query97	3192	3050	3117	3050
query98	216	206	205	205
query99	1480	1307	1285	1285
Total cold run time: 282705 ms
Total hot run time: 187491 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.94 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 09aa6ada6fc6dbc0e7223dfdf9ace2d18c323c0b, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.68	0.08	0.08
query5	0.49	0.50	0.49
query6	1.13	0.74	0.73
query7	0.02	0.01	0.02
query8	0.05	0.04	0.05
query9	0.55	0.48	0.49
query10	0.53	0.53	0.53
query11	0.15	0.12	0.12
query12	0.15	0.13	0.12
query13	0.61	0.59	0.60
query14	2.07	2.06	2.10
query15	0.89	0.82	0.82
query16	0.38	0.36	0.37
query17	1.05	1.01	0.97
query18	0.22	0.22	0.20
query19	1.80	1.69	1.69
query20	0.01	0.01	0.01
query21	15.40	0.68	0.65
query22	4.26	7.27	1.95
query23	18.34	1.37	1.35
query24	2.10	0.23	0.23
query25	0.15	0.08	0.07
query26	0.27	0.18	0.18
query27	0.08	0.07	0.08
query28	13.17	1.02	0.99
query29	12.63	3.32	3.31
query30	0.24	0.06	0.05
query31	2.86	0.40	0.38
query32	3.26	0.47	0.47
query33	2.97	2.98	3.02
query34	17.09	4.34	4.38
query35	4.41	4.44	4.41
query36	0.66	0.49	0.48
query37	0.19	0.16	0.15
query38	0.15	0.15	0.16
query39	0.05	0.04	0.04
query40	0.16	0.13	0.13
query41	0.09	0.05	0.05
query42	0.06	0.05	0.04
query43	0.04	0.04	0.05
Total cold run time: 110.76 s
Total hot run time: 31.94 s

@Mryange
Copy link
Contributor Author

Mryange commented Sep 1, 2024

run buildall

1 similar comment
@Mryange
Copy link
Contributor Author

Mryange commented Sep 1, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 38523 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 09aa6ada6fc6dbc0e7223dfdf9ace2d18c323c0b, data reload: false

------ Round 1 ----------------------------------
q1	18124	4537	4391	4391
q2	2872	178	182	178
q3	10773	1149	1182	1149
q4	10303	836	794	794
q5	8028	2918	2867	2867
q6	235	140	142	140
q7	975	623	605	605
q8	9403	2102	2070	2070
q9	7243	6540	6554	6540
q10	7007	2213	2201	2201
q11	449	241	254	241
q12	385	227	223	223
q13	17970	3062	3059	3059
q14	283	237	241	237
q15	520	481	478	478
q16	585	491	518	491
q17	983	702	742	702
q18	7333	6979	6826	6826
q19	1387	1123	1139	1123
q20	687	332	340	332
q21	3926	3055	2880	2880
q22	1082	1029	996	996
Total cold run time: 110553 ms
Total hot run time: 38523 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4392	4363	4287	4287
q2	395	272	271	271
q3	2904	2690	2657	2657
q4	2029	1676	1649	1649
q5	5406	5430	5476	5430
q6	227	131	128	128
q7	2128	1770	1728	1728
q8	3228	3374	3400	3374
q9	8475	8547	8502	8502
q10	3452	3174	3193	3174
q11	609	489	512	489
q12	797	613	613	613
q13	12225	3080	3034	3034
q14	315	280	274	274
q15	536	488	481	481
q16	633	555	561	555
q17	1816	1515	1503	1503
q18	7804	7379	7438	7379
q19	1683	1617	1511	1511
q20	2062	1832	1857	1832
q21	5415	5272	5177	5177
q22	1130	1037	1047	1037
Total cold run time: 67661 ms
Total hot run time: 55085 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 189125 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 09aa6ada6fc6dbc0e7223dfdf9ace2d18c323c0b, data reload: false

query1	926	376	370	370
query2	6460	2073	1998	1998
query3	6665	225	214	214
query4	33825	23206	23205	23205
query5	4141	505	495	495
query6	262	168	165	165
query7	4569	292	304	292
query8	280	208	217	208
query9	8822	2481	2479	2479
query10	458	267	261	261
query11	16931	15029	15131	15029
query12	152	101	101	101
query13	1635	391	375	375
query14	9695	7432	7261	7261
query15	262	173	182	173
query16	7397	462	435	435
query17	1619	551	559	551
query18	1842	291	287	287
query19	328	144	141	141
query20	117	108	109	108
query21	206	112	104	104
query22	4304	4208	4093	4093
query23	34117	33530	33456	33456
query24	11010	2888	2855	2855
query25	618	396	386	386
query26	1087	158	157	157
query27	2363	284	280	280
query28	7116	2136	2121	2121
query29	757	417	400	400
query30	296	154	153	153
query31	977	752	800	752
query32	93	58	59	58
query33	720	299	289	289
query34	912	470	488	470
query35	864	731	725	725
query36	1091	962	953	953
query37	154	94	98	94
query38	4081	3958	3860	3860
query39	1459	1426	1403	1403
query40	209	123	120	120
query41	50	50	47	47
query42	123	105	103	103
query43	538	499	498	498
query44	1224	772	754	754
query45	204	177	169	169
query46	1102	769	734	734
query47	1917	1820	1800	1800
query48	383	309	306	306
query49	1059	453	447	447
query50	812	413	422	413
query51	7190	7091	7039	7039
query52	104	91	90	90
query53	260	189	185	185
query54	851	460	466	460
query55	78	81	82	81
query56	286	275	271	271
query57	1195	1086	1076	1076
query58	263	233	237	233
query59	3082	2910	2746	2746
query60	303	278	275	275
query61	147	119	120	119
query62	825	665	682	665
query63	236	206	191	191
query64	5033	764	739	739
query65	3263	3187	3166	3166
query66	1476	359	352	352
query67	15430	15333	15504	15333
query68	4543	580	571	571
query69	416	284	287	284
query70	1202	1162	1132	1132
query71	336	282	281	281
query72	6928	4296	4140	4140
query73	765	329	332	329
query74	9289	8766	8808	8766
query75	3608	2726	2735	2726
query76	2268	936	1018	936
query77	488	334	323	323
query78	9774	9055	9113	9055
query79	2346	539	550	539
query80	1272	513	508	508
query81	597	236	232	232
query82	643	155	147	147
query83	236	149	149	149
query84	238	77	76	76
query85	1848	298	287	287
query86	484	299	304	299
query87	4454	4304	4281	4281
query88	4005	2338	2300	2300
query89	396	303	288	288
query90	1931	198	190	190
query91	127	102	102	102
query92	63	53	56	53
query93	2041	551	549	549
query94	986	301	302	301
query95	359	255	254	254
query96	601	267	265	265
query97	3211	3107	3116	3107
query98	224	209	199	199
query99	1536	1288	1287	1287
Total cold run time: 293469 ms
Total hot run time: 189125 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.36 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 09aa6ada6fc6dbc0e7223dfdf9ace2d18c323c0b, data reload: false

query1	0.05	0.04	0.04
query2	0.08	0.04	0.04
query3	0.22	0.05	0.05
query4	1.68	0.09	0.08
query5	0.50	0.49	0.51
query6	1.14	0.75	0.73
query7	0.02	0.01	0.01
query8	0.06	0.05	0.04
query9	0.55	0.50	0.48
query10	0.54	0.54	0.55
query11	0.15	0.11	0.11
query12	0.15	0.12	0.12
query13	0.61	0.59	0.60
query14	2.02	2.08	2.13
query15	0.90	0.84	0.85
query16	0.38	0.37	0.38
query17	1.05	1.05	1.01
query18	0.22	0.19	0.20
query19	1.93	1.86	1.85
query20	0.02	0.01	0.01
query21	15.41	0.70	0.67
query22	4.16	7.61	2.00
query23	18.22	1.42	1.31
query24	2.12	0.21	0.21
query25	0.14	0.09	0.08
query26	0.28	0.18	0.18
query27	0.08	0.08	0.08
query28	13.25	1.03	1.01
query29	12.61	3.34	3.34
query30	0.24	0.06	0.06
query31	2.89	0.40	0.39
query32	3.26	0.48	0.47
query33	2.98	3.01	3.05
query34	17.06	4.42	4.38
query35	4.49	4.41	4.43
query36	0.67	0.49	0.48
query37	0.19	0.16	0.16
query38	0.16	0.15	0.15
query39	0.05	0.04	0.04
query40	0.15	0.12	0.13
query41	0.09	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 110.87 s
Total hot run time: 32.36 s

res_chars, res_offsets);
auto [real_len, skip_chars] = simd::VStringFunctions::skip_leading_utf8(
(const char*)str_data, (const char*)str_data + str_len, len);
if (len <= skip_chars) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems wired. skip_chars seems always <= len

@@ -187,6 +187,19 @@ class VStringFunctions {
return p;
}

static inline std::pair<size_t, size_t> skip_leading_utf8(const char* begin, const char* end,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment and change the func name. we should know what the functinon do. if just one place call the func, better be a lambda

@@ -187,6 +187,19 @@ class VStringFunctions {
return p;
}

static inline std::pair<size_t, size_t> skip_leading_utf8(const char* begin, const char* end,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment and change the func name. we should know what the functinon do. if just one place call the func, better be a lambda

@Mryange
Copy link
Contributor Author

Mryange commented Sep 2, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

}

template <bool str_const, bool len_const, bool pad_const>
void execute_utf8(const ColumnString::Offsets& strcol_offsets,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function 'execute_utf8' exceeds recommended size/complexity thresholds [readability-function-size]

    void execute_utf8(const ColumnString::Offsets& strcol_offsets,
         ^
Additional context

be/src/vec/functions/function_string.h:1574: 82 lines including whitespace and comments (threshold 80)

    void execute_utf8(const ColumnString::Offsets& strcol_offsets,
         ^

@doris-robot
Copy link

TPC-H: Total hot run time: 37769 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 2d14a30a861756222850c50eae479fde69102f75, data reload: false

------ Round 1 ----------------------------------
q1	17619	4684	4340	4340
q2	2011	186	175	175
q3	11476	931	1047	931
q4	10515	731	657	657
q5	7771	2864	2762	2762
q6	234	138	135	135
q7	932	609	594	594
q8	9339	2066	2079	2066
q9	7182	6547	6532	6532
q10	7045	2189	2190	2189
q11	477	245	238	238
q12	388	221	228	221
q13	18913	3052	3066	3052
q14	278	241	236	236
q15	514	491	485	485
q16	608	526	494	494
q17	980	697	706	697
q18	7280	6702	7024	6702
q19	1392	1014	955	955
q20	678	328	331	328
q21	4119	3141	2968	2968
q22	1120	1025	1012	1012
Total cold run time: 110871 ms
Total hot run time: 37769 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4359	4293	4285	4285
q2	385	277	279	277
q3	2897	2633	2684	2633
q4	1939	1674	1652	1652
q5	5647	5681	5807	5681
q6	233	132	135	132
q7	2238	1813	1844	1813
q8	3286	3433	3523	3433
q9	8839	8850	8784	8784
q10	3638	3386	3358	3358
q11	584	493	506	493
q12	823	683	655	655
q13	14359	3299	3293	3293
q14	313	290	306	290
q15	538	485	486	485
q16	661	589	584	584
q17	1839	1560	1540	1540
q18	8199	7879	7766	7766
q19	1716	1555	1670	1555
q20	2167	1933	1959	1933
q21	5788	5460	5401	5401
q22	1111	1018	1100	1018
Total cold run time: 71559 ms
Total hot run time: 57061 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 193079 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 2d14a30a861756222850c50eae479fde69102f75, data reload: false

query1	1259	888	865	865
query2	6421	1996	2037	1996
query3	10618	3891	4077	3891
query4	59832	24010	23176	23176
query5	5427	515	505	505
query6	401	162	165	162
query7	5764	309	318	309
query8	299	210	227	210
query9	8867	2485	2476	2476
query10	499	281	263	263
query11	18314	15196	15261	15196
query12	165	102	106	102
query13	1579	398	387	387
query14	11262	6845	7287	6845
query15	239	177	172	172
query16	7626	482	495	482
query17	1102	584	566	566
query18	2166	292	292	292
query19	297	145	139	139
query20	121	108	111	108
query21	207	107	103	103
query22	4589	4530	4483	4483
query23	34259	33749	33173	33173
query24	5972	2886	2876	2876
query25	533	379	387	379
query26	691	157	150	150
query27	1796	276	279	276
query28	3679	2055	2040	2040
query29	710	405	409	405
query30	247	154	162	154
query31	944	732	772	732
query32	84	52	57	52
query33	484	279	290	279
query34	858	479	479	479
query35	859	714	720	714
query36	1075	907	955	907
query37	150	88	92	88
query38	3931	3827	3859	3827
query39	1440	1395	1385	1385
query40	202	116	118	116
query41	46	52	45	45
query42	122	94	95	94
query43	536	489	471	471
query44	1127	746	739	739
query45	201	168	167	167
query46	1109	744	743	743
query47	1897	1783	1788	1783
query48	379	286	292	286
query49	768	434	438	434
query50	832	417	432	417
query51	7220	7068	7030	7030
query52	98	85	89	85
query53	248	178	176	176
query54	583	464	459	459
query55	72	73	76	73
query56	288	271	279	271
query57	1203	1086	1055	1055
query58	224	267	256	256
query59	3128	2939	2934	2934
query60	314	291	288	288
query61	124	119	122	119
query62	722	652	645	645
query63	233	191	195	191
query64	2924	777	768	768
query65	3208	3159	3130	3130
query66	690	341	362	341
query67	15384	15317	15470	15317
query68	3027	584	580	580
query69	410	293	302	293
query70	1205	1143	1129	1129
query71	380	285	287	285
query72	6438	4203	4170	4170
query73	777	336	332	332
query74	9134	9041	8818	8818
query75	3372	2709	2742	2709
query76	1533	1011	997	997
query77	549	335	336	335
query78	9761	9142	9207	9142
query79	2165	527	544	527
query80	1316	523	518	518
query81	565	240	234	234
query82	1281	153	148	148
query83	178	145	148	145
query84	250	75	77	75
query85	929	296	284	284
query86	394	273	305	273
query87	4387	4273	4254	4254
query88	3252	2301	2324	2301
query89	397	284	277	277
query90	1847	190	191	190
query91	127	124	98	98
query92	64	54	50	50
query93	2805	542	545	542
query94	790	286	260	260
query95	342	251	258	251
query96	617	264	264	264
query97	3204	3089	3041	3041
query98	238	207	202	202
query99	1918	1282	1278	1278
Total cold run time: 312300 ms
Total hot run time: 193079 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.85 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 2d14a30a861756222850c50eae479fde69102f75, data reload: false

query1	0.05	0.04	0.04
query2	0.07	0.04	0.04
query3	0.22	0.05	0.05
query4	1.68	0.07	0.07
query5	0.50	0.52	0.49
query6	1.13	0.73	0.72
query7	0.02	0.01	0.02
query8	0.06	0.04	0.04
query9	0.55	0.49	0.50
query10	0.55	0.55	0.55
query11	0.15	0.11	0.12
query12	0.15	0.12	0.12
query13	0.61	0.58	0.60
query14	2.12	2.05	2.06
query15	0.89	0.83	0.81
query16	0.38	0.37	0.36
query17	1.00	0.99	1.00
query18	0.21	0.20	0.20
query19	1.90	1.72	1.74
query20	0.01	0.01	0.01
query21	15.39	0.67	0.66
query22	4.00	7.07	1.78
query23	18.35	1.38	1.32
query24	2.08	0.23	0.22
query25	0.16	0.09	0.08
query26	0.27	0.18	0.18
query27	0.09	0.08	0.08
query28	13.21	1.01	1.01
query29	12.60	3.37	3.32
query30	0.25	0.06	0.05
query31	2.88	0.40	0.39
query32	3.25	0.49	0.48
query33	3.02	3.01	2.99
query34	17.12	4.35	4.40
query35	4.47	4.43	4.50
query36	0.66	0.47	0.48
query37	0.18	0.16	0.15
query38	0.16	0.15	0.14
query39	0.05	0.04	0.04
query40	0.15	0.12	0.14
query41	0.09	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 110.78 s
Total hot run time: 31.85 s

@Mryange
Copy link
Contributor Author

Mryange commented Sep 2, 2024

run p0

simd::VStringFunctions::iterate_utf8_with_limit_length(
(const char*)str_data, (const char*)str_data + str_len, len);
// If iterate_char_len is greater than len, it indicates that the string's length exceeds len, so truncation.
if (iterate_char_len >= len) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not greater, only equal need to do truncate

@Mryange
Copy link
Contributor Author

Mryange commented Sep 3, 2024

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.84% (9391/25488)
Line Coverage: 28.29% (77496/273916)
Region Coverage: 27.69% (39990/144424)
Branch Coverage: 24.33% (20346/83626)
Coverage Report: http://coverage.selectdb-in.cc/coverage/c4b4c5ea025af35e8b10964a97c392b9864f4016_c4b4c5ea025af35e8b10964a97c392b9864f4016/report/index.html

@doris-robot
Copy link

TPC-H: Total hot run time: 38095 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit c4b4c5ea025af35e8b10964a97c392b9864f4016, data reload: false

------ Round 1 ----------------------------------
q1	17695	4403	4304	4304
q2	2018	191	178	178
q3	11923	998	1060	998
q4	10521	800	758	758
q5	7770	2850	2820	2820
q6	228	140	140	140
q7	947	623	604	604
q8	9327	2101	2071	2071
q9	6911	6565	6557	6557
q10	6988	2192	2148	2148
q11	467	250	241	241
q12	412	230	230	230
q13	17770	3092	3077	3077
q14	286	233	230	230
q15	516	512	498	498
q16	588	518	512	512
q17	987	713	667	667
q18	7164	6710	6866	6710
q19	1391	1053	1080	1053
q20	678	328	336	328
q21	3980	3159	2958	2958
q22	1149	1067	1013	1013
Total cold run time: 109716 ms
Total hot run time: 38095 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4346	4273	4329	4273
q2	379	274	268	268
q3	2849	2685	2642	2642
q4	1965	1655	1677	1655
q5	5597	5676	5708	5676
q6	231	137	134	134
q7	2258	1751	1837	1751
q8	3288	3416	3418	3416
q9	8890	8952	8873	8873
q10	3560	3364	3373	3364
q11	603	538	541	538
q12	836	684	675	675
q13	15695	3146	3299	3146
q14	298	297	289	289
q15	521	503	493	493
q16	645	575	568	568
q17	1834	1536	1562	1536
q18	8099	7777	7903	7777
q19	1735	1542	1638	1542
q20	2196	1944	1879	1879
q21	5737	5369	5530	5369
q22	1134	1058	1046	1046
Total cold run time: 72696 ms
Total hot run time: 56910 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 192290 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit c4b4c5ea025af35e8b10964a97c392b9864f4016, data reload: false

query1	1266	894	862	862
query2	6294	1886	1915	1886
query3	10747	3976	4069	3976
query4	59327	24075	23231	23231
query5	5244	488	483	483
query6	401	165	153	153
query7	5749	295	289	289
query8	299	221	226	221
query9	8603	2449	2429	2429
query10	459	283	254	254
query11	17838	15284	15101	15101
query12	163	105	107	105
query13	1545	401	381	381
query14	10819	7035	6856	6856
query15	246	179	181	179
query16	7321	469	444	444
query17	1124	558	553	553
query18	1923	290	293	290
query19	256	141	145	141
query20	116	107	112	107
query21	199	105	100	100
query22	4651	4645	4647	4645
query23	34731	33634	33588	33588
query24	5976	2827	2835	2827
query25	505	374	365	365
query26	652	152	160	152
query27	1730	282	271	271
query28	3799	2018	2010	2010
query29	631	406	396	396
query30	235	152	150	150
query31	940	744	761	744
query32	72	52	53	52
query33	412	271	280	271
query34	860	471	475	471
query35	849	719	719	719
query36	1061	914	923	914
query37	145	91	88	88
query38	4033	3948	3937	3937
query39	1468	1398	1393	1393
query40	196	114	111	111
query41	47	46	44	44
query42	116	94	95	94
query43	500	458	456	456
query44	1062	722	737	722
query45	198	172	168	168
query46	1090	750	734	734
query47	1940	1820	1866	1820
query48	368	297	305	297
query49	770	452	423	423
query50	806	417	424	417
query51	6927	6905	6768	6768
query52	94	85	86	85
query53	246	172	176	172
query54	564	450	447	447
query55	74	76	74	74
query56	275	260	253	253
query57	1208	1043	1091	1043
query58	216	228	234	228
query59	3008	3023	2855	2855
query60	298	271	268	268
query61	105	101	100	100
query62	758	657	660	657
query63	216	185	184	184
query64	2400	687	654	654
query65	3218	3113	3141	3113
query66	678	334	341	334
query67	15344	15653	15223	15223
query68	2743	568	556	556
query69	398	282	269	269
query70	1135	1137	1077	1077
query71	342	275	267	267
query72	6019	4103	3975	3975
query73	736	328	328	328
query74	9155	8855	8868	8855
query75	3331	2637	2700	2637
query76	1418	1034	930	930
query77	522	313	304	304
query78	9948	9236	9154	9154
query79	1039	533	509	509
query80	696	505	494	494
query81	513	242	228	228
query82	225	144	145	144
query83	174	145	147	145
query84	253	81	74	74
query85	679	286	279	279
query86	300	295	302	295
query87	4460	4272	4207	4207
query88	2918	2313	2309	2309
query89	371	274	279	274
query90	1977	189	185	185
query91	120	99	100	99
query92	58	48	48	48
query93	1037	522	524	522
query94	745	300	298	298
query95	348	247	253	247
query96	593	265	265	265
query97	3206	3117	3094	3094
query98	212	204	199	199
query99	1505	1246	1245	1245
Total cold run time: 302587 ms
Total hot run time: 192290 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 32.44 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit c4b4c5ea025af35e8b10964a97c392b9864f4016, data reload: false

query1	0.04	0.04	0.04
query2	0.09	0.04	0.04
query3	0.23	0.04	0.05
query4	1.68	0.08	0.07
query5	0.53	0.50	0.51
query6	1.13	0.74	0.72
query7	0.02	0.02	0.02
query8	0.04	0.04	0.04
query9	0.54	0.48	0.48
query10	0.54	0.53	0.55
query11	0.15	0.11	0.12
query12	0.15	0.13	0.12
query13	0.61	0.59	0.58
query14	2.09	2.07	2.05
query15	0.84	0.85	0.82
query16	0.36	0.38	0.38
query17	1.06	1.02	1.05
query18	0.21	0.19	0.19
query19	1.82	1.78	1.75
query20	0.01	0.01	0.01
query21	15.39	0.65	0.66
query22	4.77	6.32	2.18
query23	18.29	1.47	1.43
query24	2.08	0.23	0.22
query25	0.16	0.09	0.08
query26	0.26	0.18	0.18
query27	0.08	0.08	0.08
query28	13.26	1.01	0.99
query29	12.64	3.26	3.26
query30	0.23	0.05	0.06
query31	2.87	0.40	0.39
query32	3.26	0.48	0.46
query33	2.98	3.01	3.02
query34	17.10	4.42	4.39
query35	4.49	4.53	4.49
query36	0.66	0.49	0.49
query37	0.19	0.16	0.15
query38	0.16	0.15	0.15
query39	0.05	0.04	0.04
query40	0.16	0.12	0.12
query41	0.09	0.05	0.04
query42	0.06	0.05	0.04
query43	0.05	0.04	0.04
Total cold run time: 111.42 s
Total hot run time: 32.44 s

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 6, 2024
Copy link
Contributor

github-actions bot commented Sep 6, 2024

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Sep 6, 2024

PR approved by anyone and no changes requested.

@zhangstar333 zhangstar333 merged commit c38fa02 into apache:master Sep 9, 2024
25 of 28 checks passed
zhangstar333 added a commit that referenced this pull request Sep 13, 2024
… pad function (#40676)

## Proposed changes
 this pr #40162 introduced,
and have test case in P1,
regression-test/suites/query_p1/test_big_pad.groovy
 

<!--Describe your changes.-->
dataroaring pushed a commit that referenced this pull request Oct 9, 2024
…-8. (#40162)

## Proposed changes

1. Removed the calculation of str_index.
2. If pad is constant, calculate pad_index only once.
3. Do not insert res_chars inside the loop; instead, insert them all
together after the loop completes.
```
mysql [test]>select count(lpad(Title, 100, "abc")) from hits_10m;
+--------------------------------+
| count(lpad(Title, 100, 'abc')) |
+--------------------------------+
|                       10000000 |
+--------------------------------+
1 row in set (3.97 sec)

mysql [test]>select count(lpad(Title, 100, "abc")) from hits_10m;
+--------------------------------+
| count(lpad(Title, 100, 'abc')) |
+--------------------------------+
|                       10000000 |
+--------------------------------+
1 row in set (2.87 sec)
```



<!--Describe your changes.-->
dataroaring pushed a commit that referenced this pull request Oct 9, 2024
… pad function (#40676)

## Proposed changes
 this pr #40162 introduced,
and have test case in P1,
regression-test/suites/query_p1/test_big_pad.groovy
 

<!--Describe your changes.-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants