-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Improvement](function) optimization for substr with ascii string #29799
Conversation
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
run buildall |
clang-tidy review says "All clean, LGTM! 👍" |
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
TPC-H: Total hot run time: 38933 ms
|
TPC-DS: Total hot run time: 178536 ms
|
(From new machine)TeamCity pipeline, clickbench performance test result: |
ClickBench: Total hot run time: 30.97 s
|
TeamCity be ut coverage result: |
Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@BiteTheDDDDt Will the modifications be synchronized to 2.1.5 |
Already included |
The position of line 235 in the function_string. h file in 2.1.5 is inconsistent with the current modification, lacking the judgment of fixed_pos>index.size(), resulting in incorrect results in 2.1.5 |
The code 2.1.5 and master in this regard should be the same. Can you provide a specific example to illustrate the problem of wrong results? |
select substr('老人年轻并不等于动的金币第七十二',17); The result of executing 2.0.13 is null,2.1.5 The result is '老人年轻并不等于动的金币第七十二',Due to the presence of fixed_pos>index.size() in line 247 of the function_string. h file in 2.0.13, but not in branch 2.1.5, 2.0.13 is consistent with master |
this behavior is changed by #28352, so I think 2.1 is same with master, and 2.0 still remained old behavior |
Substring (VARCHAR str, INT pos, [INT len]), so when pos exceeds the length of str, is the correct result null? However, 2.1.5 is currently not available; |
In line 243 of function_string. h in 2.1.5, if we do not check whether fixed_pos is greater than index.size(), will there be index access errors |
@koarz hi, could you help me answer the question? I'm not very get familiar with new logic |
sorry, when I changed the logic of this part of the code, ignored this part, the return of the empty result is correct, I did not test the Chinese case led to the error was not found, i'll fix it soon |
Proposed changes
optimization for substr with ascii string
before:
after:
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...