Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pipelineX](local shuffle) Fix local shuffle for colocate/bucket join #28032

Merged
merged 11 commits into from
Dec 6, 2023

Conversation

Gabriel39
Copy link
Contributor

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@Gabriel39
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -249,40 +250,55 @@ Status PipelineXFragmentContext::prepare(const doris::TPipelineFragmentParams& r
return Status::OK();
}

Status PipelineXFragmentContext::_plan_local_shuffle() {
Status PipelineXFragmentContext::_plan_local_shuffle(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method '_plan_local_shuffle' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status PipelineXFragmentContext::_plan_local_shuffle(
static Status PipelineXFragmentContext::_plan_local_shuffle(

return Status::OK();
}

Status PipelineXFragmentContext::_plan_local_shuffle(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: method '_plan_local_shuffle' can be made static [readability-convert-member-functions-to-static]

Suggested change
Status PipelineXFragmentContext::_plan_local_shuffle(
static Status PipelineXFragmentContext::_plan_local_shuffle(

const std::vector<TExpr>& texprs,
ExchangeType exchange_type,
bool* do_local_exchange) {
Status PipelineXFragmentContext::_add_local_exchange(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_add_local_exchange' exceeds recommended size/complexity thresholds [readability-function-size]

Status PipelineXFragmentContext::_add_local_exchange(
                                 ^
Additional context

be/src/pipeline/pipeline_x/pipeline_x_fragment_context.cpp:652: 99 lines including whitespace and comments (threshold 80)

Status PipelineXFragmentContext::_add_local_exchange(
                                 ^

Status _add_local_exchange(int pip_idx, int idx, int node_id, ObjectPool* pool,
PipelinePtr cur_pipe, const std::vector<TExpr>& texprs,
ExchangeType exchange_type, bool* do_local_exchange, int num_buckets,
const std::map<int, int>& bucket_seq_to_instance_idx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: parameter 10 is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]

Suggested change
const std::map<int, int>& bucket_seq_to_instance_idx);
std::map<int, int>& bucket_seq_to_instance_idx);

@@ -153,7 +154,10 @@
const TPipelineFragmentParams& params, const RowDescriptor& row_desc,
RuntimeState* state, DescriptorTbl& desc_tbl,
PipelineId cur_pipeline_id);
Status _plan_local_shuffle();
Status _plan_local_shuffle(int num_buckets,
const std::map<int, int>& bucket_seq_to_instance_idx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: parameter 2 is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]

Suggested change
const std::map<int, int>& bucket_seq_to_instance_idx);
std::map<int, int>& bucket_seq_to_instance_idx);

Status _plan_local_shuffle(int num_buckets,
const std::map<int, int>& bucket_seq_to_instance_idx);
Status _plan_local_shuffle(int num_buckets, int pip_idx, PipelinePtr pip,
const std::map<int, int>& bucket_seq_to_instance_idx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: parameter 4 is const-qualified in the function declaration; const-qualification of parameters only has an effect in function definitions [readability-avoid-const-params-in-decls]

Suggested change
const std::map<int, int>& bucket_seq_to_instance_idx);
std::map<int, int>& bucket_seq_to_instance_idx);

@doris-robot
Copy link

TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
Tpch sf100 test result on commit 20e2638f9afcf690c105a410fa17b0dcd76a8513, data reload: false

run tpch-sf100 query with default conf and session variables
q1	4702	4466	4441	4441
q2	374	150	158	150
q3	1454	1222	1180	1180
q4	1115	930	888	888
q5	3141	3193	3240	3193
q6	250	129	124	124
q7	995	494	482	482
q8	2216	2218	2215	2215
q9	6660	6690	7107	6690
q10	3201	3252	3271	3252
q11	326	211	200	200
q12	349	208	203	203
q13	4568	3793	3771	3771
q14	240	208	211	208
q15	573	527	520	520
q16	437	380	382	380
q17	1019	603	587	587
q18	8097	6911	7100	6911
q19	1525	1412	1443	1412
q20	556	308	318	308
q21	3144	2633	2690	2633
q22	348	283	284	283
Total cold run time: 45290 ms
Total hot run time: 40031 ms

run tpch-sf100 query with default conf and set session variable runtime_filter_mode=off
q1	4376	4372	4376	4372
q2	268	161	179	161
q3	3531	3519	3513	3513
q4	2375	2364	2356	2356
q5	5745	5741	5731	5731
q6	239	120	121	120
q7	2368	1860	1827	1827
q8	3537	3528	3537	3528
q9	9001	8941	8962	8941
q10	3902	3982	3975	3975
q11	502	390	388	388
q12	754	592	595	592
q13	4285	3515	3516	3515
q14	279	243	259	243
q15	584	510	519	510
q16	489	420	463	420
q17	1868	1866	1863	1863
q18	8615	8079	8919	8079
q19	1720	1745	1737	1737
q20	2244	1949	1958	1949
q21	6531	6153	6164	6153
q22	495	408	410	408
Total cold run time: 63708 ms
Total hot run time: 60381 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.17 seconds
stream load tsv: 562 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 18 seconds loaded 2358488459 Bytes, about 124 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 28.7 seconds inserted 10000000 Rows, about 348K ops/s
storage size: 17167193003 Bytes

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 6, 2023
Copy link
Contributor

github-actions bot commented Dec 6, 2023

PR approved by at least one committer and no changes requested.

Copy link
Contributor

github-actions bot commented Dec 6, 2023

PR approved by anyone and no changes requested.

@Gabriel39 Gabriel39 merged commit 1be513b into apache:master Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. meta-change reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants