Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISSUE-2456: Support insert select query #2765

Merged
merged 6 commits into from
Nov 17, 2021

Conversation

ygf11
Copy link
Contributor

@ygf11 ygf11 commented Nov 13, 2021

I hereby agree to the terms of the CLA available at: https://databend.rs/policies/cla/

Summary

support insert select query.

Changelog

  • New Feature

Related Issues

Fixes #2456

Test Plan

Unit Tests

Stateless Tests

@databend-bot
Copy link
Member

Thanks for the contribution!
I have applied any labels matching special text in your PR Changelog.

Please review the labels and make any necessary changes.

@databend-bot databend-bot added pr-feature this PR introduces a new feature to the codebase need-review labels Nov 13, 2021
@codecov-commenter
Copy link

codecov-commenter commented Nov 13, 2021

Codecov Report

Merging #2765 (d459570) into main (387915a) will increase coverage by 0%.
The diff coverage is 57%.

Impacted file tree graph

@@          Coverage Diff           @@
##            main   #2765    +/-   ##
======================================
  Coverage     67%     67%            
======================================
  Files        574     585    +11     
  Lines      31149   31342   +193     
======================================
+ Hits       21041   21232   +191     
- Misses     10108   10110     +2     
Impacted Files Coverage Δ
...y/src/datasources/table/fuse/table_test_fixture.rs 93% <ø> (ø)
query/src/interpreters/interpreter_insert_into.rs 59% <11%> (-29%) ⬇️
query/src/interpreters/interpreter_factory.rs 32% <37%> (-1%) ⬇️
query/src/sql/plan_parser.rs 74% <50%> (-1%) ⬇️
common/planners/src/plan_insert_into.rs 62% <100%> (+2%) ⬆️
common/streams/src/stream_cast.rs 100% <100%> (ø)
...y/src/datasources/table/fuse/index/min_max_test.rs 91% <100%> (+<1%) ⬆️
.../src/datasources/table/memory/memory_table_test.rs 92% <100%> (+<1%) ⬆️
...uery/src/datasources/table/null/null_table_test.rs 90% <100%> (+<1%) ⬆️
...rc/interpreters/interpreter_truncate_table_test.rs 87% <100%> (ø)
... and 31 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 387915a...d459570. Read the comment docs.

@sundy-li
Copy link
Member

I think we should avoid having stream inside the plan, after #2786 .

@ygf11
Copy link
Contributor Author

ygf11 commented Nov 15, 2021

I think we should avoid having stream inside the plan, after #2786 .

Yeah, I see the pr #2786. It makes append more reasonably. I will do this after #2786 merged:)

@ygf11 ygf11 force-pushed the feat_insert_select branch from 159df1f to ade6f26 Compare November 15, 2021 14:50
fn cast_function(output_type: &DataType) -> Result<Box<dyn Function>> {
let function_factory = FunctionFactory::instance();
match output_type {
DataType::Null => function_factory.get("toNull"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use CastFunction::create ?

let iter = output_schema.fields().iter().zip(select_schema.fields());
let mut colunm_vec = vec![];
for (i, (output_field, input_field)) in iter.enumerate() {
let func = cast_function(output_field.data_type())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can keep funcs out of the stream iter, because trait Function is Send + Sync.

}

let select_input_stream = select_executor.execute(None).await?;
let cast_input_stream = select_input_stream.map(move |data_block| match data_block {
Copy link
Member

@sundy-li sundy-li Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to have a struct named CastStream in common/streams folder.

@sundy-li
Copy link
Member

@ygf11 hello, is this pr review for review?

@ygf11
Copy link
Contributor Author

ygf11 commented Nov 17, 2021

hello, is this pr review for review?

Wait a moment, I will push it soon!

@ygf11 ygf11 force-pushed the feat_insert_select branch from 730bffb to d459570 Compare November 17, 2021 14:10
@ygf11 ygf11 marked this pull request as ready for review November 17, 2021 14:10
@ygf11 ygf11 requested a review from BohuTANG as a code owner November 17, 2021 14:10
@@ -52,7 +52,25 @@ impl InterpreterFactory {
PlanNode::TruncateTable(v) => TruncateTableInterpreter::try_create(ctx, v),
PlanNode::UseDatabase(v) => UseDatabaseInterpreter::try_create(ctx, v),
PlanNode::SetVariable(v) => SettingInterpreter::try_create(ctx, v),
PlanNode::InsertInto(v) => InsertIntoInterpreter::try_create(ctx, v),
PlanNode::InsertInto(v) => {
let select = match v.select_plan.clone().take() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job 👍
It would be nice to check and build select plan in the InsertIntoInterpreter, it looks more cleaner

common-datablocks = {path = "../datablocks"}
common-datavalues = {path = "../datavalues"}
common-exception = {path = "../exception"}
common-functions = {path = "../functions"}
Copy link
Member

@sundy-li sundy-li Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's Clion's auto-reorder? I use vscode fail to find this format command.

Copy link
Contributor Author

@ygf11 ygf11 Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in vscode too, the command name is Format Cell:)
I found the order in package section, is always name->version->description..., should I recover this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's better to have alpha orders.

use crate::sql::*;

#[tokio::test]
async fn test_insert_into_select_interpreter() -> Result<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

}

Ok(())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a new line.

DROP TABLE t1;
DROP TABLE t2;
DROP TABLE t3;
DROP DATABASE db1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new line.

Copy link
Member

@sundy-li sundy-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost LGTM.

BTW, if we want parallel insert from the multiple select stream, it's better to add some options to ignore last merge_processor in pipeline.rs. It can be addressed in another pr.

 pub async fn execute(&mut self, parallel_insert:  bool) -> Result<SendableDataBlockStream> {
        if self.last_pipe()?.nums() > 1 && !parallel_insert {
            self.merge_processor()?;
        }
        self.last_pipe()?.first().execute().await
    }

@databend-bot
Copy link
Member

Wait for another reviewer approval

@BohuTANG
Copy link
Member

/lgtm

Awesome work, thank you @ygf11

@databend-bot
Copy link
Member

CI Passed
Reviewers Approved
Let's Merge
Thank you for the PR @ygf11

@databend-bot databend-bot merged commit ab8d2da into databendlabs:main Nov 17, 2021
@ygf11
Copy link
Contributor Author

ygf11 commented Nov 18, 2021

BTW, if we want parallel insert from the multiple select stream, it's better to add some options to ignore last merge_processor in pipeline.rs. It can be addressed in another pr.

I will do this in another pr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support insert select query
5 participants