Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ClickHouse FORMAT on INSERT #1628

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bombsimon
Copy link
Contributor

Adds supports for the SETTINGS and FORMAT keywords used for ClickHouse when inserting data with other syntax than SQL. This can happen e.g. when using the ClickHouse CLI tool to pipe input from files or similar.

Adds supports for the `SETTINGS` and `FORMAT` keywords used for
ClickHouse when inserting data with other syntax than SQL. This can
happen e.g. when using the ClickHouse CLI tool to pipe input from files
or similar.
write!(f, "{format_clause}")?;
}

if self.source.is_none() && self.columns.is_empty() && self.format_clause.is_none() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.source.is_none() && self.columns.is_empty() && self.format_clause.is_none() {
if self.source.is_none() && self.columns.is_empty() {

thinking the format isn't necessarily tied to the columns?

Comment on lines 2467 to 2473
pub enum FormatClause {
Identifier(Ident),
Identifier {
ident: Ident,
expr: Option<Vec<Expr>>,
},
Null,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't look correct to reuse this struct? The existing FormatClause is currently used on a query which has different variants/syntax than on insert?

maybe we can do something like

struct OutputFormatClause {
    name: Ident
    values: Vec<Expr>
}

can we also add doc comments describing the struct/properties with a link to the doc?
https://clickhouse.com/docs/en/sql-reference/formats

@@ -495,6 +495,20 @@ pub struct Insert {
pub priority: Option<MysqlInsertPriority>,
/// Only for mysql
pub insert_alias: Option<InsertAliases>,
/// Settings used in together with a specified `FORMAT`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm double checking if this line is correct, is it required that a format clause is present if setting is present? Maybe we can avoid describing semantics in any case, we just mentiong that this is clickhouse settings with a link to the docs

Comment on lines +1407 to +1411
r#"INSERT INTO tbl FORMAT JSONEachRow {"id": 1, "value": "foo"}, {"id": 2, "value": "bar"}"#,
r#"INSERT INTO tbl FORMAT JSONEachRow ["first", "second", "third"]"#,
r#"INSERT INTO tbl FORMAT JSONEachRow [{"first": 1}]"#,
r#"INSERT INTO tbl FORMAT jsoneachrow {"id": 1}"#,
r#"INSERT INTO tbl (foo) FORMAT JSONAsObject {"foo": {"bar": {"x": "y"}, "baz": 1}}"#,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we dedupe some of the test cases that seem to check case sensitivity? unless the impl is doing special case sensitive handling (which doesnt seem to be the case) I imagine no need to explicitly test for it

@@ -931,7 +931,7 @@ pub const RESERVED_FOR_TABLE_ALIAS: &[Keyword] = &[
Keyword::PREWHERE,
// for ClickHouse SELECT * FROM t SETTINGS ...
Keyword::SETTINGS,
// for ClickHouse SELECT * FROM t FORMAT...
// for ClickHouse SELECT * FROM t FORMAT... or INSERT INTO t FORMAT...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// for ClickHouse SELECT * FROM t FORMAT... or INSERT INTO t FORMAT...
// for ClickHouse SELECT * FROM t FORMAT...

we could also remove the full description, the keyword isn't really tied to any specific syntax/dialect

Comment on lines +11958 to +11965
// Parses format clause used for [ClickHouse]. Formats are different when using `SELECT` and
// `INSERT` and also when using the CLI for pipes. It may or may not take an additional
// expression after the format so we try to parse the expression but allow failure.
//
// Since we know we never take an additional expression in `SELECT` context we never only try
// to parse if `can_have_expression` is true.
//
// <https://clickhouse.com/docs/en/interfaces/formats>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah so per earlier comment I'm imagining if we use different representations for the format clauses the impl isn't affected?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants