Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ClickHouse] Add support for WITH FILL to OrderByExpr #1330

Merged
merged 13 commits into from
Jul 20, 2024

Conversation

nickpresta
Copy link
Contributor

ClickHouse supports the ORDER BY ... WITH FILL modifier:

https://clickhouse.com/docs/en/sql-reference/statements/select/order-by#order-by-expr-with-fill-modifier

WITH FILL itself supports a simple "from", "to", and "step" parameters, and a more sophisticated INTERPOLATE option.

@nickpresta nickpresta changed the title Add support for WITH FILL to OrderByExpr [ClickHouse] Add support for WITH FILL to OrderByExpr Jul 3, 2024
ClickHouse supports the ORDER BY ... WITH FILL modifier:

https://clickhouse.com/docs/en/sql-reference/statements/select/order-by#order-by-expr-with-fill-modifier

WITH FILL itself supports a simple "from", "to", and "step" parameters,
and a more sophisticated INTERPOLATE option.
@nickpresta nickpresta force-pushed the add-order-by-expr-with-fill branch from 972f46a to 815ee5b Compare July 3, 2024 15:11
tests/sqlparser_common.rs Outdated Show resolved Hide resolved
tests/sqlparser_common.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Outdated Show resolved Hide resolved
src/ast/query.rs Outdated Show resolved Hide resolved
src/ast/query.rs Show resolved Hide resolved
@nickpresta
Copy link
Contributor Author

One question I had around the naming - do we like InterpolationArg and Interpolation or would we prefer this to be InterpolateArg and Interpolate to match the keyword and field name?

The values describe and "interpolation" which is why I named it that way to begin with, but I don't feel strongly either way.

src/ast/query.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Outdated Show resolved Hide resolved
src/parser/mod.rs Outdated Show resolved Hide resolved
@nickpresta
Copy link
Contributor Author

nickpresta commented Jul 5, 2024

I added some tests for validation that invalid syntax is rejected in 4068bb3. This required me to add a validation function specifically for GenericDialect and ClickHouseDialect to check that we have WITH FILL when we have an INTERPOLATE and ensuring that INTERPOLATE is in the last OrderByExpr

The only other way I could think to tackle this type of validation would be to change the Query.order_by to not be a Vec<OrderByExpr> but something like:

struct OrderByClause {
  order_bys: Vec<OrderByExpr>,
  interpolate: Option<InterpolateArg>,
}

but that seems much more disruptive to everyone who uses Query.order_by vs keeping this localized to just the the ClickHouse parsing.

@nickpresta nickpresta force-pushed the add-order-by-expr-with-fill branch from ebcd31e to 4068bb3 Compare July 5, 2024 19:01
@nickpresta nickpresta requested a review from iffyio July 5, 2024 19:24
src/ast/query.rs Outdated Show resolved Hide resolved
src/ast/query.rs Outdated Show resolved Hide resolved
@@ -7888,7 +7888,9 @@ impl<'a> Parser<'a> {
let body = self.parse_boxed_query_body(0)?;

let order_by = if self.parse_keywords(&[Keyword::ORDER, Keyword::BY]) {
self.parse_comma_separated(Parser::parse_order_by_expr)?
let order_by_exprs = self.parse_comma_separated(Parser::parse_order_by_expr)?;
self.validate_order_by_exprs_for_interpolate_and_with_fill(&order_by_exprs)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so the parser generally doesn't do semantic validations explicitly except for maybe trivial cases, in this case it seems correct to have the invalid state non-representable in the AST, it does become a bit invasive of a change but that should be fine since it would leave us with a more accurate representation of the syntax

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can likely skip this validation function in any case. To clarify, I think the current approach having interpolate in the OrderByExpr would be incorrect/misleading (since the property isn't part of the order by expression and isn't applicable to other context where order by is used).
I think we can change the current representation in Query from:

Query {
    pub order_by: Vec<OrderByExpr>,
}

to either

struct Query {
    pub order_by: Vec<OrderByExpr>,
    interpolate: Option<Interpolate>
}

or

struct OrderBy {
    exprs: Vec<OrderByExpr>,
    interpolate: Option<Interpolate>
}
Query {
    pub order_by: OrderBy,
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the structure of Query.order_by (the second example) since INTERPOLATE feels more associated with ORDER BY instead of the base query itself.

4a7167b

src/ast/query.rs Outdated Show resolved Hide resolved
@coveralls
Copy link

coveralls commented Jul 8, 2024

Pull Request Test Coverage Report for Build 10019614247

Details

  • 202 of 207 (97.58%) changed or added relevant lines in 7 files are covered.
  • 5 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.03%) to 89.226%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/parser/mod.rs 45 46 97.83%
tests/sqlparser_clickhouse.rs 85 89 95.51%
Files with Coverage Reduction New Missed Lines %
src/parser/mod.rs 1 93.3%
tests/sqlparser_common.rs 4 89.64%
Totals Coverage Status
Change from base Build 9944212602: 0.03%
Covered Lines: 27023
Relevant Lines: 30286

💛 - Coveralls

@alamb
Copy link
Contributor

alamb commented Jul 8, 2024

Thanks @nickpresta and @iffyio

It looks like there are still a few suggestions outstanding on this PR. Just FYI I am hoping to get a sqlparser-rs release out tomorrow #1296 and will include this in the release if it is ready by then

@nickpresta
Copy link
Contributor Author

@alamb Thanks for that information. I believe I've addressed the latest feedback from @iffyio (rename + refactor structure).

Let me know if there's anything else I need to do (e.g. changelog updates).

src/ast/query.rs Outdated Show resolved Hide resolved
@nickpresta nickpresta force-pushed the add-order-by-expr-with-fill branch from 3e8b141 to 85fae63 Compare July 8, 2024 17:38
@nickpresta nickpresta requested a review from iffyio July 8, 2024 20:12
Copy link
Member

@git-hulk git-hulk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/ast/query.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple comments, otherwise I think this looks good!

src/parser/mod.rs Outdated Show resolved Hide resolved
src/ast/query.rs Outdated Show resolved Hide resolved
@nickpresta nickpresta requested a review from iffyio July 11, 2024 14:00
Copy link
Contributor

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment re pulling out the dialect check, LGTM! cc @alamb

// Parse a set of comma seperated INTERPOLATE expressions (ClickHouse dialect)
// that follow the INTERPOLATE keyword in an ORDER BY clause with the WITH FILL modifier
pub fn parse_interpolations(&mut self) -> Result<Option<Interpolate>, ParserError> {
if dialect_of!(self is ClickHouseDialect | GenericDialect)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move this dialect check outside the function and to the caller, and have the function only care about the parsing logic.

one minor, we can also avoid nesting the function body by returning early

if !self.parse_keyword(Keyword::INTERPOLATE) {
     return Ok(None)
}

if self.consume_token(...)
Ok(Some(...))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 000e08d. Thanks!

src/ast/query.rs Outdated Show resolved Hide resolved
@nickpresta
Copy link
Contributor Author

Hey @alamb - is this something that can be merged into main? I understand this change won't make it into a version until 0.49.0 is released (I hope 🤞), but I could at least target a commit on main instead of this branch.

Thank you!

@alamb
Copy link
Contributor

alamb commented Jul 20, 2024

Sure thing -- sorry @nickpresta. I am sorry for the delay. I pushed a commit to try and resolve a test failure and hope to merge this in once the CI completes

@alamb alamb merged commit 845a1aa into apache:main Jul 20, 2024
10 checks passed
@nickpresta
Copy link
Contributor Author

No problem at all! I appreciate everyone's effort to get this merged in.

Thank you @alamb, @iffyio, and @git-hulk for your time and all the feedback.

lustefaniak pushed a commit to getsynq/sqlparser-rs that referenced this pull request Sep 11, 2024
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
ayman-sigma pushed a commit to sigmacomputing/sqlparser-rs that referenced this pull request Nov 19, 2024
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants