Enhance object name path segments #1539

ayman-sigma · 2024-11-20T22:05:32Z

Right now ObjectName is just list of identifiers. We parse each object name path segment as a string identifier. Some dialects has more rich types for each path segment. This PR rework the object name to allow different types for each path segment.

Examples this PR will make it easier to support:

Databricks IDENTIFIER clause. Example: SELECT * FROM myschema.IDENTIFIER(:mytab). The (:mytab) is wrongly parsed right now as TableFunctionArgs. More details: https://docs.databricks.com/en/sql/language-manual/sql-ref-names-identifier-clause.html
Snowflake double-dot notation. Example SELECT * FROM db..table_name. This indicates that use of default schema PUBLIC. With this PR, we can use DefaultSchema variant for the path segment instead of using empty identifier. More details: https://docs.snowflake.com/en/sql-reference/name-resolution#resolution-when-schema-omitted-double-dot-notation

Most changes are mechanical except couple of locations I commented on below, in addition to the ast/mod.rs.

ayman-sigma · 2024-11-20T22:36:17Z

src/parser/mod.rs

@@ -4294,7 +4312,9 @@ impl<'a> Parser<'a> {
        let mut data_type = self.parse_data_type()?;
        if let DataType::Custom(n, _) = &data_type {
            // the first token is actually a name
-            name = Some(n.0[0].clone());
+            match n.0[0].clone() {
+                ObjectNamePart::Identifier(ident) => name = Some(ident),


Once we start adding more to the ObjectNamePart enum, we will return parsing error for the other variants here.

ayman-sigma · 2024-11-20T22:38:26Z

src/parser/mod.rs

@@ -10778,7 +10798,7 @@ impl<'a> Parser<'a> {
        self.expect_token(&Token::LParen)?;
        let aggregate_functions = self.parse_comma_separated(Self::parse_aliased_function_call)?;
        self.expect_keyword(Keyword::FOR)?;
-        let value_column = self.parse_object_name(false)?.0;
+        let value_column = self.parse_period_separated_identifiers()?;


Giving this is a column name, we should parse it as period-separated identifiers and not as Object name.

mvzink · 2024-11-20T23:29:59Z

I think ObjectNamePart::Wildcard or something would be better than what I did in #1538, so this seems like a good idea to me.

src/ast/mod.rs

src/parser/mod.rs

iffyio

Thanks @ayman-sigma! left some minor comments, this looks good to me overall

src/ast/mod.rs

src/parser/mod.rs

iffyio

LGTM! cc @alamb

alamb · 2024-11-30T13:10:00Z

Hi @ayman-sigma this PR appears to have some conflicts. Is there any chance you can resolve them so we can merge it in?

Thank you!

ayman-sigma · 2024-12-02T03:54:44Z

Hi @ayman-sigma this PR appears to have some conflicts. Is there any chance you can resolve them so we can merge it in?

Thank you!

@alamb, Done.

alamb

I started trying to update DataFusion to use this change -- it turns out to be fairly invasive.

You can try here: apache/datafusion#13546

(the issue is that we have a bunch of handling of ObjectName --> Indents code).

I think we can make the DataFusion code better / easier to follow

alamb · 2024-12-11T22:37:12Z

Given the potential for non trivial downstream conflicts due to this change (look at the list of conflicts it has already collected) I would like to consider it for the next release

Release sqlparser-rs version 0.53.0 / sqlparser_derive 0.3.0 #1517

ayman-sigma · 2024-12-12T02:49:56Z

Given the potential for non trivial downstream conflicts due to this change (look at the list of conflicts it has already collected) I would like to consider it for the next release

Release sqlparser-rs version 0.53.0 / sqlparser_derive 0.3.0 #1517

Sounds good. Thanks @alamb!

iffyio · 2025-01-18T08:14:45Z

@alamb just wanted to double check status of this PR if there were reservations you had or if you feel this is something we would be able to land?

alamb · 2025-01-18T21:45:20Z

@alamb just wanted to double check status of this PR if there were reservations you had or if you feel this is something we would be able to land?

My biggest reservation was that it would cause substantial downstream churn (I tried to make the changes to DataFUsion briefly and it was painful). So I just haven't had the heart to click the merge button

I mentially was prepared if you merged it I would sort it out downstraem but I couldn't get myself to inflict the main on myself ...

iffyio · 2025-01-19T10:40:09Z

Ah yeah this is indeed an invasive change. Alright that makes sense!

In that case @ayman-sigma please take a look at resolving the conflicts when you have some time to pick this back up and we can look to merge it? Sorry for the delay in getting to it

alamb · 2025-01-19T14:08:08Z

FWIW I did a test upgrade to DataFusion to prepare for the next release and it already had some non trivial changes needed (changes to FieldAccess specifically)

Test upgrade to sqlparser-rs 0.54 datafusion#14198

ayman-sigma mentioned this pull request Nov 20, 2024

Support snowflake double dot notation for object name #1540

Merged

ayman-sigma commented Nov 20, 2024

View reviewed changes

mvzink reviewed Nov 20, 2024

View reviewed changes

src/ast/mod.rs Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from 4b4998e to 18ca48f Compare November 21, 2024 20:19

iffyio mentioned this pull request Nov 23, 2024

How to best add support for IDENTIFIER() clause #1412

Open

iffyio reviewed Nov 23, 2024

View reviewed changes

src/ast/mod.rs Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

src/parser/mod.rs Outdated Show resolved Hide resolved

ayman-sigma requested a review from iffyio November 24, 2024 20:33

iffyio approved these changes Nov 25, 2024

View reviewed changes

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from e22e3d8 to 176cf13 Compare November 26, 2024 02:39

ayman-sigma added 8 commits December 1, 2024 19:46

improve object name parts

bcc9067

update readme

532a935

fmt

079a67b

remove comment

331233a

fix vistior tests and address comment

0675d75

rebase and fix tests

10bca54

address comments

1fd0477

fix after rebase

a77b1ea

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from 29f2610 to 6f05bcf Compare December 2, 2024 03:50

implement spanned for ObjectNamePart

7791973

ayman-sigma force-pushed the ayman/improveObjectNameParts branch from 6f05bcf to 7791973 Compare December 2, 2024 03:52

alamb reviewed Dec 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance object name path segments #1539

Enhance object name path segments #1539

ayman-sigma commented Nov 20, 2024 •

edited

Loading

ayman-sigma Nov 20, 2024

ayman-sigma Nov 20, 2024

mvzink commented Nov 20, 2024

iffyio left a comment

iffyio left a comment

alamb commented Nov 30, 2024

ayman-sigma commented Dec 2, 2024

alamb left a comment

alamb commented Dec 11, 2024

ayman-sigma commented Dec 12, 2024

iffyio commented Jan 18, 2025

alamb commented Jan 18, 2025

iffyio commented Jan 19, 2025

alamb commented Jan 19, 2025

Enhance object name path segments #1539

Are you sure you want to change the base?

Enhance object name path segments #1539

Conversation

ayman-sigma commented Nov 20, 2024 • edited Loading

ayman-sigma Nov 20, 2024

Choose a reason for hiding this comment

ayman-sigma Nov 20, 2024

Choose a reason for hiding this comment

mvzink commented Nov 20, 2024

iffyio left a comment

Choose a reason for hiding this comment

iffyio left a comment

Choose a reason for hiding this comment

alamb commented Nov 30, 2024

ayman-sigma commented Dec 2, 2024

alamb left a comment

Choose a reason for hiding this comment

alamb commented Dec 11, 2024

ayman-sigma commented Dec 12, 2024

iffyio commented Jan 18, 2025

alamb commented Jan 18, 2025

iffyio commented Jan 19, 2025

alamb commented Jan 19, 2025

ayman-sigma commented Nov 20, 2024 •

edited

Loading