Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Location/Spans in to the AST/parse layer #839

Conversation

jakeswenson
Copy link

This is an attempt at starting to plumb a location (or in this case a Span) through to the parsing/AST layer: #757.

Changes

  1. Add a new wrapper type WithSpan<T> which can be used to wrap and attach a span to any AST node. Not sure about this approach yet, but it seemed to work OK. I think this might need to evolve in to a type called WithTrivia<T> where Span and "trivia items" like leading and trailing Token's to an AST node are attached. (i.e. the SELECT keyword for a query, to get a span for the entire select query.) This "trivia" based approach is something I've learned from dotnet/roslyn (the C# compiler). (This trivia based approach was also taken with typescript.)
  2. Add a Span type which is a struct with a start (inclusive) and end (exclusive) Location

ToDo still

  • Finish wiring spans through the tests. This is going to be the most annoying part about this work. Would love to hear folks thoughts on this.

Open Questions

  • Should this work just align to using zkat/miette Diagnostic and SourceSpan.

@ankrgyl
Copy link
Contributor

ankrgyl commented Mar 19, 2023

@jakeswenson thanks for putting this up! Can you talk about how it's different than the approach in #790?

The key problem we're trying to work through right now is how to avoid breaking downstream users of sqlparser by introducing locations.


let with_grant_option =
self.parse_keywords(&[Keyword::WITH, Keyword::GRANT, Keyword::OPTION]);

let granted_by = self
.parse_keywords(&[Keyword::GRANTED, Keyword::BY])
.then(|| self.parse_identifier().unwrap());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this previous possible panic! in the parser for these GRANT statement.


let granted_by = self
.parse_keywords(&[Keyword::GRANTED, Keyword::BY])
.then(|| self.parse_identifier().unwrap());
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another panic! fix in REVOKE

@alamb
Copy link
Contributor

alamb commented Mar 20, 2023

cc @ankrgyl who I believe was doing something similar

@ankrgyl
Copy link
Contributor

ankrgyl commented Mar 21, 2023

@alamb @jakeswenson yes indeed! I posted this above too, but it'd be great to understand the differences between this and #790. In general, I'd prefer to collaborate on that PR vs. work on two separate ones.

@lustefaniak
Copy link
Contributor

I used this PR in our forked https://github.com/getsynq/sqlparser-rs and a tool built on top of that, and it works amazingly well. Adding empty_span() all over the place is painful initially, but in the end, all works as it should and is not much of a burden.

Since I don't have much Rust experience, I can't comment on any technical downsides(if any) of that design, but the "client" code required minimal changes to make it fully work.

@alamb
Copy link
Contributor

alamb commented Oct 2, 2023

This is the commit in case anyone is interested: getsynq@b4547ca

Since this feature keeps being requested, it would be great to get something ready to go

@lustefaniak
Copy link
Contributor

After working on column-level lineage using sqlparser-rs even more, I think we must add location to every instance of Ident and maybe also ObjectName as otherwise it is super annoying to mark every Ident as WithSpan<Ident> and then patching all the tests 🤔

@yuyang-ok
Copy link

Is this going to merge or not?

@alamb
Copy link
Contributor

alamb commented Mar 14, 2024

Is this going to merge or not?

I don't think this PR is mergeable as is (it has conflicts and doesn't pass CI)

As the Location/Span thing keeps coming up it would be awesome for someone to take a stab at implementing it. The core challenge, in my mind, of adding this feature is minimizing downstream crate impacts (sqlparser is used quite widely). I don't think we'll be able to avoid all impacts but being able to minimize the impact is important

@alamb alamb mentioned this pull request Mar 21, 2024
2 tasks
Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants