Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: consolidate table_reference and invocation identifier into object_reference #150

Merged
merged 16 commits into from
Jun 20, 2023

Conversation

dmfay
Copy link
Collaborator

@dmfay dmfay commented May 25, 2023

Incorporates #149 ; not ready for primetime yet, there are four test failures that mostly stem from incorrect precedence negotiation between object_reference and field -- give it a whirl if you have time!

@matthias-Q if this is too much (it's a lot) you can avoid a lot of the precedence-setting in #149 by setting it in parametric_type as here. It still wants the precedence on relation but I haven't gone too far down that rabbit hole yet.

@dmfay dmfay requested a review from matthias-Q May 25, 2023 04:19
Copy link
Collaborator

@matthias-Q matthias-Q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I like the new _dml_* structure

grammar.js Outdated
'.',
),
),
choice(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a choice of just one element.

@@ -625,7 +642,7 @@ module.exports = grammar({
optional(
$.keyword_only,
),
$.table_reference,
$.object_reference,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not forget to fix the queries downstream in nvim-treesitter. There is a query matching on table_reference

@matthias-Q
Copy link
Collaborator

matthias-Q commented May 25, 2023

The test Update with a JOIN is failing, because the identifier is missing the schema. In the test the assignment is written as a.d = 5;

    assignment: $ => seq(
      field('left', $.identifier), // column name only, no schema/table qualifiers
      '=',
      field('right', $._expression),
    ),

I do not understand why this has been changed from field to identifier

@matthias-Q
Copy link
Collaborator

matthias-Q commented May 25, 2023

I have a variant of your branch, where only two tests fail (Simple select with schema and fully-pathed fields, insert with field name). Should I push it to this PR?

@dmfay
Copy link
Collaborator Author

dmfay commented May 25, 2023

I have a variant of your branch, where only two tests fail (Simple select with schema and fully-pathed fields, insert with field name). Should I push it to this PR?

yes absolutely!

I do not understand why this [update assignment left-hand side] has been changed from field to identifier

I didn't follow up on MySQL multi-table updates (where the lhs can be qualified) until later and forgot to undo that change. I'll wait to change anything until you've pushed your updates though.

@matthias-Q
Copy link
Collaborator

matthias-Q commented May 26, 2023

Not so many changes. I played around with your changes a bit.

The last two tests are breaking because of L1308 - _set_values is commented out. When adding this is a whole lot more breaks. I assume this is what you are referring to with the MySQL updates.

The other one is the fully qualified object_ref. LL1260 object_reference only has schema.name and it should be schema.table.col. But there I get a lot of issues.

[Edit]: I did not comment all in in L1308. Now it is just one test failing, and thats the fully qualified path fields.

@matthias-Q
Copy link
Collaborator

Now I stumbled over your comments before _set_value. There is this test (Insert with field name) and it is valid in MySQL. I was not able to find it in the documentation but I did a quick test against a mysql instance.

@dmfay
Copy link
Collaborator Author

dmfay commented Jun 1, 2023

some experimenting here in a toy subset https://github.com/dmfay/tsqltest

this gets sch.tbl.field, sch.tbl.* (missing a test in main atm!), etc correctly in select clauses but breaks join predicates, order columns, and so on when transferred to the full tree-sitter-sql since field no longer contains an internal object_reference

I think it is promising or at least a hopeful sign though! I'll keep looking into this

@dmfay
Copy link
Collaborator Author

dmfay commented Jun 19, 2023

okay I'm alive! Lots of progress here -- actually the object_reference move is complete, and the remaining issue I can see is with the various optimize statements added in the interim. As pushed this doesn't compile, complaining about conflicts first in vacuum and then in others.

The vacuum conflict is resolved with prec.right but that's a very blunt instrument and there's something extra-fishy going on: one of the conflict paths involves field _vacuum_table_repeat1, although paren_list uses plain repeat not repeat1, and then if you change line 1490 to specify identifier the conflict now implicates create_view_repeat1 which has a paren_list($.identifier)! Possibly we need to lift those into named/anonymous rules?

@matthias-Q
Copy link
Collaborator

I don't understand why we have conflicts here at all. I mean vacuum is close to the root of the tree and it is followed by a node with is almost at the bottom (field; identifier).
From my understanding, this should not lead to any disambiguities as it is a fairly unique parsing "path".

What do you mean by "lifting into named/anonymous rules"?

@dmfay
Copy link
Collaborator Author

dmfay commented Jun 19, 2023

I don't understand why we have conflicts here at all. I mean vacuum is close to the root of the tree and it is followed by a node with is almost at the bottom (field; identifier). From my understanding, this should not lead to any disambiguities as it is a fairly unique parsing "path".

What do you mean by "lifting into named/anonymous rules"?

because the anomalous repeat1s reference other rules entirely, I wonder if tree-sitter is doing something strange with scopes of "shared" anonymous rules like paren_list -- that is, because a paren_list(identifier) appears in the "create view" path first or last, other instances of paren_list(identifier) get mixed up with "create view" syntax. So all's well if we have only the one instance of paren_list(identifier) but the moment we have another we need to have an explicit view_params_paren_list + other_thing_paren_list, or at least that's my current guess.

@dmfay
Copy link
Collaborator Author

dmfay commented Jun 20, 2023

I think it's because of how we structure programs as sequential statements and often have optional ; delimiters. So vacuum tbl ( NEXT_TOKEN could express either a single statement expressing vacuum options or two statements, a simplest-possible vacuum and a parenthesized expression.

That seems like a bit much to try to tackle here so I've added a conflict for _vacuum_table and _compute_stats which lets it compile for now. All tests pass, but we should revisit programs as sequences of delimited statements soon!

@dmfay dmfay marked this pull request as ready for review June 20, 2023 01:39
Copy link
Collaborator

@matthias-Q matthias-Q left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. I really like that we have a clearer structure now.

There 3 comments, two of them are just about comments in the code.

grammar.js Outdated
Comment on lines 1364 to 1365
// TODO i don't think we have a test for `insert into tbl set ....` and
// it's no dialect i can think of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the docs correclty, then MariaDB allows for this:
https://mariadb.com/kb/en/insert/

@@ -2092,26 +2101,29 @@ module.exports = grammar({
),
optional($.where),
optional($.group_by),
optional($.window_clause),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is is correct that the window_clause can be between group_by and order_by? Does the order matter here? I thought that is should be last.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed Postgres in placing it between having and order by. I haven't seen another product with a standalone window clause.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does make sense as limit and order by apply to the result of set operations (union, intersect, except) & you can see those are a dividing line

grammar.js Outdated Show resolved Hide resolved
grammar.js Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants