Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VALUES statement in MySQL #544

Closed
irigarae opened this issue Sep 23, 2024 · 13 comments
Closed

VALUES statement in MySQL #544

irigarae opened this issue Sep 23, 2024 · 13 comments
Assignees
Labels
enhancement needs analysis I need to think about this!

Comments

@irigarae
Copy link

This :values returns invalid mysql syntax (taking inspiration from special syntax :composite example:

(honey.sql/format
 {:select [:*]
  :from [[:table :t1]
         [{:values [[1 2] [3 4]]}
          [:t2 [:composite :col1 :col2]]]]}
 {:dialect :mysql
  :inline true})

it returns

SELECT * FROM (VALUES (1, 2), (3, 4)) AS `t2` (`col1`, `col2`)

however per the official documentation, https://dev.mysql.com/doc/refman/8.0/en/values.html, the correct syntax is

SELECT * FROM (VALUES ROW(1, 2), ROW(3, 4)) AS `t2` (`col1`, `col2`)
@p-himik
Copy link
Contributor

p-himik commented Sep 23, 2024

:values itself should probably continue working the original way since MySQL also allows plain VALUES clause that works differently.

Unfortunately, handling of :values doesn't seem to allow you to throw in some extra keywords to make it be formatted as VALUES ROW(...) (only VALUES (ROW(...)) which is invalid), but you can create your own formatter quite trivially.

@seancorfield
Copy link
Owner

I think I'll need to add :values-row to support this. I'll give it some thought before I choose an approach.

@seancorfield seancorfield self-assigned this Sep 23, 2024
@seancorfield seancorfield added needs analysis I need to think about this! enhancement labels Sep 23, 2024
@irigarae
Copy link
Author

VALUES is a DML statement introduced in MySQL 8.0.19 which returns a set of one or more rows as a table. In other words, it is a table value constructor which also functions as a standalone SQL statement.

Looking at the official documentation it's used to construct a table, and it contains rows inside. Maybe instead of :values-row (in singular) something like :values-table or :values-rows (notice the plural) might be more appropriate?

And also don't know if there are existing formats like this, but instead of something like {:values-table [[1 2] [3 4]]} maybe :row could be explicit like {:values [[:row 1 2] [:row 3 4]]}? Just leaving ideas, I have no real preference whatsoever.

@p-himik
Copy link
Contributor

p-himik commented Sep 23, 2024

Can't use {:values [[:row 1 2] [:row 3 4]]} because right now it's a valid syntax and might be used by other people for other DBs.

@seancorfield
Copy link
Owner

Re: the name in the DSL, it is standard for HoneySQL to produce SQL keywords that match the DSL keyword (or symbol) so VALUES ROW(1, 2) would come from :values-row (and if there are multiple rows, each of them is still ROW singular).

@seancorfield
Copy link
Owner

OK, I did some experiments and this seems viable:

(format {:values [:row [1 2] [3 4]]})

with just a couple of tweaks to format-values so that's what I'm going to go with.

I just need to figure out an appropriate set of tests and some documentation :)

seancorfield added a commit that referenced this issue Sep 28, 2024
Signed-off-by: Sean Corfield <sean@corfield.org>
@seancorfield
Copy link
Owner

@irigarae You can try this out with the latest SNAPSHOT or git dep. I'll add docs and tests tomorrow probably and may release a full new version next week...

@seancorfield
Copy link
Owner

Example added to docs (in Clause Reference) which also acts as a test.

@irigarae
Copy link
Author

irigarae commented Sep 29, 2024

@irigarae You can try this out with the latest SNAPSHOT or git dep. I'll add docs and tests tomorrow probably and may release a full new version next week...

Don't know if it's intentional, but the test is on #{:row 'rows} (see commit). Don't know if it's intentional to have 'rows or it should've been #{:row 'row}.

$ clj -Sdeps '{:deps {com.github.seancorfield/honeysql {:git/sha "3d48ecac379d6e85535cc94137d4a7115873f1fc"}}}'
user=> (require '[honey.sql])
user=> (honey.sql/format {:select [:*] :from [[{:values [:row [1 2] [3 4]]} :t]]} {:dialect :mysql :inline true})
["SELECT * FROM (VALUES ROW(1, 2), ROW(3, 4)) AS `t`"]
;; this one uses singular `row`
user=> (honey.sql/format '{select [*] from [[{values [row [1 2] [3 4]]} t]]} {:dialect :mysql :inline true})
["SELECT * FROM (VALUES ROW, (1, 2), (3, 4)) AS `t`"]
;; this one uses plural `rows`
user=> (honey.sql/format '{select [*] from [[{values [rows [1 2] [3 4]]} t]]} {:dialect :mysql :inline true})
["SELECT * FROM (VALUES ROW(1, 2), ROW(3, 4)) AS `t`"]

Thanks for the prompt development!


(Off topic) Unrelated, but I’m curious about why you test both the keyword or the symbol. I would’ve expected an initial pass normalizing everything (e.g. all keywords become symbols) and then only test on a single thing. Is there a reason for this design? Maybe performance?

@seancorfield
Copy link
Owner

Thanks @irigarae -- that's a bug. Fixed now.

The reason is so you can mix'n'match quoted symbolic forms and keyword-based forms, especially when using the helper functions, so you can write whichever is easier for your situation -- or your preference. A quoted symbolic form is less typing and can be easier to read -- it looks more like code -- if you don't have "variables" (values) in it, or you're using named parameters.

@irigarae
Copy link
Author

Thanks for everything!

Anyways, I'm not sure I expressed my question clearly. Of course I think it's really useful to mix and match, everybody can use their own preferred style. But what I imagine is that on the parsing/interpretation level it's generally safe to blindly normalize to one thing (e.g., keywords to symbols) and then build all the logic on a single way. E.g., basically the test in this case would be just #{'row}. I wanted to understand why such choice of not normalizing before all the parsing logic. Maybe your response explains that and it's just me that I'm not familiar with the internals. No need to delve deeper!

Just playing with it I found this difference (I can open a bug report if you prefer), which would be avoided if things were normalized somehow:

(sql/format '{select * from t1 join (t2 (:using id)) where (= t1/id 1)} {:dialect :mysql})
;; => ["SELECT * FROM `t1` INNER JOIN `t2` USING (`id`) WHERE `t1`.`id` = ?" 1]
(sql/format '{select * from t1 join (t2 (using id)) where (= t1/id 1)} {:dialect :mysql})
;; => ["SELECT * FROM `t1` INNER JOIN `t2` ON USING(`id`) WHERE `t1`.`id` = ?" 1]

Ok to leave the conversation here, in any case.

@seancorfield
Copy link
Owner

Efficiency. Normalizing everything first would require rebuilding the whole DSL with (say) just keywords and then parsing the whole DSL. In many places, HoneySQL does convert symbols to keywords as part of parsing, but in some places it's just easier to test against the keyword or symbol without doing the conversion.

And, yes, what you ran into is a bug, so please create an issue for that!

@seancorfield
Copy link
Owner

Don't worry about creating an issue -- I found and fixed it. There were a couple of other, similar bugs in other places (also fixed).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement needs analysis I need to think about this!
Projects
None yet
Development

No branches or pull requests

3 participants