Automatically escape table names in DBI interface #618

detule · 2023-11-18T19:18:04Z

This is a solution to the the issue reported in #591

In particular, add an exact boolean argument to odbcConnectionTables and odbcConnectionColumns. Users should set this to TRUE to express that any non-null identifier arguments are to be matched exactly. This may lead to performance improvements for some backends.

However, this should be considered to be optionally implemented, and only as a way to reap any performance benefits from matching exactly, rather than allowing pattern value arguments.

R/Connection.R

R/db.R

R/Table.R

R/Connection.R

R/utils.R

Providing `I()` as an escape hatch if you do want to use pattern matching. Fixes #618

hadley · 2023-11-27T13:16:54Z

R/Connection.R

 #' @param ... additional parameters to methods
+#' @param exact Se to TRUE if any non-null identifier arguments are to be interpreted


I think my only remaining question is if we want to use exact = TRUE (or maybe escape = TRUE or pattern = FALSE) or I(). The advantage of an argument is that it makes it quite clear in the function signature; the advantage of I() is that it keeps the two concerns very closely coupled.

Good question:

Can go either way on this one, but maybe slightly in favor of the argument because the user may be interacting with the API by passing in a DBI::Id or DBI::SQL argument.

Haven't thought too hard about how to pass the I() through those containers / feels like there is a way but it might start to stretch the proposition.

hadley · 2023-11-27T13:17:17Z

R/utils.R

+
+# Will iterate over charsToEsc argument and for each:
+# will escape any un-escaped occurance in `x`.
+escapeChars <- function(x, charsToEsc = c("_"), escChar ="\\\\") {


I think it would be better to call this escapePattern() and hard code it to escape % and _.

Hey Hadley:

Renamed it - but I am not sold on escaping "%". It seems to me that it is so infrequently used as an identifier that if someone is passing "%" sign they are trying to use a wild card ( but may have forgotten to wrap the input appropriately ).

I keep thinking, should we escape the percent character in odbcConnectionColumns(conn, catalog_name = "my_cat", schema_name = "my_sch", name = "my_tbl", column_name = "%", exact = TRUE)?

Anyway, I can be moved here just let me know.

My feeling is that we want to make the leaky abstraction on top of the ODBC API clear, so there are two choices, either a pattern match (the default) or a literal string.

I think what you're recognising in the API is that it's a bit weird to have one parameter that controls whether or not three other arguments are escaped. That's one advantage of the interface that uses I(), but that would mean making escape = TRUE the default, which I know you're not keen on.

So only escaping _ seems like a pragmatic choice and I'm fine with it if you want to revert back to your original idea.

Hey - thanks for continuing to think about this - good points about pros/cons. I am still thinking about how the user would pass I() through the DBI::Id container.

> str(DBI::Id(name=I("abc"))@name) Named chr "abc" - attr(*, "names")= chr "name"

In the meantime - I think by escaping the identifiers in the DBI API endpoints we've covered most (all?) of the performance gains to be had. Copy your note on let's merge and then iterate as needed.

R/utils.R

And inherit less from DBI in order to avoid confusing text. Mostly taken from #618. Fixes #645

And inherit less from DBI in order to avoid confusing text. Mostly taken from #618. Fixes #645 Drop aliases for other DBI functions

detule · 2023-12-10T23:39:45Z

Re-based. Will merge when checks are complete.

fh-mthomson · 2023-12-19T16:54:02Z

This is a game changer for being able to actually write to Snowflake! Thank you @detule @fh-afrachioni @hadley!

detule mentioned this pull request Nov 18, 2023

Add odbcSetMetadataId #591

Closed

detule requested a review from hadley November 18, 2023 20:00

detule force-pushed the snowflake/perf_improvements branch from 2f219b7 to 1ec2613 Compare November 18, 2023 20:01

hadley reviewed Nov 19, 2023

View reviewed changes

R/Connection.R Outdated Show resolved Hide resolved

R/db.R Outdated Show resolved Hide resolved

R/db.R Outdated Show resolved Hide resolved

R/Table.R Outdated Show resolved Hide resolved

R/Connection.R Show resolved Hide resolved

hadley reviewed Nov 20, 2023

View reviewed changes

R/utils.R Outdated Show resolved Hide resolved

hadley added a commit that referenced this pull request Nov 22, 2023

Escape schema, table, and column names by default

ff0930f

Providing `I()` as an escape hatch if you do want to use pattern matching. Fixes #618

hadley mentioned this pull request Nov 22, 2023

Escape schema, table, and column names by default #622

Closed

hadley changed the title ~~snowflake: odbcConnectionTables|Columns: Performance improvements~~ Automatically escape table names in DBI interface Nov 27, 2023

hadley approved these changes Nov 27, 2023

View reviewed changes

hadley added a commit that referenced this pull request Dec 7, 2023

Document dbListTables and dbListFields together

43aed86

And inherit less from DBI in order to avoid confusing text. Mostly taken from #618. Fixes #645

hadley mentioned this pull request Dec 7, 2023

Document dbListTables and dbListFields together #655

Merged

hadley added a commit that referenced this pull request Dec 8, 2023

Document dbListTables and dbListFields together (#655)

b46f6cc

And inherit less from DBI in order to avoid confusing text. Mostly taken from #618. Fixes #645 Drop aliases for other DBI functions

detule added 6 commits December 10, 2023 23:18

snowflake: odbcConnectionTables|Columns: Performance improvements

ee2a31a

Add NEWS entry

6f64913

code-review

63aef4d

code-review2

9203750

code-review3

9e2d85a

code-review: escapeChars to escapePattern

f64ac32

detule force-pushed the snowflake/perf_improvements branch from 41e0768 to f64ac32 Compare December 10, 2023 23:38

hadley added 4 commits December 11, 2023 07:13

Update docs; tweak style

0d0bc59

More doc + style tweaks

42be013

Update news bullet

150fe0b

More doc polishing

80c5b9a

hadley merged commit 56494c0 into r-dbi:main Dec 11, 2023
16 checks passed

fh-mthomson mentioned this pull request Dec 20, 2023

dbListTables only returns the first letter of each table name. #561

Closed

$@fh-afrachioni$ fh-afrachioni mentioned this pull request Dec 25, 2023

Performance and usability of joins with copy = TRUE to Snowflake tidyverse/dbplyr#1433

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically escape table names in DBI interface #618

Automatically escape table names in DBI interface #618

detule commented Nov 18, 2023

hadley Nov 27, 2023

detule Dec 1, 2023

hadley Nov 27, 2023

detule Dec 1, 2023

hadley Dec 1, 2023

detule Dec 10, 2023

detule commented Dec 10, 2023

fh-mthomson commented Dec 19, 2023

		#' @param ... additional parameters to methods
		#' @param exact Se to TRUE if any non-null identifier arguments are to be interpreted

Automatically escape table names in DBI interface #618

Automatically escape table names in DBI interface #618

Conversation

detule commented Nov 18, 2023

hadley Nov 27, 2023

Choose a reason for hiding this comment

detule Dec 1, 2023

Choose a reason for hiding this comment

hadley Nov 27, 2023

Choose a reason for hiding this comment

detule Dec 1, 2023

Choose a reason for hiding this comment

hadley Dec 1, 2023

Choose a reason for hiding this comment

detule Dec 10, 2023

Choose a reason for hiding this comment

detule commented Dec 10, 2023

fh-mthomson commented Dec 19, 2023