-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically escape table names in DBI interface #618
Conversation
2f219b7
to
1ec2613
Compare
Providing `I()` as an escape hatch if you do want to use pattern matching. Fixes #618
R/Connection.R
Outdated
#' @param ... additional parameters to methods | ||
#' @param exact Se to TRUE if any non-null identifier arguments are to be interpreted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my only remaining question is if we want to use exact = TRUE
(or maybe escape = TRUE
or pattern = FALSE
) or I()
. The advantage of an argument is that it makes it quite clear in the function signature; the advantage of I()
is that it keeps the two concerns very closely coupled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question:
Can go either way on this one, but maybe slightly in favor of the argument because the user may be interacting with the API by passing in a DBI::Id
or DBI::SQL
argument.
Haven't thought too hard about how to pass the I()
through those containers / feels like there is a way but it might start to stretch the proposition.
R/utils.R
Outdated
|
||
# Will iterate over charsToEsc argument and for each: | ||
# will escape any un-escaped occurance in `x`. | ||
escapeChars <- function(x, charsToEsc = c("_"), escChar ="\\\\") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to call this escapePattern()
and hard code it to escape %
and _
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Hadley:
Renamed it - but I am not sold on escaping "%". It seems to me that it is so infrequently used as an identifier that if someone is passing "%" sign they are trying to use a wild card ( but may have forgotten to wrap the input appropriately ).
I keep thinking, should we escape the percent character in odbcConnectionColumns(conn, catalog_name = "my_cat", schema_name = "my_sch", name = "my_tbl", column_name = "%", exact = TRUE)
?
Anyway, I can be moved here just let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My feeling is that we want to make the leaky abstraction on top of the ODBC API clear, so there are two choices, either a pattern match (the default) or a literal string.
I think what you're recognising in the API is that it's a bit weird to have one parameter that controls whether or not three other arguments are escaped. That's one advantage of the interface that uses I()
, but that would mean making escape = TRUE
the default, which I know you're not keen on.
So only escaping _
seems like a pragmatic choice and I'm fine with it if you want to revert back to your original idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - thanks for continuing to think about this - good points about pros/cons. I am still thinking about how the user would pass I()
through the DBI::Id
container.
> str(DBI::Id(name=I("abc"))@name)
Named chr "abc"
- attr(*, "names")= chr "name"
In the meantime - I think by escaping the identifiers in the DBI
API endpoints we've covered most (all?) of the performance gains to be had. Copy your note on let's merge and then iterate as needed.
41e0768
to
f64ac32
Compare
Re-based. Will merge when checks are complete. |
This is a game changer for being able to actually write to Snowflake! Thank you @detule @fh-afrachioni @hadley! |
This is a solution to the the issue reported in #591
In particular, add an
exact
boolean argument toodbcConnectionTables
andodbcConnectionColumns
. Users should set this toTRUE
to express that any non-null identifier arguments are to be matched exactly. This may lead to performance improvements for some backends.However, this should be considered to be optionally implemented, and only as a way to reap any performance benefits from matching exactly, rather than allowing pattern value arguments.