-
-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parametrize SELECT queries #1777
Parametrize SELECT queries #1777
Conversation
|
||
@classmethod | ||
def parameterizer(cls) -> Parameterizer: | ||
return Parameterizer(placeholder_factory=lambda _: "%s") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
psycopg decided to use non-standard placeholder for Postgres, so we have to account for that.
7ad8250
to
ca5e37d
Compare
Pull Request Test Coverage Report for Build 11958477626Warning: This coverage report may be inaccurate.This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Details
💛 - Coveralls |
tests/test_case_when.py
Outdated
if self.dialect == "mysql": | ||
expected_sql = "SELECT `intnum` `intnum`,CASE WHEN `intnum`>=%s THEN %s ELSE `intnum_null` END `category` FROM `intfields`" | ||
elif self.dialect == "postgres": | ||
if self.is_psycopg: | ||
expected_sql = 'SELECT "intnum" "intnum",CASE WHEN "intnum">=%s THEN %s ELSE "intnum_null" END "category" FROM "intfields"' | ||
else: | ||
expected_sql = 'SELECT "intnum" "intnum",CASE WHEN "intnum">=$1 THEN $2 ELSE "intnum_null" END "category" FROM "intfields"' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add parameters asserts, if we are not checking them inside query text anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
I did a bit of research. Django's equivalent of sql()
method returns SQL with values in place but the SQL itself might be invalid. So it's basically just for reference when you want to have a quick check what SQL the query generates.
In Ruby on Rails they return SQL with values in place too. I read a few issues in their repo and looks like it's just for reference too.
In SQLAlchemy they return the parametrized query without parameters by default but you can pass an argument to the function to have values in place.
I'm a bit torn on this one. It feels to me that returning a not parametrized query with values in place means that we are returning a different query from the one actually being executed. So I'm inclined to introduce a function argument to sql()
to configure whether to return parametrized query or a query with inlined values. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think doing it with parameter is okay, if someone needs it for debug reasons - shouldn't be problem to pass parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added params_inline
arg to QuerySet.sql()
.
tortoise/queryset.py
Outdated
parameterizer = pypika_kwargs.pop("parameterizer", self._db.executor_class.parameterizer()) | ||
return ( | ||
self.query.get_sql(parameterizer=parameterizer, **pypika_kwargs), | ||
parameterizer.values, | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you decided to implement logic of parametrisation in queryset, and not in executor?
If it is about reusing same logic between different backends - it seems to me that it should be no problem to store that logic in base executor
It just seem odd for queryset to get down in such details of query formatting as forming sql string and deriving params from it.
I would imagine queryset as quite abstract entity that doesn't go in such details in making query and pypika query seems to be okay mediator between queryset and executors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why you decided to implement logic of parametrisation in queryset, and not in executor?
Honestly, I haven't thought about it before you asked but it's a great question.
I tried to refactor the code to have parametrization as part of executors and realized a few things:
- The executor interface will need to be extended with a method to return SQL, parametrized or not
- When using
Subquery
and parametrizing the query, pypika's keywords are passed from the parent query, so they will need to be passed to the executor too, basically, making the the logic a bit more complicated - I tried to see how we can have better abstractions but honestly right now pypika is all over the place.
QuerySet
modifies private fields of pypika's query and executor builds its own query, etc. So this PR doesn't change the situation by much.
However, I moved the parametrization code to a private method which doesn't seem't seem to be complicated, it's 5 lines of code - please see _parametrize_query
I'm kinda inclined to think that it's Okay to keep parametrization as part of queryset but please let me know what you think!
ca5e37d
to
0d2fe63
Compare
@@ -215,10 +215,13 @@ def __init__(self, query: "AwaitableQuery") -> None: | |||
self.query = query | |||
|
|||
def get_sql(self, **kwargs: Any) -> str: | |||
return self.query.as_query().get_sql(**kwargs) | |||
self.query._choose_db_if_not_chosen() | |||
return self.query._make_query(**kwargs)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's seems unclear - what exactly are pypika_kwargs and what can be passed there
Because as I am reading code - it uses ambigious **pypika_kwargs
everywhere, even though only parameter passed I see explicitly used - is parametrizer
What else can be there, that is passed further down to query.get_sql
?
Are we able to give more explicit typing to such params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@abondar the things that could be in kwargs: dialect, with_namespace, secondary_quote_char, alias_quote_char, as_keyword, etc. Basically it holds the context when pypika builds SQL.
I'm not fond of this either but this is a legacy of pypika, you can search for get_sql
there to see what I mean. Since Subquery
extends Term
it needs to use this approach. Same for _make_query
in queryset.py
but I guess the issue became more obvious since I renamed kwargs
to pypika_kwargs
there. This PR does not change anything fundamentally about this approach. I think we can improve this by doing something similar as ResolveContext
but I'm not sure that this PR is a good place for it since it's already quite large.
@abondar, I replaced pypika's version, can you approve again? thanks! |
Description
This PR:
.count()
,exists()
,.values()
,.values_list()
). It does NOT implement parametrization of UPDATE and DELETE queries. I'll work on UPDATE and DELETE parametrization as part of a separate PR since this one is already quite big.AwaitableQuery._execute
uses the same SQL as returned byAwaitableQuery.sql()
by changing the signatures of_execute
and_make_query
.This PR is dependent on changes in pypika-tortoise tortoise/pypika-tortoise#16.⚠️ we should merge them first and release a new version of
pypika-tortoise
!Motivation and Context
Parameterized queries are crucial for preventing SQL injection attacks and but also can improve performance of database operations.
How Has This Been Tested?
Checklist: