Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing advanced querybuilder objects does not work #4659

Closed
giovannipizzi opened this issue Jan 13, 2021 · 1 comment · Fixed by #5081
Closed

Printing advanced querybuilder objects does not work #4659

giovannipizzi opened this issue Jan 13, 2021 · 1 comment · Fixed by #5081

Comments

@giovannipizzi
Copy link
Member

giovannipizzi commented Jan 13, 2021

If I instantiate this query builder object:

qb = QueryBuilder().append(DataFactory('structure'), project='id', filters={'extras.optimade.elements': {'contains': ['Si']}})

and then I print it (print(qb)), I get an exception:

NotImplementedError: Don't know how to literal-quote value ['Si']

By taking inspiration from this page, I adapted a bit (for JSONB rather than array, and with a fix for strings:

from datetime import date, datetime, timedelta
from sqlalchemy.orm import Query

def render_query(statement, db_session):
    """
    Generate an SQL expression string with bound parameters rendered inline
    for the given SQLAlchemy statement.
    WARNING: This method of escaping is insecure, incomplete, and for debugging
    purposes only. Executing SQL statements with inline-rendered user values is
    extremely insecure.
    Based on http://stackoverflow.com/questions/5631078/sqlalchemy-print-the-actual-query
    """
    if isinstance(statement, Query):
        statement = statement.statement
    dialect = db_session.bind.dialect

    class LiteralCompiler(dialect.statement_compiler):
        def visit_bindparam(
            self, bindparam, within_columns_clause=False, literal_binds=False, **kwargs
        ):
            return self.render_literal_value(bindparam.value, bindparam.type)

        def render_jsonb_value(self, val, item_type):
            if isinstance(val, list):
                return "{}".format(
                    ",".join([self.render_jsonb_value(x, item_type) for x in val])
                )
            if isinstance(val, str):
                return '"{}"'.format(val)
            return self.render_literal_value(val, item_type)

        def render_literal_value(self, value, type_):
            if isinstance(value, int):
                return str(value)
            elif isinstance(value, (str, date, datetime, timedelta)):
                return "'{}'".format(str(value).replace("'", "''"))
            elif isinstance(value, list):
                return "'[{}]'".format(
                    ",".join(
                        [self.render_jsonb_value(x, type_) for x in value]
                    )
                )
            return super(LiteralCompiler, self).render_literal_value(value, type_)

    return LiteralCompiler(dialect, statement).process(statement)

This now works:

q = qb.get_query()
print(render_query(q, q.session))

returns

SELECT db_dbnode_1.id 
FROM db_dbnode AS db_dbnode_1 
WHERE CAST(db_dbnode_1.node_type AS VARCHAR) LIKE 'data.structure.%' AND CAST((db_dbnode_1.extras #> '{optimade, elements}') AS JSONB) @> '["Si"]'

This is still probably not complete (we should check the other base types as well as dictionaries) but is a good start.
Should we implement this (or an extension of it) in the method returning the __str__ representation of a QueryBuilder?

Pinging @ramirezfranciscof (who maybe could implement this?) and @CasperWA as this is related to his OPTIMADE-related queries

@giovannipizzi
Copy link
Member Author

By the way, this also improves the rendering of other JSONB-related queries. E.g. for this query:

qb = QueryBuilder().append(CalcJobNode, filters={'attributes.last_job_info.wallclock_time_seconds': {'<': 400}})

Compare print(qb):

SELECT db_dbnode_1.id, db_dbnode_1.uuid, db_dbnode_1.node_type, db_dbnode_1.process_type, db_dbnode_1.label, db_dbnode_1.description, db_dbnode_1.ctime, db_dbnode_1.mtime, db_dbnode_1.attributes, db_dbnode_1.extras, db_dbnode_1.user_id, db_dbnode_1.dbcomputer_id 
FROM db_dbnode AS db_dbnode_1 
WHERE CAST(db_dbnode_1.node_type AS VARCHAR) LIKE 'process.calculation.calcjob.%%' AND CASE WHEN (jsonb_typeof(db_dbnode_1.attributes #> %(attributes_1)s) = 'number') THEN CAST((db_dbnode_1.attributes #>> '{last_job_info, wallclock_time_seconds}') AS FLOAT) < 400 ELSE false END

with q = qb.get_query(); print(render_query(q, q.session)):

SELECT db_dbnode_1.id, db_dbnode_1.uuid, db_dbnode_1.node_type, db_dbnode_1.process_type, db_dbnode_1.label, db_dbnode_1.description, db_dbnode_1.ctime, db_dbnode_1.mtime, db_dbnode_1.attributes, db_dbnode_1.extras, db_dbnode_1.user_id, db_dbnode_1.dbcomputer_id 
FROM db_dbnode AS db_dbnode_1 
WHERE CAST(db_dbnode_1.node_type AS VARCHAR) LIKE 'process.calculation.calcjob.%' AND CASE WHEN (jsonb_typeof(db_dbnode_1.attributes #> '{last_job_info, wallclock_time_seconds}') = 'number') THEN CAST((db_dbnode_1.attributes #>> '{last_job_info, wallclock_time_seconds}') AS FLOAT) < 400 ELSE False END

In particular note how #> %(attributes_1)s) is left in the print statement, while it's correctly replaced with #> '{last_job_info, wallclock_time_seconds}') below.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant