♻️ REFACTOR: Move to sqlachemy 1.4 (and v2 API) #5069

chrisjsewell · 2021-08-12T10:18:08Z

This PR is intended to update our use of sqlalchemy to the 2.0 API, which is supported by v1.4: https://docs.sqlalchemy.org/en/14/glossary.html#term-1

(note this currently builds on #5063)

chrisjsewell · 2021-08-12T10:23:49Z

In 95985f0, I have added an initial trial migration for the querybuilder.
I have run into an issue though, encapsulated by the added tests/orm/test_querybuilder.py::TestQueryBuilder::test_joined_node_count test:

    def do_execute(self, cursor, statement, parameters, context=None):
>       cursor.execute(statement, parameters)
E       sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) invalid reference to FROM-clause entry for table "db_dbnode_2"
E       LINE 3: ...b_dblink AS db_dblink_1 ON db_dblink_1.input_id = db_dbnode_...
E                                                                    ^
E       HINT:  There is an entry for table "db_dbnode_2", but it cannot be referenced from this part of the query.
E       
E       [SQL: SELECT count(*) AS count_1 
E       FROM (SELECT db_dbnode_1.id AS db_dbnode_1_id, db_dbnode_1.uuid AS db_dbnode_1_uuid, db_dbnode_1.node_type AS db_dbnode_1_node_type, db_dbnode_1.process_type AS db_dbnode_1_process_type, db_dbnode_1.label AS db_dbnode_1_label, db_dbnode_1.description AS db_dbnode_1_description, db_dbnode_1.ctime AS db_dbnode_1_ctime, db_dbnode_1.mtime AS db_dbnode_1_mtime, db_dbnode_1.attributes AS db_dbnode_1_attributes, db_dbnode_1.extras AS db_dbnode_1_extras, db_dbnode_1.repository_metadata AS db_dbnode_1_repository_metadata, db_dbnode_1.user_id AS db_dbnode_1_user_id, db_dbnode_1.dbcomputer_id AS db_dbnode_1_dbcomputer_id 
E       FROM db_dbnode AS db_dbnode_2, db_dbnode AS db_dbnode_1 JOIN db_dblink AS db_dblink_1 ON db_dblink_1.input_id = db_dbnode_2.id JOIN db_dbnode AS db_dbnode_1 ON db_dblink_1.output_id = db_dbnode_1.id 
E       WHERE CAST(db_dbnode_2.node_type AS VARCHAR) LIKE %(param_1)s AND CAST(db_dbnode_1.node_type AS VARCHAR) LIKE %(param_2)s) AS anon_1]
E       [parameters: {'param_1': 'data.%', 'param_2': 'data.%'}]
E       (Background on this error at: https://sqlalche.me/e/14/f405)

.tox/py38-django/lib/python3.8/site-packages/sqlalchemy/engine/default.py:717: ProgrammingError

when comparing the query statement to that output from develop you get:

diff --git a/x_old.txt b/x_new.txt
index 920060325..ba8541f94 100644
--- a/x_old.txt
+++ b/x_new.txt
@@ -1,3 +1,3 @@
 SELECT db_dbnode_1.id AS db_dbnode_1_id, db_dbnode_1.uuid AS db_dbnode_1_uuid, db_dbnode_1.node_type AS db_dbnode_1_node_type, db_dbnode_1.process_type AS db_dbnode_1_process_type, db_dbnode_1.label AS db_dbnode_1_label, db_dbnode_1.description AS db_dbnode_1_description, db_dbnode_1.ctime AS db_dbnode_1_ctime, db_dbnode_1.mtime AS db_dbnode_1_mtime, db_dbnode_1.attributes AS db_dbnode_1_attributes, db_dbnode_1.extras AS db_dbnode_1_extras, db_dbnode_1.repository_metadata AS db_dbnode_1_repository_metadata, db_dbnode_1.user_id AS db_dbnode_1_user_id, db_dbnode_1.dbcomputer_id AS db_dbnode_1_dbcomputer_id 
-FROM db_dbnode AS db_dbnode_2 JOIN db_dblink AS db_dblink_1 ON db_dblink_1.input_id = db_dbnode_2.id JOIN db_dbnode AS db_dbnode_1 ON db_dblink_1.output_id = db_dbnode_1.id 
+FROM db_dbnode AS db_dbnode_2, db_dbnode AS db_dbnode_1 JOIN db_dblink AS db_dblink_1 ON db_dblink_1.input_id = db_dbnode_2.id JOIN db_dbnode AS db_dbnode_1 ON db_dblink_1.output_id = db_dbnode_1.id 
 WHERE CAST(db_dbnode_2.node_type AS VARCHAR) LIKE %(param_1)s AND CAST(db_dbnode_1.node_type AS VARCHAR) LIKE %(param_2)s

You can see that it has added an additional alias db_dbnode_1, which is overriding the db_dbnode_2 alias: db_dbnode AS db_dbnode_2, db_dbnode AS db_dbnode_1

So my first guess, that even though I do self._raw_columns.pop(0) (which is not ideal in the first place), the second column still retains this "record" of what it was joining to

Note there is also a lot of these warnings:

tests/orm/test_querybuilder.py::TestQueryBuilder::test_joined_node_count
  /Users/chrisjsewell/Documents/GitHub/aiida_core_develop/aiida/orm/querybuilder.py:752: SAWarning: relationship 'Group_permissions.group' will copy column auth_group.id to column auth_group_permissions.group_id, which conflicts with relationship(s): 'Group.permissions' (copies auth_group.id to auth_group_permissions.group_id). If this is not the intention, consider if these relationships should be linked with back_populates, or if viewonly=True should be applied to one or more if they are read-only. For the less common case that foreign key constraints are partially overlapping, the orm.foreign() annotation can be used to isolate the columns that should be written towards.   To silence this warning, add the parameter 'overlaps="permissions"' to the 'Group_permissions.group' relationship.
    self.tag_to_alias_map[tag] = aliased(ormclass)

chrisjsewell · 2021-08-12T10:26:59Z

cc @sphuber, in case you tried any of this in #4855

chrisjsewell · 2021-08-12T14:46:23Z

this is the query before/after self._raw_columns.pop(0)

SELECT db_dbnode_1.id, db_dbnode_1.uuid, db_dbnode_1.node_type, db_dbnode_1.process_type, db_dbnode_1.label, db_dbnode_1.description, db_dbnode_1.ctime, db_dbnode_1.mtime, db_dbnode_1.attributes, db_dbnode_1.extras, db_dbnode_1.repository_metadata, db_dbnode_1.user_id, db_dbnode_1.dbcomputer_id, db_dbnode_2.id AS id_1, db_dbnode_2.uuid AS uuid_1, db_dbnode_2.node_type AS node_type_1, db_dbnode_2.process_type AS process_type_1, db_dbnode_2.label AS label_1, db_dbnode_2.description AS description_1, db_dbnode_2.ctime AS ctime_1, db_dbnode_2.mtime AS mtime_1, db_dbnode_2.attributes AS attributes_1, db_dbnode_2.extras AS extras_1, db_dbnode_2.repository_metadata AS repository_metadata_1, db_dbnode_2.user_id AS user_id_1, db_dbnode_2.dbcomputer_id AS dbcomputer_id_1 
FROM db_dbnode AS db_dbnode_1 JOIN db_dblink AS db_dblink_1 ON db_dblink_1.input_id = db_dbnode_1.id JOIN db_dbnode AS db_dbnode_2 ON db_dblink_1.output_id = db_dbnode_2.id 
WHERE CAST(db_dbnode_1.node_type AS VARCHAR) LIKE 'data.%%' AND CAST(db_dbnode_2.node_type AS VARCHAR) LIKE 'data.%%'

SELECT db_dbnode_1.id, db_dbnode_1.uuid, db_dbnode_1.node_type, db_dbnode_1.process_type, db_dbnode_1.label, db_dbnode_1.description, db_dbnode_1.ctime, db_dbnode_1.mtime, db_dbnode_1.attributes, db_dbnode_1.extras, db_dbnode_1.repository_metadata, db_dbnode_1.user_id, db_dbnode_1.dbcomputer_id 
FROM db_dbnode AS db_dbnode_2, db_dbnode AS db_dbnode_1 JOIN db_dblink AS db_dblink_1 ON db_dblink_1.input_id = db_dbnode_2.id JOIN db_dbnode AS db_dbnode_1 ON db_dblink_1.output_id = db_dbnode_1.id 
WHERE CAST(db_dbnode_2.node_type AS VARCHAR) LIKE 'data.%%' AND CAST(db_dbnode_1.node_type AS VARCHAR) LIKE 'data.%%'

sphuber · 2021-08-12T20:18:29Z

cc @sphuber, in case you tried any of this in #4855

I remember looking at this, but mostly that the whole logic around the private _raw_columns attribute had changed. I thought it completely changed structure and remember having to get rid of it completely. Fact that you are still using it seems to suggest I am misremembering though :D

chrisjsewell · 2021-08-12T20:22:19Z

Fact that you are still using it seems to suggest I am misremembering though

Yeh the _raw_columns is new in 1.4, with _entities being roughly its equivalent in 1.3

but, as per the issue above, I don't know if I can actually get it to work, and its obviously not great to be accessing private attributes anyway.
Going to look through the _build method in more detail, and think if there's a better way, rather than having to start hacking into the sqlalchemy internals

chrisjsewell · 2021-08-12T22:54:11Z

chrisjsewell · 2021-08-15T08:16:55Z

Note:

    # Note: according to https://github.com/sqlalchemy/sqlalchemy/commit/0d1efeec475621b5c2c2aca0632b02edef54c1a6
    # automatic uniquing of rows is turned off for the new 2.0 style of ORM querying,
    # so this sqlalchemy < 1.4 hack is no longer necessary
    # self._query._has_mapper_entities = False

giovannipizzi · 2021-08-26T07:54:30Z

For reference, re:

Note: according to sqlalchemy/sqlalchemy@0d1efee
automatic uniquing of rows is turned off for the new 2.0 style of ORM querying,
so this sqlalchemy < 1.4 hack is no longer necessary
self._query._has_mapper_entities = False

This was actually in response to an issue I had opened a lot of time ago: sqlalchemy/sqlalchemy#4395
(our workaround self._query._has_mapper_entities = False was discussed in #1600)

chrisjsewell · 2021-09-08T11:24:17Z

superseded by #5103

chrisjsewell linked an issue Aug 12, 2021 that may be closed by this pull request

Support SQLAlchemy v1.4 #4813

Closed

add initial modification

9a30f33

chrisjsewell force-pushed the sqla-1.4 branch from 95985f0 to 9a30f33 Compare August 15, 2021 07:27

chrisjsewell mentioned this pull request Aug 16, 2021

👌 IMPROVE: QueryBuilder reprs (incl/ queryhelp -> as_dict) #5081

Merged

chrisjsewell mentioned this pull request Aug 27, 2021

⬆️ UPGRADE: sqlalchemy v1.4 (v1 API) #5103

Merged

chrisjsewell closed this Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

♻️ REFACTOR: Move to sqlachemy 1.4 (and v2 API) #5069

♻️ REFACTOR: Move to sqlachemy 1.4 (and v2 API) #5069

chrisjsewell commented Aug 12, 2021

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Aug 12, 2021

chrisjsewell commented Aug 12, 2021 •

edited

Loading

sphuber commented Aug 12, 2021

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Aug 15, 2021

giovannipizzi commented Aug 26, 2021

chrisjsewell commented Sep 8, 2021 •

edited

Loading

♻️ REFACTOR: Move to sqlachemy 1.4 (and v2 API) #5069

♻️ REFACTOR: Move to sqlachemy 1.4 (and v2 API) #5069

Conversation

chrisjsewell commented Aug 12, 2021

chrisjsewell commented Aug 12, 2021 • edited Loading

chrisjsewell commented Aug 12, 2021

chrisjsewell commented Aug 12, 2021 • edited Loading

sphuber commented Aug 12, 2021

chrisjsewell commented Aug 12, 2021 • edited Loading

chrisjsewell commented Aug 12, 2021 • edited Loading

chrisjsewell commented Aug 15, 2021

giovannipizzi commented Aug 26, 2021

chrisjsewell commented Sep 8, 2021 • edited Loading

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Aug 12, 2021 •

edited

Loading

chrisjsewell commented Sep 8, 2021 •

edited

Loading