Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution for the fast node addition to group for v0.12 #2471

Merged
merged 2 commits into from
Feb 22, 2019

Conversation

szoupanos
Copy link
Contributor

This PR solves issue #1319 and provides a fast and non-ORM way to add nodes to a group. Import/export and more specifically import under SQLA should benefit from this.

@ltalirz Could you, please, verify that it works as expected at one of your databases (the corresponding tests pass)?

@giovannipizzi We should decide which version we will use at aiida/orm/implementation/sqlalchemy/group.py
I currently use the RAW SQLA query which is the fastest. I commented out the version that uses SQLA to construct the query since it needs SQLA 1.1+ and this is not supported at that version of AiiDA. Let's decide if it is worth it to make AiiDA compatible with SQLA 1.1+ or to keep the raw SQL version for 0.12

If we keep the raw SQL approach, maybe, we should do the group addition in batches and launch several SQL queries - I am not sure if I can add 1M of node ids in one INSERT SQL.

@coveralls
Copy link

coveralls commented Feb 11, 2019

Coverage Status

Coverage increased (+7.9%) to 56.073% when pulling b14e551 on szoupanos:issue_1319_for_v0.12 into bbfc73c on aiidateam:release_v0.12.3.

@szoupanos
Copy link
Contributor Author

Also, @sphuber let me know how you think that the flag (to skip the ORM) for the group node addition should be passed to the Backend layer at the provenance redesign.

E.g. Maybe we could make it only available to the backend and when someone wants to use it, should use this interface?

We can discuss in person when you have time.

@giovannipizzi
Copy link
Member

I think it's ok to use raw SQL for 0.12.3, but we should use the commented version for 1.0.
Maybe also good to briefly discuss the interface before merging, as long as it does not take too long (if we manage to have the same interface in 0.12.3 and in 1.0 it's better, otherwise people will have to do things in two different ways).

@ltalirz
Copy link
Member

ltalirz commented Feb 12, 2019

@szoupanos
I tested the following:

  • adding existing nodes to a group: now definitely fast enough for me, thanks a lot!
  • exporting a group with 5800 ParameterData nodes (dicts are empty): 76.4s on my machine.
    That's not great but acceptable (btw perhaps check that this isn't using group additions with the orm)
  • importing an export file containing 5800 ParameterData nodes (5800_parameterdatas.zip): I stopped waiting after 5 minutes
    EDIT: I need to mention that adding the nodes + attributes took about 15s, and I was waiting for 5 minutes at STORING GROUP ELEMENTS...

In conclusion: I'm very happy with the improvement concerning adding nodes to groups.
The import problem remains unsolved. You can either try to address it in this PR or open a separate one for it (let me know) but this is very important to address.

@sphuber
Copy link
Contributor

sphuber commented Feb 12, 2019

@ltalirz Does your last bullet point mean that the biggest part of the import is spent into adding the already imported nodes to a group? Does the add_nodes call in the import just not use the new skip_orm flag or is there something else that is slowing it down?

@ltalirz
Copy link
Member

ltalirz commented Feb 12, 2019

I think that's for @szoupanos to answer...

@szoupanos
Copy link
Contributor Author

I didn't touch the export and it is a common function for both backends (it's worth to see if there are any performance differences between the backends and if there are, if they are significant - but this is a different discussion).

The import of SQLA uses (or at least it should use the new code). Let me have a close look...

@szoupanos
Copy link
Contributor Author

I missed to pass the flag to one of the group.add_nodes.

The current timings for the import of the given export file:

For Django:

real	0m14.009s
user	0m7.374s
sys	0m2.030s

For SQLA:

real	0m27.320s
user	0m17.540s
sys	0m2.720s

(aiidapy2) aiida@ubuntu_vm1_new:~/leo_import_export_exp$ verdi -p issue_1759_dj group list -A
  PK  GroupName      NumNodes  User
----  -----------  ----------  ----------------------
   1  test-group         5881  leopold.talirz@epfl.ch

(aiidapy2) aiida@ubuntu_vm1_new:~/leo_import_export_exp$ verdi -p issue_1759_sqla group list -A
  PK  GroupName      NumNodes  User
----  -----------  ----------  ----------------------
   1  test-group         5881  leopold.talirz@epfl.ch

@ltalirz
Copy link
Member

ltalirz commented Feb 12, 2019

That would be great - please push the fix and I will confirm

@szoupanos szoupanos force-pushed the issue_1319_for_v0.12 branch from c45289c to 5449949 Compare February 12, 2019 14:36
@szoupanos
Copy link
Contributor Author

If we are all happy for 0.12 with this fix (RAW SQL). I would like to clean a bit the group.add_nodes code.

And for the export, it is as I expected (I exported the imported group for every backend):

SQLA

real	1m37.500s
user	1m24.828s
sys	0m2.518s

Django

real	1m36.823s
user	1m22.461s
sys	0m2.602s

@ltalirz
Copy link
Member

ltalirz commented Feb 12, 2019

If we are all happy for 0.12 with this fix (RAW SQL).

I think we are.

And for the export, it is as I expected

As mentioned before, I think this is acceptable for the moment, and there are other, more important problems in sqla [1]. If the export does not involve adding nodes to groups (I guess it doesn't), we should just leave it as it is.

[1] E.g. the following code still scales super-linearly with the number of nodes (I think this goes back to the expire_on_commit problem):

from aiida.orm.data.parameter import ParameterData
N = 10000
node_list = [ParameterData(dict={}).store() for i in range(N)]

But this can be fixed in a separate PR.

@giovannipizzi
Copy link
Member

Also, I would leave, commented, the code using the SQLA ORM, with a comment note saying why we are using raw SQL here (i.e. the version of SQLA is too old and this is fixed in 1.0)

@szoupanos
Copy link
Contributor Author

szoupanos commented Feb 12, 2019

OK, I will do the necessary cleaning and I will update the PR.

@ltalirz
Regarding, export we can discuss, in person, more in order to understand: E.g. if the export time for both backends is identical, I don't see an issue for SQLA, or something to be improved. Maybe we can improve the efficiency of the export code that will affect both backends but this is a different discussion.

Yes, [1] should be related to the expire_on_commit. I think that it is worth seeing the performance of [1] in provenance_redesign with expire_on_commit=True/False and vs Django and then depending on the available time, developments (Django JSONB) and importance of [1], re-discuss.

… the group addition procedure and import method of SQLA should benefit from it
@szoupanos szoupanos force-pushed the issue_1319_for_v0.12 branch from 5449949 to 14c135b Compare February 12, 2019 17:59
@ltalirz ltalirz self-requested a review February 12, 2019 21:06
Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested it and confirm that the verdi import problem is resolved - yay!

Good to merge for me.

@sphuber @giovannipizzi any last comments?

@giovannipizzi
Copy link
Member

Just going back to the new migration here - I was wondering, if we insert only here, since it's not in 1.0.0, I have the feeling that it means that a person going to 0.12.3 cannot then migrate to 1.0? (also, I think here we are not putting all the logic that we have in 1.0 to deduplicate UUID, if this happens). What would be the timing without the migration, would this be worse?

@sphuber
Copy link
Contributor

sphuber commented Feb 13, 2019

Yeah I would be very careful with this. We might be breaking the migration for people who want to go from 0.12* to 1.0.0. If it is not necessary we probably shouldn't, but if we do, we should really check how this would work after we have to merge 0.12* into develop and how people can migrate to 1.0.0

@szoupanos
Copy link
Contributor Author

The migration adds the constraint to the database, a constraint that it is used by the query to avoid duplicate check at the python level. This constraint is also added to the models.

So if we keep it at the models and not in the database, we will have inconsistencies and tests would fail (there are tests in SQLA that check that).

If I am not mistaken, Martin had added this constraint long time ago "somewhere" and maybe it is already at 1.0.*
If it is the case, maybe, we could just copy paste the SQLA migrations until that point to 0.12.*, if they are just small fixes - but of course this needs a closer (and a careful) look

@sphuber
Copy link
Contributor

sphuber commented Feb 13, 2019

Yes it is in 1.0.0. Was added in this commit

@szoupanos
Copy link
Contributor Author

So I did some tests related to the discussion that we had and as far the unique constraint is named differently in the two migartions, everything works fine - but, of course you end up with 2 constraints with the same name.

E.g.

7a6587e16f4c is the migration of this PR
59edaf8a8b79 is the conflicting migration of develop/provenance_redesign

Revision history

7a6587e16f4c -> 59edaf8a8b79 (head), Adding indexes and constraints to the dbnode-dbgroup relationship table
35d4ee9a1b0e -> 7a6587e16f4c, Unique constraints for the db_dbgroup_dbnodes table
89176227b25 -> 35d4ee9a1b0e, Migrating 'hidden' properties from DbAttribute to DbExtra for code.Code. nodes
a6048f0ffca8 -> 89176227b25, Add indexes to dbworkflowdata table
70c7d732f1b2 -> a6048f0ffca8, Updating link types - This is a copy of the Django migration script
e15ef2630a1b -> 70c7d732f1b2, Deleting dbpath table and triggers
<base> -> e15ef2630a1b, Initial schema
aiidadb_issue_1759_sqla=# \d db_dbgroup_dbnodes
                              Table "public.db_dbgroup_dbnodes"
   Column   |  Type   | Collation | Nullable |                    Default
------------+---------+-----------+----------+------------------------------------------------
 id         | integer |           | not null | nextval('db_dbgroup_dbnodes_id_seq'::regclass)
 dbnode_id  | integer |           |          |
 dbgroup_id | integer |           |          |
Indexes:
    "db_dbgroup_dbnodes_pkey" PRIMARY KEY, btree (id)
    "db_dbgroup_dbnodes_dbgroup_id_dbnode_id_key" UNIQUE CONSTRAINT, btree (dbgroup_id, dbnode_id)
    "uix_dbnode_id_dbgroup_id" UNIQUE CONSTRAINT, btree (dbnode_id, dbgroup_id)
    "db_dbgroup_dbnodes_dbgroup_id_idx" btree (dbgroup_id)
    "db_dbgroup_dbnodes_dbnode_id_idx" btree (dbnode_id)
Foreign-key constraints:
    "db_dbgroup_dbnodes_dbgroup_id_fkey" FOREIGN KEY (dbgroup_id) REFERENCES db_dbgroup(id) DEFERRABLE INITIALLY DEFERRED
    "db_dbgroup_dbnodes_dbnode_id_fkey" FOREIGN KEY (dbnode_id) REFERENCES db_dbnode(id) DEFERRABLE INITIALLY DEFERRED

As soon as you use the same name, then you have collisions during the migration, and Alembic complains that there is already a constraint with the same name).

So what I would propose is to silently add the current migration (7a6587e16f4c) to develop and provenance_redesign at the right place (migration order) and then modify the migration 59edaf8a8b79 to also drop the old constraint.

We can also add a check to 59edaf8a8b79, if possible, to verify that the old constraint is there before dropping it. This will cover the users that use develop, stopped at a version just before 59edaf8a8b79 and try to apply 59edaf8a8b79 without having applied 7a6587e16f4c - this is a very rare case scenario but OK...

Let me know

@sphuber
Copy link
Contributor

sphuber commented Feb 13, 2019

So you would insert 59edaf8a8b79 in develop just before 7a6587e16f4c and then change the latter to first drop the constraints that will be added by 59edaf8a8b79, is that correct? Alembic only keeps the current revision of a database, and not some history of revisions it took to arrive at the current state right? If it doesn't then this might work, but let's involve @giovannipizzi as well

@szoupanos
Copy link
Contributor Author

szoupanos commented Feb 13, 2019

I would insert 7a6587e16f4c just after 35d4ee9a1b0e (as it should be for 0.12) and just before 0aebbeab274d of develop/provenance_redesign that currently has 35d4ee9a1b0e as a previous revision.

@szoupanos
Copy link
Contributor Author

59edaf8a8b79 will stay at its place, a few migrations later in develop/provenance redesign, and it will be modified a bit to also remove the constraint of 7a6587e16f4c if it is there before adding it with another name.

@szoupanos
Copy link
Contributor Author

I checked alembic migrations and even if they support merging, my understanding is that it will not cover our case
https://alembic.sqlalchemy.org/en/latest/branches.html#branch-dependencies

                            -- ae1027a6acf -->
                           /                   \
<base> --> 1975ea83b712 -->                      --> mergepoint
                           \                   /
                            -- 27c6a30d7c24 -->

We can have multiple branches but at the mergepoint all the migrations (from both branches) are applied.

Let me know if I missed something.

I will proceed with the solution that we discussed in person the previous days.

@szoupanos szoupanos changed the title [Don't merge] Solution for the fast node addition to group for v0.12 Solution for the fast node addition to group for v0.12 Feb 21, 2019
@szoupanos
Copy link
Contributor Author

The complementary PR for provenance_redesign/develop is the #2514

@ltalirz
Copy link
Member

ltalirz commented Feb 21, 2019

Is it ready to merge?
Do you want @giovannipizzi to give a final look?

@szoupanos
Copy link
Contributor Author

@ltalirz @giovannipizzi @sphuber
For me it is OK to proceed (I had a second look today).
Please merge #2514 first (to provenance_redesign) and then this one to release_v0.12.3

@szoupanos
Copy link
Contributor Author

I missed to pass the flag to one of the group.add_nodes.

The current timings for the import of the given export file:

For Django:

real	0m14.009s
user	0m7.374s
sys	0m2.030s

For SQLA:

real	0m27.320s
user	0m17.540s
sys	0m2.720s

@ltalirz
And just for the history the old import time for SQLA for the given dataset was 27 mins (the one which dropped to 27 secs).
(I wanted to see the old performance and I waited until the end of the import)

@sphuber
Copy link
Contributor

sphuber commented Feb 22, 2019

I merged PR #2514 for the backport of the migration included in this PR, so whenever this is ready, I guess it is good to be merged

@ltalirz ltalirz merged commit 28eea8c into aiidateam:release_v0.12.3 Feb 22, 2019
@szoupanos szoupanos deleted the issue_1319_for_v0.12 branch December 12, 2019 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants