Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrected the graph export set expansion rules that are described at … #2632

Merged
merged 1 commit into from
Apr 5, 2019

Conversation

szoupanos
Copy link
Contributor

…issue 1102

#1102

There were problems when overriding the default behaviour of the rules. Moreover
they were moved to separate methods to resuce the complexity of the export method.

Various tests were added:

  • Export tests that are based on a real workchain were added to ensure that a complex
    and real AiiDA graph based on workchains is exported correctly.
  • Export tests based on a synthetic graph were added that explicitly check every
    flag that changes the default behaviour of the export set expansion rules.

@szoupanos szoupanos requested review from ltalirz and sphuber March 22, 2019 18:27
@szoupanos szoupanos force-pushed the issue_2586_v012 branch 3 times, most recently from 5e08a47 to 163baf6 Compare March 24, 2019 11:46
@ltalirz
Copy link
Member

ltalirz commented Mar 24, 2019

@danieleongari has now done a first test

Here is what we noticed:

  • printing out the selected rules (very helpful!) made us realize that the current implementation of the -C option is very counterintuitive: while all other options turn the corresponding rules on, -C turns the rule off
  -C, --create-reversed           Follow reverse CREATE links (recursively)
                                  when calculating the node set to export.
                                  [default: True]

I think the only proper way of resolving this is to have both --create-reversed and --no-create-reversed options (same for all other options).

  • When turning on all possible rules (-RIX), the export got stuck (we canceled after a few minutes).
    I wonder what it was looking for - would it have tried to take the whole connected subset of the graph? (The DB contains 194125 nodes in total)

  • The options -RXnow seems to produce what we were looking for originally :-)
    We should consider making this the default.

  • Printing uuids instead of PKs for verdi graph generate is a good idea - however

    • it is important to use a shortened version of the uuid (see graph.pdf; to be compared to the original graph - the full uuids make the blobs very wide). There should be tools that tell you how much of the uuid you need to be unique (like on github)
    • it should be possible to switch back to PKs if desired

@ltalirz
Copy link
Member

ltalirz commented Mar 24, 2019

Concerning which subset of the uuid to show: git uses 7 characters for the commit hash (unless conflicts are discovered); we could use 8 characters (the first part of the uuid).

Subsets of uuids are already supported in 0.12, e.g. you can do
verdi node show -u ea70cc12

@szoupanos
Copy link
Contributor Author

szoupanos commented Mar 24, 2019

Thanks a lot Leopold for the testing and the feedback!

Due to time constrains (and for keeping things simple), I would keep this PR to the absolutely necessary to make the export rules work correctly (the UUIDs at the graph generate were left there by accident because they were useful for debugging - but if we all like it, I can keep a shorter UUID version).

The discussion on overriding the arguments can be added at another ticket and the implementation at another PR.

@ltalirz
Copy link
Member

ltalirz commented Mar 24, 2019

Due to time constrains (and for keeping things simple), I would keep this PR to the absolutely necessary to make the export rules work correctly

Sounds good to me. Let's do both the uuid stuff and the changes to options in a separate PR (please remove the uuid changes from this one).

I will review the code changes this evening.

@ltalirz
Copy link
Member

ltalirz commented Mar 24, 2019

However, please keep the printing of selected rules in this PR (perhaps add a print "Rules on following links:" line before)

@szoupanos
Copy link
Contributor Author

The addition of the UUIDs at the graph will be addressed at issue #2636

@szoupanos
Copy link
Contributor Author

Further discussion on the export set expansion rules can be done at #2637

@szoupanos
Copy link
Contributor Author

For me it is OK.

  • Switched back to pks for the AiiDA graph
  • Created the necessary tickets for the AiiDA graph (with UUIDS) and the default behaviour of the export rules

(INPUT, CREATE, RETURN and CALL)
"""

test_case = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these be class variables?
if they are just instance variables, you can remove these two lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to keep them for clarity - I don't like the approach of using undefined variables even if Python allows it - I find it dirty

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I will document them though

@@ -1481,6 +1483,175 @@ def get_all_node_links(self):
edge_project=['label', 'type'], output_of='input')
return qb.all()

class ComplexGraph:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file already has >1k lines
could you please move the ComplexGraph out into a separate file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose this to be done under another PR. And mainly to 1.0
There is already one created by Casper (#2702)

wc2 = WorkCalculation().store()

pw1 = JobCalculation()
pw1.set_computer(self.test_case.computer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you use the test_case just to get the computer, correct?
you use the test_case just for the computer, correct?
can you just pass the computer to the constructor instead?

this is more explicit, while passing the test case creates a loop where the test case creates the ComplexGraph, which again has a reference to the test case, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will do that.
I passed the test_case since I wanted to be more generic.
But it is cleaner what you propose.

@ltalirz
Copy link
Member

ltalirz commented Mar 29, 2019

I have to stop the review now, will continue later

@szoupanos
Copy link
Contributor Author

I addressed some of the comments. I would propose to merge this and apply the same changes to 1.0 (since it is important to have correct export rules) and leave the further refactoring to be done under another ticket (since there is one - and Casper has already started working on it).

…s. Moreover

they were moved to separate methods to resuce the complexity of the export method.

Various tests were added:
- Export tests that are based on a real workchain were added to ensure that a complex
  and real AiiDA graph based on workchains is exported correctly.
- Export tests based on a synthetic graph were added that explicitly check every
  flag that changes the default behaviour of the export set expansion rules.
@szoupanos
Copy link
Contributor Author

@ltalirz @sphuber
Can this be merged please?

@ltalirz
Copy link
Member

ltalirz commented Apr 4, 2019 via email

@ltalirz ltalirz self-requested a review April 4, 2019 23:39
Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @szoupanos, looks good to me - I've mainly tested functionality rather than looking at the code since you anyhow don't want to refactor here.

One remark: As mentioned before, putting just one node with -RIX, i.e.

('input_forward: ', True)
('create_reversed: ', True)
('return_reversed: ', True)
('call_reversed: ', True)

currently seems to return basically the whole subset of a graph that is connected to the node.
I guess this is correct, since (according to Giovanni), output links must always be followed.

The recursive algorithm used here seems a bit too slow to do this in practice for a large number of connected nodes (say 10-100k), however, so we should just keep this in mind.

@szoupanos
Copy link
Contributor Author

Thanks a lot Leo!
Yes, with these flags on, according to the following rules the full sub-graph that is connected to the node should be returned.
#1102 (comment)

Regarding the speed of the graph exploration algorithm, I think that it is a simple graph traversal algorithm
https://en.wikipedia.org/wiki/Graph_traversal

If you find a case that you think that it is particularly slow, we can have a look at it (in case there is a bug etc).

@szoupanos szoupanos merged commit aaa8c68 into aiidateam:release_v0.12.4 Apr 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants