Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Merge for issue-363 branch #781

Merged
merged 4 commits into from
Apr 10, 2016
Merged

Fix Merge for issue-363 branch #781

merged 4 commits into from
Apr 10, 2016

Conversation

khuston
Copy link
Contributor

@khuston khuston commented Mar 18, 2016

Changes made in this pull request:

  • Add __radd__ method to GroupBase, making 0 + ag = ag which allows sum(ags) instead of ags[0] + ags[1] + ags[2] + ...
  • Fix one-word docstring typo in TransTable and set attrs=[] if default kwarg attrs=None to Topology() is unchanged by user.
  • Add optional attribute per_object='atom'|'residue'|'segment' to TopologyAttrs which causes Universe._process_attr on universe generation to check that the length of the value array equals n_atoms, n_residues, or n_segments, respectively, or else it raises ValueError.
  • Fixes Issue atomgroup_intersection IndexError for empty TopologyGroup #780 where if an empty bondgroup attempted to create a mask for itself, the resulting mask would be invalid and lead to an error.
  • Fixes Merge for the issue-363 branch
    • Merged Universe retains whatever topology attributes are present in all of the input universes.
    • Bonds, Angles, Dihedrals, and Impropers are included, if the attributes are present in the input universes.
    • All other attributes not present in an empty Topology instance and not in {bonds, angles, dihedrals, impropers} are assumed to be vector-valued topology attributes. Their vector values are concatenated for a new merged TopologyAttr. First a check is made whether the TopologyAttr is subclassed from AtomAttr, ResidueAttr, or SegmentAttr to decide whether atoms.attr, residues.attr, or segments.attr should be concatenated into the new array.
    • Atoms, residues, and segments are re-indexed. Atom index order is the same as the user-given order. Residues and segments from different arguments to Merge are given different indices. It's assumed if the user wanted a single residue/segment from the same pre-merge universe to have a single index in the post-merge universe, the user would just add together the atom groups first instead of providing them as separate arguments to Merge.

Original text:
This might be premature (I don't know if other planned changes will break this), but I found myself wanting to both use the LAMMPS data writer and mda.Merge for my own work. I have only tested a little bit, but I just thought I'd put it up here.

I also changed the default argument for attrs in core.Topology() beacuse
the default argument of None caused an error, whereas using an empty
list instead will return a blank Topology.

PR Checklist

  • Tests?
  • Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Rewrote mda.Merge to work for new topology style.

I also changed the default argument for attrs in core.Topology() beacuse
the default argument of None caused an error, whereas using an empty
list instead will return a blank Topology.
@@ -315,7 +315,7 @@ class Topology(object):
"""

def __init__(self, n_atoms=1, n_res=1, n_seg=1,
attrs=None,
attrs=[],
atom_resindex=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No mutable keyword arguments are a bad idea. Google can tell you why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.codementor.io/python/tutorial/essential-python-interview-questions

questions 6. The others are also a good read for people new to python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that. Definitely good to know -- I'll have to review the other points too at some point

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have a look at the Quantified Code report. It flagged this issue and provides links to explanations.

@richardjgowers
Copy link
Member

I don't think there's any more huge changes planned, so yes it's the right time to go around fixing all the things that got broke in the name of progress!

Max is correct, I think the idiom you want is..

def a(b=None):
    if b is None:
        b = []

Otherwise you get very strange bugs

@orbeckst
Copy link
Member

@richardjgowers you're pretty much the only one qualified to review... btw, I added a bright 363 label, just for you and @dotsdl ;-)

@orbeckst
Copy link
Member

(sorry, the 363 label was overkill, just using the milestone instead.)

My first attempt at collecting atom groups for the same universe was
flawed if args contained more than one atom group from the same
universe. Hopefully this one is correct.

I also override __rsum__ on AtomGroups to allow use of sum.
u._topology.n_segments), u)

# Update Universe namespace with segids
for seg in u.segments:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So with things like this, it's much better to move the original version into a method (something like _make_segid_namespace), and then call that method from the two places it gets called from. Then if we change how this works it gets updated everywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since lines 184-209 and 646-671 are identical, I could put them in a universe._generate_from_topology().

I changed my previously proposed mda.Merge to align more closely with
the old Merge behavior, in that it preserves the atom order as given to
Merge. Given the introduction of residue and segment indices (distinct
from ids), the behavior of this new Merge assigns separate residue and
segment indices to each argument of Merge. That is, if two arguments to
merge are atomgroups from the same universe with the same residue
indices, in the post-merge universe, they will have different residue
indices. If the user doesn't want this, s/he should add together the
atom groups before merging.

My understanding is that resid is just an attribute of a residue,
whereas the residue index is the unique identifier of that residue
instance. I didn't think it would make much sense for Merge to duplicate
atoms by giving them different indices while keeping their residue
indices the same, such that you have doubled or tripled (etc.) the
number of atoms in a residue or segment.

Before I was sending the per-atom segment indices to Topology instead of
the per-residue segment indices. Topology.__init__() or
TransTable.__init__() should probably check that the correct length
index arrays are passed on.

I added an attribute to the TopologyAttrs called `per_object` where the
value of this attribute is 'atom', 'residue', or 'segment' if the length
of the TopologyAttr value should equal n_atoms, n_residues, or
n_segments. The Universe method `_process_attr` now checks if the
TopologyAttr has `per_object` attribute, and raises a ValueError of the
length of the value array is not as expected.

Finally I updated `test_modelling.py`. This actually increased the
number of errors raised during testing by 2, because instead of stopping
at an error in the first few lines of the test file, it goes on and
errors occur for Capping and `atoms.write`.
@khuston
Copy link
Contributor Author

khuston commented Mar 22, 2016

@richardjgowers My latest commit includes the check talked about in issue #703 for attrs that must have length n_atoms, n_residues, or n_segments during _process_attr. The atoms in the output universe are in the same order as they are passed to Merge now. I also updated test_modelling, which is where the Merge tests are. This actually increased the number of test errors by 2, because before the whole file gave a single error when trying to import AtomGroup from the non-existent AtomGroup.py, and now it has 2 errors for "capping" (I don't know what that is) and 1 error because the atoms.write method isn't implemented yet. The Merge tests pass though.

@orbeckst
Copy link
Member

orbeckst commented Apr 8, 2016

IIRC, the "capping" tests were introduced by @jandom as an example for how to change the termini of a protein peptide with specific chemical groups.

@richardjgowers richardjgowers merged commit d365548 into MDAnalysis:issue-363 Apr 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants