Fix Merge for issue-363 branch #781

khuston · 2016-03-18T20:56:45Z

Changes made in this pull request:

Add __radd__ method to GroupBase, making 0 + ag = ag which allows sum(ags) instead of ags[0] + ags[1] + ags[2] + ...
Fix one-word docstring typo in TransTable and set attrs=[] if default kwarg attrs=None to Topology() is unchanged by user.
Add optional attribute per_object='atom'|'residue'|'segment' to TopologyAttrs which causes Universe._process_attr on universe generation to check that the length of the value array equals n_atoms, n_residues, or n_segments, respectively, or else it raises ValueError.
Fixes Issue atomgroup_intersection IndexError for empty TopologyGroup #780 where if an empty bondgroup attempted to create a mask for itself, the resulting mask would be invalid and lead to an error.
Fixes Merge for the issue-363 branch
- Merged Universe retains whatever topology attributes are present in all of the input universes.
- Bonds, Angles, Dihedrals, and Impropers are included, if the attributes are present in the input universes.
- All other attributes not present in an empty Topology instance and not in {bonds, angles, dihedrals, impropers} are assumed to be vector-valued topology attributes. Their vector values are concatenated for a new merged TopologyAttr. First a check is made whether the TopologyAttr is subclassed from AtomAttr, ResidueAttr, or SegmentAttr to decide whether atoms.attr, residues.attr, or segments.attr should be concatenated into the new array.
- Atoms, residues, and segments are re-indexed. Atom index order is the same as the user-given order. Residues and segments from different arguments to Merge are given different indices. It's assumed if the user wanted a single residue/segment from the same pre-merge universe to have a single index in the post-merge universe, the user would just add together the atom groups first instead of providing them as separate arguments to Merge.

Original text:
This might be premature (I don't know if other planned changes will break this), but I found myself wanting to both use the LAMMPS data writer and mda.Merge for my own work. I have only tested a little bit, but I just thought I'd put it up here.

I also changed the default argument for attrs in core.Topology() beacuse
the default argument of None caused an error, whereas using an empty
list instead will return a blank Topology.

PR Checklist

Tests?
Docs?
CHANGELOG updated?
Issue raised/referenced?

Rewrote mda.Merge to work for new topology style. I also changed the default argument for attrs in core.Topology() beacuse the default argument of None caused an error, whereas using an empty list instead will return a blank Topology.

kain88-de · 2016-03-18T21:02:13Z

package/MDAnalysis/core/topology.py

@@ -315,7 +315,7 @@ class Topology(object):
    """

    def __init__(self, n_atoms=1, n_res=1, n_seg=1,
-                 attrs=None,
+                 attrs=[],
                 atom_resindex=None,


No mutable keyword arguments are a bad idea. Google can tell you why.

https://www.codementor.io/python/tutorial/essential-python-interview-questions

questions 6. The others are also a good read for people new to python.

Thanks for that. Definitely good to know -- I'll have to review the other points too at some point

Have a look at the Quantified Code report. It flagged this issue and provides links to explanations.

richardjgowers · 2016-03-18T21:10:45Z

I don't think there's any more huge changes planned, so yes it's the right time to go around fixing all the things that got broke in the name of progress!

Max is correct, I think the idiom you want is..

def a(b=None):
    if b is None:
        b = []

Otherwise you get very strange bugs

orbeckst · 2016-03-18T22:50:08Z

@richardjgowers you're pretty much the only one qualified to review... btw, I added a bright 363 label, just for you and @dotsdl ;-)

orbeckst · 2016-03-18T23:01:31Z

(sorry, the 363 label was overkill, just using the milestone instead.)

My first attempt at collecting atom groups for the same universe was flawed if args contained more than one atom group from the same universe. Hopefully this one is correct. I also override __rsum__ on AtomGroups to allow use of sum.

richardjgowers · 2016-03-19T10:04:04Z

package/MDAnalysis/core/universe.py

+        u._topology.n_segments), u)
+
+    # Update Universe namespace with segids
+    for seg in u.segments:


So with things like this, it's much better to move the original version into a method (something like _make_segid_namespace), and then call that method from the two places it gets called from. Then if we change how this works it gets updated everywhere

Since lines 184-209 and 646-671 are identical, I could put them in a universe._generate_from_topology().

I changed my previously proposed mda.Merge to align more closely with the old Merge behavior, in that it preserves the atom order as given to Merge. Given the introduction of residue and segment indices (distinct from ids), the behavior of this new Merge assigns separate residue and segment indices to each argument of Merge. That is, if two arguments to merge are atomgroups from the same universe with the same residue indices, in the post-merge universe, they will have different residue indices. If the user doesn't want this, s/he should add together the atom groups before merging. My understanding is that resid is just an attribute of a residue, whereas the residue index is the unique identifier of that residue instance. I didn't think it would make much sense for Merge to duplicate atoms by giving them different indices while keeping their residue indices the same, such that you have doubled or tripled (etc.) the number of atoms in a residue or segment. Before I was sending the per-atom segment indices to Topology instead of the per-residue segment indices. Topology.__init__() or TransTable.__init__() should probably check that the correct length index arrays are passed on. I added an attribute to the TopologyAttrs called `per_object` where the value of this attribute is 'atom', 'residue', or 'segment' if the length of the TopologyAttr value should equal n_atoms, n_residues, or n_segments. The Universe method `_process_attr` now checks if the TopologyAttr has `per_object` attribute, and raises a ValueError of the length of the value array is not as expected. Finally I updated `test_modelling.py`. This actually increased the number of errors raised during testing by 2, because instead of stopping at an error in the first few lines of the test file, it goes on and errors occur for Capping and `atoms.write`.

khuston · 2016-03-22T02:43:45Z

@richardjgowers My latest commit includes the check talked about in issue #703 for attrs that must have length n_atoms, n_residues, or n_segments during _process_attr. The atoms in the output universe are in the same order as they are passed to Merge now. I also updated test_modelling, which is where the Merge tests are. This actually increased the number of test errors by 2, because before the whole file gave a single error when trying to import AtomGroup from the non-existent AtomGroup.py, and now it has 2 errors for "capping" (I don't know what that is) and 1 error because the atoms.write method isn't implemented yet. The Merge tests pass though.

orbeckst · 2016-04-08T17:30:23Z

IIRC, the "capping" tests were introduced by @jandom as an example for how to change the termini of a protein peptide with specific chemical groups.

Fix Merge for issue-363 branch

36bbbc2

Rewrote mda.Merge to work for new topology style. I also changed the default argument for attrs in core.Topology() beacuse the default argument of None caused an error, whereas using an empty list instead will return a blank Topology.

kain88-de reviewed Mar 18, 2016
View reviewed changes

orbeckst assigned richardjgowers Mar 18, 2016

orbeckst added the issue-363 (topology refactor branch) label Mar 18, 2016

orbeckst removed the issue-363 (topology refactor branch) label Mar 18, 2016

orbeckst added this to the Topology refactor milestone Mar 18, 2016

khuston added 2 commits March 18, 2016 20:50

fix atomgrps for same universe, enable sum

f34db5a

My first attempt at collecting atom groups for the same universe was flawed if args contained more than one atom group from the same universe. Hopefully this one is correct. I also override __rsum__ on AtomGroups to allow use of sum.

cleanup and describe which topologyattr get Merged

44f39f3

richardjgowers reviewed Mar 19, 2016
View reviewed changes

richardjgowers merged commit d365548 into MDAnalysis:issue-363 Apr 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Merge for issue-363 branch #781

Fix Merge for issue-363 branch #781

khuston commented Mar 18, 2016

kain88-de Mar 18, 2016

kain88-de Mar 18, 2016

khuston Mar 18, 2016

orbeckst Mar 18, 2016

richardjgowers commented Mar 18, 2016

orbeckst commented Mar 18, 2016

orbeckst commented Mar 18, 2016

richardjgowers Mar 19, 2016

khuston Mar 19, 2016

khuston commented Mar 22, 2016

orbeckst commented Apr 8, 2016

Fix Merge for issue-363 branch #781

Fix Merge for issue-363 branch #781

Conversation

khuston commented Mar 18, 2016

PR Checklist

kain88-de Mar 18, 2016

Choose a reason for hiding this comment

kain88-de Mar 18, 2016

Choose a reason for hiding this comment

khuston Mar 18, 2016

Choose a reason for hiding this comment

orbeckst Mar 18, 2016

Choose a reason for hiding this comment

richardjgowers commented Mar 18, 2016

orbeckst commented Mar 18, 2016

orbeckst commented Mar 18, 2016

richardjgowers Mar 19, 2016

Choose a reason for hiding this comment

khuston Mar 19, 2016

Choose a reason for hiding this comment

khuston commented Mar 22, 2016

orbeckst commented Apr 8, 2016