-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Transformations from closure into class #2859
Conversation
Hello @yuxuanzhuang! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-09-06 17:15:25 UTC |
e2bd2ba
to
05f2ddd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good idea but it's important not to be doing validity checks every frame
So for most parts I just moved code around, the only change is trans = PositionAverager(3, check_reset=True)
u = mda.Universe(TPR, XTC)
u.trajectory.add_transformations(trans)
u.trajectory[3]
print(u.atoms.positions[0])
u.trajectory[3]
print(u.atoms.positions[0])
u.trajectory[3]
print(u.atoms.positions[0])
|
Codecov Report
@@ Coverage Diff @@
## develop #2859 +/- ##
========================================
Coverage 93.01% 93.02%
========================================
Files 187 187
Lines 24940 24962 +22
Branches 3261 3261
========================================
+ Hits 23198 23220 +22
Misses 1694 1694
Partials 48 48
Continue to review full report at Codecov.
|
There's a segfault in a pickle transformation test https://travis-ci.com/github/MDAnalysis/mdanalysis/jobs/365340906 – do you have an idea what's happening there? |
There are also test failures related to pickle transformations on WIndows, eg https://dev.azure.com/mdanalysis/mdanalysis/_build/results?buildId=658&view=logs&j=6337f8de-d220-572c-a53d-20c279e44ba9&t=38e5c318-25f5-5156-9fa7-6cd72ffb271c&l=99 |
I am not sure why it's happening, but the reason is that I define a universe (PDB, XTC) at the top level of the file (line 44). I guess some specific version of python/numpy/? is not quite happy with that. =================================== ERRORS ====================================
_ ERROR collecting MDAnalysisTests/parallelism/test_pickle_transformation.py __
MDAnalysisTests\parallelism\test_pickle_transformation.py:44: in <module>
u = mda.Universe(TPR, XTC)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\core\universe.py:386: in __init__
in_memory_step=in_memory_step, **kwargs)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\core\universe.py:578: in load_new
self.trajectory = reader(filename, format=format, **kwargs)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\coordinates\XDR.py:149: in __init__
self._load_offsets()
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\coordinates\XDR.py:194: in _load_offsets
data = read_numpy_offsets(fname)
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\mdanalysis-2.0.0.dev0-py3.7-win-amd64.egg\MDAnalysis\coordinates\XDR.py:84: in read_numpy_offsets
return {k: v for k, v in np.load(filename).items()}
C:\hostedtoolcache\windows\Python\3.7.8\x64\lib\site-packages\numpy\lib\npyio.py:444: in load
raise ValueError("Cannot load file containing pickled data "
E ValueError: Cannot load file containing pickled data when allow_pickle=False After I changed it to DCD, it's fine now. |
Uh, oh, ... I think @IAlibay and @lilyminium have been chasing an elusive segfault that seems to be connected to numpy and xtc. Just pinging so that they can comment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only had time to look at the docs in more detail, see comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please update CHANGELOG and there's a small doc typo still in there, otherwise all good
ping @richardjgowers for re-review |
Met a problem when I was wrapping things up. After we change the behaviour of fit_trans = trans.fit_rot_trans(u.atoms, u.atoms)
u.trajectory.add_transformations(fit_trans)
u_dumped = pickle.dumps(u)
u_p = pickle.loads(u_dumped)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-e66d2dd7c314> in <module>
----> 1 u_p = pickle.loads(pickle.dumps(u))
~/mdanalysis/package/MDAnalysis/core/groups.py in _unpickle(u, ix)
113 def _unpickle(u, ix):
114 print(u.__dict__)
--> 115 return u.atoms[ix]
116
117
AttributeError: 'Universe' object has no attribute 'atoms' Not totally sure the exact reason. Seems that in the new process, transformations (in which have EDIT: My hunch is, for pickling
while |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comments, please.
u_p = pickle.loads(pickle.dumps(u)) | ||
u.trajectory[0] | ||
for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use equality comparison for floats
assert_equal(u_ts.positions, u_p_ts.positions) | |
assert_almost_equal(u_ts.positions, u_p_ts.positions) |
u_p = pickle.loads(pickle.dumps(u)) | ||
u.trajectory[0] | ||
for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use equality comparison for floats
assert_equal(u_ts.positions, u_p_ts.positions) | |
assert_almost_equal(u_ts.positions, u_p_ts.positions) |
u_p = pickle.loads(pickle.dumps(u)) | ||
u.trajectory[0] | ||
for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use equality comparison for floats
assert_equal(u_ts.positions, u_p_ts.positions) | |
assert_almost_equal(u_ts.positions, u_p_ts.positions) |
u_p = pickle.loads(pickle.dumps(u)) | ||
u.trajectory[0] | ||
for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use equality comparison for floats
assert_equal(u_ts.positions, u_p_ts.positions) | |
assert_almost_equal(u_ts.positions, u_p_ts.positions) |
u_p = pickle.loads(pickle.dumps(u)) | ||
u.trajectory[0] | ||
for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use equality comparison for floats
assert_equal(u_ts.positions, u_p_ts.positions) | |
assert_almost_equal(u_ts.positions, u_p_ts.positions) |
u_p = pickle.loads(pickle.dumps(u)) | ||
u.trajectory[0] | ||
for u_ts, u_p_ts in zip(u.trajectory[:5], u_p.trajectory[:5]): | ||
assert_equal(u_ts.positions, u_p_ts.positions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use equality comparison for floats
assert_equal(u_ts.positions, u_p_ts.positions) | |
assert_almost_equal(u_ts.positions, u_p_ts.positions) |
# | ||
import pytest | ||
import pickle | ||
from numpy.testing import assert_equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use this for float comparisons:
from numpy.testing import assert_equal | |
from numpy.testing import assert_almost_equal |
# apply changes to the Timestep object | ||
return ts | ||
|
||
See `MDAnalysis.transformations.translate` for a simple example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above is not a valid sphinx link.
See `MDAnalysis.transformations.translate` for a simple example. | |
See :mod:`MDAnalysis.transformations.translate` for a simple example. |
are often required for some analyses and visualization, and the functions in | ||
this module allow transformations to be applied on-the-fly. | ||
|
||
A typical transformation class looks like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A typical transformation class looks like this: | |
A typical transformation class looks like this (note that we keep its name | |
lowercase because we will treat it as a function, thanks to the ``__call__`` | |
method): |
# or modify an AtomGroup and return Timestep | ||
|
||
return ts | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You showed the abstract class. Now show a concrete example to address @richardjgowers comment. For instance
As a concrete example we will write a transformation that rotates a group of atoms around | |
the z-axis through the center of geometry by a fixed increment for every time step. We will use | |
:meth:`MDAnalysis.core.groups.AtomGroup.rotateby` and simply increase the rotation angle | |
every time the transformation is called:: | |
class spin_atoms(object): | |
def __init__(self, atoms, dphi): | |
"""Rotate atoms by dphi degrees for every time step (around the z axis)""" | |
self.atoms = atoms | |
self.dphi = dphi | |
self.axis = np.array([0, 0, 1]) | |
def __call__(self, ts): | |
phi = self.dphi * ts.frame | |
self.atoms.rotateby(phi, self.axis) | |
return ts | |
This transformation can be used as :: | |
u = mda.Universe(PSF, DCD) | |
u.trajectory.add_transformations(spin_atoms(u.select_atoms("protein"), 1.0)) | |
I currently can't get nglview to work in my notebook so you'll need to check that this actually works... or come up with another example.
EDIT: forgot to increment phi
...
EDIT 2: yes, it's pretty dumb that the phi angle just keeps incrementing, no matter what you do with the trajectory. Perhaps better to do something like phi = ts.frame * self.dphi
EDIT 3: changed it to phi = ts.frame * self.dphi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! it works. And I think it makes sense to just rotate by a fixed angle. (for visualization perhaps)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the snippet.
And yeah, I don't know why we need to use |
Does it still work with dask? |
No, I think it is just low-level func for On the other hand, for performance, it seems that the >>> u = mda.Universe(PSF, DCD)
>>> fit_trans = trans.fit_rot_trans(u.atoms, u.atoms)
>>> u.trajectory.add_transformations(fit_trans)
>>> dmp = pickle.dumps(u)
run `__reduce__` (Universe)
run `__reduce__` (AtomGroup)
run `__reduce__` (Universe)
>>> u_p = pickle.loads(dmp)
run `_unpickle_U`
run `unpickle AtomGroup`
run `_unpickle_U` I have to admit I am a bit confused here...when pickling intertwined stuff, e.g when we pickle them apart, it's not an issue: >>> u = mda.Universe(PSF, DCD)
>>> fit_trans = trans.fit_rot_trans(u.atoms, u.atoms)
>>> dmp = pickle.dumps([u, fit_trans]) # Note they are not linked together here.
run `__reduce__` (AtomGroup)
run `__reduce__` (Universe)
>>> u_p = pickle.loads(dmp)
run `_unpickle_U`
run `unpickle AtomGroup` |
Would be good to understand the path the code takes when it is pickling Universe twice. Even if it were not a performance issue, it is a potential bug somewhere down the line. Btw, is the Universe stored twice in the pickle? |
I think it's a normal behavior of python handling circular reference pickling. class Universe(object):
def __init__(self, top):
self.top = top
def add_trajectory(self,traj):
self.traj = traj
def __reduce__(self):
print('reduce')
return self._unpickle_u, (self.top, self.traj)
@classmethod
def _unpickle_u(cls, top, traj):
print('unpickle')
u = cls(top)
u.traj = traj
return u
class Topology(object):
def __init__(self):
pass
class Trajectory(object):
def __init__(self):
pass
def add_transformation(self, transformation):
self.transformation = transformation
class Transformation(object):
def __init__(self, atomgroup):
self.atomgroup = atomgroup
class Atomgroup(object):
def __init__(self, universe):
self.universe = universe
top = Topology()
traj = Trajectory()
u = Universe(top)
u.add_trajectory(traj)
ag = Atomgroup(u)
trans = Transformation(ag)
traj.add_transformation(trans)
>>> import pickle
>>> u_dmp = pickle.dumps(u)
>>> u_p = pickle.loads(u_dmp)
reduce
reduce
unpickle
unpickle EDIT: f = pickle.dumps(obj)
print(len(f))
81439 (with transformation)
69086 (without transformation)
63713 (u._topology) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at the pickling order. If we only store the data once then I think that's not a problem if it gets pickled twice.
I found a minor typo in the docs (see suggestion). Just correct it. I am already approving the PR.
@richardjgowers , were your comments addressed? |
Co-authored-by: Oliver Beckstein <orbeckst@gmail.com>
From what I can see, the reviewer's comments were addressed. To move the PR towards a merge, I am dismissing the stale review.
* Fixes MDAnalysis#2860 * refactor transformations from closure into class (necessary so that a Universe with transformations can be pickled). * change universe pickling to `__reduce__` (instead of `__setstate__`/`__getstate__`, which did not work with transformations) * add test for pickling a Universe with transformations * update docs for transformations - examples (written as class) - deprecate transformations as closures (still works but cannot be pickled due to limitations in Python's pickle module) * update changelog
Fixes #2860
Changes made in this Pull Request:
PR Checklist