-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Dihedral and Ramachandran Analysis #1997
Conversation
err_msg="error: dihedral angles should " | ||
"match test values") | ||
|
||
def test_ramachandran_identical_frames(self, universe, tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to have this test? It was included because I was learning from the test_rms.py
file which had it, but I am not sure of the exact reason to include it and @orbeckst isn't either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for your first contribution. It already looks good. Most of my comments are about style and it would be nice if you address them.
""" | ||
|
||
dihedrals = [atomgroup.dihedral for atomgroup in atomgroups] | ||
angles = [dih.value() for dih in dihedrals] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a list not a numpy array. I would make this function a one liner
return np.array([ag.dihedral.value() for ag in atomgroups])
atoms for residues for which phi and psi are calculated | ||
start : int, optional | ||
starting frame, default None becomes 0. | ||
stop : int, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't we moving those to the run function @orbeckst
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you're right, of course – I asked @hfmull to start from analysis.rms.RMSD and forgot that it uses soon-to-be changed call signatures (#1463).
Do we need to keep init start/stop/step for new additions like this one and deprecate them (#1979) right away?? I would say no (don't teach the user something you want them to forget) but I am open to alternative opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still keeping them here makes this behave exactly as any other class. Kind of nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes more sense to not allow them here if we want them in run
res_min = np.min(self.atomgroup.universe.select_atoms("protein").residues) | ||
res_max = np.max(self.atomgroup.universe.select_atoms("protein").residues) | ||
if any([(residue < res_min or residue > res_max) for residue in self.residues]): | ||
raise IndexError("Selection exceeds protein length") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this possible? Literally how. I would need to get an atom that is not in the universe topology.
I like to check things but I can't think of a case where this would ever be triggered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.atomgroup.residues
contains residues which aren't in protein
I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solvent molecules are listed as residues in the topology files (at least in the .gro files I've been using) so it possible to have a residue group that contains solvent molecules. The idea was to give more feedback on what the problem actually is rather than being confused when phi_selection()
or psi_selection()
returned None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But using ag.issubset(protein)
is easier and more robust for the test anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. I agree with @orbeckst to use set-semantics because it's easier to understand and more robust. I would also update the warning message to Found atoms outside of protein. Only atoms inside of a 'protein' selection can be used to calculate dihedrals
.
res_max = np.max(self.atomgroup.universe.select_atoms("protein").residues) | ||
if any([(residue < res_min or residue > res_max) for residue in self.residues]): | ||
raise IndexError("Selection exceeds protein length") | ||
elif any([residue == (res_min or res_max) for residue in self.residues]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that does what you think it does. I'm sure you want res == res_min or res == res_max for res in residues
. The parenthesis should be evaluated first resulting in the test residue == True
as the test condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. I'll fix that on the next commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can also make use of set operations here
protein = ag.universe.select_atoms('protein')
...
elif protein.residues[[0, -1]].issubset(self.residues)
It's more robust like @orbeckst has suggested above. I know the set operations are not well documented but they do make such comparisons easier.
warnings.warn("Cannot determine phi and psi angles for the first or last residues") | ||
|
||
self.phi_atoms = [residue.phi_selection() for residue in self.residues | ||
if res_min < residue < res_max] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you already checked that this can never happen. You should be fine I don't see the reason for the if here as long as the previous checks really do work. See comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for that conditional is that the function does not allow selections that exceed the protein but it still allows selections that include the first and last residues. However, because it doesn't make sense to calculate the dihedrals for those, they are excluded. The warning just acts as a signal so the user knows that they included those residues in their selection, but that they are not included in the list of angles. This is just to avoid confusion if the number of residues they listed does not match the number of angle sets given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I understand now. you can pre remove them in the selection.
protein = ag.universe.select_atoms('protein')
# remove first and last residue if part of selection
residues = residues.difference(protein.residues[[0, -1]])
https://www.mdanalysis.org/docs/documentation_pages/core/groups.html
Scroll down to the list of operations.
if title is not None: | ||
ax.set_title(title) | ||
for angles in self.angles: | ||
ax.plot(angles[0],angles[1], color=color, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a better function would be scatter
here. It doesn't need the line plot styling.
ax.scatter(angles[0], angles[1], **kwargs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can make this into a one-line and only call into the plotting routine once. Might make a speed difference if you have a lot of frames.
a = self.angles.reshape((np.prod(self.angles.shape[:2]), 2))
ax.scatter(a[0], a[1], **kwargs)
ax.set_xticks(range(-180,181,60)) | ||
ax.set_yticks(range(-180,181,60)) | ||
ax.set_xlabel(r"$\phi$ (deg)") | ||
ax.set_ylabel(r"$\psi$ (deg)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a more convenient ax.set(xlabel=..., ylabel=..., xticks=..., yticks=...)
function available.
err_msg="error: dihedral angles should " | ||
"match test values") | ||
|
||
def test_ramachandran_identical_frames(self, universe, tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is needed here.
@@ -0,0 +1,57 @@ | |||
import numpy as np | |||
from numpy.testing import assert_array_almost_equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use assert_almost_equal
. It does some more equality checks then assert_array_almost_equal
be added to the current active axes. | ||
color : string, optional | ||
Color used for markers in the plot; the default color is 'black'. | ||
marker : string, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a **kwargs
argument to the function. Anything passed to it will automatically be passed to the plot. This gives the greatest freedom for the user.
Oh and please add this to the documentation generated for the users. @orbeckst can show you how (I assume you are working in his lab) |
@kain88-de @richardjgowers thanks for the comments. @hfmull is doing a REU in my lab, with the goal to work more on PMDA. We use the dihedral featurization as the initial introduction (and we hopefully get a faster dihedral featurizer out of this MDAnalysis/pmda#45 and perhaps other PMDA analysis blocks). We just discussed a number of the points that you raised and @hfmull will respond to them as he works through the PR. |
""" | ||
|
||
dihedrals = [atomgroup.dihedral for atomgroup in atomgroups] | ||
angles = [dih.value() for dih in dihedrals] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line isn't wrong, but it would be better to use lib.distances.calc_dihedrals
and do them all at once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that works?
Great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richardjgowers does dih.value()
take PBC into account by default?
EDIT: yes. Dihedral.value has pbc=True
by default.
lib.distances.calc_dihedrals will want a box
from AtomGroup.dimensions so that you get the same results with old and new dihedral calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hfmull to use calc_dihedrals
you will need "vertical atomgroups", ie 4 atomgroups, one representing the first atom in each dihedral, the next the second atom in each dihedral. I'd make these in the __init__
of your class. Then you can call calc_dihedrals(ag1.positions, ag2.positions, ag3.positions, ag4.positions, box=ag1.dimensions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking very good but see my additional comments (in addition to what @kain88-de and @richardjgowers said).
Also
- tests for any exceptions/warnings
- add yourself to
AUTHORS
- add stubs to docs (in
analysis_modules.rst
and new fileanalysis/dihedrals.rst
) and add some text to the top of thedihedrals.py
file.
|
||
from MDAnalysis.analysis.base import AnalysisBase | ||
|
||
def dihedral_calc(atomgroups): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The discussion below converged on eliminating this function and using lib.distances.calc_dihedrals
.
@@ -415,5 +415,8 @@ | |||
|
|||
GSD = resource_filename(__name__, 'data/example.gsd') | |||
|
|||
DihedralsArray = resource_filename(__name__, 'data/adk_oplsaa_dihedrals.npy') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to add the to __all__
(I only learned that from the test that failed!)
def _conclude(self): | ||
self.angles = np.array(self.angles) | ||
|
||
def plot(self, ax=None, color='k', marker='s', title=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method needs a test that checks that it produces an Axes instance. Something like
ax = rama.plot()
assert isinstance(ax, plt.matplotlib.Axes)
(or whatever the actual instance is – see type(ax)
... or see other tests, try git grep Axes
)
""" | ||
if ax is None: | ||
ax = plt.gca() | ||
ax.axis([-180,180,-180,180]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do people plot the allowed and marginally allowed regions in "standard" Ramachandran plots? Is there a simple function (something like a spline for a density that gives the correct contours) or a dataset that we could use?
It would be nice if we had a helper function to plot them; this method could then use the helper to plot these regions in the background but anyone else could use it, too.
(Not a requirement for this PR but something to think about.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking very good, I only found one small thing and you still have to add the files to the docs.
@@ -0,0 +1,105 @@ | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the standard header – copy from the top of rms.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add from __future__ import absolute_import
as the very first import with any import from six.moves
directly below it. Also include the range
import from six.moves
protein = self.atomgroup.universe.select_atoms("protein").residues | ||
|
||
if not residues.issubset(protein): | ||
raise IndexError("Found atoms outside of protein. Only atoms " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would use a value error here. The IndexError
is when you use the square bracket operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look good. Just some very minor details left.
I'm already giving my approval. Once you satisfied @orbeckst he can merge this PR/
ax.set(xticks=range(-180,181,60), yticks=range(-180,181,60), | ||
xlabel=r"$\phi$ (deg)", ylabel=r"$\psi$ (deg)") | ||
a = self.angles.reshape(np.prod(self.angles.shape[:2]), 2) | ||
ax.scatter(a[:,0], a[:,1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add **kwargs
to the scatter function as an argument.
@@ -0,0 +1,39 @@ | |||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright header should also be added here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pylint will also issue a warning here to add from __future__ import absolute_import
as the top most import
Codecov Report
@@ Coverage Diff @@
## develop #1997 +/- ##
==========================================
+ Coverage 88.48% 88.5% +0.02%
==========================================
Files 142 143 +1
Lines 17202 17249 +47
Branches 2635 2646 +11
==========================================
+ Hits 15221 15267 +46
Misses 1385 1385
- Partials 596 597 +1
Continue to review full report at Codecov.
|
|
||
import matplotlib.pyplot as plt | ||
fig = plt.figure(figsize(5,5)) | ||
ax = fig.add_subplot(111) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a more concise way (and my preferred one)
fig, ax = plt.subplots()
It can also create a grid directly and change the figure size. Also, the line for creating a figure is not valid syntax.
I assume you change the figure size to get an equal aspect ratio. There is a better function called plt.figaspect
you can use. As an argument, it accepts an aspect ratio and returns sensible figure sizes. So you can instead write
fig, ax = plt.subplots(figsize=plt.figaspect(1))
can be accessed using :meth:`Ramachandran.angles`:: | ||
|
||
fig = plt.figure(figsize(5, 5)) | ||
ax = fig.add_subplot(111) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment above
|
||
from MDAnalysis.analysis.dihedrals import Ramachandran | ||
R = Ramachandran(r) | ||
R.run() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preferred is R = Ramachandran(r).run()
as a one-liner. The run
method returns the object itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's originally what I was doing but I've been using rms.py
as a reference so I thought that was the convention. I'll be sure to change it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might still be in older modules. We changed the return value of run only after the analysis class was introduced.
:members: | ||
:inherited-members: | ||
|
||
.. attribute:: angles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we want to have this as documentation in the class? @orbeckst is the information now rendered twice?
Attributes are not rendered automatically. You need to list them in the docs. However, if there is a Attributes section in the docs already then this one can be removed.
Build the docs locally and visually check.
…--
Oliver Beckstein
email: orbeckst@gmail.com
Am Jul 20, 2018 um 16:12 schrieb Max Linke ***@***.***>:
@kain88-de commented on this pull request.
In package/MDAnalysis/analysis/dihedrals.py:
> + ax.axis([-180,180,-180,180])
+ ax.axhline(0, color='k', lw=1)
+ ax.axvline(0, color='k', lw=1)
+ ax.set(xticks=range(-180,181,60), yticks=range(-180,181,60),
+ xlabel=r"$\phi$ (deg)", ylabel=r"$\psi$ (deg)")
+ for ts in R.angles:
+ ax.scatter(ts[:,0], ts[:,1], color='k', marker='s')
+
+Analysis Class
+--------------
+
+.. autoclass:: Ramachandran
+ :members:
+ :inherited-members:
+
+ .. attribute:: angles
don't we want to have this as documentation in the class? @orbeckst is the information now rendered twice?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
it's not in the class. I do link to have all docs in the class because it
makes it easier to inspect documentation strings in an ipython session
…On Sat, Jul 21, 2018 at 5:18 PM Oliver Beckstein ***@***.***> wrote:
Attributes are not rendered automatically. You need to list them in the
docs. However, if there is a Attributes section in the docs already then
this one can be removed.
Build the docs locally and visually check.
--
Oliver Beckstein
email: ***@***.***
> Am Jul 20, 2018 um 16:12 schrieb Max Linke ***@***.***>:
>
> @kain88-de commented on this pull request.
>
> In package/MDAnalysis/analysis/dihedrals.py:
>
> > + ax.axis([-180,180,-180,180])
> + ax.axhline(0, color='k', lw=1)
> + ax.axvline(0, color='k', lw=1)
> + ax.set(xticks=range(-180,181,60), yticks=range(-180,181,60),
> + xlabel=r"$\phi$ (deg)", ylabel=r"$\psi$ (deg)")
> + for ts in R.angles:
> + ax.scatter(ts[:,0], ts[:,1], color='k', marker='s')
> +
> +Analysis Class
> +--------------
> +
> +.. autoclass:: Ramachandran
> + :members:
> + :inherited-members:
> +
> + .. attribute:: angles
> don't we want to have this as documentation in the class? @orbeckst is
the information now rendered twice?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub, or mute the thread.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1997 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEGnVh0Vj-GfAXNrD1v4MyE_2_Q754_7ks5uI0ZagaJpZM4VVHd5>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the last thing I found
then a warning will be raised and they will be removed from the list of | ||
residues, but the analysis will still run. | ||
|
||
Run the analysis with :meth:`Ramachandran.run()`, whcih stores the results in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I missed that. I'll probably go through the docstring one last time to catch anything else like that.
@orbeckst this should be done now |
The linter is unhappy about something. Please fix and ping me.
Or if @kain88-de is available and happy he can also merge. I currently only have limited access.
…--
Oliver Beckstein
email: orbeckst@gmail.com
Am Jul 23, 2018 um 17:00 schrieb Max Linke ***@***.***>:
@orbeckst this should be done now
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Congratulations to your first PR |
Changes made in this Pull Request:
PR Checklist