-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New Dihedral and Ramachandran Analysis #1997
Changes from 8 commits
62a194f
30da783
1f149ad
13ee8a0
abbc3ef
973910c
98c8370
ad4e598
4dabf32
f0d8030
15c2bd9
3b79336
bfc75c9
c5fcfb8
4812dc0
610368e
60b1667
e83bd9a
7561232
193e503
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,121 @@ | ||
import numpy as np | ||
import matplotlib.pyplot as plt | ||
import warnings | ||
|
||
from MDAnalysis.analysis.base import AnalysisBase | ||
|
||
def dihedral_calc(atomgroups): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The discussion below converged on eliminating this function and using |
||
"""Calculates phi and psi angles for a list of AtomGroups over trajectory. | ||
|
||
Parameters | ||
---------- | ||
atomgroups : list of AtomGroups | ||
must be a list of one or more AtomGroups containing 5 atoms in the | ||
correct order (i.e. C-N-CA-C-N) | ||
|
||
Returns | ||
------- | ||
angles : numpy.ndarray | ||
An array of time steps which contain (phi,psi) for all AtomGroups. | ||
""" | ||
|
||
dihedrals = [atomgroup.dihedral for atomgroup in atomgroups] | ||
angles = [dih.value() for dih in dihedrals] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is a list not a numpy array. I would make this function a one liner return np.array([ag.dihedral.value() for ag in atomgroups])
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This line isn't wrong, but it would be better to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, that works? Great! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @richardjgowers does EDIT: yes. Dihedral.value has lib.distances.calc_dihedrals will want a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @hfmull to use |
||
|
||
return angles | ||
|
||
|
||
class Ramachandran(AnalysisBase): | ||
"""Calculate phi and psi dihedral angles of specified residues. | ||
|
||
Note | ||
---- | ||
Run the analysis with :meth:`Ramachandran.run()`, which stores the results | ||
in the array :attr:`Ramachandran.angles`. A axes object can be obtained | ||
with :meth: `Ramachandran.run().plot()`. | ||
|
||
If the residue selection is beyond the scope of the protein, then an error | ||
will be raised. If the residue selection includes the first or last residue | ||
then a warning will be raised, and the final array of angles will not | ||
include those residues. | ||
|
||
""" | ||
def __init__(self, atomgroup, **kwargs): | ||
r"""Parameters | ||
---------- | ||
atomgroup : Atomgroup | ||
atoms for residues for which phi and psi are calculated | ||
start : int, optional | ||
starting frame, default None becomes 0. | ||
stop : int, optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aren't we moving those to the run function @orbeckst There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, you're right, of course – I asked @hfmull to start from analysis.rms.RMSD and forgot that it uses soon-to-be changed call signatures (#1463). Do we need to keep init start/stop/step for new additions like this one and deprecate them (#1979) right away?? I would say no (don't teach the user something you want them to forget) but I am open to alternative opinions. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. still keeping them here makes this behave exactly as any other class. Kind of nice. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It makes more sense to not allow them here if we want them in |
||
Frame index to stop analysis. Default: None becomes | ||
n_frames. Iteration stops *before* this frame number, | ||
which means that the trajectory would be read until the end. | ||
step : int, optional | ||
step between frames, default None becomes 1. | ||
|
||
""" | ||
super(Ramachandran, self).__init__(atomgroup.universe.trajectory, **kwargs) | ||
self.atomgroup = atomgroup | ||
|
||
def _prepare(self): | ||
self.residues = self.atomgroup.residues | ||
res_min = np.min(self.atomgroup.universe.select_atoms("protein").residues) | ||
res_max = np.max(self.atomgroup.universe.select_atoms("protein").residues) | ||
if any([(residue < res_min or residue > res_max) for residue in self.residues]): | ||
raise IndexError("Selection exceeds protein length") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. how is this possible? Literally how. I would need to get an atom that is not in the universe topology. I like to check things but I can't think of a case where this would ever be triggered. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Solvent molecules are listed as residues in the topology files (at least in the .gro files I've been using) so it possible to have a residue group that contains solvent molecules. The idea was to give more feedback on what the problem actually is rather than being confused when There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the comment. I agree with @orbeckst to use set-semantics because it's easier to understand and more robust. I would also update the warning message to |
||
elif any([residue == (res_min or res_max) for residue in self.residues]): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that does what you think it does. I'm sure you want There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see what you mean. I'll fix that on the next commit. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can also make use of set operations here
It's more robust like @orbeckst has suggested above. I know the set operations are not well documented but they do make such comparisons easier. |
||
warnings.warn("Cannot determine phi and psi angles for the first or last residues") | ||
|
||
self.phi_atoms = [residue.phi_selection() for residue in self.residues | ||
if res_min < residue < res_max] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you already checked that this can never happen. You should be fine I don't see the reason for the if here as long as the previous checks really do work. See comment above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason for that conditional is that the function does not allow selections that exceed the protein but it still allows selections that include the first and last residues. However, because it doesn't make sense to calculate the dihedrals for those, they are excluded. The warning just acts as a signal so the user knows that they included those residues in their selection, but that they are not included in the list of angles. This is just to avoid confusion if the number of residues they listed does not match the number of angle sets given. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK I understand now. you can pre remove them in the selection. protein = ag.universe.select_atoms('protein')
# remove first and last residue if part of selection
residues = residues.difference(protein.residues[[0, -1]]) https://www.mdanalysis.org/docs/documentation_pages/core/groups.html |
||
self.psi_atoms = [residue.psi_selection() for residue in self.residues | ||
if res_min < residue < res_max] | ||
|
||
self.angles = [] | ||
|
||
def _single_frame(self): | ||
self.phi_angles = dihedral_calc(self.phi_atoms) | ||
self.psi_angles = dihedral_calc(self.psi_atoms) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why keep those around and attach them to self. I would rather make them normal variables that are not added to the class. |
||
self.angles.append((self.phi_angles,self.psi_angles)) | ||
|
||
def _conclude(self): | ||
self.angles = np.array(self.angles) | ||
|
||
def plot(self, ax=None, color='k', marker='s', title=None): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is better suited as a free function that takes a Ramachandran class as first argument. Also you should allow for arbitrary kwargs and then call the plotting function There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. None of the other analysis classes are using plot from an external function. And at least I think it makes sense to keep it as a method, along the lines of "the method knows how to plot the data". Most people will initially use it to get some quick results, then perhaps use it to plot into their own There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I would omit all styling kwargs (color, marker, title) and just pass the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The method needs a test that checks that it produces an Axes instance. Something like ax = rama.plot()
assert isinstance(ax, plt.matplotlib.Axes) (or whatever the actual instance is – see |
||
"""Plots data into standard ramachandran plot. Each time step in | ||
self.angles is plotted onto the same graph. | ||
|
||
Parameters | ||
---------- | ||
ax : :class:`matplotlib.axes.Axes` | ||
If no `ax` is supplied or set to ``None`` then the plot will | ||
be added to the current active axes. | ||
color : string, optional | ||
Color used for markers in the plot; the default color is 'black'. | ||
marker : string, optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. use a |
||
Marker used in plot; the default marker is 'square'. | ||
title : string, optional | ||
Title of axes object; the default `None` leaves plot without a | ||
title. | ||
|
||
Returns | ||
------- | ||
ax : :class:`~matplotlib.axes.Axes` | ||
Axes with the plot, either `ax` or the current axes. | ||
|
||
""" | ||
if ax is None: | ||
ax = plt.gca() | ||
ax.axis([-180,180,-180,180]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How do people plot the allowed and marginally allowed regions in "standard" Ramachandran plots? Is there a simple function (something like a spline for a density that gives the correct contours) or a dataset that we could use? It would be nice if we had a helper function to plot them; this method could then use the helper to plot these regions in the background but anyone else could use it, too. (Not a requirement for this PR but something to think about.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
ax.axhline(0, color='k', lw=1) | ||
ax.axvline(0, color='k', lw=1) | ||
ax.set_xticks(range(-180,181,60)) | ||
ax.set_yticks(range(-180,181,60)) | ||
ax.set_xlabel(r"$\phi$ (deg)") | ||
ax.set_ylabel(r"$\psi$ (deg)") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there is a more convenient |
||
if title is not None: | ||
ax.set_title(title) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. don't to that. Let the user decide them self |
||
for angles in self.angles: | ||
ax.plot(angles[0],angles[1], color=color, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. a better function would be ax.scatter(angles[0], angles[1], **kwargs) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can make this into a one-line and only call into the plotting routine once. Might make a speed difference if you have a lot of frames. a = self.angles.reshape((np.prod(self.angles.shape[:2]), 2))
ax.scatter(a[0], a[1], **kwargs) |
||
marker=marker, linestyle='') |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
import numpy as np | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The copyright header should also be added here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. pylint will also issue a warning here to add |
||
from numpy.testing import assert_array_almost_equal | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please use |
||
import pytest | ||
import os | ||
|
||
import MDAnalysis as mda | ||
from MDAnalysisTests.datafiles import GRO, XTC, DihedralsArray, GLYDihedralsArray | ||
from MDAnalysis.analysis import dihedrals | ||
|
||
|
||
class TestRamachandran(object): | ||
|
||
@pytest.fixture() | ||
def universe(self): | ||
return mda.Universe(GRO, XTC) | ||
|
||
def test_ramachandran(self, universe): | ||
rama = dihedrals.Ramachandran(universe.select_atoms("protein")).run() | ||
test_rama = np.load(DihedralsArray) | ||
|
||
assert_array_almost_equal(rama.angles, test_rama, 5, | ||
err_msg="error: dihedral angles should " | ||
"match test values") | ||
|
||
def test_ramachandran_single_frame(self, universe): | ||
rama = dihedrals.Ramachandran(universe.select_atoms("protein"), | ||
start=5, stop=6).run() | ||
test_rama = [np.load(DihedralsArray)[5]] | ||
|
||
assert_array_almost_equal(rama.angles, test_rama, 5, | ||
err_msg="error: dihedral angles should " | ||
"match test values") | ||
|
||
def test_ramachandran_identical_frames(self, universe, tmpdir): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it necessary to have this test? It was included because I was learning from the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think this is needed here. |
||
|
||
outfile = os.path.join(str(tmpdir), "dihedrals.xtc") | ||
|
||
# write a dummy trajectory of all the same frame | ||
with mda.Writer(outfile, universe.atoms.n_atoms) as W: | ||
for _ in range(universe.trajectory.n_frames): | ||
W.write(universe) | ||
|
||
universe = mda.Universe(GRO, outfile) | ||
rama = dihedrals.Ramachandran(universe.select_atoms("protein")).run() | ||
test_rama = [np.load(DihedralsArray)[0] for ts in universe.trajectory] | ||
|
||
assert_array_almost_equal(rama.angles, test_rama, 5, | ||
err_msg="error: dihedral angles should " | ||
"match test values") | ||
|
||
def test_ramachandran_residue_selections(self, universe): | ||
rama = dihedrals.Ramachandran(universe.select_atoms("resname GLY")).run() | ||
test_rama = np.load(GLYDihedralsArray) | ||
|
||
assert_array_almost_equal(rama.angles, test_rama, 5, | ||
err_msg="error: dihedral angles should " | ||
"match test values") |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -415,5 +415,8 @@ | |
|
||
GSD = resource_filename(__name__, 'data/example.gsd') | ||
|
||
DihedralsArray = resource_filename(__name__, 'data/adk_oplsaa_dihedrals.npy') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You also need to add the to |
||
GLYDihedralsArray = resource_filename(__name__, 'data/adk_oplsaa_GLY_dihedrals.npy') | ||
|
||
# This should be the last line: clean up namespace | ||
del resource_filename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add the standard header – copy from the top of rms.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add
from __future__ import absolute_import
as the very first import with any import fromsix.moves
directly below it. Also include therange
import from six.moves