-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved, Best-like contact analysis (Issue #702) #708
Changes from 3 commits
755631c
cd67662
a55119a
093a091
c6fe71e
2440ef7
9f67bb2
e60e4d8
876e2a5
d151610
137426e
b9d35d4
216b040
07e0238
acc2f24
e183fe6
e18b55c
c743c58
714dd64
d5e2995
03980b9
a2a3379
c82c185
2e7c4ee
d8df641
395a41a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -773,3 +773,167 @@ def plot_qavg(self, filename=None, **kwargs): | |
|
||
if filename is not None: | ||
savefig(filename) | ||
|
||
|
||
def calculate_contacts(ref, u, selA, selB, radius=4.5, beta=5.0, lambda_constant=1.8): | ||
"""Calculate fraction of native contacts (Q) between two groups | ||
defined by selection strings selA and selB. | ||
|
||
Reference distances are taken from Univere ref. | ||
Distances are taken from trajecotry in Universe u. | ||
|
||
Parameters | ||
---------- | ||
ref: Universe | ||
reference Universe from which the reference distances will be taken | ||
u: Universe | ||
Universe from which the distance timeseries will be taken | ||
selA: string | ||
selection string for group A | ||
selB: string | ||
selection string for group B | ||
radius: float, optional (4.5 Angstroms) | ||
distance cutoff for defining reference contacts (Angstroms) | ||
beta: float, optional (5 Angstroms^-1) | ||
temperature-like, steepness of the shift function for contacts (1/Angstroms) | ||
lambda_constant: float, optional (1.8 unitless) | ||
tolerance, contact is formed when r < r0*lambda_constant (unitless) | ||
|
||
Returns | ||
------- | ||
list | ||
Returns a list of following structure:: | ||
{ | ||
[[t1, q1], [t2, q2], ... [tn, qn]] | ||
} | ||
where t is time in ps and q is the fraction of native contacts | ||
|
||
Examples | ||
-------- | ||
|
||
1. Protein folding | ||
|
||
ref = Universe("villin.gro") | ||
u = Universe("conf_protein.gro", "traj_protein.xtc") | ||
Q = calculate_contacts(u, ref, "protein and not name H*", "protein and not name H*") | ||
|
||
2. A pair of helices | ||
|
||
ref = Universe("glycophorin_dimer.pdb") | ||
u = Universe("conf_protein.gro", "traj_protein.xtc") | ||
Q = calculate_contacts(u, ref, "protein and resid 75-92 and not name H* and segid A", "protein and resid 75-92 and not name H* and segid B") | ||
|
||
Note | ||
---- | ||
For all-atom simulations, radius = 4.5 A and lambda_constant = 1.8 (unitless) | ||
For coarse-grained simulations, radius = 6.0 A and lambda_constant = 1.5 (unitless) | ||
|
||
Reference | ||
--------- | ||
Using the definition from Best, Hummer, and Eaton, "Native contacts determine protein folding mechanisms in atomistic simulations" PNAS (2013) 10.1073/pnas.1311599110 | ||
|
||
Eq. (1) of the SI defines the expression for the fraction of native contacts, $Q(X)$: | ||
.. math:: $$ Q(X) = \frac{1}{|S|} \sum_{(i,j) \in S} \frac{1}{1 + \exp[\beta(r_{ij}(X) - \lambda r_{ij}^0)]}, $$ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The .. math::
Q(X) = \frac{1}{|S|} \sum_{(i,j) \in S} \frac{1}{1 + \exp[\beta(r_{ij}(X) - \lambda r_{ij}^0)]},
should work |
||
|
||
where | ||
|
||
$X$ is a conformation, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. needs to be formatted as list |
||
$r_{ij}(X)$ is the distance between atoms $i$ and $j$ in conformation $X$, | ||
$r^0_{ij}$ is the distance from heavy atom i to j in the native state conformation, | ||
$S$ is the set of all pairs of heavy atoms $(i,j)$ belonging to residues $\theta_i$ and $\theta_j$ such that $|\theta_i - \theta_j| > 3$ and $r^0_{i,} < 4.5 \unicode{x212B}$, | ||
$\beta=5 \unicode{x212B}^{-1}$, | ||
$\lambda=1.8$ for all-atom simulations | ||
|
||
""" | ||
# reference groups A and B from selection strings | ||
refA, refB = ref.select_atoms(selA), ref.select_atoms(selB) | ||
|
||
# 2D float array, reference distances (r0) | ||
dref = distance_array(refA.positions, refB.positions) | ||
|
||
# 2D bool array, select reference distances that are less than the cutoff radius | ||
mask = dref < radius | ||
#print("ref has {:d} contacts within {:.2f}".format(mask.sum(), radius)) | ||
|
||
# group A and B in a trajectory | ||
grA, grB = u.select_atoms(selA), u.select_atoms(selB) | ||
results = [] | ||
|
||
|
||
for ts in u.trajectory: | ||
d = distance_array(grA.positions, grB.positions) | ||
r, r0 = d[mask], dref[mask] | ||
x = 1/(1 + np.exp(beta*(r - lambda_constant * r0))) | ||
|
||
# average/normalize and append to results | ||
results.append(( ts.time, x.sum()/mask.sum() )) | ||
|
||
#results = pd.DataFrame(results, columns=["Time (ps)", "Q"]) | ||
return results | ||
|
||
from MDAnalysis.analysis.distances import distance_array | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Imports up top |
||
from .base import AnalysisBase | ||
logger = logging.getLogger(__name__) | ||
|
||
class BestHummerContacts(AnalysisBase): | ||
r"""Calculate Best-Hummer fraction of native contacts (Q) from a atrajectory | ||
|
||
.. versionadded:: 0.13.0 | ||
""" | ||
def __init__(self, grA, grB, refA, refB, radius=4.5, lambda_constant=1.8, beta=5.0, | ||
start=None, stop=None, step=None): | ||
"""Calculate the persistence length for polymer chains | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Old docs.... copy & paste??? ;-) |
||
|
||
Parameters | ||
---------- | ||
u : Universe | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Old docstring |
||
the thing with frame | ||
grA, grB: AtomGroup | ||
two contacting groups that change over time | ||
refA, refB: AtomGroup | ||
two contacting groups in their reference conformation | ||
radius: float, optional (4.5 Angstroms) | ||
radius within which contacts exist | ||
lambda_constant: float, optional (1.8 unitless) | ||
contact is considered formed between (lambda*r0,r0) | ||
beta: float, optional (5 Angstroms^-1) | ||
softness of the switching function, the lower the softer | ||
start : int, optional | ||
First frame of trajectory to analyse, Default: 0 | ||
stop : int, optional | ||
Last frame of trajectory to analyse, Default: -1 | ||
step : int, optional | ||
Step between frames to analyse, Default: 1 | ||
|
||
|
||
Attributes | ||
---------- | ||
results: list | ||
Fraction of native contacts for each frame | ||
""" | ||
assert(grA.universe == grB.universe) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You don't want an assert here, do a proper if not x==y: raise ValueError with a nice message |
||
self.u = grA.universe | ||
self.grA, self.grB = grA, grB | ||
|
||
r0 = distance_array(refA.positions, refB.positions) | ||
self.r0 = r0 | ||
self.mask = r0 < radius | ||
self.results = [] | ||
|
||
self.beta = beta | ||
self.lambda_constant = lambda_constant | ||
|
||
self._setup_frames(self.u.trajectory, | ||
start=start, | ||
stop=stop, | ||
step=step) | ||
|
||
def _single_frame(self): | ||
grA, grB, r0, mask = self.grA, self.grB, self.r0, self.mask | ||
|
||
d = distance_array(grA.positions, grB.positions) | ||
r, r0 = d[mask], r0[mask] | ||
y = 1/(1 + np.exp(self.beta*(r - self.lambda_constant * r0))) | ||
print y | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably don't want printing? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the print is here to stay, it should use the print function syntax. |
||
|
||
self.results.append(y.sum()/mask.sum()) |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,51 @@ | |
|
||
from MDAnalysis.core.AtomGroup import AtomGroup | ||
|
||
class CoordinateNeighborSearch(object): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought this had been removed a while ago by @kain88-de in #383 or thereabout. Or did you add it?? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added in by accident, removing now |
||
"""This class can be used to find all atoms/residues/segements within the | ||
radius of a given query position. | ||
|
||
This class is using the BioPython KDTree for the neighborsearch. This class | ||
also does not apply PBC to the distance calculattions. So you have to ensure | ||
yourself that the trajectory has been corrected for PBC artifacts. | ||
""" | ||
|
||
def __init__(self, coordinates, bucket_size=10): | ||
""" | ||
:Arguments: | ||
*atom_list* | ||
list of atoms (:class: `~MDAnalysis.core.AtomGroup.AtomGroup`) | ||
*bucket_size* | ||
Number of entries in leafs of the KDTree. If you suffer poor | ||
performance you can play around with this number. Increasing the | ||
`bucket_size` will speed up the construction of the KDTree but | ||
slow down the search. | ||
""" | ||
|
||
self.kdtree = KDTree(dim=3, bucket_size=bucket_size) | ||
self.kdtree.set_coords(coordinates) | ||
|
||
def search(self, coordinates, radius, level='A'): | ||
""" | ||
Return all atoms/residues/segments that are within *radius* of the | ||
atoms in *atoms*. | ||
|
||
:Arguments: | ||
*atoms* | ||
list of atoms (:class: `~MDAnalysis.core.AtomGroup.AtomGroup`) | ||
*radius* | ||
float. Radius for search in Angstrom. | ||
*level* (optional) | ||
char (A, R, S). Return atoms(A), residues(R) or segments(S) within | ||
*radius* of *atoms*. | ||
""" | ||
indices = [] | ||
for coord in np.array(coordinates): | ||
self.kdtree.search(coord, radius) | ||
indices.extend(self.kdtree.get_indices()) | ||
return indices | ||
|
||
search_list = search | ||
|
||
class AtomNeighborSearch(object): | ||
"""This class can be used to find all atoms/residues/segements within the | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice docs!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
succinct :)