Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved, Best-like contact analysis (Issue #702) #708

Closed
wants to merge 26 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
755631c
Improved, Best-like contact analysis (Issue #702)
Feb 9, 2016
cd67662
Revert to changes to contacts.py
Feb 10, 2016
a55119a
Arguments refactored, numpy-style docs
Feb 10, 2016
093a091
Refactor universe identity check between groups, optional scipy.spars…
Feb 10, 2016
c6fe71e
Add test files to package, rename test class for contacts
Feb 10, 2016
2440ef7
Contact analysis: move the example to cookbook
Feb 11, 2016
9f67bb2
Correct typo in tests
Feb 11, 2016
e60e4d8
Fixed module import
Feb 11, 2016
876e2a5
Fixed module imports (ii)
Feb 11, 2016
d151610
Fixed imports (iii)
Feb 11, 2016
137426e
Single class that supports both cutoff and best-hummer contacts
Feb 12, 2016
b9d35d4
Clean-up docs, bzip conf files
Feb 12, 2016
216b040
Trying to make the new Contacts class as close to ContactAnalysis1 cl…
Feb 13, 2016
07e0238
Re-link to gro.bz2 files in setup.py
Feb 14, 2016
acc2f24
Moved all the remaining methods from ContactAnalysis1 to Contacts
Feb 15, 2016
e183fe6
Deprecate the old ContactAnalysis1 class
Feb 15, 2016
e18b55c
Corrected docs, moved things around - general feature cleanup
Feb 23, 2016
c743c58
Test and docstring cleanup
Feb 24, 2016
714dd64
improvements for distances
orbeckst Feb 24, 2016
d5e2995
doc string upgrades in analysis.contacts
orbeckst Feb 24, 2016
03980b9
added back ContactAnalysis
orbeckst Feb 24, 2016
a2a3379
Merge branch 'issue-702-contacts' of github.com:jandom/mdanalysis int…
orbeckst Feb 25, 2016
c82c185
Upated CHANGELOG, brought back the old ContactAnalysis class
Feb 25, 2016
2e7c4ee
Remove any self calls from load method, to make it nice and static
Feb 26, 2016
d8df641
Merge branch 'issue-702-contacts' of https://github.com/orbeckst/mdan…
Feb 26, 2016
395a41a
Merge branch 'orbeckst-issue-702-contacts' into issue-702-contacts
Feb 26, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions package/MDAnalysis/analysis/contacts.py
Original file line number Diff line number Diff line change
Expand Up @@ -773,3 +773,167 @@ def plot_qavg(self, filename=None, **kwargs):

if filename is not None:
savefig(filename)


def calculate_contacts(ref, u, selA, selB, radius=4.5, beta=5.0, lambda_constant=1.8):
"""Calculate fraction of native contacts (Q) between two groups
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice docs!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

succinct :)

defined by selection strings selA and selB.

Reference distances are taken from Univere ref.
Distances are taken from trajecotry in Universe u.

Parameters
----------
ref: Universe
reference Universe from which the reference distances will be taken
u: Universe
Universe from which the distance timeseries will be taken
selA: string
selection string for group A
selB: string
selection string for group B
radius: float, optional (4.5 Angstroms)
distance cutoff for defining reference contacts (Angstroms)
beta: float, optional (5 Angstroms^-1)
temperature-like, steepness of the shift function for contacts (1/Angstroms)
lambda_constant: float, optional (1.8 unitless)
tolerance, contact is formed when r < r0*lambda_constant (unitless)

Returns
-------
list
Returns a list of following structure::
{
[[t1, q1], [t2, q2], ... [tn, qn]]
}
where t is time in ps and q is the fraction of native contacts

Examples
--------

1. Protein folding

ref = Universe("villin.gro")
u = Universe("conf_protein.gro", "traj_protein.xtc")
Q = calculate_contacts(u, ref, "protein and not name H*", "protein and not name H*")

2. A pair of helices

ref = Universe("glycophorin_dimer.pdb")
u = Universe("conf_protein.gro", "traj_protein.xtc")
Q = calculate_contacts(u, ref, "protein and resid 75-92 and not name H* and segid A", "protein and resid 75-92 and not name H* and segid B")

Note
----
For all-atom simulations, radius = 4.5 A and lambda_constant = 1.8 (unitless)
For coarse-grained simulations, radius = 6.0 A and lambda_constant = 1.5 (unitless)

Reference
---------
Using the definition from Best, Hummer, and Eaton, "Native contacts determine protein folding mechanisms in atomistic simulations" PNAS (2013) 10.1073/pnas.1311599110

Eq. (1) of the SI defines the expression for the fraction of native contacts, $Q(X)$:
.. math:: $$ Q(X) = \frac{1}{|S|} \sum_{(i,j) \in S} \frac{1}{1 + \exp[\beta(r_{ij}(X) - \lambda r_{ij}^0)]}, $$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $$ should not be necessary:

.. math:: 

   Q(X) = \frac{1}{|S|} \sum_{(i,j) \in S} \frac{1}{1 + \exp[\beta(r_{ij}(X) - \lambda r_{ij}^0)]},

should work


where

$X$ is a conformation,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be formatted as list

$r_{ij}(X)$ is the distance between atoms $i$ and $j$ in conformation $X$,
$r^0_{ij}$ is the distance from heavy atom i to j in the native state conformation,
$S$ is the set of all pairs of heavy atoms $(i,j)$ belonging to residues $\theta_i$ and $\theta_j$ such that $|\theta_i - \theta_j| > 3$ and $r^0_{i,} < 4.5 \unicode{x212B}$,
$\beta=5 \unicode{x212B}^{-1}$,
$\lambda=1.8$ for all-atom simulations

"""
# reference groups A and B from selection strings
refA, refB = ref.select_atoms(selA), ref.select_atoms(selB)

# 2D float array, reference distances (r0)
dref = distance_array(refA.positions, refB.positions)

# 2D bool array, select reference distances that are less than the cutoff radius
mask = dref < radius
#print("ref has {:d} contacts within {:.2f}".format(mask.sum(), radius))

# group A and B in a trajectory
grA, grB = u.select_atoms(selA), u.select_atoms(selB)
results = []


for ts in u.trajectory:
d = distance_array(grA.positions, grB.positions)
r, r0 = d[mask], dref[mask]
x = 1/(1 + np.exp(beta*(r - lambda_constant * r0)))

# average/normalize and append to results
results.append(( ts.time, x.sum()/mask.sum() ))

#results = pd.DataFrame(results, columns=["Time (ps)", "Q"])
return results

from MDAnalysis.analysis.distances import distance_array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imports up top

from .base import AnalysisBase
logger = logging.getLogger(__name__)

class BestHummerContacts(AnalysisBase):
r"""Calculate Best-Hummer fraction of native contacts (Q) from a atrajectory

.. versionadded:: 0.13.0
"""
def __init__(self, grA, grB, refA, refB, radius=4.5, lambda_constant=1.8, beta=5.0,
start=None, stop=None, step=None):
"""Calculate the persistence length for polymer chains
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old docs.... copy & paste??? ;-)


Parameters
----------
u : Universe
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Old docstring

the thing with frame
grA, grB: AtomGroup
two contacting groups that change over time
refA, refB: AtomGroup
two contacting groups in their reference conformation
radius: float, optional (4.5 Angstroms)
radius within which contacts exist
lambda_constant: float, optional (1.8 unitless)
contact is considered formed between (lambda*r0,r0)
beta: float, optional (5 Angstroms^-1)
softness of the switching function, the lower the softer
start : int, optional
First frame of trajectory to analyse, Default: 0
stop : int, optional
Last frame of trajectory to analyse, Default: -1
step : int, optional
Step between frames to analyse, Default: 1


Attributes
----------
results: list
Fraction of native contacts for each frame
"""
assert(grA.universe == grB.universe)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't want an assert here, do a proper if not x==y: raise ValueError with a nice message

self.u = grA.universe
self.grA, self.grB = grA, grB

r0 = distance_array(refA.positions, refB.positions)
self.r0 = r0
self.mask = r0 < radius
self.results = []

self.beta = beta
self.lambda_constant = lambda_constant

self._setup_frames(self.u.trajectory,
start=start,
stop=stop,
step=step)

def _single_frame(self):
grA, grB, r0, mask = self.grA, self.grB, self.r0, self.mask

d = distance_array(grA.positions, grB.positions)
r, r0 = d[mask], r0[mask]
y = 1/(1 + np.exp(self.beta*(r - self.lambda_constant * r0)))
print y
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably don't want printing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the print is here to stay, it should use the print function syntax.


self.results.append(y.sum()/mask.sum())
45 changes: 45 additions & 0 deletions package/MDAnalysis/lib/NeighborSearch.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,51 @@

from MDAnalysis.core.AtomGroup import AtomGroup

class CoordinateNeighborSearch(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this had been removed a while ago by @kain88-de in #383 or thereabout.

Or did you add it??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added in by accident, removing now

"""This class can be used to find all atoms/residues/segements within the
radius of a given query position.

This class is using the BioPython KDTree for the neighborsearch. This class
also does not apply PBC to the distance calculattions. So you have to ensure
yourself that the trajectory has been corrected for PBC artifacts.
"""

def __init__(self, coordinates, bucket_size=10):
"""
:Arguments:
*atom_list*
list of atoms (:class: `~MDAnalysis.core.AtomGroup.AtomGroup`)
*bucket_size*
Number of entries in leafs of the KDTree. If you suffer poor
performance you can play around with this number. Increasing the
`bucket_size` will speed up the construction of the KDTree but
slow down the search.
"""

self.kdtree = KDTree(dim=3, bucket_size=bucket_size)
self.kdtree.set_coords(coordinates)

def search(self, coordinates, radius, level='A'):
"""
Return all atoms/residues/segments that are within *radius* of the
atoms in *atoms*.

:Arguments:
*atoms*
list of atoms (:class: `~MDAnalysis.core.AtomGroup.AtomGroup`)
*radius*
float. Radius for search in Angstrom.
*level* (optional)
char (A, R, S). Return atoms(A), residues(R) or segments(S) within
*radius* of *atoms*.
"""
indices = []
for coord in np.array(coordinates):
self.kdtree.search(coord, radius)
indices.extend(self.kdtree.get_indices())
return indices

search_list = search

class AtomNeighborSearch(object):
"""This class can be used to find all atoms/residues/segements within the
Expand Down
Loading