Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fraction of native contacts (Q) analysis a'la Best-Hummer #702

Closed
jandom opened this issue Feb 8, 2016 · 16 comments
Closed

Fraction of native contacts (Q) analysis a'la Best-Hummer #702

jandom opened this issue Feb 8, 2016 · 16 comments

Comments

@jandom
Copy link
Contributor

jandom commented Feb 8, 2016

It's an extension of the MDAnalysis.analysis.contacts module that I wrote. Will knock up some tests and send a PR.

I think this would be useful to people? It's certainly useful to me :P

@hainm
Copy link
Contributor

hainm commented Feb 8, 2016

Yes I think so. +1

@dotsdl
Copy link
Member

dotsdl commented Feb 8, 2016

+1!

@jandom
Copy link
Contributor Author

jandom commented Feb 8, 2016

Awesome, 1+1 equals 2 and we all know it takes 2 to tango or implement a feature :P

Anyhow here is a caveat: I'm not a big fan of classes everywhere, this analysis is so simple that it can (and i personally think 'should') be implemented using a function.

Here is what I've got so far, tested&tried on things like villin and Glycohorin dimer

def calculate_contacts(ref, u, selA, selB, radius=4.5, beta=5.0, alpha=1.5):

    # reference groups A and B from selection strings
    grA, grB = ref.select_atoms(selA), ref.select_atoms(selB)

    # reference distances (r0)
    dref = distance_array(grA.coordinates(), grB.coordinates())

    # select reference distances that are less than the cutoff radius
    mask = dref < radius

    #print("ref has {:d} contacts within {:.2f}".format(mask.sum(), radius))

    # group A and B in a trajectory
    grA, grB = u.select_atoms(selA), u.select_atoms(selB)
    results = []
    for ts in u.trajectory:
        d = distance_array(grA.coordinates(), grB.coordinates())
        x = 1/(1 + np.exp(beta*(d[mask] - alpha * dref[mask])))
        results.append(( ts.time, x.sum()/mask.sum() ))
    results = pd.DataFrame(results, columns=["Time (ps)", "Q"])
    return results

Remarks

  1. not super efficient, for simplicity i calculate the entire distance matrix for an AtomGroup at every frame,
  2. contacts are double counted, again for simplicity (distance matrix element [i,j] is the same as [j,i]) but since we take the mean, it's all good,
  3. introduces dependency on pandas, i guess we want that out?

@orbeckst
Copy link
Member

orbeckst commented Feb 8, 2016

Well, sorry to break it to you: become a fan of the new analysis class that @richardjgowers prototyped. Unified interface for analysis is the next frontier here and we really have enough old code to retrofit.

Similarly: provide a test.

We can live with pandas in the analysis module although it does not seem strictly necessary here. In any case, import it at the top level in your PR and people can discuss it then.

@jandom
Copy link
Contributor Author

jandom commented Feb 8, 2016

Oh, if @richardjgowers has something that'll become standard, I'm more than happy to adopt :)
Cheerio for the comments Oli!

@kain88-de
Copy link
Member

If I remember the Hummer & Best paper correct they change the hard cut-off to a soft sigmoidal-like function. It would be nice if we could use also other functions. Could you make it so that an arbitrary function can be given as a argument to the contact analysis to determine the cut-offs.

@richardjgowers
Copy link
Member

@richardjgowers
Copy link
Member

Ah, and re: not using a class. I usually prefer classes because you can attach most post analysis and saving results in the class object, but that's probably just a matter of taste

@jandom
Copy link
Contributor Author

jandom commented Feb 8, 2016

Okay, I'm looking at this @richardjgowers
https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/analysis/base.py

It actually looks completely sane! Can run return things, or is it supposed to return None and I have to use an instance attribute to share the results with the user?

@richardjgowers
Copy link
Member

Hmm, not sure. You could make it return and save the result inside the class too. But if you override run you'll have to rewrite a little of the parallel stuff or do some supering. So maybe?

@orbeckst
Copy link
Member

orbeckst commented Feb 9, 2016

Generally I would save results as an attribute because when I do interactive work I hate it when I suddenly get gb of arrays on my screen just because I forgot to assign to a variable.... But I suppose standard are emerging here. Quickly check how the other analysis classes like RMSF or the polymer ones do it.

Oliver Beckstein
email: orbeckst@gmail.com

Am Feb 9, 2016 um 3:31 schrieb Richard Gowers notifications@github.com:

Hmm, not sure. You could make it return and save the result inside the class too. But if you override run you'll have to rewrite a little of the parallel stuff or do some supering. So maybe?


Reply to this email directly or view it on GitHub.

@jandom
Copy link
Contributor Author

jandom commented Feb 9, 2016

@orbeckst haha that's so funny - I share the same feeling towards getting some random text file written to disk instead of a numpy array that i clearly asked for :P MDAnalysis way seems to be to write files, so I want to adhere to that.

@kain88-de i like the "pass a callback as argument" idea but i'm afraid my python skill is too low for this, unless you can help me out a bit? Here is the 'best' I came up with....

The best-hummer Q callable for a frame is :

# r - distances at time t
# r0 - reference distances 
# beta - softness
# lambda - tolerance
def foo(r, r0, beta, lambda): return 1/(1 + np.exp(beta*(r - lambda * r0)))

A more general way to put it would be

def foo(r, r0, **kwargs): return 1/(1 + np.exp(kwargs.get('beta')*(r - kwargs.get('lambda') * r0)))

Does this fit your idea of a callback function passed as argument? Where the only necessary arguments are r and r0

@richardjgowers
Copy link
Member

https://github.com/MDAnalysis/mdanalysis/blob/develop/package/MDAnalysis/analysis/polymer.py#L42

I think the persistence length thingy I wrote works ok. So run doesn't return anything, but creates attributes. Then I've got some plotting and line fitting and saving as optional methods.

Anyway, make a PR already @jandom ! :P

@jandom
Copy link
Contributor Author

jandom commented Feb 9, 2016

Getting whipped here :P Yup, was just looking at polymer.py
Thanks, working on the flipping tests... ;)

@kain88-de
Copy link
Member

@kain88-de i like the "pass a callback as argument" idea

Nice. So it depends how we want to design this exactly. I'm personally like the interfaces from numpy and scipy a lot. I think cdist is a good example here.

So my first try for a possible API is this. (Note I haven't looked at BaseAnalysis for it's API)

class Contact(BaseAnalysis):
    def __init__(self, universe, cut, method='cut-off', *method_args):
        self.uni = universe
        self.method = method
        self.cut = cut
        self.method_args = method_args

    # or the appropriate name that doesn't override the function
    def run():
        if self.method == 'cut-off'
            self._cut_off(self.method_args)  # normal hard cut of
        elif self.method == 'soft-cut'
            self._soft_cut(self.method_args)  # best hummer
        else:
            # what ever the user has given
            self._custom_method(self.method, self.method_args) 

jandom pushed a commit to jandom/mdanalysis that referenced this issue Feb 9, 2016
A worked example with GpA dimer in a bilayer

Added tests to the BestHummerContacts calculation

Tests for fraction of native contacts (Q)
kain88-de pushed a commit that referenced this issue Feb 29, 2016
A worked example with GpA dimer in a bilayer

Added tests to the BestHummerContacts calculation

Tests for fraction of native contacts (Q)
@jandom
Copy link
Contributor Author

jandom commented Feb 29, 2016

@kain88-de thanks for closing and all your help on this one

backpropper pushed a commit to backpropper/mdanalysis that referenced this issue Mar 13, 2016
A worked example with GpA dimer in a bilayer

Added tests to the BestHummerContacts calculation

Tests for fraction of native contacts (Q)
backpropper pushed a commit to backpropper/mdanalysis that referenced this issue Mar 13, 2016
A worked example with GpA dimer in a bilayer

Added tests to the BestHummerContacts calculation

Tests for fraction of native contacts (Q)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants