Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct TI error estimation II #61

Merged
merged 5 commits into from
Nov 28, 2018
Merged

Correct TI error estimation II #61

merged 5 commits into from
Nov 28, 2018

Conversation

harlor
Copy link
Contributor

@harlor harlor commented Oct 17, 2018

Sorry for messing up #58.

In summary: I believe, that the TI error estimation in alchemlyb does not reproduce the value that alchemical-analysis calculates.

This PR should change the TI error estimation such that it reproduces the value alchemical-analysis calculates.

@codecov-io
Copy link

codecov-io commented Oct 17, 2018

Codecov Report

Merging #61 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #61      +/-   ##
==========================================
+ Coverage   98.17%   98.18%   +0.01%     
==========================================
  Files           9        9              
  Lines         493      497       +4     
  Branches      100      100              
==========================================
+ Hits          484      488       +4     
  Misses          4        4              
  Partials        5        5
Impacted Files Coverage Δ
src/alchemlyb/estimators/ti_.py 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 82ce951...033cb09. Read the comment docs.

@mrshirts
Copy link

mrshirts commented Oct 17, 2018

So, it does reproduce the values in alchemical-analysis? Specifically, what data sets were used to test, not just that the comparison was done. Can you post the output snippets before/after? Does it work for uneven lambda?

@harlor
Copy link
Contributor Author

harlor commented Oct 17, 2018

I'm currently working on a command line "frontend" for alchemlyb that I can use similarly to alchemical-analysis: https://github.com/harlor/flamel and I used that to compare to alchemical-analysis.

I used different input files for testing (from NVT and NPT simulations) and compared the results for single steps, segments and in total. I show example outputs in the bottom of https://github.com/harlor/flamel/blob/master/README.md that were made with these modifications. Of course I can also show outputs without these modifications - if this is helpful.

Yes, it should work with an even and uneven number of lambdas.

@harlor harlor mentioned this pull request Oct 17, 2018
@mrshirts
Copy link

I used different input files for testing (from NVT and NPT simulations)

"Different input files" is not really sufficient. The locations of the specific files and the commands really are what are needed to be reproducible.

Of course I can also show outputs without these modifications - if this is helpful.

OK, make sure to link output like that in the PR description, so people can see what is being referred to.

I'm currently working on a command line "frontend" for alchemlyb that I can use similarly to alchemical-analysis: https://github.com/harlor/flamel and I used that to compare to alchemical-analysis.

I would HIGHLY recommend getting in contact with @davidlmobley about coordinating. The original plan was for alchemical-analysis to be a front end to alchemlyb. Even if it makes sense to rewrite from scratch, it would be useful to you to be able to copy over any of the graphing functionality for alchemical-analysis.

I.e. better that such improvements involve everyone that is working on them, rather than duplicating effort.

@dotsdl
Copy link
Member

dotsdl commented Oct 19, 2018

Thanks for putting this together @harlor! I'll be able to take some time this weekend to review, I think. I'll need to step through the changes to make sure I understand what's being done here, because I'm not sure I understand what's wrong with the existing implementation.

@harlor
Copy link
Contributor Author

harlor commented Oct 19, 2018

"Different input files" is not really sufficient. The locations of the specific files and the commands really are what are needed to be reproducible.

True. Assuming you have a proper python environment with pip, numpy, pymbar (I use a miniconda enviroment for that) and you start in an empty directory, a reproducible procedure is:

0. Download and install current alchemlyb

git clone git@github.com:alchemistry/alchemlyb.git
cd alchemlyb 
pip install .
cd ..

1. Download flamel:

git clone git@github.com:harlor/flamel.git

2. Get some input files:
I put some input files on my userpage, you can get them by:

wget https://userpage.fu-berlin.de/dominikwille/nvt300K.tgz
tar xzvf nvt300K.tgz

3. Test alchemlyb TI estimator using flamel:

./flamel/flamel.py -t 300 -p ti_ -e TI

You should get an output like:

...
------------ --------------------- 
   States           TI (kJ/mol)    
------------ --------------------- 
   0 -- 1         0.078  +-  0.011 
   1 -- 2         0.241  +-  0.002 
   2 -- 3         0.435  +-  0.003 
   3 -- 4         0.666  +-  0.004 
   4 -- 5         0.924  +-  0.006 
   5 -- 6         1.196  +-  0.008 
   6 -- 7         1.432  +-  0.013 
   7 -- 8         1.510  +-  0.018 
   8 -- 9         1.352  +-  0.021 
   9 -- 10        1.044  +-  0.021 
  10 -- 11        0.734  +-  0.018 
  11 -- 12        0.520  +-  0.014 
  12 -- 13        0.423  +-  0.011 
  13 -- 14        0.375  +-  0.010 
  14 -- 15        0.313  +-  0.010 
  15 -- 16        0.265  +-  0.010 
  16 -- 17        0.237  +-  0.009 
  17 -- 18        0.197  +-  0.009 
  18 -- 19        0.152  +-  0.008 
  19 -- 20        0.123  +-  0.006 
  20 -- 21       -0.135  +-  0.006 
  21 -- 22       -0.377  +-  0.005 
  22 -- 23       -0.625  +-  0.006 
  23 -- 24       -0.934  +-  0.008 
  24 -- 25       -1.313  +-  0.010 
  25 -- 26       -1.742  +-  0.012 
  26 -- 27       -1.471  +-  0.011 
  27 -- 28       -1.755  +-  0.014 
  28 -- 29       -2.032  +-  0.014 
  29 -- 30       -2.381  +-  0.014 
  30 -- 31       -2.752  +-  0.013 
  31 -- 32       -3.136  +-  0.012 
  32 -- 33       -3.569  +-  0.015 
  33 -- 34       -4.009  +-  0.016 
  34 -- 35       -4.479  +-  0.016 
  35 -- 36       -4.956  +-  0.017 
  36 -- 37       -5.475  +-  0.019 
------------ --------------------- 
     coul:      -41.141  +-  0.053 
      vdw:       12.219  +-  0.054 
    TOTAL:      -28.923  +-  0.075 

4. Download and install alchemlyb with this PR

git clone git clone git@github.com:harlor/alchemlyb.git alchemlyb_ti_ee
cd alchemlyb_ti_ee
git checkout ti_ee
pip install .
cd ..

5. Test alchemlyb TI estimator using flamel agian:

./flamel/flamel.py -t 300 -p ti_ -e TI

You should now get an output like:

...
------------ --------------------- 
   States           TI (kJ/mol)    
------------ --------------------- 
   0 -- 1         0.078  +-  0.011 
   1 -- 2         0.241  +-  0.002 
   2 -- 3         0.435  +-  0.003 
   3 -- 4         0.666  +-  0.004 
   4 -- 5         0.924  +-  0.006 
   5 -- 6         1.196  +-  0.008 
   6 -- 7         1.432  +-  0.013 
   7 -- 8         1.510  +-  0.018 
   8 -- 9         1.352  +-  0.021 
   9 -- 10        1.044  +-  0.021 
  10 -- 11        0.734  +-  0.018 
  11 -- 12        0.520  +-  0.014 
  12 -- 13        0.423  +-  0.011 
  13 -- 14        0.375  +-  0.010 
  14 -- 15        0.313  +-  0.010 
  15 -- 16        0.265  +-  0.010 
  16 -- 17        0.237  +-  0.009 
  17 -- 18        0.197  +-  0.009 
  18 -- 19        0.152  +-  0.008 
  19 -- 20        0.123  +-  0.006 
  20 -- 21       -0.135  +-  0.006 
  21 -- 22       -0.377  +-  0.005 
  22 -- 23       -0.625  +-  0.006 
  23 -- 24       -0.934  +-  0.008 
  24 -- 25       -1.313  +-  0.010 
  25 -- 26       -1.742  +-  0.012 
  26 -- 27       -1.471  +-  0.011 
  27 -- 28       -1.755  +-  0.014 
  28 -- 29       -2.032  +-  0.014 
  29 -- 30       -2.381  +-  0.014 
  30 -- 31       -2.752  +-  0.013 
  31 -- 32       -3.136  +-  0.012 
  32 -- 33       -3.569  +-  0.015 
  33 -- 34       -4.009  +-  0.016 
  34 -- 35       -4.479  +-  0.016 
  35 -- 36       -4.956  +-  0.017 
  36 -- 37       -5.475  +-  0.019 
------------ --------------------- 
     coul:      -41.141  +-  0.073 
      vdw:       12.219  +-  0.075 
    TOTAL:      -28.923  +-  0.105 

6. Compare with alchemical analysis:

/path/to/alchemical_analysis.py -p ti -t 300

You should get an output like:

...
------------ --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- 
   States           TI (kJ/mol)     TI-CUBIC (kJ/mol)         DEXP (kJ/mol)         IEXP (kJ/mol)          BAR (kJ/mol)         MBAR (kJ/mol)    
------------ --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- 
   0 -- 1         0.078  +-  0.011      0.076  +-  0.009      0.076  +-  0.024      0.074  +-  0.002      0.074  +-  0.002      0.077  +-  0.001 
   1 -- 2         0.241  +-  0.002      0.238  +-  0.003      0.237  +-  0.002      0.239  +-  0.003      0.238  +-  0.002      0.240  +-  0.001 
   2 -- 3         0.435  +-  0.003      0.431  +-  0.003      0.432  +-  0.004      0.430  +-  0.004      0.432  +-  0.003      0.435  +-  0.002 
   3 -- 4         0.666  +-  0.004      0.664  +-  0.005      0.659  +-  0.006      0.669  +-  0.005      0.664  +-  0.004      0.662  +-  0.002 
   4 -- 5         0.924  +-  0.006      0.923  +-  0.007      0.927  +-  0.007      0.922  +-  0.008      0.924  +-  0.005      0.919  +-  0.003 
   5 -- 6         1.196  +-  0.008      1.198  +-  0.009      1.186  +-  0.011      1.204  +-  0.012      1.197  +-  0.008      1.188  +-  0.004 
   6 -- 7         1.432  +-  0.013      1.446  +-  0.015      1.451  +-  0.016      1.445  +-  0.020      1.444  +-  0.012      1.420  +-  0.006 
   7 -- 8         1.510  +-  0.018      1.532  +-  0.021      1.540  +-  0.027      1.514  +-  0.028      1.534  +-  0.019      1.516  +-  0.009 
   8 -- 9         1.352  +-  0.021      1.365  +-  0.024      1.377  +-  0.030      1.348  +-  0.035      1.373  +-  0.022      1.368  +-  0.011 
   9 -- 10        1.044  +-  0.021      1.043  +-  0.024      1.042  +-  0.027      1.064  +-  0.044      1.039  +-  0.021      1.030  +-  0.010 
  10 -- 11        0.734  +-  0.018      0.725  +-  0.021      0.742  +-  0.022      0.692  +-  0.037      0.723  +-  0.017      0.728  +-  0.008 
  11 -- 12        0.520  +-  0.014      0.509  +-  0.016      0.526  +-  0.018      0.479  +-  0.019      0.516  +-  0.013      0.553  +-  0.007 
  12 -- 13        0.423  +-  0.011      0.419  +-  0.012      0.415  +-  0.014      0.433  +-  0.018      0.423  +-  0.011      0.450  +-  0.006 
  13 -- 14        0.375  +-  0.010      0.378  +-  0.012      0.365  +-  0.013      0.385  +-  0.018      0.373  +-  0.010      0.374  +-  0.005 
  14 -- 15        0.313  +-  0.010      0.312  +-  0.012      0.320  +-  0.013      0.304  +-  0.019      0.314  +-  0.010      0.313  +-  0.004 
  15 -- 16        0.265  +-  0.010      0.262  +-  0.011      0.248  +-  0.013      0.280  +-  0.016      0.265  +-  0.010      0.261  +-  0.004 
  16 -- 17        0.237  +-  0.009      0.240  +-  0.010      0.232  +-  0.011      0.247  +-  0.018      0.236  +-  0.009      0.218  +-  0.003 
  17 -- 18        0.197  +-  0.009      0.197  +-  0.010      0.200  +-  0.011      0.189  +-  0.014      0.197  +-  0.009      0.181  +-  0.003 
  18 -- 19        0.152  +-  0.008      0.150  +-  0.009      0.155  +-  0.011      0.144  +-  0.011      0.151  +-  0.008      0.149  +-  0.002 
  19 -- 20        0.123  +-  0.006      0.121  +-  0.007      0.118  +-  0.009      0.124  +-  0.008      0.124  +-  0.006      0.121  +-  0.002 
  20 -- 21       -0.135  +-  0.006     -0.134  +-  0.006     -0.139  +-  0.010     -0.131  +-  0.007     -0.133  +-  0.005     -0.129  +-  0.003 
  21 -- 22       -0.377  +-  0.005     -0.379  +-  0.006     -0.380  +-  0.007     -0.371  +-  0.008     -0.377  +-  0.005     -0.379  +-  0.003 
  22 -- 23       -0.625  +-  0.006     -0.618  +-  0.007     -0.636  +-  0.008     -0.614  +-  0.008     -0.623  +-  0.006     -0.644  +-  0.003 
  23 -- 24       -0.934  +-  0.008     -0.927  +-  0.009     -0.893  +-  0.009     -0.958  +-  0.013     -0.918  +-  0.007     -0.939  +-  0.004 
  24 -- 25       -1.313  +-  0.010     -1.313  +-  0.011     -1.302  +-  0.016     -1.311  +-  0.013     -1.307  +-  0.010     -1.288  +-  0.005 
  25 -- 26       -1.742  +-  0.012     -1.723  +-  0.015     -1.731  +-  0.016     -1.728  +-  0.019     -1.734  +-  0.012     -1.717  +-  0.006 
  26 -- 27       -1.471  +-  0.011     -1.469  +-  0.013     -1.442  +-  0.014     -1.482  +-  0.017     -1.464  +-  0.011     -1.437  +-  0.005 
  27 -- 28       -1.755  +-  0.014     -1.759  +-  0.016     -1.778  +-  0.020     -1.736  +-  0.020     -1.755  +-  0.014     -1.715  +-  0.006 
  28 -- 29       -2.032  +-  0.014     -2.022  +-  0.017     -2.047  +-  0.022     -2.003  +-  0.020     -2.027  +-  0.014     -2.029  +-  0.006 
  29 -- 30       -2.381  +-  0.014     -2.381  +-  0.016     -2.360  +-  0.022     -2.408  +-  0.019     -2.381  +-  0.014     -2.377  +-  0.007 
  30 -- 31       -2.752  +-  0.013     -2.752  +-  0.015     -2.773  +-  0.020     -2.724  +-  0.017     -2.746  +-  0.013     -2.750  +-  0.007 
  31 -- 32       -3.136  +-  0.012     -3.129  +-  0.014     -3.121  +-  0.018     -3.143  +-  0.018     -3.132  +-  0.012     -3.144  +-  0.008 
  32 -- 33       -3.569  +-  0.015     -3.571  +-  0.017     -3.578  +-  0.020     -3.572  +-  0.024     -3.567  +-  0.015     -3.561  +-  0.008 
  33 -- 34       -4.009  +-  0.016     -4.005  +-  0.019     -4.007  +-  0.026     -4.004  +-  0.022     -4.006  +-  0.016     -4.003  +-  0.009 
  34 -- 35       -4.479  +-  0.016     -4.481  +-  0.018     -4.476  +-  0.024     -4.480  +-  0.023     -4.476  +-  0.016     -4.474  +-  0.010 
  35 -- 36       -4.956  +-  0.017     -4.953  +-  0.019     -4.986  +-  0.026     -4.936  +-  0.026     -4.956  +-  0.017     -4.970  +-  0.011 
  36 -- 37       -5.475  +-  0.019     -5.466  +-  0.020     -5.438  +-  0.026     -5.524  +-  0.029     -5.471  +-  0.019     -5.480  +-  0.014 
------------ --------------------- --------------------- --------------------- --------------------- --------------------- --------------------- 
  Coulomb:      -41.141  +-  0.073    -41.082  +-  0.074    -41.087  +-  0.078    -41.124  +-  0.078    -41.073  +-  0.052    -41.037  +-  0.070 
  vdWaals:       12.219  +-  0.075     12.231  +-  0.075     12.249  +-  0.073     12.189  +-  0.091     12.240  +-  0.052     12.203  +-  0.061 
    TOTAL:      -28.923  +-  0.105    -28.851  +-  0.105    -28.838  +-  0.107    -28.936  +-  0.120    -28.833  +-  0.074    -28.834  +-  0.093
...

@harlor
Copy link
Contributor Author

harlor commented Oct 19, 2018

Thanks for putting this together @harlor! I'll be able to take some time this weekend to review, I think. I'll need to step through the changes to make sure I understand what's being done here, because I'm not sure I understand what's wrong with the existing implementation.

I tried to explain this in: #58

@dotsdl
Copy link
Member

dotsdl commented Oct 19, 2018

Understood, will review. The gist as I understand it is that since we are actually summing integrals between <dH/dl>s, not doing some kind of sum of <dH/dl>s, we are missing some terms here. I appreciate the detailed work you've done to suss this out; this is exactly the kind of work we need for alchemlyb to become a solid core to build upon. Thanks again!

@mrshirts
Copy link

since we are actually summing integrals between <dH/dl>s, not doing some kind of sum of <dH/dl>s, we are missing some terms here.

No. Summing integrals between <dH/dl>s can always be written as a weighted sum of <dH/dl>s. There is no need to include correlations (assuming the dH/dl come from different simulations - if expanded ensemble, a bit more complicated).

I think that @harlor's work fixed, this, though (if he ended up matching the alchemical-analysis code) results.

@harlor
Copy link
Contributor Author

harlor commented Oct 20, 2018

No. Summing integrals between <dH/dl>s can always be written as a weighted sum of <dH/dl>s. There is no need to include correlations (assuming the dH/dl come from different simulations - if expanded ensemble, a bit more complicated).

Agree - That's how I derived this code.

@orbeckst
Copy link
Member

@harlor many thanks for your contribution. But would you mind opening an actual issue for the bug and then reference the issue in the PR? This is important for properly keeping track of problems, both for the CHANGELOG and for the broader community – people tend to browse issues for problems, not so much the PRs.

Thanks a lot!

@orbeckst
Copy link
Member

I think that @harlor's work fixed, this, though (if he ended up matching the alchemical-analysis code) results.

@mrshirts then please do a formal review and accept it or request changes. You can always add comments and ping @dotsdl or myself with questions regarding implementation details. But ultimately someone has to take the responsibility for agreeing to take the code. You seem to be in the best position to do so.

@orbeckst orbeckst requested a review from mrshirts October 25, 2018 21:35
Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I leave the technical details to @mrshirts but this needs tests especially when you're actually reporting a bug.

@harlor
Copy link
Contributor Author

harlor commented Oct 26, 2018

I leave the technical details to @mrshirts but this needs tests especially when you're actually reporting a bug.

Ok I can add a test for that. But first it would be nice if the gromacs parser works for different data sets #59

@harlor harlor mentioned this pull request Oct 26, 2018
@davidlmobley
Copy link

@harlor you don't have a variety of different datasets on hand? We don't usually do TI here so I don't have many handy, but OTOH GROMACS usually writes the data so probably we can find some.

Presumably data from any code will serve if the issue is analysis; perhaps @sukanyasasmal or @hannahbaumann have some TI data handy.

@dominikwille
Copy link

@harlor you don't have a variety of different datasets on hand? We don't usually do TI here so I don't have many handy, but OTOH GROMACS usually writes the data so probably we can find some.

Was already discussed what https://github.com/alchemistry/alchemtest should contain to have a reasonable test coverage for different scenarios? I guess that's the right place to collect some data sets.

@davidlmobley
Copy link

Yes, exactly, we should end up with adequate data in alchemtest.

@harlor
Copy link
Contributor Author

harlor commented Nov 22, 2018

The TI tests now also check the uncertainty.

I also compared the values for the water particle datasets with alchemical-analysis using:

alchemical_analysis.py -g -p lambda_ -t 300 -i 0 -u kBT -r 6

@harlor
Copy link
Contributor Author

harlor commented Nov 26, 2018

@mrshirts By skipping the autocorrelation analysis in alchemical-analysis it is also possible to compare the uncertainty results. What I did for the water particle datasets.

@orbeckst The uncertainties are also tested now - the reference values in the tests can also be reproduced by alchemical-analysis

@davidlmobley
Copy link

That sounds excellent, @harlor ! @orbeckst did you see this?

Did this get all the changes in that Oliver wanted?

@orbeckst
Copy link
Member

I take @mrshirts comment #61 (comment)

I think that @harlor's work fixed, this, though (if he ended up matching the alchemical-analysis code) results.

as his approval.

@orbeckst orbeckst merged commit cd70f41 into alchemistry:master Nov 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants