Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time convergence #168

Merged
merged 21 commits into from
Oct 22, 2021
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/postprocessing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,32 @@ Tools for postprocessing

Tools are available for postprocessing the dataframes.

Time Convergence
----------------
One way of determining the simulation end point is to compute and plot the
forward and backward convergence of the estimate using
:func:`~alchemlyb.postprocessors.forward_backward_convergence` and
:func:`~alchemlyb.visualisation.plot_convergence`. ::

>>> from alchemtest.gmx import load_benzene
>>> from alchemlyb.parsing.gmx import extract_u_nk
>>> from alchemlyb.visualisation import plot_convergence
>>> from alchemlyb.postprocessors import forward_backward_convergence

>>> bz = load_benzene().data
>>> data_list = [extract_u_nk(xvg, T=300) for xvg in bz['Coulomb']]
>>> df = forward_backward_convergence(data_list, 'mbar')
>>> ax = plot_convergence(dataframe=df)
>>> ax.figure.savefig('dF_t.pdf')

Will give a plot looks like this

.. figure:: images/dF_t.png

A convergence plot of showing that the forward and backward has converged
fully.

.. autofunction:: alchemlyb.postprocessors.forward_backward_convergence

Unit Conversion
---------------
Expand Down
10 changes: 7 additions & 3 deletions docs/visualisation/alchemlyb.visualisation.plot_convergence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,13 @@ Plot the Forward and Backward Convergence

The function :func:`~alchemlyb.visualisation.plot_convergence` allows
the user to visualise the convergence by plotting the free energy change
computed using the equilibrated snapshots between the proper target time frames
in both forward (data points are stored in `forward` and `forward_error`) and
reverse (data points are stored in `backward` and `backward_error`) directions.
computed using the equilibrated snapshots between the proper target time
frames. The data could be provided as a Dataframe from
:func:`alchemlyb.postprocessors.forward_backward_convergence` or provided
explicitly in both forward (data points are stored in `forward` and
`forward_error`) and reverse (data points are stored in `backward` and
`backward_error`) directions.

The unit in the y axis could be labelled to other units by setting *units*,
which by default is :math:`kT`. The user can pass :class:`matplotlib.axes.Axes` into
the function to have the convergence drawn on a specific axes.
Expand Down
3 changes: 3 additions & 0 deletions src/alchemlyb/postprocessors/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
from .convergence import forward_backward_convergence

__all__ = [
'units',
'forward_backward_convergence'
]
107 changes: 107 additions & 0 deletions src/alchemlyb/postprocessors/convergence.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
import pandas as pd
import logging
import numpy as np

from ..estimators import MBAR, BAR, TI
from .. import concat

def forward_backward_convergence(df_list, estimator='mbar', num=10):
''' The forward and backward convergence of the free energy estimate.

Generate the free energy change as a function of time in both
directions, with the specified number of points in the time.
orbeckst marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
df_list : list
List of DataFrame of either dHdl or u_nk.
estimator : {'mbar', 'bar', 'ti'}
Name of the estimators.
num : int
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
The number of time points.

Returns
-------
DataFrame
The DataFrame with convergence data. ::

Forward F. Error Backward B. Error
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
0 33.988935 0.334676 35.666128 0.324426
1 35.075489 0.232150 35.382850 0.230944
2 34.919988 0.190424 35.156028 0.189489
3 34.929927 0.165316 35.242255 0.164400
4 34.957007 0.147852 35.247704 0.147191
5 35.003660 0.134952 35.214658 0.134458
6 35.070199 0.124956 35.178422 0.124664
7 35.019853 0.116970 35.096870 0.116783
8 35.035123 0.110147 35.225907 0.109742
9 35.113417 0.104280 35.113417 0.104280

orbeckst marked this conversation as resolved.
Show resolved Hide resolved
'''
logger = logging.getLogger('alchemlyb.postprocessors.'
'forward_backward_convergence')
logger.info('Start convergence analysis.')
logger.info('Check data availability.')

if estimator.lower() == 'mbar':
logger.info('Use MBAR estimator for convergence analysis.')
estimator_fit = MBAR().fit
elif estimator.lower() == 'bar':
logger.info('Use BAR estimator for convergence analysis.')
estimator_fit = BAR().fit
elif estimator.lower() == 'ti':
logger.info('Use TI estimator for convergence analysis.')
estimator_fit = TI().fit
else: # pragma: no cover
logger.warning(
'{} is not a valid estimator.'.format(estimator))

logger.info('Begin forward analysis')
forward_list = []
forward_error_list = []
for i in range(1, num + 1):
logger.info('Forward analysis: {:.2f}%'.format(i / num))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fraction of data i/num should be in final dataframe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for adding, please see new comments about making it a float

sample = []
for data in df_list:
sample.append(data[:len(data) // num * i])
sample = concat(sample)
result = estimator_fit(sample)
forward_list.append(result.delta_f_.iloc[0, -1])
if estimator.lower() == 'bar':
error = np.sqrt(sum(
[result.d_delta_f_.iloc[i, i + 1] ** 2
for i in range(len(result.d_delta_f_) - 1)]))
forward_error_list.append(error)
else:
forward_error_list.append(result.d_delta_f_.iloc[0, -1])
logger.info('{:.2f} +/- {:.2f} kT'.format(forward_list[-1],
forward_error_list[-1]))

logger.info('Begin backward analysis')
backward_list = []
backward_error_list = []
for i in range(1, num + 1):
logger.info('Backward analysis: {:.2f}%'.format(i / num))
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
sample = []
for data in df_list:
sample.append(data[-len(data) // num * i:])
sample = concat(sample)
result = estimator_fit(sample)
backward_list.append(result.delta_f_.iloc[0, -1])
if estimator.lower() == 'bar':
error = np.sqrt(sum(
[result.d_delta_f_.iloc[i, i + 1] ** 2
for i in range(len(result.d_delta_f_) - 1)]))
backward_error_list.append(error)
else:
backward_error_list.append(result.d_delta_f_.iloc[0, -1])
logger.info('{:.2f} +/- {:.2f} kT'.format(backward_list[-1],
backward_error_list[-1]))

convergence = pd.DataFrame(
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
{'Forward': forward_list,
'F. Error': forward_error_list,
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
'Backward': backward_list,
'B. Error': backward_error_list})
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
convergence.attrs = df_list[0].attrs
return convergence
38 changes: 38 additions & 0 deletions src/alchemlyb/tests/test_convergence.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
import pytest

from alchemtest.gmx import load_benzene
from alchemlyb.parsing import gmx
from alchemlyb.postprocessors import forward_backward_convergence

@pytest.fixture()
def gmx_benzene():
dataset = load_benzene()
return [gmx.extract_dHdl(dhdl, T=300) for dhdl in dataset['data']['Coulomb']], \
[gmx.extract_u_nk(dhdl, T=300) for dhdl in dataset['data']['Coulomb']]

def test_convergence_ti(gmx_benzene):
dHdl, u_nk = gmx_benzene
convergence = forward_backward_convergence(dHdl, 'TI')
assert convergence.shape == (10, 4)
assert convergence.iloc[0, 0] == pytest.approx(3.07, 0.01)
assert convergence.iloc[0, 2] == pytest.approx(3.11, 0.01)
assert convergence.iloc[-1, 0] == pytest.approx(3.09, 0.01)
assert convergence.iloc[-1, 2] == pytest.approx(3.09, 0.01)

def test_convergence_mbar(gmx_benzene):
dHdl, u_nk = gmx_benzene
convergence = forward_backward_convergence(u_nk, 'MBAR')
assert convergence.shape == (10, 4)
assert convergence.iloc[0, 0] == pytest.approx(3.02, 0.01)
assert convergence.iloc[0, 2] == pytest.approx(3.06, 0.01)
assert convergence.iloc[-1, 0] == pytest.approx(3.05, 0.01)
assert convergence.iloc[-1, 2] == pytest.approx(3.04, 0.01)

def test_convergence_bar(gmx_benzene):
dHdl, u_nk = gmx_benzene
convergence = forward_backward_convergence(u_nk, 'BAR')
assert convergence.shape == (10, 4)
assert convergence.iloc[0, 0] == pytest.approx(3.02, 0.01)
assert convergence.iloc[0, 2] == pytest.approx(3.06, 0.01)
assert convergence.iloc[-1, 0] == pytest.approx(3.05, 0.01)
assert convergence.iloc[-1, 2] == pytest.approx(3.04, 0.01)
9 changes: 9 additions & 0 deletions src/alchemlyb/tests/test_visualisation.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from alchemlyb.visualisation.ti_dhdl import plot_ti_dhdl
from alchemlyb.visualisation.dF_state import plot_dF_state
from alchemlyb.visualisation import plot_convergence
from alchemlyb.postprocessors import forward_backward_convergence

def test_plot_mbar_omatrix():
'''Just test if the plot runs'''
Expand Down Expand Up @@ -126,6 +127,14 @@ def test_plot_dF_state():
assert isinstance(fig, matplotlib.figure.Figure)
plt.close(fig)

def test_plot_convergence_dataframe():
bz = load_benzene().data
data_list = [extract_u_nk(xvg, T=300) for xvg in bz['Coulomb']]
df = forward_backward_convergence(data_list, 'mbar')
ax = plot_convergence(dataframe=df)
assert isinstance(ax, matplotlib.axes.Axes)
plt.close(ax.figure)

def test_plot_convergence():
bz = load_benzene().data
data_list = [extract_u_nk(xvg, T=300) for xvg in bz['Coulomb']]
Expand Down
21 changes: 18 additions & 3 deletions src/alchemlyb/visualisation/convergence.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@
from matplotlib.font_manager import FontProperties as FP
import numpy as np

def plot_convergence(forward, forward_error, backward, backward_error,
from ..postprocessors.units import get_unit_converter

def plot_convergence(forward=None, forward_error=None, backward=None,
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
backward_error=None, dataframe=None,
units='kT', ax=None):
"""Plot the forward and backward convergence.

Expand All @@ -16,6 +19,11 @@ def plot_convergence(forward, forward_error, backward, backward_error,
A list of free energy estimate from the last X% of data.
backward_error : List
A list of error from the last X% of data.
dataframe : Dataframe
orbeckst marked this conversation as resolved.
Show resolved Hide resolved
Output Dataframe from
:func:`~alchemlyb.postprocessors.forward_backward_convergence`. If
Dataframe is provided, `forward`, `forward_error`, `backward`,
`backward_error` will be ignored.
units : str
The label for the unit of the estimate. Default: "kT"
ax : matplotlib.axes.Axes
Expand All @@ -32,12 +40,19 @@ def plot_convergence(forward, forward_error, backward, backward_error,
The code is taken and modified from
`Alchemical Analysis <https://github.com/MobleyLab/alchemical-analysis>`_.

The units variable is for labelling only. Changing it doesn't change the
unit of the underlying variable.
If `dataframe` is not provide, the units variable is for labelling only.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If `data` is not an :class:pandas.Dataframe` produced by :func:`.....forward_backward_convergence`  ...

Changing it doesn't change the unit of the underlying variable.
orbeckst marked this conversation as resolved.
Show resolved Hide resolved


.. versionadded:: 0.4.0
"""
if dataframe is not None:
dataframe = get_unit_converter(units)(dataframe)
forward = dataframe['Forward'].to_numpy()
forward_error = dataframe['F. Error'].to_numpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could "F. Error" pass the tests, given that you changed the column names?

backward = dataframe['Backward'].to_numpy()
backward_error = dataframe['B. Error'].to_numpy()

orbeckst marked this conversation as resolved.
Show resolved Hide resolved
if ax is None: # pragma: no cover
fig, ax = plt.subplots(figsize=(8, 6))

Expand Down