![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
It is desirable to simulate the minimum amount of time necessary to reach an acceptable amount of uncertainty in the quantity of interest.
![]() |
Welcome to kim-convergence module!
The kim-convergence package is designed to help in automatic equilibration detection & run length control.
PLEASE NOTE:
the kim-convergence code is under active development and is still in beta
versions 0.0.2
. In general changes to the patch version (the third number)
indicate backward compatible beta releases, but please be aware that file
formats and APIs may change.
Bug reports are also welcomed in the GitHub issues!
!WORK IN PROGRESS!
You need Python 3.7 or later to run kim-convergence
. You can have multiple
Python versions (2.x and 3.x) installed on the same system without problems.
To install Python 3 for different Linux flavors, macOS and Windows, packages
are available at
https://www.python.org/getit/
pip is the most popular tool for installing Python packages, and the one included with modern versions of Python.
kim-convergence
can be installed with pip
:
pip install kim-convergence
NOTE:
Depending on your Python installation, you may need to use pip3
instead of
pip
.
pip3 install kim-convergence
Depending on your configuration, you may have to run pip
like this:
python3 -m pip install kim-convergence
pip
currently supports cloning over git
pip install git+https://github.com/openkim/kim-convergence.git
For more information and examples, see the pip install reference.
conda is the package management tool for Anaconda Python installations.
Installing kim-convergence
from the conda-forge
channel can be achieved by
adding conda-forge
to your channels with:
conda config --add channels conda-forge
conda config --set channel_priority strict
Once the conda-forge
channel has been enabled, kim-convergence
can be
installed with:
conda install kim-convergence
It is possible to list all of the versions of kim-convergence
available on
your platform with:
conda search kim-convergence --channel conda-forge
Basic usage involves importing kim-convergence and use the utility to control the length of the time series data from a simulation run or a sampling approach, or a dump file from the previously done simulation.
The main requirement is a get_trajectory
function. get_trajectory
is a
callback function with a specific signature of
get_trajectory(nstep: int) -> 1darray
if we only have one variable or,
get_trajectory(nstep: int) -> 2darray
with the shape of return array as,
(number_of_variables, nstep)
.
For example,
rng = np.random.RandomState(12345)
stop = 0
def get_trajectory(step: int) -> np.ndarray:
global stop
start = stop
if 100000 < start + step:
step = 100000 - start
stop += step
data = np.ones(step) * 10 + (rng.random_sample(step) - 0.5)
return data
NOTE:
To use extra arguments in calling the get_trajectory
function, one can use
the other specific signature of
get_trajectory(nstep: int, args: dict) -> 1darray
or
get_trajectory(nstep: int, args: dict) -> 2darray
,
where all the extra required parameters and arguments can be provided with the args.
rng = np.random.RandomState(12345)
args = {'stop': 0, 'maximum_steps': 100000}
def get_trajectory(step: int, args: dict) -> np.ndarray:
start = args['stop']
if args['maximum_steps'] < start + step:
step = args['maximum_steps'] - start
args['stop'] += step
data = np.ones(step) * 10 + (rng.random_sample(step) - 0.5)
return data
Then call the run_length_control
function as below,
import kim_convergence as cr
msg = cr.run_length_control(
get_trajectory=get_trajectory,
number_of_variables=1,
initial_run_length=1000,
maximum_run_length=100000,
relative_accuracy=0.01,
fp_format='json'
)
or
import kim_convergence as cr
msg = cr.run_length_control(
get_trajectory=get_trajectory,
get_trajectory_args=args,
number_of_variables=1,
initial_run_length=1000,
maximum_run_length=100000,
relative_accuracy=0.01,
fp_format='json'
)
An estimate produced by a simulation typically has an accuracy requirement and is an input to the utility. This requirement means that the experimenter wishes to run the simulation only until an estimate meets this accuracy requirement. Running the simulation less than this length would not provide the information needed while running it longer would be a waste of computing time. In the above example, the accuracy requirement is specified as the relative accuracy.
In case of having more than one variable,
rng = np.random.RandomState(12345)
stop = 0
def get_trajectory(step: int) -> np.ndarray:
global stop
start = stop
if 100000 < start + step:
step = 100000 - start
stop += step
data = np.ones((3, step)) * 10 + (rng.random_sample(3 * step).reshape(3, step) - 0.5)
return data
Then call the run_length_control
function as below,
import kim_convergence as cr
msg = cr.run_length_control(
get_trajectory=get_trajectory,
number_of_variables=3,
initial_run_length=1000,
maximum_run_length=100000,
relative_accuracy=0.01,
fp_format='json'
)
NOTE:
All the values returned from this get_trajectory
function should be finite
values, otherwise the code will stop wih error message explaining the issue.
ERROR(@_get_trajectory): there is/are value/s in the input which is/are non-finite or not number.
Thus, one should remove infinit values or Not a Number (NaN) values from the
returning array within the get_trajectory
function.
The run-length control procedure employs initial_run_length
parameter. It
begins at time 0 and starts calling the get_trajectory
function with the
provided number of steps (e.g. initial_run_length=1000
). At this point,
and with no assumptions about the distribution of the observable of interest,
it tries to estimate an equilibration time. Failing to find the transition
point will request more data and call the get_trajectory
function until it
finds the equilibration time or hits the maximum run length limit
(e.g. maximum_run_length=100000
).
At this point, and after finding an optimal equilibration time, the confidence
interval (CI) generation method is applied to the set of available data points.
If the resulting confidence interval met the provided accuracy value
(e.g. relative_accuracy=0.01
), the simulation is terminated. If not, the
simulation is continued by requesting more data and calling the get_trajectory
function again and again until it does. This procedure continues until the
criteria is met or it reaches the maximum run length limit.
The relative_accuracy
as mentioned above, is the relative precision and
defined as a half-width of the estimator's confidence interval or an
approximated upper confidence limit (UCL) divided by the computed sample mean.
The UCL is calculated as a confidence_coefficient%
confidence interval for
the mean, using the portion of the time series data, which is in the stationary
region. If the ratio is bigger than relative_accuracy
, the length of the time
series is deemed not long enough to estimate the mean with sufficient accuracy,
which means the run should be extended.
The accuracy parameter relative_accuracy
specifies the maximum relative error
that will be allowed in the mean value of the data point series. In other words,
the distance from the confidence limit(s) to the mean (which is also known as
the precision, half-width, or margin of error). A value of 0.01
is usually
used to request two digits of accuracy, and so forth.
The parameter confidence_coefficient
is the confidence coefficient and
often, the values 0.95
is used. For the confidence coefficient,
confidence_coefficient
, we can use the following interpretation, If thousands
of samples of n items are drawn from a population using simple random sampling
and a confidence interval is calculated for each sample, the proportion of
those intervals that will include the true population mean is
confidence_coefficient
.
If something is not working as you think it should or would like it to, please get in touch with us! Further, if you have an algorithm or any idea that you would want to try using the kim-convergence, please get in touch with us, we would be glad to help!
Contributions are very welcome.
Copyright (c) 2021, Regents of the University of Minnesota.
All Rights Reserved
Contributors:
Yaser Afshar