Skip to content

Commit

Permalink
Merge pull request #274 from DavAug/doc-tweaks-2
Browse files Browse the repository at this point in the history
some doc tweaks
  • Loading branch information
DavAug authored Feb 17, 2024
2 parents 988a15d + 6e4071c commit 18e86d9
Showing 1 changed file with 100 additions and 61 deletions.
161 changes: 100 additions & 61 deletions docs/source/getting_started/fitting_models_to_data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,24 @@ Fitting mechanistic models to data

In the previous tutorial, :doc:`mechanistic_model`, we have seen how we can
implement and simulate treatment response models in Chi. For example, we can simulate
the time course of drug concentration levels following repeated dose
adminstrations as follows, using the same
1-compartment PK model as before
the time course of drug concentration levels by 1. implementing a model in SBML;
2. setting the route of administration and the dosing regimen; 3. simulating the
drug concentration for a given set of model parameters.

.. literalinclude:: code/3_fitting_models_1.py
:lines: 112-146

.. raw:: html
:file: images/3_fitting_models_1.html

This ability to simulate treatment responses is pretty cool in its own right,
but, at this point, our model has little to do with
real treatment responses. Assuming that our goal is to describe *real* treatment
responses, we need to somehow connect our model to reality.
This functionality to simulate treatment responses is pretty cool in its own
right! For example, it can help us to study nonlinearities in the treatment response
dynamics and to optimise dosing regimens to target a desired treatment response.

However, at this point, the simulated treatment responses have little to do with
real treatment responses. To describe *real* treatment
responses that we may observe in clinical practice, we need to somehow connect
our model to reality.

The most common approach to relate models to real treatment responses is to
compare the model predictions to measurements. Below, we have prepared an example
Expand All @@ -43,7 +47,7 @@ two columns that specify measurement times and dose administration times
(``Observable``, ``Value``, ``Observable unit``), and three columns that specify
dose administrations (``Dose``, ``Duration``, ``Dose unit``).

If we download the file and save it in the same directory as the Python script,
Downloading the file and saving it in the same directory as the Python script,
we can visualise the measurements by executing the below script

.. literalinclude:: code/3_fitting_models_1.py
Expand All @@ -52,19 +56,21 @@ we can visualise the measurements by executing the below script
.. raw:: html
:file: images/3_fitting_models_2.html

The figure shows that the treatment response dynamics indicated by the measurements
is similar to the treatment response simulated by the one-compartment PK model above.
The figure shows the drug concentration measurements as blue scatter points.
The treatment response dynamics indicated by the measurements
is similar to the simulated treatment response in the previous code block.
But looking more closely at the magnitude of the values,
it appears that the measured values are much smaller
than the simulated ones. We can therefore conclude that, at this point, our
model does not provide an accurate description of the measured treatment
response.

To find a better description of the treatment response, we have two options:
1. we can try to find parameter values that improves the proximity of the
1. we can try to find parameter values that improve the proximity of the
model output to the measurements; or 2. we can define a new mechanistic model and see
whether this new model is able to describe the measurements better. This tutorial
will be about the former and detail how we can find better model parameters.
will be about the former and will detail how we can find better model parameters
for a given model structure.

Estimating model parameters from data: Background
*************************************************
Expand Down Expand Up @@ -99,26 +105,31 @@ For simulation, this distribution can be used to sample measurement values and
imitate the measurement process of real treatment responses, see
Section 1.3 in the :doc:`quick_overview` for an example. For parameter estimation,
the distribution can be used to quantify the likelihood with which the observed
measurements had been generated by our model,
see Section 1.4 in the :doc:`quick_overview`.

To account for measurement noise during the parameter estimation, we therefore
measurements would have been generated by our model,
see Section 1.4 in the :doc:`quick_overview`. To account for measurement noise
during the parameter estimation, we therefore
choose to quantify the closeness between the model output an the measurements
using likelihoods. Formally, the measurement process can be described by
distributions of measurement values of the form :math:`p(y | \psi, t, r)`, where :math:`y`
denotes the measurement value, :math:`\psi` denotes the model parameters,
:math:`t` denotes the time, and :math:`r` denotes the dosing
regimen. The likelihood of a single measurement :math:`((t_1, y_1), r)` is
given by the value of the probability density evaluated at the measurement,
:math:`p(y_1 | \psi, t_1, r)`. This value depends on the chosen set of model
parameters, :math:`\psi`. The model parameters with the maximum likelihood are
using likelihoods.

Formally, we denote the measurement distribution by :math:`p(y | \psi, t, r)`,
where :math:`y` denotes the measurement value, :math:`\psi` denotes the model parameters,
:math:`t` denotes the time point of the measurement, and :math:`r` denotes the adminstered
dosing regimen. With this measurement distribution in place, we can quantify the
likelihood with which any given set of measurements would have been generated by the
model. For example, the likelihood of a measurement :math:`y_1` at time :math:`t_1`
under dosing regimen :math:`r^*` is defined by the value of the probability density
of the measurement distribution evaluated at the measurement,
:math:`p(y_1 | \psi, t_1, r^*)`. Note that this
likelihood depends on the choice of model parameters, :math:`\psi`. The model
parameters with the maximum likelihood are
the parameter values that most closely describe the measurements.

.. note::
The measurement distribution, :math:`p(y | \psi, t, r)`, is uniquely defined
by the mechanistic model output and the error model. For example for
the 1-compartment model, we may denote the simulated drug concentration
values by :math:`c(\psi, t, r)`, where the drug concentration values, :math:`c`, are
The measurement distribution, :math:`p(y | \psi, t, r)`, is defined
by the mechanistic model output and the error model. To illustrate this, let
use denote the simulated drug concentration values of the
the 1-compartment PK model by :math:`c(\psi, t, r)`. By definition of
the model, the drug concentration values, :math:`c`, are
a function of the model parameters, :math:`\psi = (a_0, k_a, k_e, v)`, the time,
:math:`t`, and the dosing regimen, :math:`r`.

Expand Down Expand Up @@ -158,7 +169,7 @@ the parameter values that most closely describe the measurements.
the probability density.


Assuming the independence of measurements, the likelihood for a dataset with :math:`n` measurements,
Assuming independence of measurements, the likelihood for a dataset with :math:`n` measurements,
:math:`\mathcal{D}=((t_1, y_1), (t_2, y_2), \ldots, (t_n, y_n), r)`, is given
by the product of the individual likelihoods

Expand All @@ -170,14 +181,16 @@ independently and identically distributed according to our model of the
measurement process (this does not have to be the case, and is especially unlikely
to be the case when our model fails to describe the measurement process accurately).

While it is easy to numerically maximise the likelihood and
to derive maximum likelihood estimates in Chi, see Section 1.4.1 in the :doc:`quick_overview`,
the main objective of Chi is to support the Bayesian inference of model parameters.
With this likelihood in place, Chi makes it straightforward to run numerical optimisation algorithms in order
to derive the maximum likelihood estimates that best describe the measurements,
see Section 1.4.1 in the :doc:`quick_overview`.

Alternatively, Chi also supports Bayesian inference of the model parameters:
Bayesian inference is conceptually different from maximum likelihood estimation
because it does not seek to find a single set of model parameters that "best"
describe the observed measurements. Instead, Bayesian inference acknowledges
the fact that noisy measurements cannot uniquely identify
the "best" model parameters, and therefore instead focuses on deriving
as it does not seek to find a single set of model parameters that "best"
describes the observed measurements. Instead, Bayesian inference acknowledges
the fact that noisy measurements leave uncertainty about which model parameters
best describe the data, and therefore focuses instead on deriving
a distribution of parameter values which
are all consistent with the observed measurements, see Section 1.4.2 in :doc:`quick_overview`
for a more detailed discussion. This
Expand All @@ -187,34 +200,35 @@ distribution is derived from the likelihood using Bayes' rule
p(\psi| \mathcal{D} ) = \frac{p(\mathcal{D}| \psi )\, p(\psi)}{p(\mathcal{D} )},
where :math:`p(\psi)` denotes the prior distribution of the model parameters.
The prior distribution can be used to quantify our belief of likely parameter
values for the model before the parameter estimation.
The prior distribution quantifies the baseline distribution of parameter values
before the parameter estimation, making it possible to inform the inference results
with otherwise hard-to-quantify knowledge of feasible or likely parameter values prior to the
inference.

Defining the log-posterior
**************************

In Chi, you can estimate model parameters from measurements by inferring posterior
distributions of parameter values using Markov chain
Monte Carlo (MCMC) algorithms. In this tutorial, we will use MCMC algorithms
implemented in the open source Python package
Pints_, but in principle you can also use implementations from other libraries.
In Chi, we can derive posterior distributions from measurements
using Markov chain
Monte Carlo (MCMC) algorithms implemented in the open source Python package
Pints_.
In Sections 1.4.1 and 1.4.2 in the :doc:`quick_overview`,
we showed in some detail how we can define (log-)posterior distributions,
:class:`chi.LogPosterior`, in Chi for this purpose. Here, we want to show
how we can use the :class:`chi.ProblemModellingController` to more easily
construct the :class:`chi.LogPosterior`, provided the measurements are available
in a CSV format of the form of Dataset_1_.
:class:`chi.LogPosterior`, for this purpose. Here, we want to show
how we can use the :class:`chi.ProblemModellingController` to simplify and automate
this process as much as possible.

The tricky bit when implementing log-posteriors for treatment response models
is often that those log-posteriors do not only depend on the treatment
response measurements, :math:`((t_1, y_1), (t_2, y_2), \ldots, (t_n, y_n))`,
but that they also depend on the administered dosing regimen, :math:`r`.
is often that log-posteriors do not only
depend on the treatment response measurements, :math:`((t_1, y_1), (t_2, y_2), \ldots, (t_n, y_n))`,
but also on the administered dosing regimen, :math:`r`.
This can make it tedious to define log-posteriors,
especially when you are inferring parameters across measurements of multiple
individuals with different dosing regimens.
especially when parameters across measurements of multiple
individuals with different dosing regimens are inferred, as dosing regimens of the
model will have to be manually set to the dosing regimens administered during the measurement
process. A mismatch between the dosing regimens would render the inference results invalid.

To simplify the process of constructing a :class:`LogPosterior`, we have
implemented the :class:`chi.ProblemModellingController`.
To eliminate this source of error, we have implemented the :class:`chi.ProblemModellingController`.
The :class:`chi.ProblemModellingController` facilitates the construction of
log-posteriors and reduces the workflow to a simple 4-step approach:

Expand Down Expand Up @@ -254,7 +268,7 @@ For the remaining 4 parameters, only positive values make biological sense, so
we choose prior distributions that focus on positive values. For
two model parameters, the volume of distribution and the scale parameter,
negative or zero values are particularly bad as they will break the simulation
(a volume of zero causes a division by zero error, and the lognormal distribution
(1. a volume of zero causes a division by zero error; and 2. the lognormal distribution
is only defined for positive sigma). We therefore use ``pints.LogNormalLogPrior``
to constrain those parameters to strictly positive values.

Expand All @@ -274,12 +288,37 @@ Inferring the posterior distribution
************************************

With this :class:`chi.LogPosterior` in place, we can infer the posterior
distribution using any MCMC algorithm of our choice. Recall that MCMC
algorithms infer distributions by sampling from them.
If we sample sufficiently many samples,
the histogram over the samples converges to the posterior
distribution.
To illustrate this, we run an MCMC algorithm to infer the above
distribution using any MCMC algorithm of our choice. Recall that inference, in this context,
means the reconstruction of the posterior distribution from the
:class:`chi.LogPosterior`, which defines the log-pdf of the posterior distribution
up to an unknown constant shift

.. math::
\log p(\psi| \mathcal{D} ) = \log p(\mathcal{D}| \psi ) + \log p(\psi) + \text{constant}.
By comparison to above Bayes rule, you will find that :math:`\text{constant} = -\log p(\mathcal{D})`,
so the constant shift may seem not so 'unkown' after all. However, *unknown* here
does not mean that we cannot write down an expression for it, it
refers to the fact that for most treatment response models
:math:`p(\mathcal{D})` is (practically) impossible to evaluate, as
evaluating :math:`p(\mathcal{D})` requires the numerical integration of the
likelihood-prior product over the full parameter space,
:math:`p(\mathcal{D}) = \int \mathrm{d} \psi \, p(\mathcal{D}, \psi ) = \int \mathrm{d} \psi \, p(\mathcal{D}| \psi )\, p(\psi)`.
This renders the value of the constant shift for all intents and purposes unknown.

The unknown shift makes it impossible to make statements about the absolute probability
of parameter values. However, it does allow for relative comparisons of
probabilities -- a fact exploited by MCMC algorithms to circumvent the limitation
of the partially known log-posterior. MCMC algorithms use
the relative comparison of parameter probabilities to generate random samples from the
posterior distribution, opening a gateway to reconstruct the distribution. The
more random samples are generated, the closer the histogram over the samples will
approximate the posterior distribution. In fact, one can show that the histogram
will converge to the posterior distribution as the number of samples approaches
infinity. This makes it possible for MCMC algorithms
to reconstruct any posterior distribution from a :class:`chi.LogPosterior`.

To illustrate this, let us run an MCMC algorithm to infer the above
defined posterior distribution of the 1-compartment PK
model.

Expand Down

0 comments on commit 18e86d9

Please sign in to comment.