-
Notifications
You must be signed in to change notification settings - Fork 7
Covariate Likelihood
The CovariateLikelihood
class calculates the log-likelihood of a set of data points (a time series) given a population model. It determines (and adds) the log-likelihood of each data point given a parameterised distribution assigned to all points.
% explain
A Covariate Likelihood element is made of the following components:
-
popmodel
: APopModelODE
element, usually a reference to the population model used by theSTreeLikelihood
component used in the PhyDyn analysis. -
data
: A string made of rows and columns of data in table format, where the first column must correspond totime
and the second column to the covariate value of interest. -
covariate-expression
: A mathematical expression used to calculate the covariate value of interest. - One of the following,
-
covariate-distribution
: The parametric distribution used to calculate the log-likelihood of each data point, OR -
distribution-expression
: A mathematical expression describing the log-density function of the covariate distribution - in case the distribution is not available
-
The Covariate Likelihood is usually used as a prior and in conjunction with PhyDyn's tree likelihood component.
Let's say our PhyDyn analysis defines a population model (PopModelODE
) containing a (say, non-deme) variable infections
that keeps track of the accumulated number of infections in time. Let's assume that we have seroprevalence information for two dates, 2020.412 and 2020.6, with confidence intervals that correspond to a standard deviation of 0.5, and that the total population size is 1500000. The covariate likelihood component that calculates the log-likelihood of the prevalence calculated from the trajectories of the population model referenced by ID 'seirmodel', given the two data points, is:
<distribution id="seir.seroprevalencelh.t" spec="phydyn.covariate.CovariateLikelihood"
popmodel='@seirmodel' >
<data>
time,sp
2020.412, 6.3
2020.6, 6.5
</data>
<covariate-expression> (infections / 150000)*100 </covariate-expression>
<covariate-distribution spec="phydyn.covariate.distribution.Normal" mean="sp" sigma="0.5"/>
</distribution>
Note the following:
- The
data
element generates 2 data points and, for each row/data-point, binds the values of the first and second column to variablestime
andsp
, respectively. - The
covariate-expression
formula is used to calculate the value of prevalence at any given the time point in our trajectory. In our example, the 'covariate-expression' is calculated twice for each population trajectory. - Each point is assigned the Normal distribution with sigma 0.5 and mean equal to the value of sp entered in the table e.g. the first point has N(6.3, 0.5).
- Let d
The expression, whose syntax is identical to the follows the same rules (syntax) used to write the matrix equations should be written in terms of the variables used by the population model
<distribution id="seir.seroprevalencelh.t" spec="phydyn.covariate.CovariateLikelihood"
popmodel='@seirmodel' >
<data>
time,sp
2020.412, 6.3
2020.6, 6.5
</data>
<covariate-expression> (infections / 150000)*100 </covariate-expression>
<distribution-expression> -(log(sigma*sqrt(2*PI))) - 0.5*(spVal - sp)*(spVal-sp)/(sigma*sigma) </distribution-expression>
</distribution>
Let's now consider the case where the standard deviation values of the normal distributions that describe each data point are provided as data i.e. as a column in our table names sigma, and that the total population size values per data point (constant 1500000 in our case) are provided in column popSize. The data
element looks like this:
<data>
time,sp, sigma, popSize
2020.412, 6.3, 0.45, 1500000
2020.6, 6.5, 0.6, 1500000
The corresponding Covariate Likelihood component is written as follows:
<distribution id="seir.seroprevalencelh.t" spec="phydyn.covariate.CovariateLikelihood"
popmodel='@seirmodel' >
<covariate-expression> (infections / popSize)*100 </covariate-expression>
<covariate-distribution spec="phydyn.covariate.distribution.Normal" mean="sp" sigma="sp"/>
<data>
time,sp, sigma, popSize
2020.412, 6.3, 0.45, 1500000
2020.6, 6.5, 0.6, 1500000
</data>
</distribution>