Skip to content

Commit

Permalink
Merge pull request #179 from LCBC-UiO/dev
Browse files Browse the repository at this point in the history
Getting close to rOpenSci submission
  • Loading branch information
osorensen authored Oct 20, 2023
2 parents 6be5bed + 2b585ae commit 1e6deab
Show file tree
Hide file tree
Showing 51 changed files with 1,625 additions and 319 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/R-CMD-check.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
R_KEEP_PKG_SOURCE: yes
MYPKG_EXTENDED_TESTS: ${{contains(github.event.head_commit.message,
GALAMM_EXTENDED_TESTS: ${{contains(github.event.head_commit.message,
'run-extended')}}

steps:
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ importFrom(stats,coef)
importFrom(stats,deviance)
importFrom(stats,family)
importFrom(stats,fitted)
importFrom(stats,formula)
importFrom(stats,gaussian)
importFrom(stats,logLik)
importFrom(stats,nobs)
Expand Down
229 changes: 229 additions & 0 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,235 @@
# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

#' Evaluate the deviance at given values of random effects
#'
#'
#' @param parlist An object of class \code{parameters<T>} containing the
#' parameters at which to evaluate the marginal log-likelihood.
#' @param datlist An object of class \code{data<T>} containing the data with
#' which to evaluate the marginal log-likelihood.
#' @param lp Vector with linear predictor values, with arguments of same
#' type as the template \code{T}.
#' @param modvec Reference to a vector of pointers to objects of class
#' \code{Model<T>}, containing the necessary functions specific to the
#' exponential families used in the model.
#' @param solver A solver for sparse linear systems of type
#' \code{Eigen::SimplicialLDLT<Eigen::SparseMatrix<T> >}.
#' @param phi Vector of dispersion parameters, one for each model family.
#' @return Model deviance, after integrating out the random effects. This
#' corresponds to \eqn{-2} times the marginal loglikelihood.
#' @noRd
NULL

#' @title Evaluate marginal log-likelihood
#'
#' @description
#' Implements penalized iteratively reweighted least squares for finding
#' conditional modes of random effects, and returns the resulting marginal
#' log-likelihood. The template \code{T} will typically be one of
#' \code{double}, \code{autodiff:dual1st}, or \code{autodiff::dual2nd}.
#'
#'
#' @param parlist An object of class \code{parameters<T>} containing the
#' parameters at which to evaluate the marginal log-likelihood.
#' @param datlist An object of class \code{data<T>} containing the data with
#' which to evaluate the marginal log-likelihood.
#' @param modvec Reference to a vector of pointers to objects of class
#' \code{Model<T>}, containing the necessary functions specific to the
#' exponential families used in the model.
#' @return An object of class \code{logLikObject<T>}. See its definition for
#' details.
#'
#' @noRd
NULL

#' @title Set up parameter and model family
#'
#' @description
#' Templated wrapper function which sets up the necessary parameters to
#' evaluate the marginal likelihood. The template type \code{T} will typically
#' be one of \code{double}, \code{autodiff::dual1st}, and
#' \code{autodiff::dual2nd}.
#'
#'
#' @param y Double precision vector of response values.
#' @param trials Double precision vector with number of trials. When trials
#' are not applicable, e.g., with Gaussian or Poisson responses, this should
#' be a vector of ones.
#' @param X Fixed effect model matrix.
#' @param Zt Transpose of random effect model matrix.
#' @param Lambdat Lower Cholesky factor of random effect covariance matrix.
#' @param beta Double precision vector of fixed effects.
#' @param theta Double precision vector with the unique elements of
#' \code{Lambdat}.
#' @param theta_mapping Integer vector mapping elements of \code{theta} to the
#' positions in \code{Lambdat}.
#' @param u_init Double precision vector with initial values of random
#' effects. These random effects should be standardized.
#' @param lambda Double precision vector of factor loadings.
#' @param lambda_mapping_X Integer vector mapping elements of
#' \code{lambda} to elements of \code{X}, in row-major order.
#' @param lambda_mapping_Zt List of integer vectors mapping elements of
#' \code{lambda} to non-zero elements of \code{Zt} assuming compressed
#' sparse column format is used. If \code{lambda_mapping_Zt_covs} is of
#' length zero, then each list element in \code{lambda_mapping_Zt} should be
#' of length one, and it will then be multiplied by the corresponding element
#' of \code{Zt}.
#' @param lambda_mapping_Zt_covs List of double precision vector. Must either
#' be of length zero, or the same length as \code{lambda_mapping_Zt_covs}.
#' Each list element contains potential covariates that the elements of
#' \code{lambda_mapping_Zt} should be multiplied with. If the list is of
#' length 0, all elements of \code{lambda_mapping_Zt} are implicitly
#' multiplied by 1.
#' @param weights Double precision vector of weights, used in heteroscedastic
#' models.
#' @param weights_mapping Integer vector mapping the elements of \code{weights}
#' to the rows of \code{X}.
#' @param family Vector of strings defining the family or families. Each
#' vector element must currently be one of \code{"gaussian"},
#' \code{"binomial"}, or \code{"poisson"}.
#' @param family_mapping Integer vector mapping elements of \code{family} to
#' the rows of \code{X}.
#' @param k Double precision vector with pre-computed constant term in the
#' log-likelihood for each element in \code{family}.
#' @param maxit_conditional_modes Integer specifying the maximum number of
#' iteration in penalized iteratively reweighted least squares algorithm
#' used to find the conditional modes of the random effects.
#' @param lossvalue_tol Double precision scalar specifying the absolute
#' convergence criterion for the penalized iteratively reweighted least
#' squares algorithm used to find the conditional modes of the random
#' effects.
#' @param reduced_hessian Boolean specifying whether the Hessian matrix of
#' second derivatives should be computed only with respect to \code{beta}
#' and \code{lambda}, in that order. This may be useful for getting a very
#' rough estimate of the inverse covariance matrix, when the full Hessian is
#' not positive definite.
#'
#' @return An \code{Rcpp::List} with the following elements. The element
#' \code{logLik} will always be there, while the other will be there or not
#' depending on the template type \code{T}.
#' * \code{logLik} Laplace approximate marginal log-likelihood at the
#' parameter values specified.
#' * \code{g} If \code{T} is \code{autodiff::dual1st} or
#' \code{autodiff::dual2nd}, the gradient is provided in this element as
#' a double precision vector.
#' * \code{H} If \code{T} is \code{autodiff::dual2nd}, the Hessian matrix
#' is provided in this element as a double precision matrix.
#' * \code{u} If \code{T} is \code{autodiff::dual2nd}, the conditional
#' modes of the standardized random effects are provided as a double
#' precision vector in this element.
#' * \code{V} If \code{T} is \code{autodiff::dual2nd}, the diagonal matrix
#' \eqn{V} with \eqn{b''(\nu_{i}) / \phi_{g(i)}} on the diagonal is
#' included in this element. See the paragraph below equation (13) in
#' \insertCite{sorensenLongitudinalModelingAgeDependent2023}{galamm} for
#' details.
#' * \code{phi} If \code{T} is \code{autodiff::dual2nd}, double precision
#' scalar containing the dispersion parameter of the model.
#' @noRd
NULL

#' @title Evaluate the marginal likelihood
#'
#' @description
#' This function evaluate the Laplace approximate marginal likelihood of a
#' generalized additive latent and mixed model at a given set of parameters.
#' The code uses elements generated by \code{lme4::glFormula}, and the
#' documentation of \code{lme4} should be consulted for further details.
#'
#' @srrstats {G1.4a} Internal function documented.
#'
#' @param y Double precision vector of response values.
#' @param trials Double precision vector with number of trials. When trials
#' are not applicable, e.g., with Gaussian or Poisson responses, this should
#' be a vector of ones.
#' @param X Fixed effect model matrix.
#' @param Zt Transpose of random effect model matrix.
#' @param Lambdat Lower Cholesky factor of random effect covariance matrix.
#' @param beta Double precision vector of fixed effects.
#' @param theta Double precision vector with the unique elements of
#' \code{Lambdat}.
#' @param theta_mapping Integer vector mapping elements of \code{theta} to the
#' positions in \code{Lambdat}.
#' @param u_init Double precision vector with initial values of random
#' effects. These random effects should be standardized.
#' @param lambda Double precision vector of factor loadings.
#' @param lambda_mapping_X Integer vector mapping elements of
#' \code{lambda} to elements of \code{X}, in row-major order.
#' @param lambda_mapping_Zt List of integer vectors mapping elements of
#' \code{lambda} to non-zero elements of \code{Zt} assuming compressed
#' sparse column format is used. If \code{lambda_mapping_Zt_covs} is of
#' length zero, then each list element in \code{lambda_mapping_Zt} should be
#' of length one, and it will then be multiplied by the corresponding element
#' of \code{Zt}.
#' @param lambda_mapping_Zt_covs List of double precision vector. Must either
#' be of length zero, or the same length as \code{lambda_mapping_Zt_covs}.
#' Each list element contains potential covariates that the elements of
#' \code{lambda_mapping_Zt} should be multiplied with. If the list is of
#' length 0, all elements of \code{lambda_mapping_Zt} are implicitly
#' multiplied by 1.
#' @param weights Double precision vector of weights, used in heteroscedastic
#' models.
#' @param weights_mapping Integer vector mapping the elements of \code{weights}
#' to the rows of \code{X}.
#' @param family Vector of strings defining the family or families. Each
#' vector element must currently be one of \code{"gaussian"},
#' \code{"binomial"}, or \code{"poisson"}.
#' @param family_mapping Integer vector mapping elements of \code{family} to
#' the rows of \code{X}.
#' @param k Double precision vector with pre-computed constant term in the
#' log-likelihood for each element in \code{family}.
#' @param maxit_conditional_modes Integer specifying the maximum number of
#' iteration in penalized iteratively reweighted least squares algorithm
#' used to find the conditional modes of the random effects.
#' @param lossvalue_tol Double precision scalar specifying the absolute
#' convergence criterion for the penalized iteratively reweighted least
#' squares algorithm used to find the conditional modes of the random
#' effects.
#' @param gradient Boolean specifying whether to compute the gradient of the
#' log-likelhood with respect to all elements of \code{theta}, \code{beta},
#' \code{lambda}, and \code{weights}, in that order. If
#' \code{gradient = TRUE}, and \code{hessian = FALSE}, forward mode
#' automatic differentiation with first-order dual numbers are used. If also
#' \code{hessian = TRUE}, then second-order dual numbers are used instead.
#' @param hessian Boolean specifying whether to compute the Hessian matrix of
#' second derivatives of the log-likelihood with respect to all elements of
#' \code{theta}, \code{beta}, \code{lambda}, and \code{weights}, in that
#' order. If \code{hessian = TRUE}, forward mode automatic differentiation
#' with second-order dual numbers are used.
#' @param reduced_hessian Boolean specifying whether the Hessian matrix of
#' second derivatives should be computed only with respect to \code{beta}
#' and \code{lambda}, in that order. This may be useful for getting a very
#' rough estimate of the inverse covariance matrix, when the full Hessian is
#' not positive definite.
#'
#' @return An \code{Rcpp::List}, which will be converted to a \code{list} in
#' \code{R}, the following elements. The element \code{logLik} will always
#' be there, while the other will be there or not depending on arguments
#' \code{gradient} and \code{hessian}.
#' * \code{logLik} Laplace approximate marginal log-likelihood at the
#' parameter values specified.
#' * \code{g} If \code{gradient = TRUE} or \code{hessian = TRUE}, the
#' gradient is provided in this element as a double precision vector.
#' * \code{H} If \code{hessian = TRUE}, the Hessian matrix is provided in
#' this element as a double precision matrix.
#' * \code{u} If \code{hessian = TRUE}, the conditional modes of the
#' standardized random effects are provided as a double precision vector
#' in this element.
#' * \code{V} If \code{hessian = TRUE}, the diagonal matrix \eqn{V} with
#' \eqn{b''(v_{i}) / \phi_{g(i)}} on the diagonal is included in this
#' element. See the paragraph below equation (13) in
#' \insertCite{sorensenLongitudinalModelingAgeDependent2023}{galamm} for
#' details.
#' * \code{phi} If \code{hessian = TRUE}, double precision vector containing
#' the dispersion parameter of the model, for each model family.
#'
#' @details
#' For many models, not all parameters exists. For example, without
#' heteroscedastic residuals, the weights don't exist, and other models don't
#' have factor loadings. For these cases, the corresponding argument (to
#' \code{weights} or \code{lambda}) should be a correctly typed vector of
#' length zero.
#' @noRd
marginal_likelihood <- function(y, trials, X, Zt, Lambdat, beta, theta, theta_mapping, u_init, lambda, lambda_mapping_X, lambda_mapping_Zt, lambda_mapping_Zt_covs, weights, weights_mapping, family, family_mapping, k, maxit_conditional_modes, lossvalue_tol, gradient, hessian, reduced_hessian = FALSE) {
.Call(`_galamm_marginal_likelihood`, y, trials, X, Zt, Lambdat, beta, theta, theta_mapping, u_init, lambda, lambda_mapping_X, lambda_mapping_Zt, lambda_mapping_Zt_covs, weights, weights_mapping, family, family_mapping, k, maxit_conditional_modes, lossvalue_tol, gradient, hessian, reduced_hessian)
}
Expand Down
15 changes: 10 additions & 5 deletions R/VarCorr.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ NULL
#' @name VarCorr
#' @aliases VarCorr VarCorr.galamm
#'
#' @return An object of class \code{VarCorr.galamm}.
#' @return An object of class \code{c("VarCorr.galamm", "VarCorr.merMod")}.
#' @export
#'
#' @seealso [print.VarCorr.galamm()] for the print function.
Expand All @@ -33,6 +33,10 @@ NULL
#' # Extract information on variance and covariance
#' VarCorr(mod)
#'
#' # Convert to data frame
#' # (this invokes lme4's function as.data.frame.VarCorr.merMod)
#' as.data.frame(VarCorr(mod))
#'
VarCorr.galamm <- function(x, sigma = 1, ...) {
useSc <- Reduce(function(`&&`, y) y$family == "gaussian",
family(x),
Expand All @@ -46,19 +50,20 @@ VarCorr.galamm <- function(x, sigma = 1, ...) {
names(x$model$lmod$reTrms$cnms)
),
useSc = useSc,
class = "VarCorr.galamm"
class = c("VarCorr.galamm", "VarCorr.merMod")
)
}


#' @title Print method for variance-covariance objects
#'
#' @srrstats {G1.4} Function documented with roxygen2.
#' @srrstats {G2.3b} Argument "comp" is case sensitive, as is documented here.
#' @srrstats {G2.1a} Expected data types provided for all inputs.
#' @srrstats {G2.3a} match.arg() used on "comp" argument.
#' @srrstats {G2.3b} Argument "comp" is case sensitive, as is documented here.
#'
#' @param x An object of class \code{VarCorr.galamm}, returned from
#' \code{\link{VarCorr.galamm}}.
#' @param x An object of class \code{c("VarCorr.galamm", "VarCorr.merMod")},
#' returned from \code{\link{VarCorr.galamm}}.
#' @param digits Optional arguments specifying number of digits to use when
#' printing.
#' @param comp Character vector of length 1 or 2 specifying which variance
Expand Down
1 change: 1 addition & 0 deletions R/confint.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#' @title Confidence intervals for model parameters
#'
#' @srrstats {G1.4} Function documented with roxygen2.
#' @srrstats {G2.3a} match.arg() used on "method" argument.
#' @srrstats {G2.3b} Arguments parm and method are case sensitive, as stated in
#' their documentation.
#' @srrstats {G2.1a} Expected data types provided for all inputs.
Expand Down
16 changes: 16 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
#' \insertCite{skrondalGeneralizedLatentVariable2004;textual}{galamm}, where
#' the dataset is used.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `epilep` A data frame with 236 rows and 7 columns:
#' \describe{
#' \item{subj}{Subject ID.}
Expand All @@ -27,6 +29,8 @@
#' Very basic mixed response dataset with one set of normally distributed
#' responses and one set of binomially distributed responses.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `mresp` A data frame with 4000 rows and 5 columns:
#' \describe{
#' \item{id}{Subject ID.}
Expand All @@ -46,6 +50,8 @@
#' responses and one set of binomially distributed responses. The normally
#' distributed response follow two different residual standard deviations.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `mresp` A data frame with 4000 rows and 5 columns:
#' \describe{
#' \item{id}{Subject ID.}
Expand All @@ -72,6 +78,8 @@
#' dataset is used. See also
#' \insertCite{rabe-heskethCorrectingCovariateMeasurement2003;textual}{galamm}.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `diet` A data frame with 236 rows and 7 columns:
#' \describe{
#' \item{id}{Subject ID.}
Expand Down Expand Up @@ -99,6 +107,8 @@
#' Simulated dataset with residual standard deviation that varies between
#' items.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `hsced` A data frame with 1200 rows and 5 columns:
#' \describe{
#' \item{id}{Subject ID.}
Expand All @@ -119,6 +129,8 @@
#' \insertCite{woodGeneralizedAdditiveModels2017a}{galamm}, and depend on the
#' explanatory variable x.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `cognition` A data frame with 14400 rows and 7 columns:
#' \describe{
#' \item{id}{Subject ID.}
Expand All @@ -142,6 +154,8 @@
#' Simulated dataset for use in examples and testing with a latent covariate
#' interacting with an observed covariate.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `latent_covariates` A data frame with 600 rows and 5 columns:
#' \describe{
#' \item{id}{Subject ID.}
Expand All @@ -166,6 +180,8 @@
#' interacting with an observed covariate. In this data, each response has been
#' measured six times for each subject.
#'
#' @srrstats {G5.1} Dataset used to test package is exported.
#'
#' @format ## `latent_covariates_long` A data frame with 800 rows and 5
#' columns:
#' \describe{
Expand Down
Loading

0 comments on commit 1e6deab

Please sign in to comment.