From 79026562789e403431246635296460f2bd630199 Mon Sep 17 00:00:00 2001 From: Anestis Touloumis Date: Mon, 10 Jul 2023 14:03:15 +0100 Subject: [PATCH] 230710: submitted to CRAN --- R/SimCorMultRes-data.R | 22 +++++++++++----------- README.Rmd | 5 +++-- README.md | 18 ++++++++++-------- inst/CITATION | 13 ++++++------- inst/NEWS.Rd | 2 +- man/simulation.Rd | 22 +++++++++++----------- vignettes/SimCorMultRes.Rmd | 6 +++--- 7 files changed, 45 insertions(+), 43 deletions(-) diff --git a/R/SimCorMultRes-data.R b/R/SimCorMultRes-data.R index 1b9f290..86fb0c4 100644 --- a/R/SimCorMultRes-data.R +++ b/R/SimCorMultRes-data.R @@ -1,22 +1,22 @@ #' Simulated Correlation Parameters #' -#' Simulated dataset to examine the approximation of the correlation matrix -#' of the latent variables generated by NORTA to the correlation matrix of -#' the normal distribution used in the intermediate step of NORTA. +#' A simulated dataset to explore the association between the correlation +#' parameter of bivariate normally distributed random variables used in the +#' intermediate step of the NORTA method and the correlation parameter of the +#' resulting non-normal random responses generated by the NORTA method for all +#' the threshold approached implemented in this package. #' #' @format #' A data frame with 100 rows and 4 columns: #' \describe{ #' \item{rho}{numeric indicating the true value of the correlation parameter.} -#' \item{normal}{numeric indicating the (simulated) estimated correlation -#' parameter when the marginal distribution of each of the latent variables is -#' normal.} -#' \item{logistic}{numeric indicating the (simulated) estimated correlation -#' parameter when the marginal distribution of each of the latent variables is +#' \item{normal}{numeric indicating the simulated correlation parameter when +#' the marginal distribution of each of the latent variables is normal.} +#' \item{logistic}{numeric indicating the simulated correlation parameter +#' when the marginal distribution of each of the latent variables is #' logistic.} -#' \item{gumbel}{numeric indicating the (simulated) estimated correlation -#' parameter when the marginal distribution of each of the latent variables is -#' Gumbel.} +#' \item{gumbel}{numeric indicating the simulated correlation parameter when +#' the marginal distribution of each of the latent variables is Gumbel.} #' } #' @examples #' plot(rho - normal ~ rho, data = simulation, type = "l", col = "blue", diff --git a/README.Rmd b/README.Rmd index b3f7164..72a952b 100644 --- a/README.Rmd +++ b/README.Rmd @@ -80,7 +80,7 @@ This package provides five core functions to simulate correlated binary (`rbin`) - `rmult.clm` to simulate correlated ordinal responses under a marginal cumulative link model, - `rmult.crm` to simulate correlated ordinal responses under a marginal continuation-ratio link model. -All five functions, assume that you provide either the correlation matrix of the multivariate normal distribution in NORTA (via `cor.matrix`) or the values of the latent responses (via the `rlatent`). A simulation study (described in Section 3.5 of the vignette) suggests that the correlation matrix of the multivariate normal distribution in NORTA (via `cor.matrix`) could be treated as a good approximation of the true correlation matrix of the latent variables generated by the NORTA method regardless of their marginal distributions for all the thresholds implemented in `SimCorMultRes`. +All five functions, assume that you provide either the correlation matrix of the multivariate normal distribution in NORTA (via `cor.matrix`) or the values of the latent responses (via the `rlatent`). Based on a simulation study (see Section 3.5 of the vignette and dataset `simulation`), it is indicated that the correlation matrix of the multivariate normal distribution used in the NORTA method (via `cor.matrix`) can be considered a reliable approximation of the actual correlation matrix of the latent responses generated by the NORTA method. This appears to be the case irrespective of the marginal distributions of the latent responses for all the threshold approaches implemented in `SimCorMultRes`. There are also two utility functions: @@ -110,7 +110,8 @@ latent_correlation_matrix <- toeplitz(c(1, 0.9, 0.9, 0.9)) ## use rbin function to create the desired dataset simulated_binary_responses <- rbin(clsize = cluster_size, intercepts = beta_intercepts, - betas = beta_coefficients, xformula = ~ x, + betas = beta_coefficients, + xformula = ~ x, cor.matrix = latent_correlation_matrix, link = "probit") library("gee") diff --git a/README.md b/README.md index 4d91c14..76b8263 100644 --- a/README.md +++ b/README.md @@ -72,11 +72,13 @@ regression model for continuous random vectors as proposed by Touloumis All five functions, assume that you provide either the correlation matrix of the multivariate normal distribution in NORTA (via `cor.matrix`) or the values of the latent responses (via the `rlatent`). -A simulation study (described in Section 3.5 of the vignette) suggests -that the correlation matrix of the multivariate normal distribution in -NORTA (via `cor.matrix`) could be treated as a good approximation of the -true correlation matrix of the latent variables generated by the NORTA -method regardless of their marginal distributions for all the thresholds +Based on a simulation study (see Section 3.5 of the vignette and dataset +`simulation`), it is indicated that the correlation matrix of the +multivariate normal distribution used in the NORTA method (via +`cor.matrix`) can be considered a reliable approximation of the actual +correlation matrix of the latent responses generated by the NORTA +method. This appears to be the case irrespective of the marginal +distributions of the latent responses for all the threshold approaches implemented in `SimCorMultRes`. There are also two utility functions: @@ -131,11 +133,11 @@ browseVignettes("SimCorMultRes") ## How to cite - To cite SimCorMultRes in publications use: + To cite 'SimCorMultRes' in publications, please use: Touloumis A (2016). "Simulating Correlated Binary and Multinomial Responses under Marginal Model Specification: The SimCorMultRes - Package." _The R Journal_, *8*(2), -12. R package version 1.9.0, + Package." _The R Journal_, *8*(2), 79-91. R package version 1.9.0, . A BibTeX entry for LaTeX users is @@ -149,7 +151,7 @@ browseVignettes("SimCorMultRes") volume = {8}, number = {2}, note = {R package version 1.9.0}, - pages = {-12}, + pages = {79-91}, url = {https://journal.r-project.org/archive/2016/RJ-2016-034/index.html}, } diff --git a/inst/CITATION b/inst/CITATION index cb4f967..a1050f6 100644 --- a/inst/CITATION +++ b/inst/CITATION @@ -1,15 +1,14 @@ note <- sprintf("R package version %s", meta$Version) bibentry(bibtype = "Article", - header = "To cite SimCorMultRes in publications use:", + header = "To cite 'SimCorMultRes' in publications, please use:", title = "Simulating Correlated Binary and Multinomial Responses under Marginal Model Specification: The SimCorMultRes Package", author = as.person("Anestis Touloumis"), - year = 2016, + year = "2016", journal= "The R Journal", - volume= 8, - number= 2, + volume= "8", + number= "2", note = note, - pages= {79-91}, - url = "https://journal.r-project.org/archive/2016/RJ-2016-034/index.html" - ) + pages= "79-91", + url = "https://journal.r-project.org/archive/2016/RJ-2016-034/index.html") diff --git a/inst/NEWS.Rd b/inst/NEWS.Rd index a7a66f2..ee08f7f 100644 --- a/inst/NEWS.Rd +++ b/inst/NEWS.Rd @@ -1,7 +1,7 @@ \name{NEWS} \title{NEWS file for the \pkg{SimCorMultRes} package} -\section{Changes in Version 1.9.0 (2023-06-28)}{ +\section{Changes in Version 1.9.0 (2023-07-10)}{ \subsection{MINOR CHANGES}{ \itemize{ \item{Added R journal paper as vignette.} diff --git a/man/simulation.Rd b/man/simulation.Rd index 2161268..f8a45b5 100644 --- a/man/simulation.Rd +++ b/man/simulation.Rd @@ -8,24 +8,24 @@ A data frame with 100 rows and 4 columns: \describe{ \item{rho}{numeric indicating the true value of the correlation parameter.} - \item{normal}{numeric indicating the (simulated) estimated correlation - parameter when the marginal distribution of each of the latent variables is - normal.} - \item{logistic}{numeric indicating the (simulated) estimated correlation - parameter when the marginal distribution of each of the latent variables is + \item{normal}{numeric indicating the simulated correlation parameter when + the marginal distribution of each of the latent variables is normal.} + \item{logistic}{numeric indicating the simulated correlation parameter + when the marginal distribution of each of the latent variables is logistic.} - \item{gumbel}{numeric indicating the (simulated) estimated correlation - parameter when the marginal distribution of each of the latent variables is - Gumbel.} + \item{gumbel}{numeric indicating the simulated correlation parameter when + the marginal distribution of each of the latent variables is Gumbel.} } } \usage{ simulation } \description{ -Simulated dataset to examine the approximation of the correlation matrix -of the latent variables generated by NORTA to the correlation matrix of -the normal distribution used in the intermediate step of NORTA. +A simulated dataset to explore the association between the correlation +parameter of bivariate normally distributed random variables used in the +intermediate step of the NORTA method and the correlation parameter of the +resulting non-normal random responses generated by the NORTA method for all +the threshold approached implemented in this package. } \examples{ plot(rho - normal ~ rho, data = simulation, type = "l", col = "blue", diff --git a/vignettes/SimCorMultRes.Rmd b/vignettes/SimCorMultRes.Rmd index fd7c336..54b8fde 100644 --- a/vignettes/SimCorMultRes.Rmd +++ b/vignettes/SimCorMultRes.Rmd @@ -471,12 +471,12 @@ apply(simulated_nominal_dataset$Ysim, 2, table) / sample_size ``` -## A note on NORTA implementation +## A note on the correlation matrix -In `SimCorMultRes`, the user specifies the correlation matrix of the multivariate normal distribution (denoted by $\mathbf R$) used in the intermediate step of the NORTA method and not the correlation matrix of the latent variables. The motivation is that when all the marginal distributions of the correlated latent variables are logistic, then the correlation matrix $\mathbf R$ and that of the latent variables will be close [@Touloumis2016]. This approximation is also used in `SimCorMultRes` regardless of the marginal distribution of the latent variables. +In `SimCorMultRes`, the user provides the correlation matrix (denoted as $\mathbf{R}$) for the multivariate normal distribution used in the intermediate step of the NORTA method, rather than the correlation matrix of the latent responses used in the corresponding threshold approach. This choice is motivated by the observation that when all the marginal distributions of the correlated latent responses follow a logistic distribution, the correlation matrix $\mathbf{R}$ and the correlation matrix of the latent responses are expected to be similar, as noted by Touloumis (2016). Therefore, in `SimCorMultRes`, this approximation is employed irrespective of the marginal distribution of the latent `responses`. -To evaluate the validity of this approximation for the marginal distributions employed in `SimCorMultRes`, a simulation study was conducted. For a fixed sample size $N$ and a correlation parameter $\rho$, $N$ independent bivariate random vectors $\{\mathbf y_{i}: i = 1, \ldots, N \}$ from a bivariate normal distribution with mean vector the zero vector and covariance matrix the correlation matrix +To evaluate the validity of this approximation for the threshold approaches employed in `SimCorMultRes`, a simulation study was conducted. For a fixed sample size $N$ and a correlation parameter $\rho$, $N$ independent bivariate random vectors $\{\mathbf y_{i}: i = 1, \ldots, N \}$ from a bivariate normal distribution with mean vector the zero vector and covariance matrix the correlation matrix \[ \mathbf R = \begin{bmatrix} 1 & \rho\\