Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with population cosinors #1

Open
1 of 2 tasks
shah-in-boots opened this issue Mar 23, 2021 · 7 comments
Open
1 of 2 tasks

Problem with population cosinors #1

shah-in-boots opened this issue Mar 23, 2021 · 7 comments
Assignees
Milestone

Comments

@shah-in-boots
Copy link
Owner

shah-in-boots commented Mar 23, 2021

Problem with population cosinors, reported by DV.

  • Confirm issue on local workspace
  • Re-evaluate statistics behind population mean

Just to report that I found the following error when running cosinor for
calculating a population mean cosinor

Error in data.frame(population = rep(names(kfits), sapply(kfits, length)),
 :
#                      arguments imply differing number of rows: 0, 520

To my best knowledge, I gave the same format to my file and relevant
variables as the Twins example file has. Twins runs ok, but not my file.

Even if I didn't find the reason for this, I solved it by running
cosinor_pop_impl and declaring kfits as data frame.

fits <- data.frame(
    population = rep(names(as.data.frame(kfits)),
sapply(as.data.frame(kfits), length)),
    yhat = unlist(kfits)
  )

Originally, it returned

names (kfits)
NULL
sapply((kfits), length)
   [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 .....

After declaring kfits as data frame, it returns

names(as.data.frame(kfits))
[1] "1" "2" "3" "4" "5" "6"
sapply(as.data.frame(kfits), length)
  1   2   3   4   5   6
520 520 520 520 520 520

Above documentation from message from DV. Code below is from DV as part of a MWE.


Code that would lead to this problem is below (generated by DV).

# cosinor test

library (card)

pmc.df.t <- as.data.frame(matrix(NA,3120,3)) # data frame for data
names(pmc.df.t) <- c("time","subject", "HR") # variable names

t <- c(1:520) # time 
pmc.df.t[,1] <- rep(1:520,6) # six subjects 
pmc.df.t[,2] <- rep(1:6,520)[order(rep(1:6,520))] # time for each subject


set.seed(1) # seed for rnd

# generates six different signals with some noise
for (i in 1:6){
M <- rnorm(1, mean=70, sd=5)
A <- rnorm(1, mean=3, sd=0.1)
phi <- rnorm(1, mean=60, sd=10)
e <- rnorm(c(1:520), mean=0, sd=5)
hr.curve <- M + A*cos((2*pi/260)*t+phi)+e
pmc.df.t[520*(i-1)+(1:520),3] <- hr.curve
print(plot(t,hr.curve))
}

# cosinor model
pmc.model.t <- cosinor(HR~time, data=pmc.df.t, tau = c(260), population = "subject")
@shah-in-boots shah-in-boots self-assigned this Mar 23, 2021
davorvr added a commit to davorvr/card that referenced this issue Aug 28, 2021
@AlistairCarr
Copy link

Hi,

I'm experiencing the same issue as described above with the population mean cosinor function. I posted a query about it on Stack Overflow (https://stackoverflow.com/questions/77469061/how-to-carry-out-population-mean-cosinor-analysis-in-r-using-card-package).

Using the as.data.frame(kfits) fix described above by DV solved that error message. However, I found the coefficients from the resulting population mean cosinor model with that altered code were the same as those in the normal cosinor model (i.e., they were wrong).

I'm not particularly proficient in R or using GitHub. Will there be any future updates of the 'card' package with this issue resolved? Or could you suggest an alternative approach that might be used?

Thanks!

@shah-in-boots
Copy link
Owner Author

shah-in-boots commented Nov 13, 2023

Hi @AlistairCarr - thanks for your comment. I actually had an issue with a CRAN check about x 1 year ago, and I had limited time and resources to fix it at the time.

Let me take a look at this issue in the upcoming 2-3 weeks, and see if I can at least repair it at the developmental level. If I can, you should be able to install the package from github directly.

If I don't think I can fix it for whatever reason, I'll let you know as well.

@shah-in-boots
Copy link
Owner Author

However, I found the coefficients from the resulting population mean cosinor model with that altered code were the same as those in the normal cosinor model (i.e., they were wrong).

@AlistairCarr could you help me by either sharing the code or the discrepancy you found? I'm brushing up on the internals of the software this week. When you say "they were wrong", do you mean that the program reported the population mean consinor as if it was an individual cosinor? Thanks for the clarification!

@AlistairCarr
Copy link

Hi, thanks for getting back. And just to feedback it's a great package I've really enjoyed using it. Find it much more intuitive and flexible for graphing than alternative packages that can do cosinor analyses.

The link to the stat exchange question has code for the problem I'm experiencing, which sounds the same as experienced by DV.

The below code I hope explains what I mean by 'being wrong'. Compared with the example coefficient output for a 'normal' cosinor model (model 'm' in code below) and population mean cosinor model (model 'm_pop' in code below), as used in your vignette, when I incorporate DV's edit suggestion into the code for the 'cosinor_pop_impl' function, the resulting coefficients I get from that model ('m_pop2' in code below) are equal to those of the 'normal' cosinor model (model 'm') rather than the population mean cosinor model ('m_pop').

I hope that makes sense? Apologies if my coding vocabulary is not up to scratch - as I say don't really have much experience in R.

`

rm(list = ls())

install.packages('card')

library(card)

twins <- card::twins

model coefficient outputs using card package

#Normal consinor model

m <- cosinor(rDYX ~ hour, data = twins, tau = 24)

summary(m)

#Population cosinor model

m_pop <- cosinor(rDYX ~ hour, data = twins, tau = c(24), population = "patid")

summary(m_pop)

################## model coefficient output using edited 'card' code ###############

#underlying 'card' code, edited with DV's suggestion to make kfits as.data.frame

#DV's suggested fix was:

fits <- data.frame(

population = rep(names(as.data.frame(kfits)),

sapply(as.data.frame(kfits), length)),

yhat = unlist(kfits)

)

Cosinor Implementation {{{ ====

Single Component Cosinor Implementation

#' @description Model fitting algorithm for cosinor. Results in output that define the new S3 class, as seen by the [hardhat::new_model], which generates the new_cosinor function.
#' @nord
cosinor_impl <- function(predictors, outcomes, tau) {

Parameters for normal equations

Formal equation

y(t) = M + Acos(2pi*t/tau + phi)

A = Amplitude

phi = acrophase (measure of hte time of overall high values in cycle)

M = MESOR

y(t) = M + betax + gammaz + error(t)

beta = A*cos(phi)

gamma = -A*sin(phi)

x = cos(2pit/tau)

z = sin(2pit/tau)

Where N is number of observations iterated through by i

RSS = sum[y - (M + betax + gammaz)]^2

Normal equations (where M, beta, gamma are the coefficients to solve for)

sum(y) = Mn + betasum(x) + gamma*sum(z)

sum(yx) = Msum(x) + betasum(x^2) + gammasum(x*z)

sum(yz) = Msum(z) + betasum(xz) + gamma*sum(z^2)

Multiple components... is an extension of single component

y(t) = M + sum_j[ A_jcos(2pi*t/tau_j + phi_j)

y(t) = M + sum_j[beta_j * x_j + gamma_j * z_j] + error(t)

Number of parameters will be the number of taus

e.g. single component = 3 components, where 3 = 2p + 1 (p = 1 component)

p <- length(tau)

Single parameters

y <- outcomes
t <- predictors
n <- length(t)

Normal equation for 3 components

Normal equations (where M, beta, gamma are the coefficients to solve for)

sum(y) = Mn + betasum(x) + gamma*sum(z)

sum(yx) = Msum(x) + betasum(x^2) + gammasum(x*z)

sum(yz) = Msum(z) + betasum(xz) + gamma*sum(z^2)

d = Su (for single component, 3 equations with 3 unknowns)

For multiple components, the matrix must be expanded

Need to create number of x values to match number of taus

x1, x2, z1, z2 in this case

for (i in 1:p) {
assign(paste0("x", i), cos((2 * pi * t) / tau[i]))
assign(paste0("z", i), sin((2 * pi * t) / tau[i]))
}

Creating a new dataframe with all variables

model <- data.frame(y, t, mget(paste0("x", 1:p)), mget(paste0("z", 1:p)))

The formula, where the intercept will be the MESOR (not included)

f <- stats::formula(
paste0("y ~ ", paste0("x", 1:p, " + ", "z", 1:p, collapse = " + "))
)

Can create a model frame here using two approaches

Base R and with hardhat

m <- stats::model.frame(f, model)
xmat <- stats::model.matrix(f, m)
ymat <- as.matrix(y)

Solving for coefficients

Solve for coefficients, including amplitude and acrophase

coefs <- solve(t(xmat) %% xmat) %% t(xmat) %*% ymat
mesor <- coefs[1]

for (i in 1:p) {

# Beta and gamma terms
assign(paste0("beta", i), unname(coefs[paste0("x", i),]))
assign(paste0("gamma", i), unname(coefs[paste0("z", i),]))

# Amplitude
assign(paste0("amp", i), sqrt(get(paste0("beta", i))^2 + get(paste0("gamma", i))^2))

# Phi / acrophase
sb <- sign(get(paste0("beta", i)))
sg <- sign(get(paste0("gamma", i)))
theta <- atan(abs(get(paste0("gamma", i)) / get(paste0("beta", i))))

if ((sb == 1 | sb == 0) & sg == 1) {
  phi <- -theta
} else if (sb == -1 & (sg == 1 | sg == 0)) {
  phi <- theta - pi
} else if ((sb == -1 | sb == 0) & sg == -1) {
  phi <- -theta - pi
} else if (sb == 1 & (sg == -1 | sg == 0)) {
  phi <- theta - (2 * pi)
}

assign(paste0("phi", i), phi)

}

coefs <- unlist(c(mesor = mesor, mget(paste0("amp", 1:p)), mget(paste0("phi", 1:p)), mget(paste0("beta", 1:p)), mget(paste0("gamma", 1:p))))

Predicted / output

y(t) = M + b1 * x1 + g1 * z1 + b2 * x2 + g2 * z2

y(t) = M + amp1 * cos(2pit/tau1 + phi1) + amp2 * cos(2pit/tau2 + phi2)

pars <- list()
for (i in 1:p) {
pars[[i]] <- get(paste0("amp", i)) * cos(2pit / tau[i] + get(paste0("phi", i)))
}
df <- data.frame(mesor = mesor, matrix(unlist(pars), ncol = length(pars), byrow = FALSE))
yhat <- rowSums(df)

Model Output

Model coefficients

coef_names <- names(coefs)
coefs <- unname(coefs)

Fit and residuals

fitted.values <- yhat
residuals <- y - yhat

List to return

list(

# Raw coefficients
coefficients = coefs,
coef_names = coef_names,

# Fitted and residual values
fitted.values = fitted.values,
residuals = residuals,

# Overall model of cosinor
model = model,

# Matrices used
xmat = xmat

)

}

Population Mean Cosinor Implementation

#' @description Model fitting algorithm for population-mean cosinor. Uses the
#' cosinor_impl() algorithm to derive individual parameters.
#' @nord
cosinor_pop_impl <- function(predictors, outcomes, tau, population) {

Population cosinor parameter setup

Period

p <- length(tau) # Number of parameters ... single cosinor ~ 2p + 1 = 3

Create data frame for split/apply approach

df <- na.omit(data.frame(predictors, outcomes, population))

Remove patients with only p observations (will cause a det ~ 0 error)

counts <- by(df, df[, "population"], nrow)
lowCounts <- as.numeric(names(counts[counts <= 2*p + 1]))
df <- subset(df, !(population %in% lowCounts))

Message about population count removal

if (length(lowCounts) != 0) {
message(length(lowCounts), " subjects were removed due to having insufficient observations.")
}

Population parameters

k <- length(unique(df$population)) # Number of individuals
y <- df$outcomes
t <- df$predictors
n <- length(t)
population <- df$population

Need to create number of x values to match number of taus

x1, x2, z1, z2 in this case

for (i in 1:p) {
assign(paste0("x", i), cos((2 * pi * t) / tau[i]))
assign(paste0("z", i), sin((2 * pi * t) / tau[i]))
}

Creating a new dataframe with all variables

model <- data.frame(y, t, mget(paste0("x", 1:p)), mget(paste0("z", 1:p)), population)

Create matrix that we can apply cosinor to subgroups

kCosinors <- with(
df,
by(df, population, function(x) {
cosinor_impl(x$predictors, x$outcomes, tau)
})
)

Coefficients

Fits of individual cosinors

kfits <- sapply(kCosinors, stats::fitted)
fits <- data.frame(
population = rep(names(as.data.frame(kfits)),
sapply(as.data.frame(kfits), length)),
yhat = unlist(kfits)
)

Matrix of coefficients

tbl <- sapply(kCosinors, stats::coef, USE.NAMES = TRUE)
coef_names <- c("mesor", paste0("amp", 1:p), paste0("phi", 1:p), paste0("beta", 1:p), paste0("gamma", 1:p))
rownames(tbl) <- coef_names
xmat <- t(tbl)

Get mean for each parameter (mesor, beta, gamma), ignoring averaged amp/phi

coefs <- colMeans(xmat, na.rm = TRUE)

for (i in 1:p) {

# Beta and gamma terms
assign(paste0("beta", i), unname(coefs[paste0("beta", i)]))
assign(paste0("gamma", i), unname(coefs[paste0("gamma", i)]))

# Amplitude
assign(paste0("amp", i), sqrt(get(paste0("beta", i))^2 + get(paste0("gamma", i))^2))

# Phi / acrophase
sb <- sign(get(paste0("beta", i)))
sg <- sign(get(paste0("gamma", i)))
theta <- atan(abs(get(paste0("gamma", i)) / get(paste0("beta", i))))

if ((sb == 1 | sb == 0) & sg == 1) {
  phi <- -theta
} else if (sb == -1 & (sg == 1 | sg == 0)) {
  phi <- theta - pi
} else if ((sb == -1 | sb == 0) & sg == -1) {
  phi <- -theta - pi
} else if (sb == 1 & (sg == -1 | sg == 0)) {
  phi <- theta - (2 * pi)
}

assign(paste0("phi", i), phi)

# Final coefs
coefs[paste0("amp", i)] <- get(paste0("amp", i))
coefs[paste0("phi", i)] <- get(paste0("phi", i))

}

Model output

Fitted values

y(t) = M + Acos(2pi*t/tau + phi)

Individual fits

Overall model

yhat <- fits$yhat

List of values to return (must be same as cosinor_impl)

list(

# Raw coefficients
coefficients = unname(coefs),
coef_names = coef_names,

# Fitted and residual values
fitted.values = yhat,
residuals = y - yhat,

# Overall population cosinor data set, including subject names
model = model,

# Matrices used (for population cosinor, is the coefficient matrix)
xmat = xmat

)

}

##coefficient outputs of mean population cosinor modeul using above edited cosinor_pop_impl function

m_pop2 <- cosinor_pop_impl(twins$hour, twins$rDYX, 24, "patid")

#coefficients from cosinor_pop_impl function = coefficients of normal cosinor model (model m) rather than population cosinor model (model m_pop)
m_pop2$coefficients

m_pop$coefficients

m$coefficients

`

@shah-in-boots
Copy link
Owner Author

@AlistairCarr - I appreciate the thorough response. I'll take a look more closely and see if we can resolve this. I imagine that its an inappropriate data reference somewhere when passing the population dataset. When I started writing this I was not consistent with my non-standard evaluation. I think it'll be 1-2 weeks to get to a resolution at least on the development version.

@AlistairCarr
Copy link

Thanks the update. Good luck!

@shah-in-boots
Copy link
Owner Author

Development version of population cosinor is being developed:

  • Identified key issue in maintaining power in uneven samples across a population.
  • limitations are in harmonic and Taylor series estimates of confidence intervals / error methods.

@AlistairCarr - thanks for bring this all to my attention again. I expect a working version in the next 1-2 weeks.

@shah-in-boots shah-in-boots added this to the v0.2.0 milestone Apr 3, 2024
shah-in-boots pushed a commit that referenced this issue Oct 14, 2024
… class from sapply in the chain, which should now be resolved, however need to test coefficients in additional dataframes #1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants