Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix anova.cca #701

Merged
merged 1 commit into from
Oct 22, 2024
Merged

Fix anova.cca #701

merged 1 commit into from
Oct 22, 2024

Conversation

TuomasBorman
Copy link
Contributor

In constrained ordination models such as dbrda, variables that are collinear and do not explain any variance are excluded from the model. For example, if a user models patient ID as a random effect along with other explanatory variables (sample type, diagnosis), collinear variables such as diagnosis may be excluded

data ~ sample_type + diagnosis + Condition(patient_id)

When testing for the significance of the variance explained by each variable, anova.cca(by = "margin") calls permutest() for each variable.

mods <- suppressMessages(

If a variable is excluded from the model due to collinearity, no constrained variance is available, and permutest() returns a result with num = 0.

if (is.null(x$CCA)) {

The current implementation works if the excluded variable is the only variable in the model. In this case, the method successfully handles the situation. However, if the model contains both variables that explain variance and variables that do not, the function produces an error.

The error is caused because this line below outputs a list instead of matrix or numeric scalar:

Fval <- sapply(mods, function(x) x$num)

Single numeric value is detected by this line that converts it to matrix:

if (length(Fval) == 1)

If the values are not in matrix format, this line below causes an error:

Fval <- sweep(-Fval, 1, big$num, "+")

I fixed the error by ensuring that the values are in matrix format (ncol = N variables, nrow = N permutations). If certain variable was excluded from the model, the corresponding column is filled with 0s.

The code below reproduces the problem

library(vegan)
data(dune)
data(dune.env)

# Add collinear variable
dune.env[["Use_detail"]] <- as.character(dune.env[["Use"]])
dune.env[1, "Use_detail"] <- "something"
# WORKS
rda <- rda(dune ~ Manure + Condition(Use_detail), dune.env)
anova.cca(rda, by = "margin")
# DO NOT WORK
rda <- rda(dune ~ Manure + Use + Condition(Use_detail), dune.env)
anova.cca(rda, by = "margin")

-Tuomas

@jarioksa jarioksa added the bug label Oct 18, 2024
@jarioksa
Copy link
Contributor

Simplest way to reproduce this is:

library(vegan)
data(dune, dune.env)
mod <- rda(dune ~ Management + Use + Condition(Use), data=dune.env)
anova(mod, by = "margin")

The fix seems to work, but I still want to see what is the best place for the fix.

@jarioksa
Copy link
Contributor

Your analysis is correct. The key line is indeed 104 where sapply fails to simplify to a matrix when the aliased term is a scalar 0 and non-aliased a vector of length nperm. Another place to correct this is after sapply:

diff --git a/R/anova.ccabyterm.R b/R/anova.ccabyterm.R
index 7cb61f0c..8deb6ed4 100644
--- a/R/anova.ccabyterm.R
+++ b/R/anova.ccabyterm.R
@@ -105,6 +105,8 @@
     ## Had we an empty model we need to clone the denominator
     if (length(Fval) == 1)
         Fval <- matrix(Fval, nrow = nperm)
+    if (is.list(Fval))
+        Fval <- do.call(cbind, Fval)
     Fval <- sweep(-Fval, 1, big$num, "+")
     Fval <- sweep(Fval, 2, Df, "/")
     Fval <- sweep(Fval, 1, scale, "/")

What do you think of this?

This will list the aliased term with 0 degrees of freedom, and arbitrary F-value – could be Inf, -Inf or NaN and with this respect your fix is better, and similar to anova(..., by = "terms").

@TuomasBorman
Copy link
Contributor Author

I have no strong opinion.

Your fix is simple, but I would rather fix the sapply call directly. I think that would simplify the code. Your solution would introduce second fix to sapply (1st: length(Fval) == 1 and 2nd: is.list(Fval)) that could be fixed directly.

Also, I would prefer same formatting as in anova(by="terms", ...)

@jarioksa jarioksa merged commit 6275efd into vegandevs:master Oct 22, 2024
7 checks passed
jarioksa added a commit that referenced this pull request Nov 4, 2024
jarioksa added a commit that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants