Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outputs of rbmi::lsmeans() don't match those of emmeans::lsmeans() #412

Closed
adcascone opened this issue Mar 11, 2024 · 4 comments · Fixed by #413
Closed

outputs of rbmi::lsmeans() don't match those of emmeans::lsmeans() #412

adcascone opened this issue Mar 11, 2024 · 4 comments · Fixed by #413
Labels
bug Something isn't working

Comments

@adcascone
Copy link

On pg. 70 of the rbmi reference manual, users are directed to the references of lsmeans() for identical implementations of estimating the least square means from a linear model in SAS and in R via the emmeans package.

However, the outputs of rbmi::lsmeans() don't match those of emmeans::lsmeans()


  iris2 <- iris[ iris$Species %in% c("versicolor", "virginica"), ]
  iris2$Species <- factor(iris2$Species)
  
  emmeans::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), specs = "Species",data=iris2, weights="proportional")
    #  Species    lsmean     SE df lower.CL upper.CL
    # versicolor   6.49 0.0796 95     6.33     6.65
    # virginica    5.99 0.0768 95     5.84     6.14
  rbmi:::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), Species = "versicolor",.weights = "proportional")$est[[1]]
    #6.512539
  rbmi:::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), Species = "virginica",.weights = "proportional")$est[[1]]
    #6.011461

Environment

  • OS: Linux
  • R version 4.3.2
  • rbmi version 1.2.5

Thank you,
Arianna

@adcascone adcascone added the bug Something isn't working label Mar 11, 2024
@gowerc
Copy link
Collaborator

gowerc commented Mar 19, 2024

Hi @adcascone,

Thanks for raising this. Digging into this a bit more I don't think this is a bug but more of a "what are you estimating" issue. For reference the discrepancy comes from how we handle interactions between continuous covariates where we use mean(A*B) whilst emmeans uses mean(A) * mean(B). For additional reference it appears that SAS's default is equivalent to what we've implemented however they do provide options for the user to select between the two (see Setting Covariate Values ).

Talking to @wolbersm, if I have understood him correctly, the difference comes down to whether you are wanting to construct a single hypothetical patient (in which case the emmeans approach makes more sense) or if you are wanting to construct a counterfactual e.g. what would have happened if all patients had taken drug vs all patients taking placebo (which is what our current implementation represents).

If you would definitely prefer to use the hypothetical patient approach we could look to add an option to rbmi to allow users to select between the two.

(@wolbersm - Please correct me if I've got any of the details above wrong)

@wolbersm
Copy link
Collaborator

Hi both

Thanks @gowerc! Just to small additional clarification to what @gowerc wrote:

rbmi is primarily targeted to the analysis of randomized clinical trials.
In this setting it is often useful to report the estimated (counterfactual) population mean outcomes if all subjects had received intervention or control treatment, respectively. This can be done via standardization and, for linear models, equivalently via LSMEANS as per our implementation. See e.g. this slide deck (slide 30f) for further regarding the connection of standadization and LSMEANS: https://baselbiometrics.github.io/home/docs/trainings/20210202/2_Bornkamp.pdf

@wolbersm
Copy link
Collaborator

So, while I think our choice is reasonable, it could be useful to document in the help pages that rbmi::lsmeans() may give different results to emmeans::lsmeans() if interaction terms are included in the model. What do you think @gowerc?

@gowerc
Copy link
Collaborator

gowerc commented Mar 22, 2024

Just to clarify we ended up making the following changes which hopefully address this issue (please re-open if you feel it hasn't been sufficiently addressed)

  • Renamed the weighting "proportional" to "counterfactual" to be better represent what this does
  • Introduced weighting "proportional_em" to mirror the emmeans "proportional" weighting
  • The weighting "proportional" still exists for backwards compatibility and is an alias for "counterfactual" but now gives a message suggesting the user to update to either "proportional_em" or "counterfactual" to be explicit
  • Updated our documentation to hopefully make this clearer.

Hope this helps !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants