Improve loo functionality #496

ahartikainen · 2019-01-05T08:39:27Z

loo functionality should be more verbose and return more informative data. (see Model Selection lecture in StanCon Helsinki by @avehtari)

Also docstrings needs extensive examples for loo-usage.

The text was updated successfully, but these errors were encountered:

avehtari · 2019-01-07T08:32:58Z

I'm in progress of adding more explanations to loo package about interpreting loo output: approx SE of elpd_diff, p_loo, k stan-dev/loo#81. I'll comment here when ready with those.

avehtari · 2019-01-20T09:11:49Z

Because of some confusing questions in discourse.mc-stan.org I found out that loo() help says

Calculates leave-one-out (LOO) cross-validation for out of sample predictive model fit, following Vehtari et al. (2015).

The year should be 2017 (http://link.springer.com/article/10.1007/s11222-016-9696-4), but more importantly in that paper log score is used. However the code computes

loo_lppd_i = -2 * _logsumexp(log_weights, axis=0)

that is, log score multiplied by -2.

As the lppd has a specific definition in Vehtari et al (2017) which doesn't include -2, I recommend to remove that -2. If you want to print out a value multiplied by .2, call it something else and make it clear that it's lppd multiplied by -2. This will make it much easier to answer questions whether one model is better than the other.

avehtari · 2019-01-20T09:15:16Z

I also noticed that waic() help refers to DIC paper by Spiegelhalter et al, and should refer to one of the Watanabe's papers. The code has again -2, but Watanabe didn't multiply by -2, so again if you want to keep -2, you have to be explicit that the function is returning something different than what is in the paper.

ahartikainen · 2019-01-20T09:53:50Z

Hmm, where does that -2 come from?

@aloctavodia @junpenglao

pymc-devs/pymc@90c7286#diff-e41ef58a3a4077bc00fca799f90d12a4

And this

pymc-devs/pymc@5d8a8c5#diff-e41ef58a3a4077bc00fca799f90d12a4

ahartikainen · 2019-01-20T13:06:54Z

So it is the deviance scale. I think we could let user to decide the scale or return all scales (deviance, log, neglog?)

avehtari · 2019-01-20T19:19:52Z

R loo package stores and shows elpd_loo and looic. elpd_loo is clear as it has the specific definition in Vehtari et al (2017). I'm not happy with looic as it's still not indicate whether it's -1 or -2, but it's been there for a while. (DIC is the only information criterion indicating the deviance in the name. Unfortunately the original definition of deviance makes really sense only in case of point estimates, and thus deviance scale is also a bit misleading name for -2 times the log score)

avehtari · 2019-02-21T21:12:53Z

There is additional documentation now in loo package. See the pull request stan-dev/loo#98 (review). The glossary will appear in loo web pages also after the next release in spring.

OriolAbril mentioned this issue May 23, 2019

pointwise elpd diagnostics (text formatting and plot) #678

Merged

8 tasks

aloctavodia closed this as completed in #678 Jun 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve loo functionality #496

Improve loo functionality #496

ahartikainen commented Jan 5, 2019

avehtari commented Jan 7, 2019

avehtari commented Jan 20, 2019

avehtari commented Jan 20, 2019

ahartikainen commented Jan 20, 2019 •

edited

Loading

ahartikainen commented Jan 20, 2019 •

edited

Loading

avehtari commented Jan 20, 2019

avehtari commented Feb 21, 2019

Improve loo functionality #496

Improve loo functionality #496

Comments

ahartikainen commented Jan 5, 2019

avehtari commented Jan 7, 2019

avehtari commented Jan 20, 2019

avehtari commented Jan 20, 2019

ahartikainen commented Jan 20, 2019 • edited Loading

ahartikainen commented Jan 20, 2019 • edited Loading

avehtari commented Jan 20, 2019

avehtari commented Feb 21, 2019

ahartikainen commented Jan 20, 2019 •

edited

Loading

ahartikainen commented Jan 20, 2019 •

edited

Loading