-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wrong averaging of AR standard deviations #307
Comments
cc @leabeusch I defer this issue to v1.0 as I want to not add "numerically changing" results in v0.9 |
We may have to add mesmer/mesmer/stats/auto_regression.py Lines 227 to 230 in acd80e4
|
* auto_regression: return covariance * add reference to #307 * CHANGELOG
I started looking into this. At the moment, this is what we do: mesmer/mesmer/stats/_auto_regression.py Lines 60 to 105 in f8287ba
What we need to do to get the average std is average the variances, taking into account the number of observations (in the residuals): Now I have a question. We first average over ensemble members and then over scenarios. This means that we give all scenarios an equal weight, no matter how many ensemble members it holds. Or, from a different perspective, we give different ensemble members different weight, depending on how many of them are in one scenario. Is that what we want? From a coding perspective it would be easier to give each ensemble member equal weight, since then we can simply stack the ensemble and scenario dimension and then do the averaging. Otherwise we would also have to think about how to do the "second averaging" for the scenarios for the standard deviation. Given that we assume that the variance of the driving white noise process should be the same for each ensemble member and scenario I think stacking the dimensions would be most intuitive (and actually I think the same holds true for the AR process). |
For the linear regression we stack all the data and we weight each scenario equally (that means we down-weight ensemble members for scenarios with many ensemble members - but that's not yet implemented for the new mesmer code path...). The idea is that we don't want the hist or ssp585 scenarios to dominate the estimates. (I think this argument is less important for the AR process and the std.) We cannot stack the data for the AR process as there are boundaries (we could stack the residuals). Lukas would maybe argue to do it anyways. So we could go two ways (which is my difficulty here)
😕 So are we actually correct by taking the mean of the variances? Not sure if that answers your questions? Question: in our application the mean of the residuals is 0. Because the mean plays into the mean variance... import numpy as np
# ---
# same size a & b - mean != 0
a = np.random.randn(100)
b = np.random.randn(100)
var_dir = np.var(np.concatenate([a, b]))
var_via = (a.var() + b.var()) / 2 + ((a.mean() - b.mean()) / 2)**2
np.testing.assert_allclose(var_dir, var_via)
# ---
# different size a & b - mean == 0
a = np.random.randn(100)
b = np.random.randn(20)
a -= a.mean()
b -= b.mean()
var_dir = np.var(np.concatenate([a, b]))
var_via = (a.var()*a.size + b.var() * b.size) / (a.size + b.size)
np.testing.assert_allclose(var_dir, var_via) I did not manage to find a formula for different size and mean =! 0. |
I don't mean stacking them for fitting but simply stacking them for the averaging. So I just wanted to say that we could just compute the average over all ensemble members, regardless of which scenario they are in.
Hm I read Leas paper again and I understand this better now. Did she/you ever try to weight the ensemble members equally? That would be interesting. But for now I guess we should keep it as is. Then I would opt for averaging the variances with
I mean, since the ensemble members generally have the same numbers of time steps, it's not very wrong. But averaging the standard deviations definitely is.
I guess the approach is to always transform to a mean of 0 first, which we do, so that is fine, no? |
So as I see it there are now several options to do the weighting according to number of time steps and ensemble members. And when we do implement the weighting we do not only need to do it for the variances but also for the coefficients and intercepts! Option one:
🙃 I prefer either one or three. We could also implement a choice between the two. |
Or let the user give weights? |
In
train_gv_AR
we take the average over the AR params - including the standard deviation. However, this is wrong: you have to average the covariances (or even take care of the size of the samples: https://stats.stackexchange.com/q/25848)mesmer/mesmer/calibrate_mesmer/train_gv.py
Lines 195 to 208 in acd80e4
These are used here:
mesmer/mesmer/create_emulations/create_emus_gv.py
Line 176 in acd80e4
Originally posted by @mathause in #306 (comment)
The text was updated successfully, but these errors were encountered: