Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong averaging of AR standard deviations #307

Open
mathause opened this issue Sep 27, 2023 · 7 comments
Open

wrong averaging of AR standard deviations #307

mathause opened this issue Sep 27, 2023 · 7 comments

Comments

@mathause
Copy link
Member

mathause commented Sep 27, 2023

In train_gv_AR we take the average over the AR params - including the standard deviation. However, this is wrong: you have to average the covariances (or even take care of the size of the samples: https://stats.stackexchange.com/q/25848)

params_scen = list()
for scen in gv.keys():
data = gv[scen]
# create temporary DataArray
data = xr.DataArray(data, dims=("run", "time"))
params = _fit_auto_regression_xr(data, dim="time", lags=AR_order_sel)
params = params.mean("run")
params_scen.append(params)
params_scen = xr.concat(params_scen, dim="scen")
params_scen = params_scen.mean("scen")

These are used here:

covariance=AR_std_innovs**2, # pass the (co-)variance!

Originally posted by @mathause in #306 (comment)

@mathause
Copy link
Member Author

cc @leabeusch

I defer this issue to v1.0 as I want to not add "numerically changing" results in v0.9

@mathause
Copy link
Member Author

We may have to add nobs from AR_result to be able to calculate a more correct mean covariance:

AR_result = AR_model.fit()
intercept = AR_result.params[0]
coeffs = AR_result.params[1:]

@mathause mathause added the bug Something isn't working label Sep 27, 2023
@mathause mathause added this to the v1.0.0 milestone Sep 27, 2023
mathause added a commit to mathause/mesmer that referenced this issue Sep 27, 2023
mathause added a commit that referenced this issue Sep 27, 2023
* auto_regression: return covariance

* add reference to #307

* CHANGELOG
@veni-vidi-vici-dormivi
Copy link
Collaborator

I started looking into this. At the moment, this is what we do:

def _fit_auto_regression_scen_ens(*objs, dim, ens_dim, lags):
"""
fit an auto regression and potentially calculate the mean over ensemble members
and scenarios
Parameters
----------
*objs : iterable of DataArray
A list of ``xr.DataArray`` to estimate the auto regression over.
dim : str
Dimension along which to fit the auto regression.
ens_dim : str
Dimension name of the ensemble members.
lags : int
The number of lags to include in the model.
Returns
-------
:obj:`xr.Dataset`
Dataset containing the estimated parameters of the ``intercept``, the AR
``coeffs`` and the ``variance`` of the residuals.
Notes
-----
Calculates the mean auto regression, first over the ensemble members, then over all
scenarios.
"""
ar_params_scen = list()
for obj in objs:
ar_params = fit_auto_regression(obj, dim=dim, lags=int(lags))
# BUG/ TODO: fix for v1, see https://github.com/MESMER-group/mesmer/issues/307
ar_params["standard_deviation"] = np.sqrt(ar_params.variance)
if ens_dim in ar_params.dims:
ar_params = ar_params.mean(ens_dim)
ar_params_scen.append(ar_params)
ar_params_scen = xr.concat(ar_params_scen, dim="scen")
# return the mean over all scenarios
ar_params = ar_params_scen.mean("scen")
return ar_params

What we need to do to get the average std is average the variances, taking into account the number of observations (in the residuals):

$avg(\sigma) = \sqrt{\frac{\sum_{l=0}^{k} (n_l-1) \cdot \sigma_l^2}{\sum_{l=0}^k n_l - 1}}$ with $k$ the ensemble member (for one scenario (?, see below)).

Now I have a question. We first average over ensemble members and then over scenarios. This means that we give all scenarios an equal weight, no matter how many ensemble members it holds. Or, from a different perspective, we give different ensemble members different weight, depending on how many of them are in one scenario. Is that what we want? From a coding perspective it would be easier to give each ensemble member equal weight, since then we can simply stack the ensemble and scenario dimension and then do the averaging. Otherwise we would also have to think about how to do the "second averaging" for the scenarios for the standard deviation. Given that we assume that the variance of the driving white noise process should be the same for each ensemble member and scenario I think stacking the dimensions would be most intuitive (and actually I think the same holds true for the AR process).

@mathause
Copy link
Member Author

For the linear regression we stack all the data and we weight each scenario equally (that means we down-weight ensemble members for scenarios with many ensemble members - but that's not yet implemented for the new mesmer code path...). The idea is that we don't want the hist or ssp585 scenarios to dominate the estimates. (I think this argument is less important for the AR process and the std.)

We cannot stack the data for the AR process as there are boundaries (we could stack the residuals). Lukas would maybe argue to do it anyways.

So we could go two ways (which is my difficulty here)

  1. Follow the formula and weight the variances by the number of data points (n_year * n_ens).
  2. Do a weighted variance where we weight the variance by 1 / (n_year * n_ens)

😕

So are we actually correct by taking the mean of the variances?

Not sure if that answers your questions?


Question: in our application the mean of the residuals is 0. Because the mean plays into the mean variance...

$$\mathrm{Var}(Y) = \mathrm{E}[\mathrm{Var}(Y \mid X)] + \mathrm{Var}(\mathrm{E}[Y \mid X])$$

import numpy as np

# ---
# same size a & b - mean != 0

a = np.random.randn(100)
b = np.random.randn(100)

var_dir = np.var(np.concatenate([a, b]))
var_via = (a.var() + b.var()) / 2 + ((a.mean() - b.mean()) / 2)**2

np.testing.assert_allclose(var_dir, var_via)

# ---
# different size a & b - mean == 0

a = np.random.randn(100)
b = np.random.randn(20)

a -= a.mean()
b -= b.mean()

var_dir = np.var(np.concatenate([a, b]))
var_via = (a.var()*a.size + b.var() * b.size) / (a.size + b.size)

np.testing.assert_allclose(var_dir, var_via)

I did not manage to find a formula for different size and mean =! 0.

@veni-vidi-vici-dormivi
Copy link
Collaborator

We cannot stack the data for the AR process as there are boundaries

I don't mean stacking them for fitting but simply stacking them for the averaging. So I just wanted to say that we could just compute the average over all ensemble members, regardless of which scenario they are in.

The idea is that we don't want the hist or ssp585 scenarios to dominate the estimates. (I think this argument is less important for the AR process and the std.)

Hm I read Leas paper again and I understand this better now. Did she/you ever try to weight the ensemble members equally? That would be interesting. But for now I guess we should keep it as is. Then I would opt for averaging the variances with n_years and n_ens, as in your one.

So are we actually correct by taking the mean of the variances?

I mean, since the ensemble members generally have the same numbers of time steps, it's not very wrong. But averaging the standard deviations definitely is.

I did not manage to find a formula for different size and mean =! 0.

I guess the approach is to always transform to a mean of 0 first, which we do, so that is fine, no?

@veni-vidi-vici-dormivi
Copy link
Collaborator

veni-vidi-vici-dormivi commented Aug 22, 2024

So as I see it there are now several options to do the weighting according to number of time steps and ensemble members. And when we do implement the weighting we do not only need to do it for the variances but also for the coefficients and intercepts!

Option one:

  1. Leave at is, meaning scenarios are weighted equally, not considering the number of ensemble members or time steps
  2. Weigh each scenario by number of ensemble members but not by number of time steps
  3. Weigh each scenario by number of time steps and number of ensemble members
  4. Weigh each ensemble member by the number of time steps it has but then not by the number of members for the scenario.

🙃

I prefer either one or three. We could also implement a choice between the two.

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi removed the bug Something isn't working label Aug 22, 2024
@veni-vidi-vici-dormivi
Copy link
Collaborator

Or let the user give weights?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants