wrong averaging of AR standard deviations #307

mathause · 2023-09-27T08:21:08Z

In train_gv_AR we take the average over the AR params - including the standard deviation. However, this is wrong: you have to average the covariances (or even take care of the size of the samples: https://stats.stackexchange.com/q/25848)

mesmer/mesmer/calibrate_mesmer/train_gv.py

Lines 195 to 208 in acd80e4

    
           params_scen = list() 
        
           for scen in gv.keys(): 
        
               data = gv[scen] 
        
               # create temporary DataArray 
        
               data = xr.DataArray(data, dims=("run", "time")) 
        
               params = _fit_auto_regression_xr(data, dim="time", lags=AR_order_sel) 
        
               params = params.mean("run") 
        
               params_scen.append(params) 
        
           params_scen = xr.concat(params_scen, dim="scen") 
        
           params_scen = params_scen.mean("scen")

These are used here:

mesmer/mesmer/create_emulations/create_emus_gv.py

Line 176 in acd80e4

covariance=AR_std_innovs**2, # pass the (co-)variance!

Originally posted by @mathause in #306 (comment)

The text was updated successfully, but these errors were encountered:

mathause · 2023-09-27T08:22:48Z

cc @leabeusch

I defer this issue to v1.0 as I want to not add "numerically changing" results in v0.9

mathause · 2023-09-27T09:01:44Z

We may have to add nobs from AR_result to be able to calculate a more correct mean covariance:

mesmer/mesmer/stats/auto_regression.py

Lines 227 to 230 in acd80e4

    
           AR_result = AR_model.fit() 
        
           intercept = AR_result.params[0] 
        
           coeffs = AR_result.params[1:]

* auto_regression: return covariance * add reference to #307 * CHANGELOG

veni-vidi-vici-dormivi · 2024-08-12T13:37:06Z

I started looking into this. At the moment, this is what we do:

mesmer/mesmer/stats/_auto_regression.py

Lines 60 to 105 in f8287ba

    
           def _fit_auto_regression_scen_ens(*objs, dim, ens_dim, lags): 
        
               """ 
        
               fit an auto regression and potentially calculate the mean over ensemble members 
        
               and scenarios 
        
               Parameters 
        
               ---------- 
        
               *objs : iterable of DataArray 
        
                   A list of ``xr.DataArray`` to estimate the auto regression over. 
        
               dim : str 
        
                   Dimension along which to fit the auto regression. 
        
               ens_dim : str 
        
                   Dimension name of the ensemble members. 
        
               lags : int 
        
                   The number of lags to include in the model. 
        
               Returns 
        
               ------- 
        
               :obj:`xr.Dataset` 
        
                   Dataset containing the estimated parameters of the ``intercept``, the AR 
        
                   ``coeffs`` and the ``variance`` of the residuals. 
        
               Notes 
        
               ----- 
        
               Calculates the mean auto regression, first over the ensemble members, then over all 
        
               scenarios. 
        
               """ 
        
               ar_params_scen = list() 
        
               for obj in objs: 
        
                   ar_params = fit_auto_regression(obj, dim=dim, lags=int(lags)) 
        
                   # BUG/ TODO: fix for v1, see https://github.com/MESMER-group/mesmer/issues/307 
        
                   ar_params["standard_deviation"] = np.sqrt(ar_params.variance) 
        
                   if ens_dim in ar_params.dims: 
        
                       ar_params = ar_params.mean(ens_dim) 
        
                   ar_params_scen.append(ar_params) 
        
               ar_params_scen = xr.concat(ar_params_scen, dim="scen") 
        
               # return the mean over all scenarios 
        
               ar_params = ar_params_scen.mean("scen") 
        
               return ar_params

What we need to do to get the average std is average the variances, taking into account the number of observations (in the residuals):

$avg(\sigma) = \sqrt{\frac{\sum_{l=0}^{k} (n_l-1) \cdot \sigma_l^2}{\sum_{l=0}^k n_l - 1}}$ with $k$ the ensemble member (for one scenario (?, see below)).

Now I have a question. We first average over ensemble members and then over scenarios. This means that we give all scenarios an equal weight, no matter how many ensemble members it holds. Or, from a different perspective, we give different ensemble members different weight, depending on how many of them are in one scenario. Is that what we want? From a coding perspective it would be easier to give each ensemble member equal weight, since then we can simply stack the ensemble and scenario dimension and then do the averaging. Otherwise we would also have to think about how to do the "second averaging" for the scenarios for the standard deviation. Given that we assume that the variance of the driving white noise process should be the same for each ensemble member and scenario I think stacking the dimensions would be most intuitive (and actually I think the same holds true for the AR process).

mathause · 2024-08-12T16:18:51Z

For the linear regression we stack all the data and we weight each scenario equally (that means we down-weight ensemble members for scenarios with many ensemble members - but that's not yet implemented for the new mesmer code path...). The idea is that we don't want the hist or ssp585 scenarios to dominate the estimates. (I think this argument is less important for the AR process and the std.)

We cannot stack the data for the AR process as there are boundaries (we could stack the residuals). Lukas would maybe argue to do it anyways.

So we could go two ways (which is my difficulty here)

Follow the formula and weight the variances by the number of data points (n_year * n_ens).
Do a weighted variance where we weight the variance by 1 / (n_year * n_ens)

😕

So are we actually correct by taking the mean of the variances?

Not sure if that answers your questions?

Question: in our application the mean of the residuals is 0. Because the mean plays into the mean variance...

$$\mathrm{Var}(Y) = \mathrm{E}[\mathrm{Var}(Y \mid X)] + \mathrm{Var}(\mathrm{E}[Y \mid X])$$

import numpy as np

# ---
# same size a & b - mean != 0

a = np.random.randn(100)
b = np.random.randn(100)

var_dir = np.var(np.concatenate([a, b]))
var_via = (a.var() + b.var()) / 2 + ((a.mean() - b.mean()) / 2)**2

np.testing.assert_allclose(var_dir, var_via)

# ---
# different size a & b - mean == 0

a = np.random.randn(100)
b = np.random.randn(20)

a -= a.mean()
b -= b.mean()

var_dir = np.var(np.concatenate([a, b]))
var_via = (a.var()*a.size + b.var() * b.size) / (a.size + b.size)

np.testing.assert_allclose(var_dir, var_via)

I did not manage to find a formula for different size and mean =! 0.

veni-vidi-vici-dormivi · 2024-08-13T08:17:16Z

We cannot stack the data for the AR process as there are boundaries

I don't mean stacking them for fitting but simply stacking them for the averaging. So I just wanted to say that we could just compute the average over all ensemble members, regardless of which scenario they are in.

The idea is that we don't want the hist or ssp585 scenarios to dominate the estimates. (I think this argument is less important for the AR process and the std.)

Hm I read Leas paper again and I understand this better now. Did she/you ever try to weight the ensemble members equally? That would be interesting. But for now I guess we should keep it as is. Then I would opt for averaging the variances with n_years and n_ens, as in your one.

So are we actually correct by taking the mean of the variances?

I mean, since the ensemble members generally have the same numbers of time steps, it's not very wrong. But averaging the standard deviations definitely is.

I did not manage to find a formula for different size and mean =! 0.

I guess the approach is to always transform to a mean of 0 first, which we do, so that is fine, no?

veni-vidi-vici-dormivi · 2024-08-22T15:01:12Z

So as I see it there are now several options to do the weighting according to number of time steps and ensemble members. And when we do implement the weighting we do not only need to do it for the variances but also for the coefficients and intercepts!

Option one:

Leave at is, meaning scenarios are weighted equally, not considering the number of ensemble members or time steps
Weigh each scenario by number of ensemble members but not by number of time steps
Weigh each scenario by number of time steps and number of ensemble members
Weigh each ensemble member by the number of time steps it has but then not by the number of members for the scenario.

🙃

I prefer either one or three. We could also implement a choice between the two.

veni-vidi-vici-dormivi · 2024-09-06T07:29:13Z

Or let the user give weights?

mathause added the bug Something isn't working label Sep 27, 2023

mathause added this to the v1.0.0 milestone Sep 27, 2023

mathause mentioned this issue Sep 27, 2023

auto_regression: return covariance #309

Merged

4 tasks

mathause added a commit to mathause/mesmer that referenced this issue Sep 27, 2023

add reference to MESMER-group#307

cb8cb33

mathause added a commit that referenced this issue Sep 27, 2023

auto_regression: return covariance (#309)

c31044f

* auto_regression: return covariance * add reference to #307 * CHANGELOG

This was referenced Oct 4, 2023

auto regression does not return the *co*variance #317

Closed

have fit AR regression return nobs #319

Closed

mathause added the topic-stats label Oct 5, 2023

mathause added the breaking-change label Nov 27, 2023

veni-vidi-vici-dormivi mentioned this issue Aug 19, 2024

Fix wrong averaging of standard deviations #499

Merged

2 tasks

veni-vidi-vici-dormivi removed the bug Something isn't working label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wrong averaging of AR standard deviations #307

wrong averaging of AR standard deviations #307

mathause commented Sep 27, 2023 •

edited

Loading

mathause commented Sep 27, 2023

mathause commented Sep 27, 2023

veni-vidi-vici-dormivi commented Aug 12, 2024

mathause commented Aug 12, 2024

veni-vidi-vici-dormivi commented Aug 13, 2024

veni-vidi-vici-dormivi commented Aug 22, 2024 •

edited

Loading

veni-vidi-vici-dormivi commented Sep 6, 2024

wrong averaging of AR standard deviations #307

wrong averaging of AR standard deviations #307

Comments

mathause commented Sep 27, 2023 • edited Loading

mathause commented Sep 27, 2023

mathause commented Sep 27, 2023

veni-vidi-vici-dormivi commented Aug 12, 2024

mathause commented Aug 12, 2024

veni-vidi-vici-dormivi commented Aug 13, 2024

veni-vidi-vici-dormivi commented Aug 22, 2024 • edited Loading

veni-vidi-vici-dormivi commented Sep 6, 2024

mathause commented Sep 27, 2023 •

edited

Loading

veni-vidi-vici-dormivi commented Aug 22, 2024 •

edited

Loading