Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unrealistic correlation between sterodynamic and VLM components due to default Monte Carlo seed (1234) #349

Open
jetesdal opened this issue Nov 21, 2024 · 3 comments

Comments

@jetesdal
Copy link
Contributor

jetesdal commented Nov 21, 2024

I have observed an artificial correlation between the sterodynamic and vertical land motion components in the FACTS output, which is visible in scatter plots. See an example for the tide gauge location at Tema (Ghana) using workflow '1e':
figure-Copy1

The correlation appears to arise from the default use of the same random seed that is set by default to 1234 for the Monte Carlo sampling:

See in tlm_sterodynamics_postprocess.py:

parser.add_argument('--seed', help="Seed value for random number generator", default=1234, type=int)

and in kopp14_verticallandmotion_postprocess.py

parser.add_argument('--seed', help="Seed value for random number generator", default=1234, type=int)

The unintended correlation can lead to misleading variance decomposition that overstate the role of the interaction effects where the variance of the total is larger than the sum of the components variances. This is particularly visible for any location of a sizable sea level change due to vertical land motion, such as Tema (Ghana):

figure (1)
figure (2)

From the figures above, the "Interaction Effect" often appears to be more than 20% of the variance in total sea level change, which I suspect is mostly because of the artificial correlation between the sterodynamic and vertical land motion components. I wonder how to address this. Change the default seed to make sure they are different across the components / modules? Are there any other module that apply a Monte Carlo sampling with a default seed of 1234?

@bobkopp
Copy link
Collaborator

bobkopp commented Nov 21, 2024

This is a common default, but usually not an issue because of different relationships between the random number draws and the sea level outcome. In this case, I guess the underlying statistical form is similar enough it is a problem. This does suggest we need some standard method for assigning a unique default seed to each module, but for the moment, see what happens if you change the seed. Seed is a model parameter, so no code change is needed to affect this; it should be possible to do this in the experiment file by specifying the seed parameter.

@jetesdal
Copy link
Contributor Author

jetesdal commented Nov 21, 2024

Hi @bobkopp! Thanks for your quick response to this issue. I followed your suggestion and run a test by specifying a different seed for VLM in the experiment configuration file:

k14vlm:
    module_set: "kopp14"
    module: "verticallandmotion"
    options:
        seed: 5678

So this should specify the seed to be 5678 in the VLM module instead of the default seed of 1234, which is still used in the tlm/sterodynamics module.

The figures below show the comparison between the original config file and a new run with the modified config file (coupling.ssp245.seed), looking at the tide gauge location at Tema, Ghana where the uncertainty in VLM is relatively high.

figure (10)
The original experiment (coupling.ssp245) exhibits a strong correlation between sterodynamics and VLM (left panel, ssp245) After changing the seed value in the VLM module to 5678, there is no correlation apparent anymore (right panel, ssp245.seed).

In the new experiment, the interaction effects is now very small, such that the sum of variances across components now closely matches the total variance:
figure (8)

I think that this will impact the projection of total sea level change. You can see this by comparing the 5-95 interval from .total.workflow.wf1e.local.nc output:
figure (11)
I think the wider spread in the original setting (see blue projection called "ssp245") is because of the correlation between sterodynamics and VLM. With the modified configuration file, the uncertainty range is more constrained (see red projection called "ssp245.seed").

@bobkopp
Copy link
Collaborator

bobkopp commented Nov 22, 2024

This should be addressed systematically, but for the moment, why don't you do a pull request to change the default seed for the vlm module?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants