Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scientific Defaults #5

Merged
merged 9 commits into from
Oct 21, 2024
Merged

Scientific Defaults #5

merged 9 commits into from
Oct 21, 2024

Conversation

lispandfound
Copy link
Contributor

This PR introduces the new workflow changes for scientific defaults. These defaults are an accumulation of all of the magic numbers I could find in the old workflow codebase. These defaults and read and written to realisations by the workflow stages.

Why These Changes?

The old workflow kept defaults across a number of different files. Some default values, such as those for the high frequency simulations, were recorded in code and not recognised as values with scientific significance for simulations. Consolidating the defaults has three advantages:

  1. Simulations are more easily reproducible because the state can no longer be inconsistent between the defaults and the code. That is, it is no longer possible for the codebase to be using scientific defaults from 2022 for velocity modelling and EMOD3D but then using the latest HF code from last week (which has HF defaults in the old workflow).
  2. Simulation defaults are easily shared and tweaked because they live in just one place.
  3. It is easier to swap out defaults and run 200m instead of 400m for the same realisation.

Summary of the files involved

The workflow/default_parameters directory contains four sets of defaults, 24.2.2.1, 24.2.2.2, 24.2.2.4 and develop respectively. The first three represent a closest attempt to copy over the defaults from the old workflow. They should not deviate significantly from the old workflow default values. The develop defaults are defaults for a 400m simulation designed for testing. Develop defaults are not designed to be used for any real experiments.

The defaults.py module contains code to load the defaults from a version (valid versions given in DefaultsVersion). The defaults are loaded by the updated realisations module (not part of this PR).

@lispandfound
Copy link
Contributor Author

I will introduce test cases for the defaults version module when I add the changes to the realisation configuration

Copy link

@joelridden joelridden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments. I wonder if we had 1 default config and then 24_2_2_1 for example would just replace values that matter for 100m? Could reduce the amount of config line duplicates, not sure if this limits us at all though

workflow/defaults.py Show resolved Hide resolved
@lispandfound
Copy link
Contributor Author

Part of the problem regarding the hierarchical approach that Joel suggests for YAML defaults is that I am not sure what variables are dependent on the resolution and what aren't. So it's hard to know what to override. Secondly, I am a fan of keeping things quite simple for defaults so that you can just pass one file around, and publish just one file in papers to show what the settings for simulations were.

@lispandfound lispandfound merged commit c01c54c into main Oct 21, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants