Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronized config-user.yml with version from ESMValTool #1453

Merged
merged 2 commits into from
Feb 1, 2022

Conversation

schlunma
Copy link
Contributor

@schlunma schlunma commented Feb 1, 2022

Description

Apply changes from ESMValGroup/ESMValTool#2507


Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.


To help with the number pull requests:

@schlunma schlunma added the documentation Improvements or additions to documentation label Feb 1, 2022
@schlunma schlunma added this to the v2.5.0 milestone Feb 1, 2022
@schlunma schlunma self-assigned this Feb 1, 2022
Copy link
Contributor

@valeriupredoi valeriupredoi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cheers, Manu!

@codecov
Copy link

codecov bot commented Feb 1, 2022

Codecov Report

Merging #1453 (707a5df) into main (2496320) will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1453   +/-   ##
=======================================
  Coverage   89.66%   89.67%           
=======================================
  Files         196      196           
  Lines       10339    10339           
=======================================
+ Hits         9270     9271    +1     
+ Misses       1069     1068    -1     
Impacted Files Coverage Δ
esmvalcore/_config/_config.py 96.24% <0.00%> (+0.75%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2496320...707a5df. Read the comment docs.

@schlunma schlunma merged commit 7d43072 into main Feb 1, 2022
@schlunma schlunma deleted the unify_config_user branch February 1, 2022 17:06
Comment on lines -48 to +84
default: ~/climate_data
CMIP3: [~/cmip3_inputpath1, ~/cmip3_inputpath2]
CMIP5: [~/cmip5_inputpath1, ~/cmip5_inputpath2]
CMIP6: [~/cmip6_inputpath1, ~/cmip6_inputpath2]
OBS: ~/obs_inputpath
OBS6: ~/obs6_inputpath
obs4MIPs: ~/obs4mips_inputpath
ana4mips: ~/ana4mips_inputpath
native6: ~/native6_inputpath
RAWOBS: ~/rawobs_inputpath
default: ~/default_inputpath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for changing this back to the old default from before #1217? I liked the new default value here because it would work for the data that gets automatically downloaded with the --offline=False method.

I think the old example was confusing because it contains a mix of lists of paths and paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest I didn't really think about this and used the options that were use in ESMValTool. Showcasing the possibility to use multiple paths and also list all available projects seemed like a good idea. If a user wants to use this file with automatic downloads they need to change the default value of offline anyway, and in the file there is a comment:

Uncomment the lines below to locate data that has been automatically
# downloaded from ESGF (using "offline: false")

I'm also fine with the old one, but don't really see the use. For me this is an example file which needs to be altered anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind the changes I made in #1217 was to finally make it possible to run the tool for the first time without editing the example config-user.yml, as documented here. That failed slightly because people objected to automatically downloading the required data, but if you specify --offline=False on the command line it still works.

Showcasing per project paths and lists of paths is also good, but may we could add a few more commented-out examples instead to showcase that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will add that to #1457

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bouweandela you're such a romantic 😆 We should have such a first-time-user file be used by a dedicated test (GA most probably) that runs exactly what a first time user would run every week or so

Comment on lines -54 to +94
CMIP3: ESGF
CMIP5: ESGF
CMIP6: ESGF
CORDEX: ESGF
obs4MIPs: ESGF
CMIP3: default
CMIP5: default
CMIP6: default
CORDEX: default
obs4MIPs: default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the reason for changing this back to the old default from before #1217? I liked the new default value, because it encourages storing the data in such a structured way, that enables you to find out what version of the files you have. If you download CMIP data and store it with the default (i.e. no subdirectories) format, it becomes very hard to tell what version of the files you have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but that assumes an ESGF-like structure (ie you have downloaded data) - we should NOT encourage people to download data

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above

###############################################################################
---

# Directory where results are stored
output_dir: ./esmvaltool_output
output_dir: ~/esmvaltool_output
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that this is a sensible default (it's what I always use in my own configuration file), it is not necessarily the most expected. Most tools write to the current working directory, if you do not tell them where to write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, we don't, or, we write wherever the config tells it to, if that default may lead to spawning a large number of output dirs, the user will probably blame us first upon finding out they have 300 esmvaltool_output dirs because they don't know how to do pwd and figure out where they are. Also, they may get lost and say they never ran that particular run 😁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with V here, writing to a path that doesn't depend on the pwd seems like the better option here.

@bouweandela
Copy link
Member

Note that the defaults are also set here:

# set defaults
defaults = {
'auxiliary_data_dir': 'auxiliary_data',
'compress_netcdf': False,
'config_developer_file': None,
'drs': {},
'download_dir': '~/climate_data',
'exit_on_warning': False,
'extra_facets_dir': tuple(),
'max_parallel_tasks': None,
'offline': True,
'output_file_type': 'png',
'output_dir': 'esmvaltool_output',
'profile_diagnostic': False,
'remove_preproc_dir': True,
'resume_from': [],
'run_diagnostic': True,
'save_intermediary_cubes': False,
}

This pull request changes the defaults only in config-user.yml, leading to the situation where not specifying a value will give you a different default than using the default setting from the example, except if you're using the experimental API which uses the default values from the example.

@valeriupredoi
Copy link
Contributor

I think @schlunma just wanted to make the example config file more user-friendly, and not operate on any actual defaults, Bouwe

@schlunma
Copy link
Contributor Author

schlunma commented Feb 2, 2022

Note that the defaults are also set here:

# set defaults
defaults = {
'auxiliary_data_dir': 'auxiliary_data',
'compress_netcdf': False,
'config_developer_file': None,
'drs': {},
'download_dir': '~/climate_data',
'exit_on_warning': False,
'extra_facets_dir': tuple(),
'max_parallel_tasks': None,
'offline': True,
'output_file_type': 'png',
'output_dir': 'esmvaltool_output',
'profile_diagnostic': False,
'remove_preproc_dir': True,
'resume_from': [],
'run_diagnostic': True,
'save_intermediary_cubes': False,
}

This pull request changes the defaults only in config-user.yml, leading to the situation where not specifying a value will give you a different default than using the default setting from the example, except if you're using the experimental API which uses the default values from the example.

Actually that is not 100% true. The only options that I changed (apart from the rootpath which does not have a default in ESMValCore/esmvalcore/_config/_config.py) are drs, remove_preproc_dir and output_dir. drs already differed in the old version, remove_preproc_dir now actually matches the entry (it didn't before) and output_dir doesn't match its entry now. However, there are other entries (e.g. auxiliary_data) which do not match their entries in the new and old version.

I agree with you that we should be consistent here, but this PR didn't worsen this. I will open address this in another PR.

@schlunma schlunma mentioned this pull request Feb 2, 2022
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants