Skip to content

Commit 40e00e8

Browse files
schlunmabouweandelavaleriupredoiRémi Kazeroni
authored
Combined offline and always_search_esgf into a single option search_esgf (#1935)
Co-authored-by: Bouwe Andela <b.andela@esciencecenter.nl> Co-authored-by: Valeriu Predoi <valeriu.predoi@gmail.com> Co-authored-by: Rémi Kazeroni <remi.kazeroni@dlr.de>
1 parent 6d285d2 commit 40e00e8

26 files changed

+717
-296
lines changed

.github/workflows/run-tests.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ name: Test
1515

1616
# runs on a push on main and at the end of every day
1717
on:
18-
# triggering on push without branch name will run tests everytime
18+
# triggering on push without branch name will run tests every time
1919
# there is a push on any branch
2020
# turn it on only if needed
2121
push:

doc/changelog.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -232,7 +232,7 @@ Highlights
232232

233233
- The new preprocessor :func:`~esmvalcore.preprocessor.extract_location` can extract arbitrary locations on the Earth using the `geopy <https://pypi.org/project/geopy/>`__ package that connects to OpenStreetMap. For details, see :ref:`Extract location <extract_location>`.
234234
- Time ranges can now be extracted using the `ISO 8601 format <https://en.wikipedia.org/wiki/ISO_8601>`_. In addition, wildcards are allowed, which makes the time selection much more flexible. For details, see :ref:`Recipe section: Datasets <Datasets>`.
235-
- The new preprocessor :func:`~esmvalcore.preprocessor.ensemble_statistics` can calculate arbitrary statitics over all ensemble members of a simulation. In addition, the preprocessor :func:`~esmvalcore.preprocessor.multi_model_statistics` now accepts the keyword ``groupy``, which allows the calculation of multi-model statistics over arbitrary multi-model ensembles. For details, see :ref:`Ensemble statistics <ensemble statistics>` and :ref:`Multi-model statistics <multi-model statistics>`.
235+
- The new preprocessor :func:`~esmvalcore.preprocessor.ensemble_statistics` can calculate arbitrary statistics over all ensemble members of a simulation. In addition, the preprocessor :func:`~esmvalcore.preprocessor.multi_model_statistics` now accepts the keyword ``groupy``, which allows the calculation of multi-model statistics over arbitrary multi-model ensembles. For details, see :ref:`Ensemble statistics <ensemble statistics>` and :ref:`Multi-model statistics <multi-model statistics>`.
236236

237237
This release includes
238238

@@ -327,7 +327,7 @@ Automatic testing
327327
- Switch to Mambaforge in Github Actions tests (`#1438 <https://github.com/ESMValGroup/ESMValCore/pull/1438>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
328328
- Turn off conda lock file creation on any push on `main` branch from Github Action test (`#1489 <https://github.com/ESMValGroup/ESMValCore/pull/1489>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
329329
- Add DRS path test for IPSLCM files (`#1490 <https://github.com/ESMValGroup/ESMValCore/pull/1490>`__) `Stéphane Sénési <https://github.com/senesis>`__
330-
- Add a test module that runs tests of `iris` I/O everytime we notice serious bugs there (`#1510 <https://github.com/ESMValGroup/ESMValCore/pull/1510>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
330+
- Add a test module that runs tests of `iris` I/O every time we notice serious bugs there (`#1510 <https://github.com/ESMValGroup/ESMValCore/pull/1510>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
331331
- [Github Actions] Trigger Github Actions tests (`run-tests.yml` workflow) from a comment in a PR (`#1520 <https://github.com/ESMValGroup/ESMValCore/pull/1520>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
332332
- Update Linux condalock file (various pull requests) github-actions[bot]
333333

@@ -617,7 +617,7 @@ Automatic testing
617617
- Report coverage for tests that run on any pull request (`#994 <https://github.com/ESMValGroup/ESMValCore/pull/994>`__) `Bouwe Andela <https://github.com/bouweandela>`__
618618
- Install ESMValTool sample data from PyPI (`#998 <https://github.com/ESMValGroup/ESMValCore/pull/998>`__) `Javier Vegas-Regidor <https://github.com/jvegasbsc>`__
619619
- Fix tests for multi-processing with spawn method (i.e. macOSX with Python>3.8) (`#1003 <https://github.com/ESMValGroup/ESMValCore/pull/1003>`__) `Barbara Vreede <https://github.com/bvreede>`__
620-
- Switch to running the Github Action test workflow every 3 hours in single thread mode to observe if Sementation Faults occur (`#1022 <https://github.com/ESMValGroup/ESMValCore/pull/1022>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
620+
- Switch to running the Github Action test workflow every 3 hours in single thread mode to observe if Segmentation Faults occur (`#1022 <https://github.com/ESMValGroup/ESMValCore/pull/1022>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
621621
- Revert to original Github Actions test workflow removing the 3-hourly test run with -n 1 (`#1025 <https://github.com/ESMValGroup/ESMValCore/pull/1025>`__) `Valeriu Predoi <https://github.com/valeriupredoi>`__
622622
- Avoid stale cache for multimodel statistics regression tests (`#1030 <https://github.com/ESMValGroup/ESMValCore/pull/1030>`__) `Bouwe Andela <https://github.com/bouweandela>`__
623623
- Add newer Python versions in OSX to Github Actions (`#1035 <https://github.com/ESMValGroup/ESMValCore/pull/1035>`__) `Barbara Vreede <https://github.com/bvreede>`__

doc/quickstart/configure.rst

+41-30
Original file line numberDiff line numberDiff line change
@@ -53,38 +53,40 @@ omitted in the file.
5353
# Includes log files and performance stats.
5454
output_dir: ~/esmvaltool_output
5555
56-
# Directory for storing downloaded climate data
57-
download_dir: ~/climate_data
58-
59-
# Disable automatic downloads --- [true]/false
60-
# Disable the automatic download of missing CMIP3, CMIP5, CMIP6, CORDEX,
61-
# and obs4MIPs data from ESGF by default. This is useful if you are working
62-
# on a computer without an internet connection.
63-
offline: true
64-
65-
# Search ESGF for files even when files are available locally --- true/[false]
66-
# This option is useful to make sure you have the latest version of all files.
67-
# Remember to set ``offline: false`` if this option is set to ``true``.
68-
always_search_esgf: false
69-
7056
# Auxiliary data directory
7157
# Used by some recipes to look for additional datasets.
7258
auxiliary_data_dir: ~/auxiliary_data
7359
60+
# Automatic data download from ESGF --- [never]/when_missing/always
61+
# Use automatic download of missing CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs
62+
# data from ESGF. ``never`` disables this feature, which is useful if you are
63+
# working on a computer without an internet connection, or if you have limited
64+
# disk space. ``when_missing`` enables the automatic download for files that
65+
# are not available locally. ``always`` will always check ESGF for the latest
66+
# version of a file, and will only use local files if they correspond to that
67+
# latest version.
68+
search_esgf: never
69+
70+
# Directory for storing downloaded climate data
71+
# Make sure to use a directory where you can store multiple GBs of data. Your
72+
# home directory on a HPC is usually not suited for this purpose, so please
73+
# change the default value in this case!
74+
download_dir: ~/climate_data
75+
7476
# Rootpaths to the data from different projects
75-
# This default setting will work if files have been downloaded by the
76-
# ESMValTool via ``offline=False``. Lists are also possible. For
77-
# site-specific entries, see the default ``config-user.yml`` file that can be
78-
# installed with the command ``esmvaltool config get_config_user``. For each
79-
# project, this can be either a single path or a list of paths. Comment out
80-
# these when using a site-specific path.
77+
# This default setting will work if files have been downloaded by ESMValTool
78+
# via ``search_esgf``. Lists are also possible. For site-specific entries,
79+
# see the default ``config-user.yml`` file that can be installed with the
80+
# command ``esmvaltool config get_config_user``. For each project, this can
81+
# be either a single path or a list of paths. Comment out these when using a
82+
# site-specific path.
8183
rootpath:
8284
default: ~/climate_data
8385
8486
# Directory structure for input data --- [default]/ESGF/BADC/DKRZ/ETHZ/etc.
85-
# This default setting will work if files have been downloaded by the
86-
# ESMValTool via ``offline=False``. See ``config-developer.yml`` for
87-
# definitions. Comment out/replace as per needed.
87+
# This default setting will work if files have been downloaded by ESMValTool
88+
# via ``search_esgf``. See ``config-developer.yml`` for definitions. Comment
89+
# out/replace as per needed.
8890
drs:
8991
CMIP3: ESGF
9092
CMIP5: ESGF
@@ -136,11 +138,19 @@ omitted in the file.
136138
# ``config-developer.yml`` for an example. Set to ``null`` to use the default.
137139
config_developer_file: null
138140
139-
The ``offline`` setting can be used to disable or enable automatic downloads from ESGF.
140-
If ``offline`` is set to ``false``, the tool will automatically download
141-
any CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs data that is required to run a recipe
142-
but not available locally and store it in ``download_dir`` using the ``ESGF``
141+
The ``search_esgf`` setting can be used to disable or enable automatic
142+
downloads from ESGF.
143+
If ``search_esgf`` is set to ``never``, the tool does not download any data
144+
from the ESGF.
145+
If ``search_esgf`` is set to ``when_missing``, the tool will download any CMIP3,
146+
CMIP5, CMIP6, CORDEX, and obs4MIPs data that is required to run a recipe but
147+
not available locally and store it in ``download_dir`` using the ``ESGF``
143148
directory structure defined in the :ref:`config-developer`.
149+
If ``search_esgf`` is set to ``always``, the tool will first check the ESGF for
150+
the needed data, regardless of any local data availability; if the data found
151+
on ESGF is newer than the local data (if any) or the user specifies a version
152+
of the data that is available only from the ESGF, then that data will be
153+
downloaded; otherwise, local data will be used.
144154

145155
The ``auxiliary_data_dir`` setting is the path to place any required
146156
additional auxiliary data files. This is necessary because certain
@@ -199,9 +209,10 @@ The ``esmvaltool run`` command can automatically download the files required
199209
to run a recipe from ESGF for the projects CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs.
200210
The downloaded files will be stored in the ``download_dir`` specified in the
201211
:ref:`user configuration file`.
202-
To enable automatic downloads from ESGF, set ``offline: false`` in
203-
the :ref:`user configuration file` or provide the command line argument
204-
``--offline=False`` when running the recipe.
212+
To enable automatic downloads from ESGF, set ``search_esgf: when_missing`` or
213+
``search_esgf: always`` in the :ref:`user configuration file`, or provide the
214+
corresponding command line arguments ``--search_esgf=when_missing`` or
215+
``--search_esgf=always`` when running the recipe.
205216

206217
.. note::
207218

doc/quickstart/find_data.rst

+6-3
Original file line numberDiff line numberDiff line change
@@ -483,9 +483,12 @@ retrieval parameters is explained below.
483483

484484
Enabling automatic downloads from the ESGF
485485
------------------------------------------
486-
To enable automatic downloads from ESGF, set ``offline: false`` in
487-
the :ref:`user configuration file` or provide the command line argument
488-
``--offline=False`` when running the recipe.
486+
To enable automatic downloads from ESGF, set ``search_esgf: when_missing`` (use
487+
local files whenever possible) or ``search_esgf: always`` (always search ESGF
488+
for latest version of files and only use local data if it is the latest
489+
version) in the :ref:`user configuration file`, or provide the corresponding
490+
command line arguments ``--search_esgf=when_missing`` or
491+
``--search_esgf=always`` when running the recipe.
489492
The files will be stored in the ``download_dir`` set in
490493
the :ref:`user configuration file`.
491494

doc/quickstart/run.rst

+11-3
Original file line numberDiff line numberDiff line change
@@ -60,12 +60,20 @@ It is also possible to explicitly change values from the config file using flags
6060
esmvaltool run --argument_name argument_value recipe_example.yml
6161
6262
To automatically download the files required to run a recipe from ESGF, set
63-
``offline`` to ``false`` in the :ref:`user configuration file`
64-
or run the tool with the command
63+
``search_esgf`` to ``when_missing`` (use local files whenever possible) or
64+
``always`` (always search ESGF for latest version of files and only use local
65+
data if it is the latest version) in the :ref:`user configuration file` or run
66+
the tool with the corresponding commands
6567

6668
.. code:: bash
6769
68-
esmvaltool run --offline=False recipe_example.yml
70+
esmvaltool run --search_esgf=when_missing recipe_example.yml
71+
72+
or
73+
74+
.. code:: bash
75+
76+
esmvaltool run --search_esgf=always recipe_example.yml
6977
7078
This feature is available for projects that are hosted on the ESGF, i.e.
7179
CMIP3, CMIP5, CMIP6, CORDEX, and obs4MIPs.

doc/recipe/overview.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -129,10 +129,10 @@ Reading facet values from file names is not yet supported.
129129
See :ref:`CMOR-DRS` for more information on this kind of file organization.
130130

131131
When (some) files are available locally, the tool will not automatically look
132-
for more files on ESGF. To populate a recipe with all available datasets from
133-
ESGF, ``offline`` should be set to ``false`` and ``always_search_esgf`` should
134-
be set to ``true`` in the
135-
:ref:`user configuration file<user configuration file>`.
132+
for more files on ESGF.
133+
To populate a recipe with all available datasets from ESGF, ``search_esgf``
134+
should be set to ``always`` in the :ref:`user configuration file<user
135+
configuration file>`.
136136

137137
For more control over which datasets are selected, it is recommended to use
138138
a Python script or `Jupyter notebook <https://jupyter.org/>`_ to compose

environment.yml

+1
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ dependencies:
2626
- nested-lookup
2727
- netcdf4
2828
- numpy
29+
- packaging
2930
- pandas
3031
- pillow
3132
- pip!=21.3

esmvalcore/_main.py

+19-1
Original file line numberDiff line numberDiff line change
@@ -329,6 +329,7 @@ def run(self,
329329
max_years=None,
330330
skip_nonexistent=None,
331331
offline=None,
332+
search_esgf=None,
332333
diagnostics=None,
333334
check_level=None,
334335
**kwargs):
@@ -357,6 +358,19 @@ def run(self,
357358
If True, the run will not fail if some datasets are not available.
358359
offline: bool, optional
359360
If True, the tool will not download missing data from ESGF.
361+
362+
.. deprecated:: 2.8.0
363+
This option has been deprecated in ESMValCore version 2.8.0 and
364+
is scheduled for removal in version 2.10.0. Please use the
365+
options `search_esgf=never` (for `offline=True`) or
366+
`search_esgf=when_missing` (for `offline=False`). These are
367+
exact replacements.
368+
search_esgf: str, optional
369+
If `never`, disable automatic download of data from the ESGF. If
370+
`when_missing`, enable the automatic download of files that are not
371+
available locally. If `always`, always check ESGF for the latest
372+
version of a file, and only use local files if they correspond to
373+
that latest version.
360374
diagnostics: list(str), optional
361375
Only run the selected diagnostics from the recipe. To provide more
362376
than one diagnostic to filter use the syntax 'diag1 diag2/script1'
@@ -384,12 +398,16 @@ def run(self,
384398
session['max_years'] = max_years
385399
if offline is not None:
386400
session['offline'] = offline
401+
if search_esgf is not None:
402+
session['search_esgf'] = search_esgf
387403
if skip_nonexistent is not None:
388404
session['skip_nonexistent'] = skip_nonexistent
389405
session['resume_from'] = parse_resume(resume_from, recipe)
390406
session.update(kwargs)
391407

392408
self._run(recipe, session)
409+
# Print warnings about deprecated configuration options again:
410+
CFG.reload()
393411

394412
@staticmethod
395413
def _create_session_dir(session):
@@ -421,7 +439,7 @@ def _run(self, recipe: Path, session) -> None:
421439
console_log_level=session['log_level'])
422440
self._log_header(session['config_file'], log_files)
423441

424-
if not session['offline']:
442+
if session['search_esgf'] != 'never':
425443
from .esgf._logon import logon
426444
logon()
427445

esmvalcore/_recipe/recipe.py

+11-8
Original file line numberDiff line numberDiff line change
@@ -957,21 +957,22 @@ def _set_use_legacy_supplementaries(self):
957957
logger.info("Running with --use-legacy-supplementaries=True")
958958
self.session['use_legacy_supplementaries'] = True
959959

960-
# Also set the global config because it is used to check if
961-
# mismatching shapes should be ignored when attaching
960+
# Also adapt the global config if necessary because it is used to check
961+
# if mismatching shapes should be ignored when attaching
962962
# supplementary variables in `esmvalcore.preprocessor.
963963
# _supplementary_vars.add_supplementary_variables` to avoid having to
964964
# introduce a new function argument that is immediately deprecated.
965-
option = 'use_legacy_supplementaries'
966-
CFG[option] = self.session[option]
965+
session_use_legacy_supp = self.session['use_legacy_supplementaries']
966+
if session_use_legacy_supp is not None:
967+
CFG['use_legacy_supplementaries'] = session_use_legacy_supp
967968

968969
def _log_recipe_errors(self, exc):
969970
"""Log a message with recipe errors."""
970971
logger.error(exc.message)
971972
for task in exc.failed_tasks:
972973
logger.error(task.message)
973974

974-
if self.session['offline'] and any(
975+
if self.session['search_esgf'] == 'never' and any(
975976
isinstance(err, InputFilesNotFound)
976977
for err in exc.failed_tasks):
977978
logger.error(
@@ -983,8 +984,10 @@ def _log_recipe_errors(self, exc):
983984
"configuration file %s", self.session['config_file'])
984985
logger.error(
985986
"To automatically download the required files to "
986-
"`download_dir: %s`, set `offline: false` in %s or run the "
987-
"recipe with the extra command line argument --offline=False",
987+
"`download_dir: %s`, set `search_esgf: when_missing` or "
988+
"`search_esgf: always` in %s, or run the recipe with the "
989+
"extra command line argument --search_esgf=when_missing or "
990+
"--search_esgf=always",
988991
self.session['download_dir'],
989992
self.session['config_file'],
990993
)
@@ -1310,7 +1313,7 @@ def run(self):
13101313
filled_recipe = self.write_filled_recipe()
13111314

13121315
# Download required data
1313-
if not self.session['offline']:
1316+
if self.session['search_esgf'] != 'never':
13141317
esgf.download(self._download_files, self.session['download_dir'])
13151318

13161319
self.tasks.run(max_parallel_tasks=self.session['max_parallel_tasks'])

0 commit comments

Comments
 (0)