Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the syntax for adding multiple ensemble members from the same dataset #678

Merged
merged 4 commits into from
Jun 22, 2020
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions doc/recipe/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ the following:
documentation:
description: |
Recipe to produce time series figures of the derived variable, the
Atlantic meriodinal overturning circulation (AMOC).
Atlantic meridional overturning circulation (AMOC).
This recipe also produces transect figures of the stream functions for
the years 2001-2004.

Expand Down Expand Up @@ -102,8 +102,8 @@ Here it is an example concatenating the `historical` experiment with `rcp85`
datasets:
- {dataset: CanESM2, project: CMIP5, exp: [historical, rcp85], ensemble: r1i1p1, start_year: 2001, end_year: 2004}

It is also possible to define the ensemble as a list, although it is useful only
case the two experiments have different ensemble names
It is also possible to define the ensemble as a list when the two experiments have different ensemble names.
In this case, the specified datasets are concatenated into a single cube:

.. code-block:: yaml

Expand All @@ -113,22 +113,24 @@ case the two experiments have different ensemble names
ESMValTool also supports a simplified syntax to add multiple ensemble members from the same dataset.
In the ensemble key, any element in the form `(x:y)` will be replaced with all numbers from x to y (both inclusive),
adding a dataset entry for each replacement. For example, to add ensemble members r1i1p1 to r10i1p1
you can use the following abreviatted syntax:
you can use the following abbreviated syntax:

.. code-block:: yaml

datasets:
- {dataset: CanESM2, project: CMIP5, exp: historical, ensemble: r(1:10)i1p1, start_year: 2001, end_year: 2004}
- {dataset: CanESM2, project: CMIP5, exp: historical, ensemble: "r(1:10)i1p1", start_year: 2001, end_year: 2004}

It can be included multiple times in one definition. For example, to generate the datasets definitions
for the ensemble members r1i1p1 to r5i1p1 and from r1i2p1 to r5i1p1 you can use:

.. code-block:: yaml

datasets:
- {dataset: CanESM2, project: CMIP5, exp: historical, ensemble: r(1:5)i(1:2)p1, start_year: 2001, end_year: 2004}
- {dataset: CanESM2, project: CMIP5, exp: historical, ensemble: "r(1:5)i(1:2)p1", start_year: 2001, end_year: 2004}

Please, bear in mind that this syntax can only be used in the ensemble tag.
Also, note that the combination of multiple experiments and ensembles, like
exp: [historical, rcp85], ensemble: [r1i1p1, "r(2:3)i1p1"] has not been yet supported and will raise an error.

Note that this section is not required, as datasets can also be provided in the
Diagnostics_ section.
Expand All @@ -140,7 +142,7 @@ Diagnostics_ section.
Recipe section: ``preprocessors``
=================================

The preprocessor section of the recipe includes one or more preprocesors, each
The preprocessor section of the recipe includes one or more preprocessors, each
of which may call the execution of one or several preprocessor functions.

Each preprocessor section includes:
Expand Down
33 changes: 22 additions & 11 deletions esmvalcore/_recipe.py
Original file line number Diff line number Diff line change
Expand Up @@ -1015,20 +1015,31 @@ def _expand_ensemble(variables):
"""
expanded = []
regex = re.compile(r'\(\d+:\d+\)')

def expand_ensemble(variable):
ens = variable.get('ensemble', "")
match = regex.search(ens)
if match:
start, end = match.group(0)[1:-1].split(':')
for i in range(int(start), int(end) + 1):
expand = deepcopy(variable)
expand['ensemble'] = regex.sub(str(i), ens, 1)
expand_ensemble(expand)
else:
expanded.append(variable)

for variable in variables:
ensemble = variable.get('ensemble', "")
if not isinstance(ensemble, str):
if isinstance(ensemble, (list, tuple)):
for elem in ensemble:
if regex.search(elem):
raise RecipeError(
f"In variable {variable}: ensemble expansion "
"cannot be combined with ensemble lists")
expanded.append(variable)
continue
match = regex.search(ensemble)
if not match:
expanded.append(variable)
continue
start, end = match.group(0)[1:-1].split(':')
for i in range(int(start), int(end) + 1):
expand = deepcopy(variable)
expand['ensemble'] = regex.sub(str(i), ensemble, 1)
expanded.append(expand)
else:
expand_ensemble(variable)

return expanded

def _initialize_variables(self, raw_variable, raw_datasets):
Expand Down
42 changes: 42 additions & 0 deletions tests/unit/test_recipe.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
import pytest

from esmvalcore._recipe import Recipe
from esmvalcore._recipe_checks import RecipeError


class TestRecipe:
def test_expand_ensemble(self):

datasets = [
{
'dataset': 'XYZ',
'ensemble': 'r(1:2)i(2:3)p(3:4)',
},
]

expanded = Recipe._expand_ensemble(datasets)

ensembles = [
'r1i2p3',
'r1i2p4',
'r1i3p3',
'r1i3p4',
'r2i2p3',
'r2i2p4',
'r2i3p3',
'r2i3p4',
]
for i, ensemble in enumerate(ensembles):
assert expanded[i] == {'dataset': 'XYZ', 'ensemble': ensemble}

def test_expand_ensemble_nolist(self):

datasets = [
{
'dataset': 'XYZ',
'ensemble': ['r1i1p1', 'r(1:2)i1p1']
},
]

with pytest.raises(RecipeError):
Recipe._expand_ensemble(datasets)