Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting GFS interval greater than 24 hours #260

Closed
malloryprow opened this issue Feb 9, 2021 · 6 comments · Fixed by #2928
Closed

Supporting GFS interval greater than 24 hours #260

malloryprow opened this issue Feb 9, 2021 · 6 comments · Fixed by #2928
Assignees
Labels
feature New feature or request

Comments

@malloryprow
Copy link
Contributor

Issue came up when a user manually change the GFS interval in their XML from to to . However the user left their config.metp unchanged and left VRFYBACK_HRS to its default of 24 hours, meaning the verification would run for the cycle date - 24 hours. This caused problems in the gfsmetp tasks as the GFS cycles would not have been ran for these dates. When the user update VRFYBACK_HRS to 120 the verification ran as expected.

The goal with this issue would be to rework how gfs_cyc gets set and used, and creating a variable that script that connect EMC_verif-global and the global workflow can key off of to get the right verification dates so users don't have to make the changes in config.metp.

From Kate: "It could become an hour value rather than a toggle value. It would default to "24" (similar to the current gfs_cyc=1). Then users could provide a different value at setup time (or between setup steps by changing config.base) and could set it to a value divided by the assimilation frequency (6hrs)."

We will need to consider how this could impact and affect other places where gfs_cyc is used.

@malloryprow
Copy link
Contributor Author

malloryprow commented Feb 12, 2021

@KateFriedman-NOAA I just had an idea for this that would potentially fix things for the metp tasks with leaving gfs_cyc alone. If we set up the setup_workflow.py and setup_workflow_fcstonly.py to add INTERVAL_GFS to the environment variables for the "gfsmetp*" tasks, I can use that to adjust the verification date. I understand though wanting to address the bigger picture with gfs_cyc and user manually adjusting the XML to set INTERVAL_GFS. I know you said you plate is pretty full for the rest of the month, but just wanted to get this idea written down.

WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Jun 7, 2021
Replaces the old gfs_cyc method for determining cycling frequency with one that
explicitly sets the step length. In place of gfs_cyc there are two new variables
to determine step sizing: STEP_GFS and STEP_DA. Both are of the timedelta type.

STEP_GFS determines the frequency of GFS and defaults to 6 hours
STEP_DA determines the frequency of DA and also defaults to 6 hours

If STEP_GFS is set to 00:00:00, only the DA system is run
If STEP_DA is set to 00:00:00, only the GFS system is run (free-forecast mode)

Note that this new method currently requires the additional setting of STEP_DA
to zero, in addition to using free forecast layout.

If STEP_DA is non-zero, the first cycle is DA-only, so this should be kept in
mind when setting up your experiments (especially if you want GFS to run at
certain times of day). If both are non-zero, STEP_GFS should always be an
integer multiple of STEP_DA, otherwise unexpected behavior may occur. Other
than setting STEP_DA to zero for free forecast, STEP_DA is unlikely to be
changed from its default value in the near-future for GFS, but the new method
is flexible for any future incorporation of LAM system that requires faster DA
cycling.

Examples:

Full-cycling with DA & GFS:
STEP_GFS: !timedelta "06:00:00"
STEP_DA: !timedelta "06:00:00"
(since these are defaults, nothing needs to be placed in the case file, but you may
wish to for clarity)

Normal DA with GFS at each 00z starting 2013-04-01:
SDATE: 2013-03-31t18:00:00
STEP_GFS: !timedelta "24:00:00"

Free forecast with GFS every 24h at 00z starting 2013-04-01:
SDATE: 2013-04-01t00:00:00
STEP_GFS: !timedelta "24:00:00"
STEP_DA: !timedelta "00:00:00"

CROW ss updated to add a new tool that converts a timedelta to a string.

The wave model, which formerly used gfs_cyc to determine the wave cycling interval,
now determines WAVHCYC directly from STEP_GFS. Since this variable is only a number
of hours, this will only work for whole-hour GFS frequencies.

Case files have been updated to use the new settings in place of gfs_cyc, as well as
set STEP_DA to zero for free forecast cases. However, they have not all been tested.

Additional testing is needed to ensure these changes are fully operational for DA
cycling runs.

Refs: NOAA-EMC#260, NOAA-EMC#314
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Jun 7, 2021
Replaces the old gfs_cyc method for determining cycling frequency with one that
explicitly sets the step length. In place of gfs_cyc there are two new variables
to determine step sizing: STEP_GFS and STEP_DA. Both are of the timedelta type.

STEP_GFS determines the frequency of GFS and defaults to 6 hours
STEP_DA determines the frequency of DA and also defaults to 6 hours

If STEP_GFS is set to 00:00:00, only the DA system is run
If STEP_DA is set to 00:00:00, only the GFS system is run (free-forecast mode)

Note that this new method currently requires the additional setting of STEP_DA
to zero, in addition to using free forecast layout.

If STEP_DA is non-zero, the first cycle is DA-only, so this should be kept in
mind when setting up your experiments (especially if you want GFS to run at
certain times of day). If both are non-zero, STEP_GFS should always be an
integer multiple of STEP_DA, otherwise unexpected behavior may occur. Other
than setting STEP_DA to zero for free forecast, STEP_DA is unlikely to be
changed from its default value in the near-future for GFS, but the new method
is flexible for any future incorporation of LAM system that requires faster DA
cycling.

Examples:

Full-cycling with DA & GFS:
STEP_GFS: !timedelta "06:00:00"
STEP_DA: !timedelta "06:00:00"
(since these are defaults, nothing needs to be placed in the case file, but you may
wish to for clarity)

Normal DA with GFS at each 00z starting 2013-04-01:
SDATE: 2013-03-31t18:00:00
STEP_GFS: !timedelta "24:00:00"

Free forecast with GFS every 24h at 00z starting 2013-04-01:
SDATE: 2013-04-01t00:00:00
STEP_GFS: !timedelta "24:00:00"
STEP_DA: !timedelta "00:00:00"

CROW ss updated to add a new tool that converts a timedelta to a string.

The wave model, which formerly used gfs_cyc to determine the wave cycling interval,
now determines WAVHCYC directly from STEP_GFS. Since this variable is only a number
of hours, this will only work for whole-hour GFS frequencies.

Case files have been updated to use the new settings in place of gfs_cyc, as well as
set STEP_DA to zero for free forecast cases. However, they have not all been tested.

Additional testing is needed to ensure these changes are fully operational for DA
cycling runs.

Refs: NOAA-EMC#260, NOAA-EMC#314
@JianKuang-Intelsat
Copy link
Contributor

@malloryprow Has this issue been addressed?

@malloryprow
Copy link
Contributor Author

@JianKuang-UMD It has not

@WalterKolczynski-NOAA
Copy link
Contributor

I had a solution in the coupled-crow branch, but now we've moved away from that.

@KateFriedman-NOAA KateFriedman-NOAA removed their assignment Mar 16, 2022
@aerorahul
Copy link
Contributor

Out of scope for current development and capabilities.

@WalterKolczynski-NOAA
Copy link
Contributor

This has become relevant again with GEFS reforecast

@WalterKolczynski-NOAA WalterKolczynski-NOAA added feature New feature or request and removed coupled labels May 1, 2024
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Sep 17, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:

```
--interval <n_hours>
```

Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every
cycle).

In cycled mode, there is an additional argument to control which
cycle will be the first gfs cycle:

```
---sdate_gfs <YYYYMMDDHH>
```

The default if not provided is `--idate` + 6h (first full cycle).

As part of this change, some of the validation of the dates has
been added. `--edate` has also been made optional and defaults to
`--idate` if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running
metp on the correct cycles. This also removes "do nothing" metp
tasks that exit immediately, because only the last GFS cycle in a
day would actually process verification.

Now, metp has its own cycledef and always runs at 18z, regardless
of whether gfs is running at 18z or not. This is simplier than
trying to determine the last gfs cycle of a day when it could
change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the
metp task needs to know whether the cycle has a gfsarch task or not.
metp will trigger on gfsarch completing (as before), or gdasarch
completing if there is no gfsarch.

metp tasks are no longer generated for forecast-only, as the pgbanl
files (copied of the 1p00 pgbanl files) are not generated for f-o
anyway. If metp is needed for f-o, additional work will be needed.

Additionally, a couple EE2 issues with the metp job are resolved
(even though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Sep 28, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:

```
--interval <n_hours>
```

Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every
cycle).

In cycled mode, there is an additional argument to control which
cycle will be the first gfs cycle:

```
---sdate_gfs <YYYYMMDDHH>
```

The default if not provided is `--idate` + 6h (first full cycle).

As part of this change, some of the validation of the dates has
been added. `--edate` has also been made optional and defaults to
`--idate` if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running
metp on the correct cycles. This also removes "do nothing" metp
tasks that exit immediately, because only the last GFS cycle in a
day would actually process verification.

Now, metp has its own cycledef and always runs at 18z, regardless
of whether gfs is running at 18z or not. This is simplier than
trying to determine the last gfs cycle of a day when it could
change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the
metp task needs to know whether the cycle has a gfsarch task or not.
metp will trigger on gfsarch completing (as before), or gdasarch
completing if there is no gfsarch.

metp tasks are no longer generated for forecast-only, as the pgbanl
files (copied of the 1p00 pgbanl files) are not generated for f-o
anyway. If metp is needed for f-o, additional work will be needed.

Additionally, a couple EE2 issues with the metp job are resolved
(even though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this issue Oct 1, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:

```
--interval <n_hours>
```

Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every
cycle).

In cycled mode, there is an additional argument to control which
cycle will be the first gfs cycle:

```
---sdate_gfs <YYYYMMDDHH>
```

The default if not provided is `--idate` + 6h (first full cycle).

As part of this change, some of the validation of the dates has
been added. `--edate` has also been made optional and defaults to
`--idate` if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running
metp on the correct cycles. This also removes "do nothing" metp
tasks that exit immediately, because only the last GFS cycle in a
day would actually process verification.

Now, metp has its own cycledef and always runs at 18z, regardless
of whether gfs is running at 18z or not. This is simplier than
trying to determine the last gfs cycle of a day when it could
change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the
metp task needs to know whether the cycle has a gfsarch task or not.
metp will trigger on gfsarch completing (as before), or gdasarch
completing if there is no gfsarch.

metp tasks are no longer generated for forecast-only, as the pgbanl
files (copied of the 1p00 pgbanl files) are not generated for f-o
anyway. If metp is needed for f-o, additional work will be needed.

Additionally, a couple EE2 issues with the metp job are resolved
(even though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299
EricSinsky-NOAA pushed a commit to EricSinsky-NOAA/global-workflow that referenced this issue Oct 24, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:
```
--interval <n_hours>
```
Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every cycle).
(This is a change from current behavior of 24.)

In cycled mode, there is an additional argument to control which cycle
will be the first gfs cycle:
```
--sdate_gfs <YYYYMMDDHH>
```
The default if not provided is `--idate` + 6h (first full cycle). This
is the same as current behavior when `gfs_cyc` is 6, but may vary from
current behavior for other cadences.

As part of this change, some of the validation of the dates has been
added. `--edate` has also been made optional and defaults to `--idate`
if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running metp
on the correct cycles. This also removes "do nothing" metp tasks that
exit immediately, because only the last GFS cycle in a day would
actually process verification.

Now, metp has its own cycledef and will (a) always runs at 18z,
regardless of whether gfs is running at 18z or not, if the interval is
less than 24h; (b) use the same cycledef as gfs if the interval is 24h
or greater. This is simpler than trying to determine the last gfs cycle
of a day when it could change from day to day. To facilitate this
change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the metp
task needs to know whether the cycle has a gfsarch task or not. metp
will trigger on gfsarch completing (as before), or look backwards for
the last gfsarch to exist.

Additionally, a couple EE2 issues with the metp job are resolved (even
though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Also corrects some dependency issues with the extractvars job for replay and the replay CI test.

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299

---------

Co-authored-by: David Huber <david.huber@noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants