Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use more cycledefs for task control #1078

Merged

Conversation

WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA WalterKolczynski-NOAA commented Oct 17, 2022

Description
Splits the existing rocoto cycle definitions up to offer better job control. This means that only the jobs that are due to run will appear in a cycle's job list from rocotostat/rocotoviewer. It also allows for the removal of some of the cycleexist dependencies that were there solely to prevent the job from running in the half cycle. A side effect of this change is that the half-cycle will be recognized as a completed cycle, fixing the bug with archive jobs starting in the fourth cycle (#1003).

The gdas cycledef has been split into a gdas_half for the first half- cycle and gdas for the other GDAS cycles. Tasks that run during that first half-cycle therefore run on two cycledefs.

For gfs, instead of slicing perpindicular to time, a new cycledef gfs_cont (continuity) was created in parallel to the existing gfs cycledef that omits the first cycle. This was done since only one job (aerosol_init) currently skips the first cycle, and this prevents the need to provide two cycledefs for every gfs task but one.

Since some time math is now being done on sdate in workflow_xml.py, we now keep those as datetime objects and only convert to string when writing the cycledef strings.

In order to access the pygw utilities in the workflow directory, a symlink is created in workflow pointing to the pygw location in ush. A better solution may be found in the future.

Fixes #1003

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • 4½-cycle cold-start C96/48 cycled test on Orion
  • 4½-cycle warm-start C384/192 cycled test on Orion
  • Coupled prototype test on Orion
  • 3-cycle forecast-only C96 ATMA test on Orion (test aerosol_init)
  • GDASapp

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes

Splits the existing rocoto cycle definitions up to offer better job
control. This means that only the jobs that are due to run will appear
in a cycle's job list from rocotostat/rocotoviewer. It also allows for
the removal of some of the cycleexist dependencies that were there
solely to prevent the job from running in the half cycle. A side effect
of this change is that the half-cycle will be recognized as a completed
cycle, fixing the bug with archive jobs starting in the fourth cycle
(NOAA-EMC#1004).

The gdas cycledef has been split into a `gdas_half` for the first half-
cycle and `gdas` for the other GDAS cycles. Tasks that run during that
first half-cycle therefore run on two cycledefs.

For gfs, instead of slicing perpindicular to time, a new cycledef
`gfs_cont` (continuity) was created in parallel to the existing gfs
cycledef that omits the first cycle. This was done since only one job
(`aerosol_init`) currently skips the first cycle, and this prevents the
need to provide two cycledefs for every gfs task but one.

Since some time math is now being done on sdate in workflow_xml.py, we
now keep those as datetime objects and only convert to string when
writing the cycledef strings.

In order to access the pygw utilities in the workflow directory, a
symlink is created in `workflow` pointing to the pygw location in `ush`.
A better solution may be found in the future.

Fixes NOAA-EMC#1003
@WalterKolczynski-NOAA
Copy link
Contributor Author

Will need some help confirming that the GDASapp job flow is still correct, since I don't know how to test that app myself.

I'd also love some confirmation testing on all the other modes. I tried to hit everything that might matter for these changes.

@WalterKolczynski-NOAA WalterKolczynski-NOAA marked this pull request as ready for review October 17, 2022 18:52
Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gfs_cont is meant to represent "continuous aerosols". In other words, it is no longer a "parallel" forecasts, but is now "sequential" since every subsequent CDATE depends on the aerosol backgrounds (restarts) from the forecast valid at the CDATE initialized from the previous CDATE.

In that frame, the continuous aerosols operates like the "cycled" system which is sequential.

@CoryMartin-NOAA
Copy link
Contributor

The gfs_cont is meant to represent "continuous aerosols". In other words, it is no longer a "parallel" forecasts, but is now "sequential" since every subsequent CDATE depends on the aerosol backgrounds (restarts) from the forecast valid at the CDATE initialized from the previous CDATE.

In that frame, the continuous aerosols operates like the "cycled" system which is sequential.

Yeah I guess that is a confusing name then... This is really a temporary thing though so I won't fight it, as the 'aerosol init' will be replaced by aerosol DA soon.

@aerorahul
Copy link
Contributor

The gfs_cont is meant to represent "continuous aerosols". In other words, it is no longer a "parallel" forecasts, but is now "sequential" since every subsequent CDATE depends on the aerosol backgrounds (restarts) from the forecast valid at the CDATE initialized from the previous CDATE.
In that frame, the continuous aerosols operates like the "cycled" system which is sequential.

Yeah I guess that is a confusing name then... This is really a temporary thing though so I won't fight it, as the 'aerosol init' will be replaced by aerosol DA soon.

@CoryMartin-NOAA
There will still be those who wish to run continuous aerosols without DA. I can't say why they would want to (perhaps cost savings from not having to run DA), but there will be some who would argue for such capability. In essence, continuous aerosols is "Identity DA".

@CoryMartin-NOAA
Copy link
Contributor

@aerorahul not disagreeing there. I guess my point is, why is it a separate 'GFS' and not just an option to turn on/off?

Copy link
Contributor

@CoryMartin-NOAA CoryMartin-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gfs_cont is less than ideal but otherwise seems fine to me

@WalterKolczynski-NOAA
Copy link
Contributor Author

@CoryMartin-NOAA did you test to make sure I didn't break the GDASapp?

@CoryMartin-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA I have not yet. To be honest, it is only used by like 2-3 people at the moment so I wouldn't worry too much about breaking GDASApp compatibility at this time, since all of our stuff is definitely still in flux. In my opinion, don't let us hold you up (unless @guillaumevernieres or @RussTreadon-NOAA want to make sure it works before merging)

@guillaumevernieres
Copy link
Contributor

@WalterKolczynski-NOAA I have not yet. To be honest, it is only used by like 2-3 people at the moment so I wouldn't worry too much about breaking GDASApp compatibility at this time, since all of our stuff is definitely still in flux. In my opinion, don't let us hold you up (unless @guillaumevernieres or @RussTreadon-NOAA want to make sure it works before merging)

I'm with @CoryMartin-NOAA , you don't need to worry about the GDASApp stuff.

@WalterKolczynski-NOAA
Copy link
Contributor Author

@WalterKolczynski-NOAA I have not yet. To be honest, it is only used by like 2-3 people at the moment so I wouldn't worry too much about breaking GDASApp compatibility at this time, since all of our stuff is definitely still in flux. In my opinion, don't let us hold you up (unless @guillaumevernieres or @RussTreadon-NOAA want to make sure it works before merging)

I'm with @CoryMartin-NOAA , you don't need to worry about the GDASApp stuff.

Okay. It's a simple change later if anything is wrong, I just hate breaking things (I do enough of that unknowingly).

@WalterKolczynski-NOAA WalterKolczynski-NOAA merged commit 6f5fa79 into NOAA-EMC:develop Oct 20, 2022
@WalterKolczynski-NOAA WalterKolczynski-NOAA deleted the feature/cycledefs branch October 20, 2022 03:37
KateFriedman-NOAA added a commit to KateFriedman-NOAA/global-workflow that referenced this pull request Jan 30, 2023
* develop:
  Correct issue in linking final restart files (NOAA-EMC#1285)
  Remove execute permissions from config files (NOAA-EMC#1281)
  Make needed updates to run forecast from GEFS (NOAA-EMC#1203)
  Remove unnecessary variables which reference to nemsio (NOAA-EMC#1259)
  Create analysis files for early-cycle EnKF by default (NOAA-EMC#1237)
  Don't wipe $DATA before running ocean bmat (NOAA-EMC#1280)
  More marine DA j-jobs (NOAA-EMC#1270)
  Update UFS-DA atmospheric prep script to be consistent with GDASApp update (NOAA-EMC#1265)
  Add new jjob for ocean analysis bmat (NOAA-EMC#1239)
  Retire ecf/versions in develop (NOAA-EMC#1267)
  Deploy documentation to RTD (NOAA-EMC#1264)
  Temporarily disable failing pytest (NOAA-EMC#1263)
  Remove incorrect/misleading comments in config.base (NOAA-EMC#1261)
  Add initial Sphinx documentation (NOAA-EMC#1258)
  Remove nemsio support (NOAA-EMC#1255)
  Increase wallclock for diag jobs (NOAA-EMC#1216)
  Use correct resources for GFS gempak (NOAA-EMC#1214)
  Abstract common j-job tasks (NOAA-EMC#1230)
  Add missing mkgfsawps.x link (NOAA-EMC#1218)
  Fix post sounding job (NOAA-EMC#1212)
  Revert "Use fracoro data for all new UFS applications (NOAA-EMC#1182)" (NOAA-EMC#1240)
  Use fracoro data for all new UFS applications (NOAA-EMC#1182)
  Revert "Merge GFS v16.3 operational GSI changes into develop branch. (NOAA-EMC#1158)" (NOAA-EMC#1238)
  Add more user defined parameters for the marine DA (NOAA-EMC#1235)
  Update pytests action version and run sequentially (NOAA-EMC#1236)
  Add utility to compare Fortran namelists (NOAA-EMC#1234)
  Updates for pygw (NOAA-EMC#1231)
  Merge GFS v16.3 operational GSI changes into develop branch. (NOAA-EMC#1158)
  Move member up in directory hierarchy (NOAA-EMC#1201)
  Enable staging ics for cycled experiments. (NOAA-EMC#1199)
  Add tests for configuration.py (NOAA-EMC#1192)
  Replace ocnanal_${CDATE}} with ${RUN}ocnanal_${cyc} (NOAA-EMC#1191)
  define NET and RUN in the Rocoto XML to accurately mimic the ecf in ecflow (NOAA-EMC#1193)
  Fix checking for restart files (NOAA-EMC#1186)
  Fix 'DEBUG' option in build_ufs.sh (NOAA-EMC#1188)
  Update archive job memory request value for R&Ds (NOAA-EMC#1183)
  Reorder post so all flux files are generated when running offline (NOAA-EMC#1181)
  Stop checking for restarts on non-GFS CDUMPs (NOAA-EMC#1179)
  Add missing jobids in some pre-job scripts (NOAA-EMC#1176)
  Remove existing directory if it exists when getic runs (NOAA-EMC#1165)
  Add logging decorator, test and test for yaml_file (NOAA-EMC#1178)
  fix coding norm check in `hosts.py` (NOAA-EMC#1174)
  Fix some bugs and make other changes so ctest in GDASApp works (NOAA-EMC#1172)
  Support for the GDASApp testing in containers (NOAA-EMC#1151)
  ATM 3DVAR with and without IAU (NOAA-EMC#1113)
  Enable checking for python norms and fix violating code (NOAA-EMC#1168)
  Enforce decimal math in atmos post (NOAA-EMC#1171)
  Update marine DA j-jobs to new format (NOAA-EMC#1149)
  Add utility to manipulate files en masse  (NOAA-EMC#1166)
  add action to run pytests (NOAA-EMC#1167)
  Pin `differential-shellcheck` to `v3` tag (NOAA-EMC#1162)
  Add a task base class and basic logger (NOAA-EMC#1160)
  Recursively convert dict to AttrDict when making an AttrDict (NOAA-EMC#1154)
  move configuration.py to pygw. Use it from there.  return AttrDict after sourcing configs (NOAA-EMC#1153)
  JEDI based Marine DA tasks (NOAA-EMC#1134)
  Allow customizations based on user/configuration (NOAA-EMC#1146)
  First step towards making j-jobs consistent in use from ecflow and rocoto (NOAA-EMC#1120)
  enable APP=S2SWA on WCOSS2 (NOAA-EMC#1142)
  Fix typo in .shellcheckrc
  Remove prod_envir module load from WCOSS2 (NOAA-EMC#1138)
  Link staged GSI fix files instead of cloning them from gerrit (NOAA-EMC#1132)
  Address shellcheck warnings in env files (NOAA-EMC#1136)
  Adds group size and nmem for GEFS (NOAA-EMC#1127)
  Remove unnecessary sCDATE assignment in forecast_predet.sh (NOAA-EMC#1133)
  Convert archive jobs to proper j-jobs (NOAA-EMC#1115)
  Update C48 forecast to run with one thread (NOAA-EMC#1131)
  Improved error messages from atmos analysis (NOAA-EMC#1125)
  Update MODULEPATH for Orion (NOAA-EMC#1126)
  MPMD variable updates and fix (NOAA-EMC#1124)
  Introduce FHMAX_ENKF_GFS to extending ensemble forecast capabilities (NOAA-EMC#1122)
  Update R&D launcher commands for tasks and multi-prog (NOAA-EMC#1112)
  Correct crtm path in UFS DA atmospheric analysis scripts (NOAA-EMC#1111)
  Correct syntax in remaining sorc scripts (NOAA-EMC#1105)
  Add GSI background error covariance as an option for UFS DA variational assimilation (NOAA-EMC#1104)
  Add Early Cycle EnKF workflow (NOAA-EMC#1022)
  Correct errors with gdas and monitoring symlinks (NOAA-EMC#1101)
  Fixed gfs-utils links (NOAA-EMC#1099)
  Fix build scripts and bring into compliance (NOAA-EMC#1096)
  Feature/updates for gdas app (NOAA-EMC#1091)
  Change GLDAS USE_CFP to NO on Hera (NOAA-EMC#1094)
  Resource updates to support WCOSS2 (NOAA-EMC#1070)
  Set COMPILER in link for detect machine (NOAA-EMC#1092)
  gfs utils update (NOAA-EMC#1088)
  GFS-UTILS update for build and ush scripts (NOAA-EMC#1082)
  Update UFS version to 2022 Oct 19 (NOAA-EMC#1083)
  Use more cycledefs for task control (NOAA-EMC#1078)
  removing superfluous EFSOI-specific files from develop (NOAA-EMC#1079)
  Update UFS to Sept 9 version (NOAA-EMC#1073)
  Modify default file location for monitor data when using rocoto (NOAA-EMC#1065)
  Fix companion ocean resolution for C48 (NOAA-EMC#1066)
  Add trailing slash for gldas topo path (NOAA-EMC#1064)
  Limit number of CPU for post (NOAA-EMC#1061)
  Fix eupd trace (NOAA-EMC#1057)
  Port to S4 (NOAA-EMC#1023)
  Update to obsproc.v1.0.2 and prepobs.v1.0.1 (NOAA-EMC#1049)
  Add GDAS to the partial build list (NOAA-EMC#1050)
  Fix group number being treated as octal in gdas arch (NOAA-EMC#1053)
  Remove trace from link script (NOAA-EMC#1046)
  Update gfs-utils hash to 3a609ea (NOAA-EMC#1048)
  Fix link script usage statement (NOAA-EMC#1045)
  Replace preamble variable commands with functions (NOAA-EMC#1012)
  Implement fix reorg and remove gfs-utils code (NOAA-EMC#1009)
  Rename post scripts (NOAA-EMC#1038)
  Fix missing @ symbol with COMINsyn in config.base (NOAA-EMC#1039)
  WCOSS2 run support and script/config updates (NOAA-EMC#1030)
  Remove base_svn from Hera and Orion hosts files (NOAA-EMC#1036)
  initial commit for incoming yaml work (NOAA-EMC#1029)
  Fix radiance verification failing to find diag files (NOAA-EMC#1031)
  Supported resolutions on platforms and defaults for mode (NOAA-EMC#1026)
  Add GLDAS scripts & fix GLDAS job (NOAA-EMC#1018)
  Update GSI Monitor for radmon fix
  Correct shell linter config (NOAA-EMC#1013)
  Correct diagnostic file handling in ush/ozn_xtrct.sh (NOAA-EMC#1016)
  Add shell linter Github action for pull requests (NOAA-EMC#1007)
  Build updates for WCOSS2 (NOAA-EMC#1002)
  Update UFS_UTILS tag to `ufs_utils_1_8_0` (NOAA-EMC#1001)
  Fix preamble id (NOAA-EMC#996)
  Add missing "atmos" into job dependencies (NOAA-EMC#998)
  Bugfix in arch.sh to remove hardwired "htar" (NOAA-EMC#992)
  Add in stubs for aerosol DA tasks + bugfix for setup_expt where cycled and ATMA are used (NOAA-EMC#990)
  Add GSI monitor scripts (NOAA-EMC#969)
  Fix product generation at some fcst hrs (NOAA-EMC#988)
  Add initial config files for global aerosol DA (NOAA-EMC#986)
  Update diag table to remove wav-ocn coupling fields (NOAA-EMC#979)
  use a robust Findwgrib2.cmake to find wgrib2 built w/ native wgrib2 build (NOAA-EMC#970)
  Externals.cfg was stale and had drifted off (NOAA-EMC#965)
  Fix post comparison with zero-padded numbers (NOAA-EMC#964)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Archive jobs fail for fourth full cycle when starting with a half-cycle
4 participants