Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFSv16.2.1 #899

Merged
merged 538 commits into from
Jul 8, 2022
Merged

GFSv16.2.1 #899

merged 538 commits into from
Jul 8, 2022

Conversation

KateFriedman-NOAA
Copy link
Member

@KateFriedman-NOAA KateFriedman-NOAA commented Jul 7, 2022

Description

This PR merges in GFSv16.2.1 updates on WCOSS2 into the operations branch (RFC 9682). Several bug fixes for the GFSv16.2 package to resolve issues with the gfs_forecast job (wave restart calculation) and the gfs_atmos_postsnd (bufr sounding) job. All of these changes were known and tested before WCOSS2 go-live but because of the timing of them they had to be implemented via a RFC post-go-live and resulted in the version incrementing.

RFC 9682 - GFS v16.2.1 - On WCOSS2, update GFS package to v16.2.1. This includes fixes for two issues:
● Source code and resource configuration changes to address bufr sounding job failures.
● Script changes to address the gfs_forecast job restart issue that happened when the f1 filesystem was degraded.
To be implemented on July 5 at 1330Z.

Changes:

  1. Updates related to the gfs_bufr postsnd job:
  • Correction to gfs_bufr code (by @BoCui-NOAA )
  • Change to gfs_bufr build to no longer build with -qopenmp.
  • Adjustments to job resources based on testing by @BoCui-NOAA, @WeiWei-NCO , and GDIT.
  1. Update gfs_forecast job calculation of starting time of rerun based on if wave restarts exist.
  2. A memory adjustment to the FBWIND job.
  3. An update made to the transfer_rdhpcs_gfs_nawips.list parm file by NCO.
  4. New release notes for RFC upgrade to GFSv16.2.1 (docs/Release_Notes.gfs.v16.2.1.md).

Refs #399
Closes #399

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • This change requires a documentation update

How Has This Been Tested?

  • Clone and Build tests on WCOSS2
  • Cycled test on WCOSS2

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

KateFriedman-NOAA and others added 30 commits January 12, 2022 20:40
- update GLDAS tag to gldas_gfsv16_release.v1.24.0
- update WAFS to gfs_wafs.v6.2.6

Refs: #399
- remove "-o" after compath.py in COMIN definitions
- add "${envir}" and move closing ")" forward in line

Refs: #399
Incorporate ecflow feedback from NCO - part 1
- update WAFS tag in sorc/checkout.sh and release notes

Refs: #399
…NOAA/global-workflow into feature/ops-wcoss2

* 'feature/ops-wcoss2' of https://github.com/KateFriedman-NOAA/global-workflow:
  revert ecflow include files to NCO versions.  Will adapt as necessary for proper use
- update COMIN paths in GEMPAK JJOB scripts for COMINukmet, COMINecmwf,
and COMINnam to add the respective systems to the end of the path
definition

Refs: #399
Updates to ROTDIR/COMIN definitions related to compath.py and new GLDAS/WAFS tags
Load compiler env. and modules in the ecf scripts.
Request exclusive nodes for large memory jobs or jobs requesting all cores on WCOSS2 in ecf scripts.
KateFriedman-NOAA and others added 21 commits May 16, 2022 18:25
Bo Cui updated gfs_bufr.sh to improve error handling

Refs: #399, #790
- remove hyper=true in jgdas_atmos_analysis_calc.ecf
- add export nth_echgres=$nth_echgres_gfs when CDUMP=gfs in
config.analcalc; for correct thread setting at runtime
- add export nth_echgres=4 to analcalc block in config.resources
- add export nth_echgres_gfs=12 to analcalc block in config.resources

Refs: #399
Hand-off tag to NCO is now EMC-v16.2.0.7

Refs: #399
Final pre-production freeze updates for GFSv16.2.0 package on WCOSS2
- NCO updated the default path for HOMENHC and tested it in prod on
WCOSS2 during NHC test

Refs: #399
- Based on testing on Dogwood after some WCOSS2 updates some memory and
resource adjustments were made by NCO.
- Memory updates to the gempak, awips, and fbwnd job ecf scripts.
- Resource adjustments to remedy oversubscription errors in the post and
postsnd jobs.

Refs: #399
The gfspostsnd job was oversubscribing CPUs on WCOSS2 after updates on
Dogwood. Updating resources settings to get them matching and working.

Refs: #399
- Add updated memory values for awips and gempak jobs into resource
configs to match similar updates in ecf scripts

Refs: #399
WCOSS2 GFSv16.2.0 resource updates and NHC change
Co-authored-by: Rahul Mahajan <aerorahul@users.noreply.github.com>
for ops: Update calculation of starting time of rerun based on if wave restarts exist
- Resource adjustments for the postsnd bufr_sounding job (based on GDIT
and Bo Cui testing).
- Fix to gfs_bufr code (Bo Cui).
- gfs_bufr build update to no longer build with "-qopenmp".

Refs: #399
Merge remote-tracking branch 'origin/feature/ops-wcoss2' into operations_v16.2.1

* origin/feature/ops-wcoss2: (422 commits)
  Add release notes for GFSv16.2.1
  Update to transfer_rdhpcs_gfs_nawips.list by NCO
  Bug fix for gfs_bufr (postsnd) job
  Adjust memory value for fbwind job
  update logic for linking wave restart files for rerun
  Update no rerun wave linking logic statement
  add error message and exit if wave model does not have restart on rerun
  add checking for wave restarts for RERUN date calculation
  Matching memory updates for awips/gempak in config
  Update prior GFS version in v16.2.0 release notes
  Update gfspostsnd job resources - oversubscribing
  Memory and resource adjustments to some jobs (NCO)
  Update to HOMENHC default path in JGLOBAL_ATMOS_TROPCY_QC_RELOC
  Update EMC tag name in v16.2.0 release notes
  Resource updates for analysis_calc job on WCOSS2
  Updated error handling in gfs_bufr script
  Add -g and -traceback flags to utility builds if missing
  EnKf forecast serial netcdf updates and DELTIM=200
  Add HOMEobsproc back to config.base.nco.static
  Update "excl" to "exclhost" in workflow_utils.py
  Update ecf PBS excl to exclhost
  Remove reference to HOMEobsproc in NCO config.base
  Update GFSv16.2.0 release notes for new hand-off tag
  Update WCOSS2 env file cpu-bind flags for threading
  Update UPP tag to upp_v8.1.2
  Remove nco_ver from build.ver - not needed
  Update release notes to update prior version
  Update GFSv16.2.0 release notes to reflect new tag
  Increase post_master job to 126 tasks
  Update enkfgdas_sfc job to use 60GB
  Add gsl module load needed by nco module
  Set hyper=true for gdas_atmos_analysis_calc job
  Optimized gfs_forecast job resource configuration
  Add WCOSS2 operations gfs defs files
  Add missing --init flag to GSI checkout submodule
  update file Release_Notes.gfs.v16.1.7.txt
  add Release_Notes.gfs.v16.1.7.txt
  Code update to syndat_getjtbul.fd for v16.1.7
  Update HOMEobsproc paths in config.base.nco.static
  Update obsproc package settings in dev config.base
  Update prep.sh to use new WCOSS2 obsproc packages
  Add obsproc/prepobs run versions to wcoss2.ver
  Add needed gempak subfolder to gempak ush scripts
  Update GSI submodule command and release notes
  GFS v16.1.6 update: Turn off uv 224 VADWND
  Update GLDAS tag to gldas_gfsv16_release.v.2.0.0
  Update DMPDIR and BASE_GIT paths for WCOSS2
  Update Externals.cfg with GFSv16.2.0 component versions
  Update release notes for current ops version
  Move all PBS place settings to separate line
  Remove commented out lines from transfer lists
  Update WAFS tag to gfs_wafs.v6.2.8
  GFS v16.1.6: Update release notes and comment in config.anal
  Update npe_node_fcst_gfs in config.resources.emc.dyn
  Updates to support wcoss2.ver
  Fold in transfer parm list updates from NCO
  Move transfer lists into new transfer folder
  Update wave job resources with NCO feedback
  Update EMC tag name in v16.2.0 release notes
  Updates to run.ver and create wcoss2.ver
  Script updates from NCO
  GFS v16.1.6: GSI update to add commercial GPSRO in DO-4
  Move excl setting into resource line in ecf scripts
  Update gfs_forecast job resources
  Update several versions in run.ver
  Add OMP_PLACES=cores for fcst block in WCOSS2.env
  Update compilation flags for gaussian_sfcanl build
  Add the following scripts changed to remove module load libjpeg:  jgdas_wave_prep.ecf  jgfs_wave_prep.ecf
  Remove hardwired DELL path util/ush/make_tif.sh
  Remove esmf from enkf fcst
  1.   A check on job/ush/script from HOMEgfs, I found the following reference to USE_CFP:   gldas_forcing.sh   exgdas_atmos_chgres_forenkf.sh   exgdas_atmos_gldas.sh   exgdas_enkf_update.sh   exglobal_atmos_analysis.sh   exglobal_diag.sh
  Correct analysis job walltimes in config.resources
  The following scripts changed to remove module load wgrib2:   jenkfgdas_sfc.ecf   jgfs_wave_prdgen_bulls.ecf   jgdas_wave_postsbs.ecf
  Adjust analysis job walltimes for ops
  Add missing EXPDIR setting to JGDAS_ATMOS_GEMPAK
  Remove non-WCOSS2 references in nco.static configs
  Update EMC tag name in release notes
  Change npe_analdiag to 96
  Remove npe_node_eupd=9 setting on WCOSS2
  Update several GSI/EnKF job resources
  Update to correct infinite loop in gempak script
  Add missing character to GLDAS tag in release notes
  Update GLDAS tag to gldas_gfsv16_release.v.1.28.0
  Remove excl for gfswaveprep job PBS directive
  Update GFSv16.2.0 release notes
  GEMPAK_META script updates from Boi
  Update GEMPAK scripts
  Update gempak job in setup_workflow scripts
  Back out comment of job variable in awips scripts
  Resource adjustments for eobs, waveprep, gfspost
  Resource updates for analysis and eobs
  Change RUN to RUN2 in awips scripts
  Change RUN to RUN2 in gempak pgrb2 spec script
  Correct config list for wavepostbndpntbll job
  Comment out job variable in awips ecf scripts
  Reduce gdas analysis job walltime back to 40mins
  Remove nth_max usage in WCOSS2.env
  A few resource updates from NCO and WCOSS_C removal
  Update analysis job walltime to 50mins
  Optimization resource updates from NCO
  remove obsproc ecfs and there references from suite.def.  work needs to find a proper trigger for the remaining dump job
  bringing in changes from @WeiWei-NCO after his testing
  Update analysis job walltime to 50 mins
  Update gdasechgres job resources
  Update esfc and analysis job resources
  Update C384 and C768 values in config.fv3.emc.dyn
  Update config.resource.emc.dyn with tested values
  Cleanup of config.fv3.nco.static
  Add COMIN_OBS/COMIN_GES_OBS and related xml support
  Add COMIN_OBS/COMIN_GES_OBS and related xml support
  Update resources for gdasesfc job in ecf script
  Numerous resource updates based on optimization
  Update for C768 gdasfcst job resource settings
  Update analysis job ecf resource settings
  Update to WCOSS2 env file for waveprep job
  Add missing get_awipsgroups function to fcstonly
  Update memory setting in workflow_utils for gfs
  update resources for more jobs in ecf scripts from NCO
  update resources for more jobs in ecf scripts from NCO
  update resources for jgfs_atmos_tropcy_qc_reloc.ecf.  Remove developer overwrite section
  update resources for jgdas_atmos_tropcy_qc_reloc.ecf.  Remove developer overwrite section
  update resources for more jobs to include memory in ecf scripts.  wave init jobs need modules for Intel loaded
  update resources for wave jobs to include memory in ecf scripts
  update resources for atmos chgres for enkf in ecf scripts
  update resources for atmos pp wafs_gcip in ecf scripts
  update resources for atmos gempak_meta in ecf scripts
  update resources for atmos gempak in ecf scripts
  update resources for wave init, post and prep jobs in ecf scripts
  fix resource allocations for some jobs that NCO flagged were allocating too many cores
  Remove unneeded COM paths from wavepostsbs JJOB
  Update GLDAS tag in release notes
  Update GLDAS tag to gldas_gfsv16_release.v1.25.0
  wave init jobs just need cray-pals per NCO.  remove rest
  put NCO identified changes from the global-workflow in a branch
  remove gdas remnant from enkfgdas jobnames
  some PBS jobnames were hardwired gdas or gfs, while some inherited from %RUN%. This commit uses %RUN% to make it consistent and possibly will open the door for further consolidation between gdas and gfs families
  post/anl job is the same as the forecast hour.  create a link, instead of having a copy
  Update post job resources in config.resources.nco.static
  Update post jobs ecf script resources
  Update release notes for new EMC tag
  Update workflow_utils.py to support exclusive
  Add imagemagick_ver=7.0.8-7 to run.ver
  Update NCO resource config for memory
  Update v16.2.0 release notes for ecf script linking
  Add memory setting to jgfs_atmos_wafs_master.ecf
  Add memory setting to jgfs_atmos_wafs_grib2_0p25.ecf
  Add memory setting to jgfs_atmos_wafs_grib2.ecf
  Add memory setting to jgfs_atmos_wafs_blending_0p25.ecf
  Add memory setting to jgfs_atmos_wafs_blending.ecf
  Add memory setting to jgfs_atmos_awips_g2_master.ecf
  Add memory setting to jgfs_atmos_awips_master.ecf
  Add excl tag to jgfs_atmos_gempak.ecf
  Add excl tag to jgfs_forecast.ecf
  add pesky blank lines at the end of script. reviewers are brutal
  add script that sets up the links to the master.ecf that loop over forecast hours
  add gitignore in appropriate places to ignore links.  update defs to the consistent grib_wafs ecf tasks
  remove duplicate jgfs_atmos_wafs_f*.ecf files and rename f00 as master
  remove duplicate jgfs_atmos_awips_g2_f*.ecf files and rename f000 as master
  remove duplicate jgfs_atmos_awips_f*.ecf files and rename f000 as master
  remove duplicate gfs_atmos_post_fxxx.ecf files and rename f000 as master
  fix typo that causes the opposite effect
  ignore gdas/atmos/post/ forecast hour ecf links
  remove duplicate gdas_atmos_post_fxxx.ecf files and rename f000 as master
  remove remnant from PR555 that copied gdas/enkf to enkfgdas. add gitignore in enkfgdas/post to ignore links
  remove duplicate enkfgdas_post_fxxx.ecf files and rename f003 as master
  add script that sets up the links to the master.ecf that loop over forecast hours
  add gitignore in appropriate places to ignore links.  update defs to the consistent grib_wafs ecf tasks
  remove duplicate jgfs_atmos_wafs_f*.ecf files and rename f00 as master
  remove duplicate jgfs_atmos_awips_g2_f*.ecf files and rename f000 as master
  remove duplicate jgfs_atmos_awips_f*.ecf files and rename f000 as master
  remove duplicate gfs_atmos_post_fxxx.ecf files and rename f000 as master
  fix typo that causes the opposite effect
  ignore gdas/atmos/post/ forecast hour ecf links
  remove duplicate gdas_atmos_post_fxxx.ecf files and rename f000 as master
  remove remnant from PR555 that copied gdas/enkf to enkfgdas. add gitignore in enkfgdas/post to ignore links
  remove duplicate enkfgdas_post_fxxx.ecf files and rename f003 as master
  request exclusive node where ncpus=128
  remove memory requests of 500gb and request exclusive node instead
  every ecf script that loads compiler dependent module, now loads PrgEnv-intel, craype and intel
  every ecf script that loads compiler dependent module, now loads PrgEnv-intel, craype and intel
  every ecf script that loads compiler dependent module, now loads PrgEnv-intel, craype and intel
  every ecf script that loads cray-mpich, now loads PrgEnv-intel, craype and intel.  Ignore swp files
  Correct COMIN paths in GEMPAK driver scripts
  Update COMIN paths for ukmet, ecmwf, and nam
  Update WAFS tag to gfs_wafs.v6.2.7
  add #PBS -l debug=true to all .ecf files
  Correct COMIN definitions
  Update GLDAS and WAFS tags in release notes
  Update GLDAS tag to gldas_gfsv16_release.v1.24.0
  revert ecflow include files to NCO versions.  Will adapt as necessary for proper use
  Remove NCO if-block from JJOB scripts
  Update ROTDIR in config.base.nco.static
  Remove ecflow post assignment in envir-p1.h
  Remove remark from envir-p1.h and head.h Update analysis ecflow script to use 128 for wcoss2 Remove extra CDATE
  Reference to NCO version:  - Move enkf out of gdas and rename it to enkfgdas.      Include all ecflow definition files job name      Include all ecflow scripts name and job/log name  - Move "model=gfs" to the top on each job except all jobs under obsproc.      obsproc will no longer be part of GFS. Therefore leave it without change for testing purpose.  - Remove the source of model_ver from each ecflow script except all jobs under obsproc.      obsproc will no longer be part of GFS. Therefore leave it without change for testing purpose.
  Update enkf structure changes in ecflow definition files
  Update WAFS tag to gfs_wafs.v6.2.6
  Remove envvar from module-setup.*.inc scripts
  Remove envvar from WCOSS2 driver scripts
  Update machine-setup based on NCO feedback
  ...
@KateFriedman-NOAA KateFriedman-NOAA added the production update Processing update in production label Jul 7, 2022
@KateFriedman-NOAA KateFriedman-NOAA added this to the WCOSS2 - GFSv16.2.0 milestone Jul 7, 2022
@KateFriedman-NOAA KateFriedman-NOAA self-assigned this Jul 7, 2022
@KateFriedman-NOAA KateFriedman-NOAA linked an issue Jul 7, 2022 that may be closed by this pull request
3 tasks
Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Going forward, if there are changes in ecf/, please also open PR's into develop.

@KateFriedman-NOAA KateFriedman-NOAA merged commit 88bbfa8 into operations Jul 8, 2022
@KateFriedman-NOAA KateFriedman-NOAA deleted the operations_v16.2.1 branch July 8, 2022 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
production update Processing update in production
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GFSv16.2.0 - Port global-workflow operations branch to WCOSS2
5 participants