You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the wrapper scripts no longer run as-is on Hera (and possibly other RDHPCS systems).
Expected behavior
Running the workflow with standalone scripts using the instructions in the documentation for running with standalone scripts should allow users to run the experiment to completion on any supported system.
Running the make_grid wrapper scripts on Hera led to a variety of errors. I have not tested yet on other systems.
Steps To Reproduce
After generating the workflow and running: ./run_make_grid.sh, the following error came up:
+ /scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 108: /source_util_funcs.sh: No such file or directory
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 109: source_config_for_task: command not found
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 110: /job_preamble.sh: No such file or directory
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 129: -f: command not found
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 145: print_info_msg: command not found
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 155: check_for_preexist_dir_file: command not found
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 156: mkdir_vrfy: command not found
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 165: mkdir_vrfy: command not found
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 174: /exregional_make_grid.sh: No such file or directory
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 176: print_err_msg_exit: command not found
touch: cannot touch ‘/make_grid_task_complete.txt’: Read-only file system
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID: line 208: job_postamble: command not found
After updating the scripts to add 'a' in set -xa for all wrapper files and rerunning ./run_make_grid.sh, the following error appears:
========================================================================
Entering script: "JREGIONAL_MAKE_GRID"
In directory: "/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs"
This is the J-job script for the task that generates grid files.
========================================================================
Specified directory or file (dir_or_file) already exists:
dir_or_file = "/scratch2/NAGAPE/epic/Gillian.Petro/expt_dirs/standalone/grid"
Moving (renaming) preexisting directory or file to:
old_dir_or_file = "/scratch2/NAGAPE/epic/Gillian.Petro/expt_dirs/standalone/grid_old001"
========================================================================
Entering script: "exregional_make_grid.sh"
In directory: "/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/scripts"
This is the ex-script for the task that generates grid files.
========================================================================
/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/scripts/exregional_make_grid.sh: line 63: ulimit: stack size: cannot modify limit: Operation not permitted
End exregional_make_grid.sh at Mon Aug 7 22:13:29 UTC 2023 with error code 1 (time elapsed: 00:00:00)
FATAL ERROR:
ERROR:
From script: "JREGIONAL_MAKE_GRID"
Full path to script: "/scratch2/NAGAPE/epic/Gillian.Petro/ufs-srweather-app/jobs/JREGIONAL_MAKE_GRID"
Call to ex-script corresponding to J-job "JREGIONAL_MAKE_GRID" failed.
Exiting with nonzero status.
End JREGIONAL_MAKE_GRID at Mon Aug 7 22:13:29 UTC 2023 with error code 1 (time elapsed: 00:00:02)
Detailed Description of Fix (optional)
Likely, we will need to update the wrapper scripts and/or their documentation to implement the changes from #847 that allow Jenkins tests to run the wrappers successfully.
The text was updated successfully, but these errors were encountered:
To correct the ulimit: stack size: cannot modify limit: Operation not permitted failure, the ush/machine/hera.yaml file will need to be modified as so:
@MichaelLueken Is this something that will work on all systems? If so, could I just tell users in the docs to make that adjustment in their config file when running w/o Rocoto?
@gspetro-NOAA - Unfortunately, I can't say whether this will work on all systems or not. This is only being used to allow the wrapper scripts to run on Hera. Additional testing will be required to see if this will work on other systems.
Currently, the wrapper scripts no longer run as-is on Hera (and possibly other RDHPCS systems).
Expected behavior
Running the workflow with standalone scripts using the instructions in the documentation for running with standalone scripts should allow users to run the experiment to completion on any supported system.
Current behavior
When I run the workflow according to the instructions in the documentation for running with standalone scripts, the initial tasks all come up with error messages on Hera.
Machines affected
Running the make_grid wrapper scripts on Hera led to a variety of errors. I have not tested yet on other systems.
Steps To Reproduce
After generating the workflow and running:
./run_make_grid.sh
, the following error came up:After updating the scripts to add 'a' in
set -xa
for all wrapper files and rerunning./run_make_grid.sh
, the following error appears:Detailed Description of Fix (optional)
Likely, we will need to update the wrapper scripts and/or their documentation to implement the changes from #847 that allow Jenkins tests to run the wrappers successfully.
The text was updated successfully, but these errors were encountered: