Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow does not work on MacOS due to bash and UNIX utility differences #369

Closed
mkavulich opened this issue Nov 18, 2020 · 1 comment · Fixed by #539
Closed

Workflow does not work on MacOS due to bash and UNIX utility differences #369

mkavulich opened this issue Nov 18, 2020 · 1 comment · Fixed by #539

Comments

@mkavulich
Copy link
Collaborator

A number of instances of UNIX utilities and bash syntax do not work on MacOS due to various incompatibilities.

The main problem is that the bash version installed on macos is very old due to licensing issues (version 3.2 from 2007!), and so it does not recognize the newer syntax that is used extensively in the workflow generation and other bash scripts. In addition, some unix utilities for macos differ in functionality from their gnu counterparts.

The specific problems I've found so far are:

  • bash does not allow the ${VARIABLE^^} syntax for capitalizing strings, nor the converse ${VARIABLE,,} syntax for making them lowercase
  • readlink does not accept the "-f" flag
  • sed does not accept the "-r" flag

Among potentially others. I am currently working on getting the generation script working, and from there will fix problems as I go through the entire workflow.

@mkavulich mkavulich self-assigned this Nov 18, 2020
mkavulich added a commit that referenced this issue Feb 1, 2021
## DESCRIPTION OF CHANGES: 
This PR adds generic platforms to the regional_workflow, not specific to any one machine, that should allow users to run the ufs-srweather-app on any UNIX-based machine, without a workflow manager, so long as the NCEPLIBS and other prerequisites have been properly installed. This can be done using the scripts described in regional_workflow/ush/wrappers/README.md; additional documentation is currently being written.

Users can utilize these options by setting the MACHINE variable in config.sh to either "LINUX" or "MACOS". The LINUX option should allow most users to run the ufs-srweather-app on a generic Linux OS machine. The MACOS option is for MacOS/Darwin operating systems; this needs to be kept separate because the MacOS version of bash is very old, and missing some functionality, as well as several GNU Linux utilities having different functionality and/or names. 

## TESTS CONDUCTED: 
"Generic Linux" test was run on Cheyenne machine (GNU 9.1.0 compilers) as a fresh install, including stand-alone install of NCEPLIBS, with no reference to staged or pre-built input files. This was run without rocoto or directly submitting jobs via PBS, but rather the entire workflow was run interactively on a compute node (using the `qinteractive` command which emulated the running of the workflow on a machine with no job scheduler).

On MacOS (Catalina, 10.15.7), with GNU 10.1.0 compilers, was able to successfully generate workflow, and run end-to-end successfully. Currently there is a bug in UFS_UTILS that makes the make_orog test fail; UFS UTILS PR245 must be merged to fix this.

## ISSUE (optional): 
Resolves #369
christinaholtNOAA pushed a commit to christinaholtNOAA/regional_workflow that referenced this issue Feb 10, 2021
…ity#402)

This PR adds generic platforms to the regional_workflow, not specific to any one machine, that should allow users to run the ufs-srweather-app on any UNIX-based machine, without a workflow manager, so long as the NCEPLIBS and other prerequisites have been properly installed. This can be done using the scripts described in regional_workflow/ush/wrappers/README.md; additional documentation is currently being written.

Users can utilize these options by setting the MACHINE variable in config.sh to either "LINUX" or "MACOS". The LINUX option should allow most users to run the ufs-srweather-app on a generic Linux OS machine. The MACOS option is for MacOS/Darwin operating systems; this needs to be kept separate because the MacOS version of bash is very old, and missing some functionality, as well as several GNU Linux utilities having different functionality and/or names.

"Generic Linux" test was run on Cheyenne machine (GNU 9.1.0 compilers) as a fresh install, including stand-alone install of NCEPLIBS, with no reference to staged or pre-built input files. This was run without rocoto or directly submitting jobs via PBS, but rather the entire workflow was run interactively on a compute node (using the `qinteractive` command which emulated the running of the workflow on a machine with no job scheduler).

On MacOS (Catalina, 10.15.7), with GNU 10.1.0 compilers, was able to successfully generate workflow, and run end-to-end successfully. Currently there is a bug in UFS_UTILS that makes the make_orog test fail; UFS UTILS PR245 must be merged to fix this.

Resolves ufs-community#369
christinaholtNOAA referenced this issue in NOAA-GSL/regional_workflow Feb 17, 2021
* Add MACOS and Generic Linux options for regional_workflow (#402)

This PR adds generic platforms to the regional_workflow, not specific to any one machine, that should allow users to run the ufs-srweather-app on any UNIX-based machine, without a workflow manager, so long as the NCEPLIBS and other prerequisites have been properly installed. This can be done using the scripts described in regional_workflow/ush/wrappers/README.md; additional documentation is currently being written.

Users can utilize these options by setting the MACHINE variable in config.sh to either "LINUX" or "MACOS". The LINUX option should allow most users to run the ufs-srweather-app on a generic Linux OS machine. The MACOS option is for MacOS/Darwin operating systems; this needs to be kept separate because the MacOS version of bash is very old, and missing some functionality, as well as several GNU Linux utilities having different functionality and/or names.

"Generic Linux" test was run on Cheyenne machine (GNU 9.1.0 compilers) as a fresh install, including stand-alone install of NCEPLIBS, with no reference to staged or pre-built input files. This was run without rocoto or directly submitting jobs via PBS, but rather the entire workflow was run interactively on a compute node (using the `qinteractive` command which emulated the running of the workflow on a machine with no job scheduler).

On MacOS (Catalina, 10.15.7), with GNU 10.1.0 compilers, was able to successfully generate workflow, and run end-to-end successfully. Currently there is a bug in UFS_UTILS that makes the make_orog test fail; UFS UTILS PR245 must be merged to fix this.

Resolves #369

* Source bash utils in the workflow launch script and set the workflow manager as Rocoto. (#426)

## DESCRIPTION OF CHANGES: 
Added sourcing of bash utilities to avoid $SED undefined variable error when using the workflow launch script.  Add Rocoto as the workflow manager on Gaea.

## TESTS CONDUCTED: 
Tested on Gaea. Release branch end-to-end tests (aside from 3km runs) were run on Hera and all passed.

## CONTRIBUTORS (optional): 
@climbfuji, @mkavulich, @gsketefian

* Run with LINUX + rocoto.

* Adding reference configs for Hera

* Updating configs to work on Hera.

* Add configurable options needed for linux.

* Add modulefiles needed for linux.

* Remove the first instance of "RUN_CMD_FCST" in var_defns.sh to avoid potential undefined variable issues (#433)

## DESCRIPTION OF CHANGES: 
It was found that if set -u is in the user's default bash environment, this will cause the launch script or individual run scripts to fail because you're using a variable before it's defined; this is likely to occur if you submit any of these scripts from a crontab. This was due to the way that the default run command was set up for MacOS and generic LINUX platforms, which was a bit of a hack that resulted in RUN_CMD_FCST being defined twice in var_defns.sh. The fix will delete the first instance of RUN_CMD_FCST in var_defns.sh so that it is no longer referencing an undefined variable early on.

This potential bug does not affect Tier 1 supported platforms, only MacOS and generic Linux.

## TESTS CONDUCTED: 
Tested on affected MacOS platform and the fix worked. Also ran end-to-end tests on Hera and Cheyenne (still running) as a sanity check.

* Other mods for running on Hera.

* Fix needed for create_diag_table_file change.

Co-authored-by: Michael Kavulich <kavulich@ucar.edu>
Co-authored-by: JeffBeck-NOAA <55201531+JeffBeck-NOAA@users.noreply.github.com>
@mkavulich
Copy link
Collaborator Author

Fixed in release branch by #402, fix pending in develop branch

mkavulich added a commit that referenced this issue Sep 22, 2021
## DESCRIPTION OF CHANGES: 
This change will add the capability to run regional_workflow (as part of the SRW app) on MacOS and generic LINUX platforms. Most of these changes are identical to those in #402 (hash bc08607) but some additional modifications needed to be made due to intervening changes in the develop branch.

## TESTS CONDUCTED: 
Ran Graduate Student Test on new platforms:
 - my personal Mac machine (MacOS Catalina 10.15.7) MacOS with gnu 9.4.0 compilers. 
 - Cheyenne compute node as a faux "stand-alone" machine, intel 19.1.1 compilers

Ran suite of end-to-end tests on Cheyenne (intel/19.1.1) and Hera (intel/18.0.5.274). All passed as expected.

Tests also passed on WCOSS, MacOS Mojave, RedHat Linux.

## ISSUE: 
Will resolve #369
christinaholtNOAA pushed a commit to christinaholtNOAA/regional_workflow that referenced this issue May 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment