Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ecflow package for wcoss2 GFS transition #555

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
466 changes: 237 additions & 229 deletions ecf/defs/prod00.def

Large diffs are not rendered by default.

466 changes: 237 additions & 229 deletions ecf/defs/prod06.def

Large diffs are not rendered by default.

466 changes: 237 additions & 229 deletions ecf/defs/prod12.def

Large diffs are not rendered by default.

466 changes: 237 additions & 229 deletions ecf/defs/prod18.def

Large diffs are not rendered by default.

54 changes: 0 additions & 54 deletions ecf/include/envir-p1-old.h

This file was deleted.

41 changes: 41 additions & 0 deletions ecf/include/envir-p1.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# envir-p1.h
export job=${job:-$PBS_JOBNAME}
export jobid=${jobid:-$job.$PBS_JOBID}

export RUN_ENVIR=emc
export envir=%ENVIR%
export MACHINE_SITE=%MACHINE_SITE%
export SENDDBN=${SENDDBN:-%SENDDBN:YES%}
export SENDDBN_NTC=${SENDDBN_NTC:-%SENDDBN_NTC:YES%}
if [[ "$envir" == prod && "$SENDDBN" == YES ]]; then
export eval=%EVAL:NO%
if [ $eval == YES ]; then
export SIPHONROOT=${UTILROOT}/para_dbn
else
export SIPHONROOT=/lfs/h1/ops/prod/dbnet_siphon
fi
export SIPHONROOT=${UTILROOT}/fakedbn
else
export SIPHONROOT=${UTILROOT}/fakedbn
fi
Comment on lines +10 to +20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the logic here, regardless of the conditions, the result is:
export SIPHONROOT=${UTILROOT}/fakedbn
Also, a variable eval is being exported for use somewhere down when envir == prod and SENDDBN is YES.
Is that the desired outcome?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The developer will always use fakedbn to test. eval is ok here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does eval do? Is it used anywhere other than the if-test?
If the developer will always use fakeddbn, then the first if test "$envir" == prod is not necessary (and this whole block just boiled down to this):

export SIPHONROOT=${UTILROOT}/fakedbn


export DBNROOT=$SIPHONROOT

if [[ ! " prod para test " =~ " ${envir} " && " ops.prod ops.para " =~ " $(whoami) " ]]; then err_exit "ENVIR must be prod, para, or test [envir-p1.h]"; fi
export DATAROOT=/lfs/h2/emc/stmp/Lin.Gan/RUNDIRS/ecfops
export COMROOT=/lfs/h2/emc/ptmp/Lin.Gan/ecfops/com
lgannoaa marked this conversation as resolved.
Show resolved Hide resolved
export COREROOT=/lfs/h2/emc/ptmp/production.core/$jobid
export NWROOT=/lfs/h1/ops/prod
export SENDECF=${SENDECF:-YES}
export SENDCOM=${SENDCOM:-YES}
export KEEPDATA=${KEEPDATA:-%KEEPDATA:NO%}
export TMPDIR=${TMPDIR:-${DATAROOT:?}}
if [ -n "%PDY:%" ]; then
export PDY=${PDY:-%PDY:%}
export CDATE=${PDY}%CYC:%
fi
if [ -n "%COMPATH:%" ]; then export COMPATH=${COMPATH:-%COMPATH:%}; fi
if [ -n "%MAILTO:%" ]; then export MAILTO=${MAILTO:-%MAILTO:%}; fi
if [ -n "%DBNLOG:%" ]; then export DBNLOG=${DBNLOG:-%DBNLOG:%}; fi


79 changes: 51 additions & 28 deletions ecf/include/head.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,45 +4,68 @@ export PS4='+ $SECONDS + '

# Variables needed for communication with ecFlow
export ECF_NAME=%ECF_NAME%
#export ECF_HOST=%ECF_HOST%
export ECF_HOST=%ECF_LOGHOST%
export ECF_PORT=%ECF_PORT%
export ECF_PASS=%ECF_PASS%
export ECF_TRYNO=%ECF_TRYNO%
export ECF_RID=$LSB_JOBID

# Tell ecFlow we have started
# POST_OUT variable enables LSF post_exec to communicate with ecFlow
if [ -d /opt/modules ]; then
# WCOSS TO4 (Cray XC40)
. /opt/modules/default/init/sh
module load ecflow
POST_OUT=/gpfs/hps/tmpfs/ecflow/ecflow_post_in.$LSB_BATCH_JID
else
# WCOSS Phase 3 (Dell PowerEdge)
. /usrx/local/prod/lmod/lmod/init/sh
. /gpfs/dell1/nco/ops/nwprod/versions/ecflow_p3.ver
module load ips/$ips_ver
module load EnvVars/$EnvVars_ver
module load ecflow/$ecflow_ver
POST_OUT=/var/lsf/ecflow_post_in.$USER.$LSB_BATCH_JID
export ECF_RID=${ECF_RID:-${PBS_JOBID:-$$}}
lgannoaa marked this conversation as resolved.
Show resolved Hide resolved
export ECF_JOB=%ECF_JOB%
export ECF_JOBOUT=%ECF_JOBOUT%
export ecflow_ver=%ecflow_ver%

if [ -d /apps/ops/prod ]; then # On WCOSS2
echo "Running 'module reset'"
module reset
module load envvar/1.0
module load PrgEnv-intel/8.1.0
module load craype/2.7.8
module load intel/19.1.3.304
fi
ecflow_client --init=${ECF_RID}

cat > $POST_OUT <<ENDFILE
ECF_NAME=${ECF_NAME}
ECF_HOST=${ECF_HOST}
ECF_PORT=${ECF_PORT}
ECF_PASS=${ECF_PASS}
ECF_TRYNO=${ECF_TRYNO}
ECF_RID=${ECF_RID}
ENDFILE
export HOMEgfs=/lfs/h2/emc/global/noscrub/$USER/git/feature-ops-wcoss2
. ${HOMEgfs}/versions/run.ver
export gfs_ver=v16.2

if [ -d /apps/ops/prod ]; then # On WCOSS2
export ECF_ROOT=/apps/ops/prod/nco/core/ecflow.v5.6.0.7
. ${ECF_ROOT}/versions/run.ver
Comment on lines +30 to +31
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am failing to understand what this is doing. Why export? and why loading prod_util in head.h

module load prod_util/${prod_util_ver}
module load prod_envir/${prod_envir_ver}

echo "Running module load ecflow/$ecflow_ver"
module load ecflow/$ecflow_ver
echo "ecflow module location: $(module display ecflow |& head -2 | tail -1 | sed 's/:$//')"
export ECF_ROOT=/apps/ops/prod/nco/core/ecflow.v5.6.0.7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are dual uses of the variable ecflow_ver and 5.6.0.7 for ecflow version. Which one is it? Please be consistent in the use of the version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable is set by the ecflow module. It should not be overwritten. Load the appropriate module instead.
Please remove.

export ECF_PORT=34326
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove user specific PORT information. This value is set in the ecflow module for the user and is unique to the user.

export ECF_HOST=ddecflow02
export ECF_INCLUDE=/lfs/h2/emc/global/noscrub/$USER/git/feature-ops-wcoss2/ecf/include
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hardwired to a specific branch which will not exist as soon as it is merged and deleted.
I am failing to understand the need for this entire section from 40-47. It seems to be tailored for a specific branch and specific location.

export ECF_HOME=/lfs/h2/emc/global/noscrub/$USER/ecflow/submit
export ECF_DATA_ROOT=/lfs/h2/emc/global/noscrub/$USER/ecflow
export ECF_OUTPUTDIR=/lfs/h2/emc/global/noscrub/$USER/ecflow/output
export ECF_COMDIR=/lfs/h2/emc/global/noscrub/$USER/ecflow/submit
export ECF_COMDIR=/lfs/h2/emc/ptmp/$USER/ecflow/submit
ecflow_client --alter change variable ECF_INCLUDE /lfs/h2/emc/global/noscrub/$USER/git/feature-ops-wcoss2/ecf/include /

echo "Listing modules from head.h:"
module list
fi

timeout 300 ecflow_client --init=${ECF_RID}

POST_OUT=/lfs/h2/emc/stmp/$USER/RUNDIRS/ecfops/tmp/posts/ecflow_post_in.$USER.${PBS_JOBID}
mkdir -p /lfs/h2/emc/stmp/$USER/RUNDIRS/ecfops/tmp/posts
echo 'export ECF_NAME=${ECF_NAME}' > $POST_OUT
echo 'export ECF_HOST=${ECF_HOST}' >> $POST_OUT
echo 'export ECF_PORT=${ECF_PORT}' >> $POST_OUT
echo 'export ECF_PASS=${ECF_PASS}' >> $POST_OUT
echo 'export ECF_TRYNO=${ECF_TRYNO}' >> $POST_OUT
echo 'export ECF_RID=${ECF_RID}' >> $POST_OUT

# Define error handler
ERROR() {
set +ex
if [ "$1" -eq 0 ]; then
msg="Killed by signal (likely via bkill)"
msg="Killed by signal (likely via qdel)"
else
msg="Killed by signal $1"
fi
Expand Down
7 changes: 5 additions & 2 deletions ecf/include/model_ver.h
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
. ${NWROOT:?}/versions/${model:?}.ver
eval export HOME${model}=${NWROOT}/${model}.\${${model}_ver:?}
# . ${NWROOT:?}/versions/${model:?}.ver
# eval export HOME${model}=${NWROOT}/${model}.\${${model}_ver:?}

. /lfs/h2/emc/global/noscrub/Lin.Gan/git/feature-ops-wcoss2/ecf/versions/${model:?}.ver
export HOMEgfs=/lfs/h2/emc/global/noscrub/Lin.Gan/git/feature-ops-wcoss2
51 changes: 51 additions & 0 deletions ecf/scripts/enkfgdas/analysis/create/jenkfgdas_diag.ecf
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#PBS -S /bin/bash
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These EnKF jobs are new and we agreed to add them in a followup PR to avoid further bloating this current PR.

#PBS -N enkfgdas_diag_%CYC%
#PBS -j oe
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=00:06:00
#PBS -l place=vscatter,select=1:mpiprocs=128:ompthreads=1:ncpus=128:mem=500GB

model=gfs
%include <head.h>
%include <envir-p1.h>

set -x

export NET=%NET:gfs%
export RUN=%RUN%
export CDUMP=%RUN%

############################################################
# Load modules
############################################################
module load cray-mpich/${cray_mpich_ver}
module load cray-pals/${cray_pals_ver}
module load cfp/${cfp_ver}
module load hdf5/${hdf5_ver}
module load netcdf/${netcdf_ver}

module list

#############################################################
# WCOSS environment settings
#############################################################
export cyc=%CYC%
export cycle=t%CYC%z
export USE_CFP=YES

############################################################
# CALL executable job script here
############################################################
${HOMEgfs}/jobs/JGDAS_ENKF_DIAG

if [ $? -ne 0 ]; then
ecflow_client --msg="***JOB ${ECF_NAME} ERROR RUNNING J-SCRIPT ***"
ecflow_client --abort
exit
fi

%include <tail.h>
%manual

%end
53 changes: 53 additions & 0 deletions ecf/scripts/enkfgdas/analysis/create/jenkfgdas_select_obs.ecf
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#PBS -S /bin/bash
#PBS -N enkfgdas_select_obs_%CYC%
#PBS -j oe
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=00:10:00
#PBS -l place=vscatter,select=30:mpiprocs=16:ompthreads=8:ncpus=128:mem=500GB

model=gfs
%include <head.h>
%include <envir-p1.h>

set -x

export NET=%NET:gfs%
export RUN=%RUN%
export CDUMP=%RUN%

############################################################
# Load modules
############################################################
module load cray-mpich/${cray_mpich_ver}
module load cray-pals/${cray_pals_ver}
module load cfp/${cfp_ver}
module load python/${python_ver}
module load hdf5/${hdf5_ver}
module load netcdf/${netcdf_ver}
module load crtm/${crtm_ver}

module list

#############################################################
# WCOSS environment settings
#############################################################
export cyc=%CYC%
export cycle=t%CYC%z
export USE_CFP=YES

############################################################
# CALL executable job script here
############################################################
${HOMEgfs}/jobs/JGDAS_ENKF_SELECT_OBS

if [ $? -ne 0 ]; then
ecflow_client --msg="***JOB ${ECF_NAME} ERROR RUNNING J-SCRIPT ***"
ecflow_client --abort
exit
fi

%include <tail.h>
%manual

%end
52 changes: 52 additions & 0 deletions ecf/scripts/enkfgdas/analysis/create/jenkfgdas_update.ecf
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#PBS -S /bin/bash
#PBS -N enkfgdas_update_%CYC%
#PBS -j oe
#PBS -q %QUEUE%
#PBS -A %PROJ%-%PROJENVIR%
#PBS -l walltime=00:30:00
#PBS -l place=vscatter,select=50:mpiprocs=10:ompthreads=12:ncpus=120:mem=500GB

model=gfs
%include <head.h>
%include <envir-p1.h>

set -x

export NET=%NET:gfs%
export RUN=%RUN%
export CDUMP=%RUN%

############################################################
# Load modules
############################################################
module load cray-mpich/${cray_mpich_ver}
module load cray-pals/${cray_pals_ver}
module load cfp/${cfp_ver}
module load python/${python_ver}
module load hdf5/${hdf5_ver}
module load netcdf/${netcdf_ver}

module list

#############################################################
# WCOSS environment settings
#############################################################
export cyc=%CYC%
export cycle=t%CYC%z
export USE_CFP=YES

############################################################
# CALL executable job script here
############################################################
${HOMEgfs}/jobs/JGDAS_ENKF_UPDATE

if [ $? -ne 0 ]; then
ecflow_client --msg="***JOB ${ECF_NAME} ERROR RUNNING J-SCRIPT ***"
ecflow_client --abort
exit
fi

%include <tail.h>
%manual

%end
Loading