Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/met on hera #552

Merged
merged 4 commits into from
Jul 16, 2021
Merged

Conversation

siwei-noaa
Copy link
Contributor

DESCRIPTION OF CHANGES:

  1. Add if condition in tests/run_experiments.sh so to get informative error message when MET and MET paths are not available on a machine.
  2. Remove MET/MET+ paths in tests/baseline_configs/config.verification.sh

TESTS CONDUCTED:

A test run was conducted and has been finished successfully.

DEPENDENCIES:

To have MET verification run successfully, the observational data (e.g., CCPA, MRMS, NDAS) must be available.

DOCUMENTATION:

N/A

ISSUE (optional):

This is a follow up PR to complete the previous one in https://github.com/NOAA-EMC/regional_workflow/pull/537

CONTRIBUTORS (optional):

@gsketefian contributed the revision.

#
METPLUS_PATH=\"${metplus_path}\"
MET_INSTALL_DIR=\"${met_install_dir}\""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of these two lines?

METPLUS_PATH="${metplus_path}"
MET_INSTALL_DIR="${met_install_dir}""

I don't see where the lower-case versions of these variables are defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a mistake in define the variable. line 804 and 805 should be the lower-case. I will correct this.

tests/run_experiments.sh Show resolved Hide resolved
#
if [ "${RUN_TASK_VX_GRIDSTAT}" = "TRUE" ] || [ "${RUN_TASK_VX_POINTSTAT}" = "TRUE" ]; then
if [ "$MACHINE" = "HERA" ]; then
MET_INSTALL_DIR="/contrib/met/10.0.0"
Copy link
Collaborator

@gsketefian gsketefian Jul 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how the test worked if on lines 792 and 793, MET_INSTALL_DIR and METPLUS_PATH are defined (instead of met_install_dir and metplus_path), but on the right-hand sides of lines 804 and 805, the lower-case variables are used. That should have failed with an "undefined variable" error (because set -u is activated at the start of this script). Use the lowercase ones on lines 792-793.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the thing you should test is running this on a machine other than Hera to see if this error-catching actually works.

@gsketefian
Copy link
Collaborator

@siwei-noaa Can you also run this on a machine other than Hera to see if this error-catching actually works? Thanks.

@siwei-noaa
Copy link
Contributor Author

@siwei-noaa Can you also run this on a machine other than Hera to see if this error-catching actually works? Thanks.

Yes, I can test this on Jet. But if we need to have observation data to obtain the wanted crash or not necessary? If you have ever run regional workflow on Jet? If so, I would like to mimic your configuration. Thank you.

@gsketefian
Copy link
Collaborator

@siwei-noaa You don't need observation data because the workflow should not even be generated. You don't even need to build the code. Just clone it, do the manage_externals step, then try running run_experiments.sh just for the config.MET_verification.sh test. You should get an error (the error message you put in as part of this PR).

We have run regional_workflow on Jet (@JeffBeck-NOAA runs it all the time), but you probably don't need to get all his configuration settings, etc.

@siwei-noaa
Copy link
Contributor Author

@siwei-noaa You don't need observation data because the workflow should not even be generated. You don't even need to build the code. Just clone it, do the manage_externals step, then try running run_experiments.sh just for the config.MET_verification.sh test. You should get an error (the error message you put in as part of this PR).

We have run regional_workflow on Jet (@JeffBeck-NOAA runs it all the time), but you probably don't need to get all his configuration settings, etc.

@gsketefian Could you please tell me how to run run_experiments.sh quickly? I just tried and got error.

@gsketefian
Copy link
Collaborator

@siwei-noaa Can you point me to the directory on Jet in which you're running run_experiments.sh? What is the error you're getting (you're supposed to get an error)?

@siwei-noaa
Copy link
Contributor Author

The dir is: /mnt/lfs4/BMC/wrfruc/she/ufs-srweather-app/regional_workflow/
I just did ./run_experiments.sh
The error asked me for machine/account name. I have them in use/config.sh, but I do not know how to connect the config.sh to run_experiments.sh.
@gsketefian

@gsketefian
Copy link
Collaborator

@siwei-noaa First, create under the tests directory a text file named my_expts.txt that contains the single line "MET_verification". Then run run_experiments.sh as follows (replace "rtrr" with the name of your account):

./run_experiments.sh expts_file="my_expts.txt" machine=jet account=rtrr

On Jet, I copied your directory to mine and ran the test, and it gave the expected error (see below), so you'r actually good to go! You can try the above command too to make sure you get the same behavior as I did. Approving now.

[Gerard.Ketefian@(jet)fe7] /lfs1/.../regional_workflow/tests (feature/MET_on_Hera)
$ pwd
/mnt/lfs1/BMC/gsd-fv3/Gerard.Ketefian/UFS_CAM/siwei/ufs-srweather-app/regional_workflow/tests

[Gerard.Ketefian@(jet)fe7] /lfs1/.../regional_workflow/tests (feature/MET_on_Hera)
$ ./run_experiments.sh expts_file="my_expts.txt" machine=jet account=rtrr

The arguments to the script in file

  "/mnt/lfs1/BMC/gsd-fv3/Gerard.Ketefian/UFS_CAM/siwei/ufs-srweather-app/regional_workflow/tests/run_experiments.sh"

have been set as follows:

  declare -- expts_file="my_expts.txt"
  declare -- machine="jet"
  declare -- account="rtrr"
  declare -- expt_basedir=""
  declare -- testset_name=""
  declare -- use_cron_to_relaunch=""
  declare -- cron_relaunch_intvl_mnts=""
  declare -- verbose=""
  declare -- stmp=""
  declare -- ptmp=""

Reading in list of forecast experiments from file
  expts_list_fp = "/mnt/lfs1/BMC/gsd-fv3/Gerard.Ketefian/UFS_CAM/siwei/ufs-srweather-app/regional_workflow/tests/my_expts.txt"
and storing result in the array "all_lines" (one array element per expe-
riment)...

All lines from experiments list file (expts_list_fp) read in, where:
  expts_list_fp = "/mnt/lfs1/BMC/gsd-fv3/Gerard.Ketefian/UFS_CAM/siwei/ufs-srweather-app/regional_workflow/tests/my_expts.txt"
Contents of file are (line by line, each line within single quotes, and
before any processing):

'MET_verification'


After processing, the number of experiments to run (num_expts) is:
  num_expts = 1
The list of forecast experiments to run (one experiment per line) is gi-
ven by:
  'MET_verification'


Processing experiment "MET_verification" ...

ERROR:
  From script:  "run_experiments.sh"
  Full path to script:  "/mnt/lfs1/BMC/gsd-fv3/Gerard.Ketefian/UFS_CAM/siwei/ufs-srweather-app/regional_workflow/tests/run_experiments.sh"
The MET and MET+ paths (MET_INSTALL_DIR and MET_INSTALL_DIR) have not been specified for
this machine (MACHINE): MACHINE= "JET"
Exiting with nonzero status.



@siwei-noaa
Copy link
Contributor Author

Perfect, thank you. @gsketefian

@gsketefian gsketefian merged commit 140f1b0 into ufs-community:develop Jul 16, 2021
@chan-hoo
Copy link
Collaborator

@siwei-noaa @gsketefian @JeffBeck-NOAA , Have you run this WE2E test on Hera? I got an error: model files does not exist:
extrn_mdl_source_dir = "/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files/FV3GFS/2019041500". I can't see the data directory for the date (20190415) on Hera.

@gsketefian
Copy link
Collaborator

@chan-hoo I did run @siwei-noaa 's experiment on Hera (see my replay to him above). It should give the following error since MET and MET+ have not been properly set up on Hera:

ERROR:
  From script:  "run_experiments.sh"
  Full path to script:  "/mnt/lfs1/BMC/gsd-fv3/Gerard.Ketefian/UFS_CAM/siwei/ufs-srweather-app/regional_workflow/tests/run_experiments.sh"
The MET and MET+ paths (MET_INSTALL_DIR and MET_INSTALL_DIR) have not been specified for
this machine (MACHINE): MACHINE= "JET"
Exiting with nonzero status.

I did not get to the point of your error; it shouldn't get to that point. Do you have an experiment directory on Hera I can look at?

@chan-hoo
Copy link
Collaborator

@gsketefian, here is my dir on Hera: /scratch2/NCEPDEV/fv3-cam/Chan-hoo.Jeon/ufs_srw_app/srw_dev_test/expt_dirs/MET_verification

@gsketefian
Copy link
Collaborator

@chan-hoo Sorry, I got confused. It's supposed to work on Hera and fail on other machines. I guess then this is a question for @siwei-noaa. @siwei-noaa, which date did you use for your test?

@chan-hoo
Copy link
Collaborator

@siwei-noaa , I think you might use your own config.sh file or didn't update the 'config.MET_verification.sh' file. In this file or 'run_experiments.sh', 'CCPA_OBS_DIR', 'MRMS_OBS_DIR', and 'NDAS_OBS_DIR' are not specified. I used the data in '/scratch2/BMC/det/UFS_SRW_app/v1p0/obs_data/' on Hera for testing the verification task.

@gsketefian
Copy link
Collaborator

@chan-hoo I'm guessing @siwei-noaa didn't use the run_experiments.sh script but instead called generate_FV3LAM_wflow.sh using his own config.sh. But he can confirm. If so, this PR needs to be retested using run_experiments.sh.

@chan-hoo
Copy link
Collaborator

@gsketefian, I was adding wcoss part to run_experiments.sh. If you and @siwei-noaa agree, I'll update run_experiments.sh in my PR.

@gsketefian
Copy link
Collaborator

gsketefian commented Jul 22, 2021

@chan-hoo You mean in a new PR, you're going to add wcoss stanzas to run_experiments.sh? That works for me.

But we still need to make sure that MET_verification works on Hera. It seems like a matter of just adding the FV3GFS files for 2019041500 to the staged-data directory. @siwei-noaa probably has those somewhere since he ran the case successfully somehow.

And I guess also adding the variables 'CCPA_OBS_DIR', 'MRMS_OBS_DIR', and 'NDAS_OBS_DIR' somewhere, like @chan-hoo said.

@chan-hoo
Copy link
Collaborator

chan-hoo commented Jul 22, 2021

@gsketefian, ok. I'll wait until @siwei-noaa resolves this issue, and then update it for wcoss.

@gsketefian
Copy link
Collaborator

@siwei-noaa @chan-hoo I created a new issue for this (#560). Things have changed a bit since I merged in my big WE2E reorg PR (#531). The issue has more info, so please take a look. Thx.

@siwei-noaa
Copy link
Contributor Author

@chan-hoo and @gsketefian, sorry for my late response as I had my annual leave last week.

First, I would like to confirm some information about my WE2E test on Hera: (1) my test used date 20190415, (2) I did not run run_experiments.sh but call generate_FV3LAM_wflow.sh, (3) I used staged files from "/scratch2/BMC/fv3lam/RRFS_baseline/model_data/FV3GFS" and observed files from "/scratch2/BMC/fv3lam/RRFS_baseline/obs_data/". These directories only have files for our RRFS baseline experiments, but I do not know the directories for general use.

Then, could you please tell me what should I do next? I saw https://github.com/NOAA-EMC/regional_workflow/issues/560 wants to include CCPA_OBS_DIR, MRMS_OBS_DIR, and NDAS_OBS_DIR. As I mentioned in the previous section, if we have a general directory that includes these observations?

@chan-hoo
Copy link
Collaborator

@siwei-noaa, as can be seen in 'regional_workflow/tests/WE2E/run_WE2E_tests.sh' (Lns 932-945), the user staged external data should be located in the designated directories. If you have permission, you can put the data for 20190415 into the directories. You should add the *_OBS_DIR parameters to somewhere between Lns 1019-1039.

@gsketefian
Copy link
Collaborator

@siwei-noaa @chan-hoo I fetched the 20190415 nemsio tar file from HPSS, and it contained only the analysis, no forecast files. So LBCs couldn't be generated. That's why I then tried another data a year later -- 20200415. That worked, and the data is in that "staged_extrn_mdl_files" directory on Hera. Not sure how @siwei-noaa generated the LBCs. See Issue #560 for details.

@siwei-noaa
Copy link
Contributor Author

@gsketefian @chan-hoo I did not generate the LBCs. All these files are from Jamie Wolff (/scratch2/BMC/fv3lam/RRFS_baseline/model_data/FV3GFS), and they are ready to be used.

@siwei-noaa
Copy link
Contributor Author

@chan-hoo Thank you for your information on run_WE2E_tests.sh. May I know if /scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files is a permanent directory for staging file? I am asking because I think the dates of stage and observation files would vary case by case. Or should we stay on 20190415?

@gsketefian
Copy link
Collaborator

@siwei-noaa On Hera, the directory
/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files
is where we currently keep all the external model files needed for the WE2E tests. I am currently trying to get a role account (on Hera, Jet, and Orion) that a few of us can use so we can move the contents of this directory under that account.

I wonder how Jamie got the LBC files. She may have done something that is not doable as part of the workflow. @jwolff-ncar , can you say how you got the FV3GFS forecasts from which to generate LBCs? They are not in the nemsio tar file that the workflow gets from NOAA-HPSS.

@jwolff-ncar
Copy link
Contributor

To keep the tests consistent and to avoid adding more data to disk, shouldn't we simply use the GST date (20190615)? The data we need for any WE2E test should be staged on hera at /scratch2/BMC/det/UFS_SRW_app/v1p0/obs_data/. The data we are using the the RRFS baselining was from Jili (we did not pull that data our selves).

@chan-hoo
Copy link
Collaborator

I agree with @jwolff-ncar.

@gsketefian
Copy link
Collaborator

@jwolff-ncar Yes, that is fine. I just guessed another date thinking that's the time of year you wanted.

The tests by default call the make_ics and make_lbcs tasks, so we would normally start from the FV3GFS files and generate the ICs and LBCs from those (instead of starting with pre-made IC and LBC files). @siwei-noaa, can you change the date in config.MET_verification.sh to the GST date and set any other machine-specific parameters in run_WE2E_tests.sh (e.g. CCPA_OBS_DIR, MRMS_OBS_DIR, and NDAS_OBS_DIR, any others?) right after where the MET and MET+ directories are set for Hera (do it the same way MET_INSTALL_DIR and METPLUS_PATH are set, i.e. lowercase variables first, then further below use uppercase)? Then try running the MET_verification test using run_WE2E_tests.sh:

> cd regional_workflow/tests/WE2E
> more my_tests.txt
MET_verification
> ./run_WE2E_tests.sh tests_file=my_tests.txt machine=hera account=rtrr

Thanks. Please let me know if you have any questions.

@siwei-noaa
Copy link
Contributor Author

@gsketefian Sure, I can do this. Should my changes based on https://github.com/NOAA-EMC/regional_workflow/issues/560 or the latest develop branch?

@gsketefian
Copy link
Collaborator

@siwei-noaa Please start with the latest develop branch (any new PR you want to create you should start with the latest develop branch).

@siwei-noaa
Copy link
Contributor Author

@gsketefian Should I create 'my_tests.txt' myself? It seems this file does not exist.

@gsketefian
Copy link
Collaborator

@gsketefian Should I create 'my_tests.txt' myself? It seems this file does not exist.

@siwei-noaa Yes, you have to create it (and call it whatever you want, just use the same name in the all to run_WE2E_tests.sh). It is just a text file containing the names of the tests you want to run. In this case, it contains just one line that says "MET_verification".

@chan-hoo
Copy link
Collaborator

@siwei-noaa, you can refer to my doc (Section 4.4): https://drive.google.com/file/d/12r9mSHSYgI5O3Whgeit9pIGmfIcoRm7T/view?usp=sharing

and my sample file on my github:
https://github.com/chan-hoo/regional_workflow_config/tree/main/hera/WE2E

The structure of WE2E has been changed recently. So, it is not the same as that in the user's guide of SRW App.

@siwei-noaa
Copy link
Contributor Author

@gsketefian and @chan-hoo I see, thank you for the information.

@siwei-noaa
Copy link
Contributor Author

@gsketefian It seems "/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files/FV3GFS/" does not have the required IC files for the GST date (20190615). Should I still use GST date rather than 20200415? However, the issue of using 20200415 is that the observations are not available at "/scratch2/BMC/det/UFS_SRW_app/v1p0/obs_data/".

@jwolff-ncar
Copy link
Contributor

jwolff-ncar commented Jul 29, 2021 via email

@siwei-noaa
Copy link
Contributor Author

Thank you for the information, @jwolff-ncar. However, it seems that this directory also misses certain required files, such as gfs.atmanl.nemsio.
@gsketefian, should I update "extrn_mdl_source_basedir" to "/scratch2/BMC/det/UFS_SRW_app/v1p0/model_data" in run_WE2E_tests.sh?

@gsketefian
Copy link
Collaborator

@siwei-noaa It looks like the files do exist in that directory that's in my space. They're here:
/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files/FV3GFS/2019061500

I think the directory "/scratch2/BMC/det/UFS_SRW_app/v1p0/model_data" is just for the v1.0 release. If you change "extrn_mdl_source_basedir" to "/scratch2/BMC/det/UFS_SRW_app/v1p0/model_data", you will probably break many of the other tests since the files for those are in my directory. We will need to create a new data directory in a common space for the tests in the develop branch and move into it the files for only those dates that are needed (since we will probably end up removing some of the tests). I am waiting on getting a role account to do that (Curtis gave his approval, so it should be any time now).

Since my directory already contains the files you need, your test should work. Can you say what the error you're getting is?

@siwei-noaa
Copy link
Contributor Author

@gsketefian. I see. Here is the error:

 87 File fp does NOT exist on disk:
 88   fp = "/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files/FV3GFS/2019061500/gfs.atmanl.nemsio"
 89 Please ensure that the directory specified by extrn_mdl_source_dir exists
 90 and that all the files specified in the array extrn_mdl_fns_on_disk exist
 91 within it:
 92   extrn_mdl_source_dir = "/scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/staged_extrn_mdl_files/FV3GFS/2019061500"
 93   extrn_mdl_fns_on_disk = ( "gfs.atmanl.nemsio" "gfs.sfcanl.nemsio" )
 94 Exiting with nonzero status.

@gsketefian
Copy link
Collaborator

@siwei-noaa I think that error can be fixed if we specify the file format to be grib2. Can you replace the two lines

EXTRN_MDL_NAME_ICS="FV3GFS"
EXTRN_MDL_NAME_LBCS="FV3GFS"

in config.MET_verification.sh with these 4 lines and retry?

EXTRN_MDL_NAME_ICS="FV3GFS"
FV3GFS_FILE_FMT_LBCS="grib2"
EXTRN_MDL_NAME_LBCS="FV3GFS"
FV3GFS_FILE_FMT_ICS="grib2"

Thanks.

SamuelTrahanNOAA pushed a commit to SamuelTrahanNOAA/regional_workflow that referenced this pull request Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants