[develop] Introduce DA data preprocessing to workflow #778

mkavulich · 2023-05-08T00:58:00Z

DESCRIPTION OF CHANGES:

This is a set of contributions as part of the RRFS/dev merge. The major capabilities include:

A new task (get_da_obs) that retrieves and stages observation files for the subsequent data processing tasks. Most of this logic is existing consolidated logic that has been moved out of the process* tasks.
Add workflow entries for running get_da_obs, process_radarref, process_lightning, and process_bufrobs tasks
Add a new config.rrfs.yaml file for RRFS-specific options; this will be expanded over time as more capabilities are added.
Add a new test (process_obs) to test these initial observation processing tasks. Note that in order to run this test, the app must be built with the rrfs_utils and gsi builds, which is not part of the default build at this time.
More verbose errors for run_WE2E_tests.py
Improvements to retrieve_data.py
- Rename --file_type and --external_model arguments to --file_fmt and --data_type, respectively. These are less confusing option names since both model data and obs can be retrieved with this script.
- Fix bug where script was not looking for files in additional locations if they exist
- Fix bug where script failed if any files are missing from zip archive; this should not be a failure condition unless all are missing
- Make some arguments optional if it's possible to continue without them
- Give more helpful/verbose error messages if an invalid/unavailable data store was selected
- Add new unit tests "test_rap_obs_from_hpss" and "test_rap_e_obs_from_hpss"

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

TESTS CONDUCTED:

Ran fundamental tests on Hera and Jet, all passed. Ran new "process_obs" test on Hera and Jet that just tests the observation preprocessing tasks, confirmed that all expected tasks ran and completed successfully.

DEPENDENCIES:

None

DOCUMENTATION:

None

ISSUE:

Resolves #675

CHECKLIST

My code follows the style guidelines in the Contributor's Guide
I have performed a self-review of my own code using the Code Reviewer's Guide
I have commented my code, particularly in hard-to-understand areas
My changes do not require updates to the documentation (explain).
- RRFS documentations will be updated when merge complete
My changes generate no new warnings
New and existing tests pass with my changes

CONTRIBUTORS (optional):

Workflow changes contributed by @christinaholtNOAA.

- "archive_internal_dir" must be an empty string for top-level archive files - Rename "file_type" argument to "file_fmt" - Rename "external_model" argument to more generic/correct "data_type" - If only some files can not be un-zipped, do not fail. We'll handle this more gracefully later if it needs to be an error - Remove default location logic for now; couldn't get it working - Re-add explicit disk path requirement; may re-visit later - For fatal error, do not imply that script continues - Remove debug prints

"rap_e" files in addition to "rap".

specifically HPSS fails with some kind of logging error. Needs more investigation. This commit removes some un-needed, un-initialized variables, adds modulefiles for the new task on each platform, and adds some unit tests for retrieve_data.py to test new capabilities. The new task entry has been added to parm/wflow/coldstart.yaml for testing, but it should be removed prior to merging this PR since the task is not integrated with the rest of the workflow yet.

… tasks going forward. Clearly the boundary conditions tasks will need to eventually change also.

- Link obs needed for task_process_bufr - Put obs in $DATA/obs rather than $DATA/gsi - Put in logic for getting "rap_e" files at 00 and 12z cycles; I hope we can extract this logic into the config file later with Jinja - Add "task_process_bufr" entry to warmstart.yaml with a dependency on task get_da_obs - Add "CYCLE_TYPE" variable to config_defaults - Start a new config.rrfs.yaml file

- Set machine-specific paths in machine files, this should have been obvious... - Use RRFS_NA_13km domain for RRFS - Re-organize Jet and Hera machine files for section consistency - Fix grammar error in retrieve_data.py

- Move date variables to J-job level - Keep da_obs in same directory for community and nco mode (since process_bufr does the same) - In process_bufr, look for da_obs in staging directory

- Add default directory to Jet machine file...only available there for now - Add task_process_lightning to warmstart workflow

…ve to Hera. Also fix config.rrfs.yaml to better settings for test case.

…to avoid duplicate error messages, search for RRFS data on disk as well as hpss

… on Hera!

Removing the CYCLE_TYPE from config_defaults because it is a run-time, cycle-dependent setting, not an experiment level one.

This is similar to what is done for external model files so that we can use the machine file information about which data stores a platform has access to.

Matching new standard outlined in Issue ufs-community#633

…obs_task Had to manually revert some renaming changes in ush/test_retrieve_data.py

…obs_task

MichaelLueken · 2023-05-08T15:30:44Z

@mkavulich Following the merging of PR #728 this morning, there is now a conflict in ush/test_retrieve_data.py. Please merge the latest develop into your feature/add_get_obs_task branch at your earliest convenience.

…obs_task

…ata isnt ready by 00z two days later

…l problems when running all unit tests

mkavulich · 2023-05-10T03:22:33Z

@MichaelLueken I have merged in the latest changes from develop, and fixed some problems with the merged unit tests. I can not figure out why the linter is unhappy (I ran the linter on Jet for develop and got 9.74/10, so I don't know what I'm supposed to be aiming for). I have asked @christinaholtNOAA to take a look.

MichaelLueken · 2023-05-10T13:58:26Z

@mkavulich It looks like the new linting GitHub Action test gave you a score of 9.98/10. I see the following message:

************* Module test_python.test_calculate_cost
tests/test_python/test_calculate_cost.py:1:0: R0801: Similar lines in 2 files
==test_python.test_set_FV3nml_ens_stoch_seeds:30
==test_python.test_set_FV3nml_sfc_climo_filenames:26
        test_dir = os.path.dirname(os.path.abspath(__file__))
        USHdir = os.path.join(test_dir, "..", "..", "ush")
        PARMdir = os.path.join(USHdir, "..", "parm")

        # Create a temporary experiment directory structure
        self.tmp_dir = tempfile.TemporaryDirectory(
            dir=os.path.dirname(__file__),
            prefix="expt",
            )
        EXPTDIR = self.tmp_dir.name (duplicate-code)

-----------------------------------

christinaholtNOAA · 2023-05-10T14:22:47Z

@mkavulich may have been the first one to check my work on-prem. I've pinned the GitHub actions to pylint v2.6, which is consistent with what's available in an old version of regional_workflow on Jet/Hera. I need to open an Issue/PR and bump that version to the pylint version available in the appropriate supported environments. This should help folks debug locally before pushing.

I'm sorry for the inconvenience.

Edit: I opened Issue #787 to document the need for this change.

EdwardSnyder-NOAA · 2023-05-10T17:56:40Z

Ran the process_obs WE2E test, config.rrfs.yaml experiment, and test_retrieve_data.py unit tests on Hera without any issues. All code checks out as well. The only suggestion I have is to perhaps rename the config.rrfs.yaml to something else, since we aren't releasing the full RRFS app, and the naming could be deceiving.

…obs_task

EdwardSnyder-NOAA · 2023-05-16T15:28:03Z

Approving based on our talk yesterday that the config.rrfs.yaml will be renamed.

tests/test_python/test_retrieve_data.py

christinaholtNOAA · 2023-05-18T15:10:06Z

It looks like many of the major platform/compiler combos passed the Jenkins tests and Gaea's lack of Intel compiler module held up the others (best I can tell).

Can anyone help us get this PR through the pipeline given that the PR itself looks fine, but not Gaea?

natalie-perlin · 2023-05-18T16:18:14Z

@christinaholtNOAA - Gaea has been updated, and new regression tests now pass with the newly updated stack for the C3 and C4 partitions, see WM issue-1755 (1)
, WM issue-1755 (2)

At this point it is a matter of updating the ufs-weather-model repository pointing to the new software stack.

christinaholtNOAA · 2023-05-18T16:24:46Z

Okay, but what does that mean for this PR?

natalie-perlin · 2023-05-18T16:32:21Z

@christinaholtNOAA - Could Gaea possibly be added to the testing process? (still pending updates to the stack location)

christinaholtNOAA · 2023-05-18T17:12:23Z

@natalie-perlin Are you suggesting that we update the version of the weather model in this PR to make the Gaea tests pass?

That is way outside the scope of the work in this PR, and it would be great if we could find a solution that lets us merge the current PR as-is since the failures are due to factors outside our control and we have the approvals.

We're also only adding new capabilities here, not modifying old ones.

natalie-perlin · 2023-05-18T17:30:03Z

@christinaholtNOAA - Oh, the question was whether any changes in machine file for Gaea are expected, but not implemented due to issues following the upgrade. If no changes are expected for Gaea, and it only holds the PR from being fully passed, then no objections!

MichaelLueken · 2023-05-19T13:18:06Z

On Cheyenne GNU, the nco_grid_RRFS_CONUS_3km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15_thompson_mynn_lam3km test failed intermittently in run_post. This is an expected failure.

The Gaea tests will fail until PR #799 is merged.

On Hera GNU, the get_from_NOMADS_ics_FV3GFS_lbcs_FV3GFS test failed in the get_extrn_ics/lbcs tasks. Since this test has been added, it fails intermittently with this same issue.

On Hera Intel, the grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1beta test is failing in the run_fcst task. This is an expected failure.

Since all of the failures are expected, I will go ahead and merge this work now.

…community#778) * Construct var_defns components from dictionary. * Bring back config_defaults.yaml * Add support for sourcing yaml file into shell script. * Remove newline for printing config, json config fix. * Make QUILTING a sub-dictionary in predef_grids * Reorganize config_defaults.yaml by task and feature. * Bug fix with QUILTING=true. * Structure a dictionary based on a template dictionary. * Convert all WE2E config files to yaml. * Take care of problematic chars when converting to shell string. * Process only selected keys of config. * Add symlinked yaml config files. * Actually use yaml config files for WE2E tests. * Delete all shell WE2E configs. * Don't check for single quotes in test description. * Make WE2E work with yaml configs. * Make yaml default config format. * Bug fix in run_WE2E script. * Add utility to check validity of yaml config file. * Add config utility interface in ush directory. * Remove unused check_expt_config_vars script. * Add description to default config. * Reorganize source_config. * Add XML as one of the config formats. * Update custom_ESGgrid config. * Bug fix due to update. * Change ensemble seed. * Change POST_OUTPUT group due to merge. * Make xml and ini configs work. * Maintain config structure down to var_defns. * Add function to load structured shell config, put description under metadata * Flatten dicts before importing env now that shell config is structured. * Support python regex for selecting dict keys. * Add capability of sourcing task specific portion of config file. * Access var_defns via env variable. * Make names of tasks consistent with ex- and j- job script names. * Append pid to temp file. * Prettify user config, don't use " in xml texts. * Compare timestamp of csv vs all files instead of directory. * Fixes for some pylint suggestions. * Convert new configs to yaml. * Format python files with black (no functional change). * More readable yaml/json formats by using more data types. Only datetime type is now in quotes. * More readable yaml config files for WE2E and default configs. * Make config_defaults itself more readable. * Correct pyyaml list indentation issue. * Fix indentation in all config files. * Use unquoted WTIME in config_defaults * Cosmotic changes. * Fix due to merge. * Make __init__.py clearer. * Fixes due to merge. * Minor edits of comments. * Remove wcoss_dell_p3 from workflow (ufs-community#810) * remove wcoss_dell_p3 * remove block for tide and gyre * Replace deprecated NCAR python environment with conda on Cheyenne (ufs-community#812) * Fix issue on get_extrn_lbcs when FCST_LEN_HRS>=40 with netcdf (ufs-community#814) * activate b file on hpss for >40h * add a new we2e test for fcst_len_hrs>40 * reduce fcst time for we2e * Convert new test case to yaml. * Fix formatting due to merge. * Convert new test case to yaml. * Fix unittest. * Merge develop * Remove exception logic from __init__.py * Minor change to cmd concat. * Make grid gen methods return dictionary, simplifis code a lot. * Add a comment why we are suppressing yaml import exception. * Minor change to beautify unittest output. * Add status badge for functional tests. * Reorder tasks in config_default and we2e test cases to match order in FV3LAM.xml * Keep single quotes and newlines in we2e test description. * Revert back to not rounding to 10 digits Co-authored-by: Chan-Hoo.Jeon-NOAA <60152248+chan-hoo@users.noreply.github.com> Co-authored-by: Michael Kavulich <kavulich@ucar.edu>

mkavulich and others added 29 commits April 19, 2023 20:27

First draft, doesnt work

7cbc445

Emergency commit, still not working

340f9ab

First draft of get obs exscript. Modified data_locations to look for

61af001

"rap_e" files in addition to "rap".

It works! Just needed to specify HPSS partition

5429625

Start new workflow yaml "warmstart.yaml" that will be used for GSI/DA…

d0472d8

… tasks going forward. Clearly the boundary conditions tasks will need to eventually change also.

Prevent user from entering empty --tests argument to run_WE2E_tests.py

7471a0a

Fix incorrect path after rebase, add alternate path for older RAP data

9d1b37b

Some better design choices after putting some more thought into it

158b7e1

- Set machine-specific paths in machine files, this should have been obvious... - Use RRFS_NA_13km domain for RRFS - Re-organize Jet and Hera machine files for section consistency - Fix grammar error in retrieve_data.py

Data now successfully staged for process_bufr task!

24aac12

- Move date variables to J-job level - Keep da_obs in same directory for community and nco mode (since process_bufr does the same) - In process_bufr, look for da_obs in staging directory

Add NLDN lightning data to get_da_obs

a99fd5b

- Add default directory to Jet machine file...only available there for now - Add task_process_lightning to warmstart workflow

Adding NLDN lightning to get_da_obs. Not fully tested yet, need to mo…

6679d59

…ve to Hera. Also fix config.rrfs.yaml to better settings for test case.

Use "else" statement in final try block for generate_FV3LAM_wflow.py …

64dbc63

…to avoid duplicate error messages, search for RRFS data on disk as well as hpss

Hot diggity, got the process_bufr and process_lightning tasks working…

1f1c497

… on Hera!

Untested first pass at defining the early workflow

252eade

Partial incorporation of Christina's changes from ufs-community#725

a612a0f

CYCLE_TYPE is a run-time arg.

870722e

Removing the CYCLE_TYPE from config_defaults because it is a run-time, cycle-dependent setting, not an experiment level one.

Make paths and data stores configurable.

769c1af

This is similar to what is done for external model files so that we can use the machine file information about which data stores a platform has access to.

Remove duplicative information from machine files.

de99914

Renaming process_bufr to process_bufrobs.

4f56a54

Matching new standard outlined in Issue ufs-community#633

Working prepbufr getting and processing workflow.

412fa6d

This version of rrfs_beta might work.

5530100

Turn it off since we can't use it.

c9fe408

Merge remote-tracking branch 'upstream/develop' into feature/add_get_…

ec0eae1

…obs_task Had to manually revert some renaming changes in ush/test_retrieve_data.py

Updates to fix failing unit tests

f2aca37

Merge remote-tracking branch 'upstream/develop' into feature/add_get_…

c88704a

…obs_task

Update get_da_obs modulefiles for latest develop changes

5ab55b8

mkavulich requested a review from gsketefian as a code owner May 8, 2023 00:58

This was referenced May 8, 2023

Get and process radar data in separate workflow tasks. #782

Closed

Add get_obs tasks to workflow using retrieve_data.py. #675

Closed

mkavulich added 4 commits May 8, 2023 21:58

Merge remote-tracking branch 'upstream/develop' into feature/add_get_…

54df375

…obs_task

Add command-line instructions for unit tests

0c4e3c6

Change "two days ago" tests to "three days ago"...occasionally some d…

cbccf74

…ata isnt ready by 00z two days later

Remove references to current working directory; this causes occasiona…

751a11f

…l problems when running all unit tests

Allow this sort of similarity in tests.

b158542

Merge remote-tracking branch 'upstream/develop' into feature/add_get_…

69c6abd

…obs_task

EdwardSnyder-NOAA approved these changes May 16, 2023

View reviewed changes

MichaelLueken reviewed May 16, 2023

View reviewed changes

tests/test_python/test_retrieve_data.py Outdated Show resolved Hide resolved

christinaholtNOAA added 3 commits May 16, 2023 19:36

Update wording to reflect that this is not RRFS.

a537195

Rename the community config for da cycling.

c31a0b6

Attempt to fix test that only checks the URL.

ae8ebc8

MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label May 17, 2023

MichaelLueken merged commit 4892952 into ufs-community:develop May 19, 2023

mkavulich mentioned this pull request May 22, 2023

[develop] Integrate Unified Workflow templater tool #793

Merged

37 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[develop] Introduce DA data preprocessing to workflow #778

[develop] Introduce DA data preprocessing to workflow #778

mkavulich commented May 8, 2023

MichaelLueken commented May 8, 2023

mkavulich commented May 10, 2023

MichaelLueken commented May 10, 2023

christinaholtNOAA commented May 10, 2023 •

edited

Loading

EdwardSnyder-NOAA commented May 10, 2023

EdwardSnyder-NOAA commented May 16, 2023

christinaholtNOAA commented May 18, 2023

natalie-perlin commented May 18, 2023

christinaholtNOAA commented May 18, 2023

natalie-perlin commented May 18, 2023

christinaholtNOAA commented May 18, 2023

natalie-perlin commented May 18, 2023

MichaelLueken commented May 19, 2023

[develop] Introduce DA data preprocessing to workflow #778

[develop] Introduce DA data preprocessing to workflow #778

Conversation

mkavulich commented May 8, 2023

DESCRIPTION OF CHANGES:

Type of change

TESTS CONDUCTED:

DEPENDENCIES:

DOCUMENTATION:

ISSUE:

CHECKLIST

CONTRIBUTORS (optional):

MichaelLueken commented May 8, 2023

mkavulich commented May 10, 2023

MichaelLueken commented May 10, 2023

christinaholtNOAA commented May 10, 2023 • edited Loading

EdwardSnyder-NOAA commented May 10, 2023

EdwardSnyder-NOAA commented May 16, 2023

christinaholtNOAA commented May 18, 2023

natalie-perlin commented May 18, 2023

christinaholtNOAA commented May 18, 2023

natalie-perlin commented May 18, 2023

christinaholtNOAA commented May 18, 2023

natalie-perlin commented May 18, 2023

MichaelLueken commented May 19, 2023

christinaholtNOAA commented May 10, 2023 •

edited

Loading