-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Determine list of WE2E tests for fundamental
test set
#277
Comments
@gsketefian Thanks for your effort to put together a first draft of this list. For ease of conversation I have replicated the list you described above in its entirety here:
I would like to propose a slightly reduced list of tests, in two parts, the first being one that has (in my opinion) no drawbacks, with the second being potentially controversial. Can definitely removeIn my opinion, these three tests can be removed without reducing any coverage of grids, suites, or basic functionality in the test suite:
This would result in significant core hour savings, reducing the cost from 113 to 76 "units" as described above; a 33% savings! Should be removed, but open to debateIn my opinion, we should remove the tests that correspond to physics suites and features that are no longer supported/deprecated. These can still be tested in the comprehensive suite, but we shouldn't waste time and resources testing these for every single change to the code:
This would result in another ~16 "unit" savings; in total, almost 50% savings from the original proposed list. Other thoughtsOnce this list is finalized and adopted, I would recommend we put a file |
@mkavulich All that sounds good to me. Note that ~6 new tests have been added to the set since I created the list, so I need to update the list and I/we have to figure out which of the new ones need to go into the fundamental set. Also, when running the comprehensive set, the script that does it needs to update that set on the fly since any given PR might be changing the set (add/delete tests, rename/move tests, update the tests.fundamental file, etc). |
Okay, so after some offline discussion, we have a proposed path forward: How the "fundamental test suite" can be runThe test config files for those tests considered "fundamental" will be modified to include a "FUNDAMENTAL_TEST=TRUE" variable (or similar). The run_WE2E_tests.sh script will be modified to look for this variable, and users can run the script with the "--fundamental=true" flag (or similar) to automatically run the proper set of tests. Because some tests only work on certain platforms (e.g. those tests that retrieve data from NOAA HPSS), the config file variable can have the appropriate platform-specific logic to avoid running those tests that will always fail on a given platform. List of fundamental tests
Several tests from the above list have been removed because they are known failures, or for other reasons that I specified in a previous comment. |
Description
The purpose of this issue is to propose and discuss a set of WE2E tests to run for each PR into the
ufs-srweather-app
and/orregional_workflow
repositories. These are referred to as "fundamental" tests in the SRW App's Code Management Document.Solution
A spreadsheet has been created that contains the complete set of WE2E tests (as of 20220525) with various parameters (e.g. grid on which the forecast runs, forecast length, time step, etc) listed for each. There are a total of 80 tests, and the rows highlighted in blue (of which there are 57) are the ones proposed to be included in the fundamental test set.
Note that there are 4 categories of WE2E tests:
grids_extrn_mdls_suites_community
(contains 41 tests)grids_extrn_mdls_suites_nco
(contains 3 tests)release_SRW_v1
(contains 1 test)wflow_featues
(contains 37 tests; two are duplicates (i.e. symlinks to other tests), so really only 35 unique tests)These categories correspond to subdirectories under the
ufs-srweather-model/regioal_workflow/tests/WE2E/test_configs
directory.As a first cut, the following procedure is used to determine which tests from the four categories to include in the set of fundamental tests.
From the category
grids_extrn_mdls_suites_community
, include the tests that use the 25km grids (RRFS_CONUS_25km
,RRFS_CONUScompat_25km
, andCONUS_25km_GFDLgrid
) as well as those that use the (very small in extent) 3km grid over Indianapolis (SUBCONUS_Ind_3km
). This means excluding tests that use the 13km and 3km grids over the CONUS, Alaska, and North America as well as the mid-sized 3km sub-CONUS grid (RRFS_SUBCONUS_3km
). This results in a total of 20 tests from thegrids_extrn_mdls_suites_community
category to include in the fundamental set of tests.Similarly, from the category
grids_extrn_mdls_suites_nco
, include only tests that run on the 25km grid. The only test in this category (out of a total of 3) that runs on such a grid isnco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR
. Thus, there is only one test from this category to include in the fundamental set of tests.From the
release_SRW_v1
category, include the testGST_release_public_v1
(which is the only test in this category) since it is the only long-term test in the complete set of WE2 tests (runs out to 48 hours). (We might want to rename this category or just move this test into one of the others.)Finally, from the
wflow_features
category, include all tests excepttemplate_vars
andnco_inline_post
since these are just symlinks to (i.e. alternate name for) thedeactivate_tasks
test inwflow_features
and thenco_grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_HRRR
test ingrids_extrn_mdls_suites_nco
, respectively, both of which are already included in the fundamental set of tests. This results in a total of 35 tests from this category to include in the fundamental test set. (Note that for platforms that do not have access to NOMADS, the testget_from_NOMADS_ics_FV3GFS_lbcs_FV3GFS_fmt_nemsio
should not be run.)With the above proposed approach, the total number of WE2E tests to be included in the fundamental tests is 20+1+35+1=57. The combined relative cost of these fundamental tests is about 113 units, where 1 unit of cost corresponds to running a 6-hour forecast on the RRFS_CONUS_25km grid with a time step of 40 sec. For comparison, the relative cost of running the full set of WE2E tests (of which there are 80) is 1622. (The relative cost of each WE2E test is listed in the spreadsheet.)
Notes on "cost":
To calculate the relative cost of a test, first calculate the absolute cost as follows:
where
nx
andny
are the number of points in the horizontal (x and y) directions,num_time_steps
is the number of time steps, andnum_fcsts
is the number of forecasts the test runs (this will be greater than 1 for tests that include multiple starting dates and/or run ensembles). Note that this cost calculation does not differentiate between different physics suites. Relative cost is then calculated usingwhere
abs_cost_ref
is the cost of running a single (num_fcsts = 1
) 6-hour forecast on theRRFS_CONUS_25km
grid (withnx = 219
andny = 131
) with a time step of 40 sec (so thatnum_time_steps = 6*3600/40 = 540
), i.e.Related To
PR #776 in
regional_workflow
introduced the calculation of relative cost for the WE2E tests. PR #278 provides the necessary update to the documentation.The text was updated successfully, but these errors were encountered: