Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need capability to exclude machines for tests in rt.conf #305

Closed
climbfuji opened this issue Dec 1, 2020 · 0 comments · Fixed by #349
Closed

Need capability to exclude machines for tests in rt.conf #305

climbfuji opened this issue Dec 1, 2020 · 0 comments · Fixed by #349
Assignees
Labels
enhancement New feature or request

Comments

@climbfuji
Copy link
Collaborator

Description

Currently we can either leave the list of machines in rt.conf empty to run a test on all machines, or we can specify a list of all machines to run on. Sometimes, tests need to be skipped on only one machine, which means that we have to repeat a RUN line several times.

Solution

Implement a syntax that tells rt.sh to skip a certain test for the machines listed in the MACHINES column, for example by prepending the first entry with a -.

Alternatives

Better solutions?

@climbfuji climbfuji added the enhancement New feature or request label Dec 1, 2020
epic-cicd-jenkins pushed a commit that referenced this issue Apr 17, 2023
## DESCRIPTION OF CHANGES:
This PR renames and modifies several existing grids and creates some new ones.  Details below.

Note that the 3 grids required for the ufs-srweather-app release are created in this PR:  RRFS_CONUS_25km, RRFS_CONUS_13km, RRFS_CONUS_3km.  These pass their tests (2 tests per grid) except for RRFS_CONUS_13km, which fails one of its two tests in the make_ics task.  This may need further attention, possibly after the PR is merged.

### CONUS and SUBCONUS grids of ESGgrid type:
* There are currently four such grids:  GSD_HRRR25km, GSD_HRRR13km, GSD_HRRR3km, and GSD_SUBCONUS3km.
* These have been renamed to RRFS_CONUS_25km, RRFS_CONUS_13km, RRFS_CONUS_3km, and RRFS_SUBCONUS_3km, respectively.
* The grid parameters (both native and write-component) have been adjusted as follows:
  * The outer boundaries of the three CONUS grids are reset such that they are as close to the HRRRX grid boundary as possible but such that each grid, including its 4-cell-wide halo, is within the HRRRX domain.  This is to ensure that the HRRRX can be used to generate ICs for these CONUS grids.
  * The write-component grid corresponding to each of these four native grids is reset such that it is as large as possible but still lies completely within the grid (without any halo points).
  * There is no longer a GFDLgrid type grid corresponding to these grids; they are all strictly of type ESGgrid.  Setting GRID_GEN_METHOD to "GFDLgrid" with any of these grids will result in an error during the workflow generation step.
* The following WE2E tests have been added to test these grids:
  * grid_RRFS_CONUS_25km_FV3GFS_FV3GFS - run on the RRFS_CONUS_13km grid using FV3GFS for ICs/LBCs and the GFS_v16beta physics suite.
  * grid_RRFS_CONUS_13km_FV3GFS_FV3GFS - same as above but for the RRFS_CONUS_13km grid.
  * grid_RRFS_CONUS_3km_FV3GFS_FV3GFS - same as above but for the RRFS_CONUS_3km grid.
  * grid_RRFS_SUBCONUS_3km_FV3GFS_FV3GFS - same as above but for the RRFS_SUBCONUS_3km grid.
  * grid_RRFS_CONUS_25km_HRRRX_RAPX - run on the RRFS_CONUS_25km grid using HRRRX for ICs, RAPX for LBCs, and the GSD_SAR physics suite.
  * grid_RRFS_CONUS_13km_HRRRX_RAPX - same as above but for the RRFS_CONUS_13km grid.
  * grid_RRFS_CONUS_3km_HRRRX_RAPX - same as above but for the RRFS_CONUS_3km grid.
  * grid_RRFS_SUBCONUS_3km_HRRRX_RAPX - same as above but for the RRFS_SUBCONUS_3km grid.

### CONUS grids of GFDLgrid type:
* There are currently two GFDLgrid type grids that are actively tested:  EMC_CONUS_coarse and EMC_CONUS_3km.
* Two new ones similar to these have been introduced:  CONUS_25km_GFDLgrid and CONUS_3km_GFDLgrid.
* CONUS_25km_GFDLgrid is similar to EMC_CONUS_coarse except that it is adjusted to be centered about the "center" point (with respect to the "parent" tile 6) and to have an average cell size of about 25.1km (whereas EMC_CONUS_coarse was not symmetric and had an average cell size of about 23.? km).  In centering the grid, the number of grid points in each direction was chosen to allow more choices in setting LAYOUT_X, LAYOUT_Y, and BLOCKSIZE.
* CONUS_3km_GFDLgrid is similar to EMC_CONUS_3km except that it is adjusted to be centered about the "center" point (with respect to the "parent" tile 6).  In centering the grid, the number of grid points in each direction was chosen to allow more choices in setting LAYOUT_X, LAYOUT_Y, and BLOCKSIZE.
* The write-component grids corresponding to these two native grids have been set such that they are as large as possible but still lie completely within the native grid (without any halo points).
* These grids are present to allow testing of the GFDLgrid capability in the workflow as new PRs are merged.
* The current EMC_CONUS_coarse and EMC_CONUS_3km grids are retained for now but will be removed soon in another PR.
* Setting GRID_GEN_METHOD to "ESGgrid" is not allowed with either of these grids; if set, it will result in an error during the workflow generation step.
* The following WE2E tests have been added to test these grids:
  * grid_CONUS_25km_GFDLgrid_FV3GFS_FV3GFS - run on the CONUS_25km_GFDLgrid grid using FV3GFS for ICs/LBCs and the GFS_v16beta physics suite.
  * grid_CONUS_3km_GFDLgrid_FV3GFS_FV3GFS - same as above but on the CONUS_3km_GFDLgrid grid.

### Alaska grids:
* There are currently three Alaska grids of ESGgrid type:  EMC_AK (a 3-km grid), GSD_RRFSAK_3km, and GSD_HRRR_AK_50km.  Only GSD_HRRR_AK_50km is actively tested.
* The RRFSAK_3km grid is renamed to RRFS_AK_3km and its write-component grid adjusted so that it covers as much of the native as possible without going outside.
* The new Alaska grids RRFS_AK_13km has been added.  This is meant to be a replacement for the GSD_HRRR_AK_50km, which was deemed too coarse for testing (and will be removed in a future PR).
* The Alaska grids RRFS_AK_3km and RRFS_AK_13km fail most of the tests (see below) and need further adjustment.  They are included in this PR in order to allow users to try them and suggest modifications.
* The following WE2E tests have been added to test these grids:
  * grid_RRFS_AK_13km_FV3GFS_FV3GFS - run on the RRFS_AK_13km grid using FV3GFS for ICs/LBCs and the GFS_v16beta physics suite.
  * grid_RRFS_AK_3km_FV3GFS_FV3GFS - same as above but for the RRFS_AK_3km grid.
  * grid_RRFS_AK_13km_RAPX_RAPX - run on the RRFS_AK_13km grid using RAPX for ICs/LBCs and the GSD_SAR physics suite.
  * grid_RRFS_AK_3km_RAPX_RAPX - same as above but for the RRFS_AK_3km grid.

## TESTS CONDUCTED: 
Ran the 14 new WE2E tests added in this PR.  Test results are as follows:
```
* grid_RRFS_CONUS_13km_FV3GFS_FV3GFS            SUCCESS

* grid_RRFS_CONUS_13km_HRRRX_RAPX               FAILURE - In make_ics task
  * Log file:  PET00 ESMCI_Mesh_Regrid_Glue.C:313 c_esmc_regrid_create() Arguments are incompatible  - - There exist destination points (e.g. id=1) which can't be mapped to any source cell
  * Boundary of native grid may be too close to that of HRRRX grid.

* grid_RRFS_CONUS_25km_FV3GFS_FV3GFS            SUCCESS

* grid_RRFS_CONUS_25km_HRRRX_RAPX               SUCCESS

* grid_RRFS_CONUS_3km_FV3GFS_FV3GFS             SUCCESS

* grid_RRFS_CONUS_3km_HRRRX_RAPX                SUCCESS

* grid_RRFS_SUBCONUS_3km_FV3GFS_FV3GFS          FAILURE - In run_fcst task
  * Log file:  FATAL from PE   719: compute_qs: saturation vapor pressure table overflow, nbad=      1
  * Fails only after hour 5, so probably not a grid issue.

* grid_RRFS_SUBCONUS_3km_HRRRX_RAPX             SUCCESS

* grid_CONUS_25km_GFDLgrid_FV3GFS_FV3GFS        SUCCESS

* grid_CONUS_3km_GFDLgrid_FV3GFS_FV3GFS         SUCCESS

* grid_RRFS_AK_13km_FV3GFS_FV3GFS               SUCCESS

* grid_RRFS_AK_13km_RAPX_RAPX                   FAILURE - In run_fcst task
  * Log file:
       AVOST IN VILKA     Table index=  -2147483648
  I,J=           1           1 LU_index =           15 Psfc[hPa] =
    914.224907314199      Tsfc =    278.090454101562

* grid_RRFS_AK_3km_FV3GFS_FV3GFS                FAILURE - In run_fcst task
  * Log file:  
  forrtl: severe (174): SIGSEGV, segmentation fault occurred               
  *** longjmp causes uninitialized stack frame ***: /scratch2/BMC/det/Gerard.Ketefian/UFS_CAM/PR_feature_predef_grids/ufs-srweather-app/exec/fv3_g    
  fs.x terminated
  * Fails only after hour 2, so probably not a grid issue.

* grid_RRFS_AK_3km_RAPX_RAPX                    FAILURE - In run_fcst task
  * Log file:
       AVOST IN VILKA     Table index=  -2147483648                       
  I,J=           1           1 LU_index =           15 Psfc[hPa] =        
    847.226062635191      Tsfc =    294.777954101562                      

```
Note that:
* The RRFS_CONUS_XXkm grids pass 5 of 6 tests.  The failure of test grid_RRFS_CONUS_13km_HRRRX_RAPX in the make_ics task is due to there being one or more "destination" cells on the RRFS_CONUS_13km grid for which chgres_cube could not find one or more "source" cells on the HRRRX external grid.  This may be fixed by making the RRFS_CONUS_13km grid smaller, but it would be better to fix chgres_cube so that this case works (since the RRFS_CONUS_13km grid, including its 4-cell-wide halo, is already completely within the HRRRX domain).
* The RRFS_SUBCONUS_XXkm grids (2 tests) pass for the HRRRX/RAPX ICs/LBCs case but fail in the run_fcst task for the FV3GFS ICs/LBCs case.  Since the failure occurs after hour 5, it is probably not grid-related.
* The CONUS_XXkm_GFDLgrid grids pass their tests (2 tests).
* The RRFS_AK_XXkm grids pass only 1 of 4 tests.  These failures need to be investigated further.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants