Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create new baseline to fix the issue with develop-20230504 and recover baseline on gaea #1749

Merged
merged 11 commits into from
May 12, 2023
Merged

Conversation

jkbk2004
Copy link
Collaborator

@jkbk2004 jkbk2004 commented May 10, 2023

Description

  • Issue: PR PBL, Convection and Microphysics Update for HR2 #1731 contained minor changes of ccpp physics commits during the review process. However, user's ccpp fork branch not fully synced to those newly committed changes when the develop-20230504 baseline was created.
  • Fix: As develop branch code base is correct, new baseline, develop-20230510 is needed to be created.
  • Gaea baselines are regenerated with hpc stack update (intel-classic-2022.0.2) and removed node option of gaea9 (dedicated for old c3 image)

Top of commit queue on: TBD

Input data additions/changes

  • No changes are expected to input data.
  • There will be new input data.
  • Input data will be updated.

Anticipated changes to regression tests:

  • No changes are expected to any regression test.
  • Changes are expected to the following tests:

Needs to update all cases affected from #1731:
INTEL FAILED TESTS:
Test cpld_control_p8_mixedmode 001 failed in run_test failed
Test cpld_control_gfsv17 002 failed in run_test failed
Test cpld_control_p8 003 failed in run_test failed
Test cpld_control_qr_p8 005 failed in run_test failed
Test cpld_2threads_p8 007 failed in run_test failed
Test cpld_decomp_p8 008 failed in run_test failed
Test cpld_mpi_p8 009 failed in run_test failed
Test cpld_control_ciceC_p8 010 failed in run_test failed
Test cpld_control_c192_p8 011 failed in run_test failed
Test cpld_control_noaero_p8 013 failed in run_test failed
Test cpld_debug_p8 015 failed in run_test failed
Test cpld_debug_noaero_p8 016 failed in run_test failed
Test control_flake 021 failed in run_test failed
Test control_latlon 024 failed in run_test failed
Test control_wrtGauss_netcdf_parallel 025 failed in run_test failed
Test control_c192 027 failed in run_test failed
Test control_c384 028 failed in run_test failed
Test control_c384gdas 029 failed in run_test failed
Test control_stochy 030 failed in run_test failed
Test control_lndp 032 failed in run_test failed
Test control_iovr4 033 failed in run_test failed
Test control_iovr5 034 failed in run_test failed
Test control_p8 035 failed in run_test failed
Test control_qr_p8 037 failed in run_test failed
Test control_decomp_p8 039 failed in run_test failed
Test control_2threads_p8 040 failed in run_test failed
Test control_p8_lndp 041 failed in run_test failed
Test control_p8_rrtmgp 042 failed in run_test failed
Test control_p8_mynn 043 failed in run_test failed
Test merra2_thompson 044 failed in run_test failed
Test control_csawmg 075 failed in run_test failed
Test control_csawmgt 076 failed in run_test failed
Test control_ras 077 failed in run_test failed
Test control_p8_faster 079 failed in run_test failed
Test hafs_regional_atm 121 failed in run_test failed
Test hafs_regional_atm_ocn 123 failed in run_test failed
Test hafs_regional_atm_wav 124 failed in run_test failed
Test hafs_regional_atm_ocn_wav 125 failed in run_test failed
Test hafs_global_multiple_4nests_atm 129 failed in run_test failed
Test hafs_regional_specified_moving_1nest_atm 130 failed in run_test failed
Test hafs_regional_storm_following_1nest_atm_ocn 132 failed in run_test failed
Test hafs_regional_storm_following_1nest_atm_ocn_wav 135 failed in run_test failed
Test atmwav_control_noaero_p8 157 failed in run_test failed
Test control_atmwav 158 failed in run_test failed
Test atmaero_control_p8 159 failed in run_test failed
Test atmaero_control_p8_rad 160 failed in run_test failed
Test atmaero_control_p8_rad_micro 161 failed in run_test failed

GNU FAILED TESTS:
Test control_c48 001 failed in run_test failed
Test control_stochy 002 failed in run_test failed
Test control_ras 003 failed in run_test failed
Test control_p8 004 failed in run_test failed
Test control_flake 005 failed in run_test failed
Test control_diag_debug 023 failed in run_test failed
Test control_ras_debug 031 failed in run_test failed
Test control_stochy_debug 032 failed in run_test failed
Test control_debug_p8 033 failed in run_test failed
Test control_wam_debug 039 failed in run_test failed
Test cpld_control_p8 051 failed in run_test failed
Test cpld_control_nowave_noaero_p8 052 failed in run_test failed
Test cpld_debug_p8 053 failed in run_test failed

Subcomponents involved:

  • AQM
  • CDEPS
  • CICE
  • CMEPS
  • CMakeModules
  • FV3
  • GOCART
  • HYCOM
  • MOM6
  • NOAHMP
  • WW3
  • stochastic_physics
  • none

Combined with PR's (If Applicable):

Commit Queue Checklist:

  • Link PR's from all sub-components involved
  • Confirm reviews completed in sub-component PR's
  • Add all appropriate labels to this PR.
  • Run full RT suite on either Hera/Cheyenne with both Intel/GNU compilers
  • Add list of any failed regression tests to "Anticipated changes to regression tests" section.

Linked PR's and Issues:

Fixes #1746, #1724, and #1232,

Testing Day Checklist:

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
  • Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.

Testing Log (for CM's):

  • RDHPCS
    • Intel
      • Hera
      • Orion
      • Jet
      • Gaea
      • Cheyenne
    • GNU
      • Hera
      • Cheyenne
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
    • Completed
  • opnReqTest
    • N/A
    • Log attached to comment

@jkbk2004
Copy link
Collaborator Author

@BrianCurtis-NOAA @zach1221 leave any comment

@zach1221
Copy link
Collaborator

Thanks, @jkbk2004 . I'll let you know when my set of BL are created.

@zach1221
Copy link
Collaborator

I checked in on the RT jobs for hera. Looks like gnu is almost finished. Intel still has plenty of tests to run yet, a little over halfway finished.

@jkbk2004 jkbk2004 changed the title Create new baseline to fix the issue with develop-20230504 Create new baseline to fix the issue with develop-20230504 and recover baseline on gaea May 12, 2023
@jkbk2004
Copy link
Collaborator Author

Automated RT Failure Notification
Machine: hera
Compiler: intel
Job: RT
[RT] Repo location: /scratch1/NCEPDEV/nems/emc.nemspara/autort/pr/1345751926/20230511124514/ufs-weather-model
[RT] Error: Test rap_control 058 failed in check_result failed
[RT] Error: Test rap_control 058 failed in run_test failed
[RT] Error: Test regional_spp_sppt_shum_skeb 059 failed in check_result failed
[RT] Error: Test regional_spp_sppt_shum_skeb 059 failed in run_test failed
[RT] Error: Test rap_decomp 060 failed in check_result failed
[RT] Error: Test rap_decomp 060 failed in run_test failed
[RT] Error: Test rap_2threads 061 failed in check_result failed
[RT] Error: Test rap_2threads 061 failed in run_test failed
[RT] Error: Test rap_sfcdiff 063 failed in check_result failed
[RT] Error: Test rap_sfcdiff 063 failed in run_test failed
[RT] Error: Test rap_sfcdiff_decomp 064 failed in check_result failed
[RT] Error: Test rap_sfcdiff_decomp 064 failed in run_test failed
[RT] Error: Test hrrr_control 066 failed in check_result failed
[RT] Error: Test hrrr_control 066 failed in run_test failed
[RT] Error: Test hrrr_control_decomp 067 failed in check_result failed
[RT] Error: Test hrrr_control_decomp 067 failed in run_test failed
[RT] Error: Test hrrr_control_2threads 068 failed in check_result failed
[RT] Error: Test hrrr_control_2threads 068 failed in run_test failed
[RT] Error: Test rrfs_v1beta 070 failed in check_result failed
[RT] Error: Test rrfs_v1beta 070 failed in run_test failed
[RT] Error: Test rrfs_v1nssl 071 failed in check_result failed
[RT] Error: Test rrfs_v1nssl 071 failed in run_test failed
[RT] Error: Test rrfs_v1nssl_nohailnoccn 072 failed in check_result failed
[RT] Error: Test rrfs_v1nssl_nohailnoccn 072 failed in run_test failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm 073 failed in check_result failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm 073 failed in run_test failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm_2threads 074 failed in check_result failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm_2threads 074 failed in run_test failed
[RT] Error: Test rrfs_conus13km_hrrr_warm 075 failed in check_result failed
[RT] Error: Test rrfs_conus13km_hrrr_warm 075 failed in run_test failed
[RT] Error: Test rrfs_smoke_conus13km_radar_tten_warm 076 failed in check_result failed
[RT] Error: Test rrfs_smoke_conus13km_radar_tten_warm 076 failed in run_test failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm_debug 084 failed in check_result failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm_debug 084 failed in run_test failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm_debug_2threads 085 failed in check_result failed
[RT] Error: Test rrfs_smoke_conus13km_hrrr_warm_debug_2threads 085 failed in run_test failed
[RT] Error: Test rrfs_conus13km_hrrr_warm_debug 086 failed in check_result failed
[RT] Error: Test rrfs_conus13km_hrrr_warm_debug 086 failed in run_test failed
[RT] Error: Test control_CubedSphereGrid_debug 087 failed in check_result failed
[RT] Error: Test control_CubedSphereGrid_debug 087 failed in run_test failed
[RT] Error: Test control_wrtGauss_netcdf_parallel_debug 088 failed in check_result failed
[RT] Error: Test control_wrtGauss_netcdf_parallel_debug 088 failed in run_test failed
[RT] Error: Test control_stochy_debug 089 failed in check_result failed
[RT] Error: Test control_stochy_debug 089 failed in run_test failed
[RT] Error: Test control_lndp_debug 090 failed in check_result failed
[RT] Error: Test control_lndp_debug 090 failed in run_test failed
[RT] Error: Test control_csawmg_debug 091 failed in check_result failed
[RT] Error: Test control_csawmg_debug 091 failed in run_test failed
[RT] Error: Test control_csawmgt_debug 092 failed in check_result failed
[RT] Error: Test control_csawmgt_debug 092 failed in run_test failed
[RT] Error: Test control_ras_debug 093 failed in check_result failed
[RT] Error: Test control_ras_debug 093 failed in run_test failed
[RT] Error: Test control_diag_debug 094 failed in check_result failed
[RT] Error: Test control_diag_debug 094 failed in run_test failed
[RT] Error: Test control_debug_p8 095 failed in check_result failed
[RT] Error: Test control_debug_p8 095 failed in run_test failed
[RT] Error: Test regional_debug 096 failed in check_result failed
[RT] Error: Test regional_debug 096 failed in run_test failed
[RT] Error: Test rap_control_debug 097 failed in check_result failed
[RT] Error: Test rap_control_debug 097 failed in run_test failed
[RT] Error: Test hrrr_control_debug 098 failed in check_result failed
[RT] Error: Test hrrr_control_debug 098 failed in run_test failed
[RT] Error: Test rap_unified_drag_suite_debug 099 failed in check_result failed
[RT] Error: Test rap_unified_drag_suite_debug 099 failed in run_test failed
[RT] Error: Test rap_diag_debug 100 failed in check_result failed
[RT] Error: Test rap_diag_debug 100 failed in run_test failed
[RT] Error: Test rap_cires_ugwp_debug 101 failed in check_result failed
[RT] Error: Test rap_cires_ugwp_debug 101 failed in run_test failed
[RT] Error: Test rap_unified_ugwp_debug 102 failed in check_result failed
[RT] Error: Test rap_unified_ugwp_debug 102 failed in run_test failed
[RT] Error: Test rap_lndp_debug 103 failed in check_result failed
[RT] Error: Test rap_lndp_debug 103 failed in run_test failed
[RT] Error: Test rap_progcld_thompson_debug 104 failed in check_result failed
[RT] Error: Test rap_progcld_thompson_debug 104 failed in run_test failed
[RT] Error: Test rap_noah_debug 105 failed in check_result failed
[RT] Error: Test rap_noah_debug 105 failed in run_test failed
[RT] Error: Test rap_sfcdiff_debug 106 failed in check_result failed
[RT] Error: Test rap_sfcdiff_debug 106 failed in run_test failed
[RT] Error: Test rap_noah_sfcdiff_cires_ugwp_debug 107 failed in check_result failed
[RT] Error: Test rap_noah_sfcdiff_cires_ugwp_debug 107 failed in run_test failed
[RT] Error: Test rrfs_v1beta_debug 108 failed in check_result failed
[RT] Error: Test rrfs_v1beta_debug 108 failed in run_test failed
[RT] Error: Test rap_clm_lake_debug 109 failed in check_result failed
[RT] Error: Test rap_clm_lake_debug 109 failed in run_test failed
[RT] Error: Test rap_flake_debug 110 failed in check_result failed
[RT] Error: Test rap_flake_debug 110 failed in run_test failed
[RT] Error: Test control_wam_debug 111 failed in check_result failed
[RT] Error: Test control_wam_debug 111 failed in run_test failed
[RT] Error: Test regional_spp_sppt_shum_skeb_dyn32_phy32 112 failed in check_result failed
[RT] Error: Test regional_spp_sppt_shum_skeb_dyn32_phy32 112 failed in run_test failed
[RT] Error: Test rap_control_dyn32_phy32 113 failed in check_result failed
[RT] Error: Test rap_control_dyn32_phy32 113 failed in run_test failed
[RT] Error: Test hrrr_control_dyn32_phy32 114 failed in check_result failed
[RT] Error: Test hrrr_control_dyn32_phy32 114 failed in run_test failed
[RT] Error: Test rap_2threads_dyn32_phy32 115 failed in check_result failed
[RT] Error: Test rap_2threads_dyn32_phy32 115 failed in run_test failed
[RT] Error: Test hrrr_control_2threads_dyn32_phy32 116 failed in check_result failed
[RT] Error: Test hrrr_control_2threads_dyn32_phy32 116 failed in run_test failed
[RT] Error: Test hrrr_control_decomp_dyn32_phy32 117 failed in check_result failed
[RT] Error: Test hrrr_control_decomp_dyn32_phy32 117 failed in run_test failed
[RT] Error: Test rap_control_dyn64_phy32 120 failed in check_result failed
[RT] Error: Test rap_control_dyn64_phy32 120 failed in run_test failed
[RT] Error: Test rap_control_debug_dyn32_phy32 121 failed in check_result failed
[RT] Error: Test rap_control_debug_dyn32_phy32 121 failed in run_test failed
[RT] Error: Test hrrr_control_debug_dyn32_phy32 122 failed in check_result failed
[RT] Error: Test hrrr_control_debug_dyn32_phy32 122 failed in run_test failed
[RT] Error: Test rap_control_dyn64_phy32_debug 123 failed in check_result failed
[RT] Error: Test rap_control_dyn64_phy32_debug 123 failed in run_test failed
[RT] Error: Test datm_cdeps_control_cfsr 142 failed in check_result failed
[RT] Error: Test datm_cdeps_control_cfsr 142 failed in run_test failed
[RT] Error: Test datm_cdeps_control_gefs 144 failed in check_result failed
[RT] Error: Test datm_cdeps_control_gefs 144 failed in run_test failed
[RT] Error: Test datm_cdeps_iau_gefs 145 failed in check_result failed
[RT] Error: Test datm_cdeps_iau_gefs 145 failed in run_test failed
[RT] Error: Test datm_cdeps_stochy_gefs 146 failed in check_result failed
[RT] Error: Test datm_cdeps_stochy_gefs 146 failed in run_test failed
[RT] Error: Test datm_cdeps_ciceC_cfsr 147 failed in check_result failed
[RT] Error: Test datm_cdeps_ciceC_cfsr 147 failed in run_test failed
[RT] Error: Test datm_cdeps_bulk_cfsr 148 failed in check_result failed
[RT] Error: Test datm_cdeps_bulk_cfsr 148 failed in run_test failed
[RT] Error: Test datm_cdeps_bulk_gefs 149 failed in check_result failed
[RT] Error: Test datm_cdeps_bulk_gefs 149 failed in run_test failed
[RT] Error: Test datm_cdeps_mx025_cfsr 150 failed in check_result failed
[RT] Error: Test datm_cdeps_mx025_cfsr 150 failed in run_test failed
[RT] Error: Test datm_cdeps_mx025_gefs 151 failed in check_result failed
[RT] Error: Test datm_cdeps_mx025_gefs 151 failed in run_test failed
[RT] Error: Test datm_cdeps_multiple_files_cfsr 152 failed in check_result failed
[RT] Error: Test datm_cdeps_multiple_files_cfsr 152 failed in run_test failed
[RT] Error: Test datm_cdeps_3072x1536_cfsr 153 failed in check_result failed
[RT] Error: Test datm_cdeps_3072x1536_cfsr 153 failed in run_test failed
[RT] Error: Test datm_cdeps_gfs 154 failed in check_result failed
[RT] Error: Test datm_cdeps_gfs 154 failed in run_test failed
[RT] Error: Test datm_cdeps_debug_cfsr 155 failed in check_result failed
[RT] Error: Test datm_cdeps_debug_cfsr 155 failed in run_test failed
[RT] Error: Test datm_cdeps_control_cfsr_faster 156 failed in check_result failed
[RT] Error: Test datm_cdeps_control_cfsr_faster 156 failed in run_test failed
[RT] Error: Test datm_cdeps_lnd_gswp3 157 failed in check_result failed
[RT] Error: Test datm_cdeps_lnd_gswp3 157 failed in run_test failed
[RT] Error: Test regional_atmaq 165 failed in check_result failed
[RT] Error: Test regional_atmaq 165 failed in run_test failed
[RT] Error: Test regional_atmaq_faster 167 failed in check_result failed
[RT] Error: Test regional_atmaq_faster 167 failed in run_test failed
Please make changes and add the following label back: hera-intel-RT

@DeniseWorthen
Copy link
Collaborator

@jkbk2004 What happened w/ the hera.intel test? It looks like the 0510 baseline does not have all the tests in it.

@jkbk2004
Copy link
Collaborator Author

@jkbk2004 What happened w/ the hera.intel test? It looks like the 0510 baseline does not have all the tests in it.

Looks like we only created cases only changing results for hera.intel. I see missing baseline complaints. I am manually comparing for those complaints. It will take a bit of time but I think I might be able to push a new hera.intel log. I will confirm.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented May 12, 2023

@jkbk2004 So you tried to just create new baselines for the failed tests to save time? Was that really a good idea---we waited all day in the Q for hera.intel just to have a massive list of failed tests.

@jkbk2004
Copy link
Collaborator Author

@jkbk2004 So you tried to just create new baselines for the failed tests to save time? Was that really a good idea---we waited all day in the Q for hera.intel just to have a massive list of failed tests.

It's not good idea to selectively run test with so many cases changing results. mis-communication, I guess. @zach1221 FYI. But not much we can do about hera fairshare issue.

@zach1221
Copy link
Collaborator

@DeniseWorthen it was my fault. I misunderstood which baseline needed to be created as new, based on the PR description. My apologies. We're working to get it resolved as soon as possible.

@DeniseWorthen
Copy link
Collaborator

Thanks @zach1221. I think the slow turnaround has gotten everyone frustrated. I know I am, having our main platform basically useless for us.

@zach1221
Copy link
Collaborator

Hera.intel RT is done. Sending out final approvals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

baseline time stamp
6 participants