Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new unit tests sumchk and bcstchk and update tests #606

Merged
merged 6 commits into from
Jun 10, 2021

Conversation

apcraig
Copy link
Contributor

@apcraig apcraig commented Jun 7, 2021

PR checklist

  • Short (1 sentence) summary of your PR:
    Update testing and add sumchk and bcstchk unit tests
  • Developer(s):
    apcraig
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    Full tests suites run on cheyenne on 3 compilers, https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#d7d59a49665d432dfd06216d16915c3b52d4b98f. Note that test results for tx1 change because the ns_boundary_type was changed from 'open' to 'tripole'.
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

Update testing

  • Set default debug_model_step=0 (was 99999999)
  • Add debug_model_[i,j,iblk,task] to define the debug_model diagnostic point in local grid index space. If this point is not set and debug_model is turned on, it will use lonpnt(1),latpnt(1).
  • Rename forcing_diag namelist/variable to debug_forcing to be more consistent with other "debug_" namelist variables
  • Rename the local variable forcing_debug in ice_forcing.F90 to local_debug to avoid confusion with global varaible debug_forcing.
  • Add namelist variable optics_file and optics_file_fieldname. Was hardwired in ice_forcing_bgc.F90.
  • Update setting of nbtrcr_sw and allocation of trcrn_sw. nbtrcr_sw was not set in icepack after it was computed and trcrn_sw was allocated before nbtrcr_sw was computed. This impacts the dedd_algae implementation which still isn't working.
  • move default distribution_wgt_file for gx1 to set_nml.gx1
  • update test suite, add testing of debug_model_[i,j,iblk,task], add addtional testing of maskhalo
  • add sumchk unit test to test ice_global_reduction methods including on center and Nface locations
  • add bcstchk unit test to test ice_broadcast methods
  • add get_rank to ice_communicate.F90
  • add global_[min/max]val_scalar_int_nodist method to ice_global_reductions.F90
  • add tripole output in ice_blocks.F90 with debug_blocks
  • add grid_type and ns_boundary_type tripole check
  • update set_nml.tx1 to set ns_boundary_type to 'tripole', was 'open'. NOTE: this changes answers for tx1 tests
  • update lsum16 global sum to revert to double precision if NO_R16 is set
  • update documentation

- Set default debug_model_step=0 (was 99999999)
- Add debug_model_[i,j,iblk,task] to define the debug_model diagnostic point
  in local grid index space.  If this point is not set and debug_model
  is turned on, it will use lonpnt(1),latpnt(1).
- Rename forcing_diag namelist/variable to debug_forcing to be more
  consistent with other "debug_" namelist variables
- Rename the local variable forcing_debug in ice_forcing.F90 to local_debug
  to avoid confusion with global varaible debug_forcing.
- Add namelist variable optics_file.  Was hardwired in ice_forcing_bgc.F90
- Update optics file variable name to read, still hardwired in model.
- Update setting of nbtrcr_sw and allocation of trcrn_sw.   nbtrcr_sw
  was not set in icepack after it was computed and trcrn_sw was allocated
  before nbtrcr_sw was computed.  This impacts the dedd_algae implementation
  which still isn't working.
- move default distribution_wgt_file for gx1 to set_nml.gx1
- update test suite, add testing of debug_model_[i,j,iblk,task], add addtional
  testing of maskhalo
- update documentation
- update ice_broadcast to sync up serial and mpi versions
- add get_rank to ice_communicate.F90
- add global_[min/max]val_scalar_int_nodist method to ice_global_reductions.F90
- add tripole output in ice_blocks.F90 with debug_blocks
- update set_nml.tx1 to set ns_boundary_type to 'tripole', was 'open'
Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a big chunk of work, making the code more consistent and (hopefully) robust. Please add some documentation that explain what the unit tests are doing.

@@ -1683,6 +1685,56 @@ function global_maxval_scalar_int (scalar, dist) &

end function global_maxval_scalar_int

!***********************************************************************

function global_maxval_scalar_int_nodist (scalar, communicator) &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are the new global_max/minval_scalar_int_nodist functions needed?

Copy link
Contributor Author

@apcraig apcraig Jun 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. The answer is a little complicated. I added this because I needed it for testing. In the bcstchk (ice_broadcast check) unit test, I broadcast data from one pe to others using the ice_broadcast (mpi or serial) infrastructure. Then I have to validate that. To validate it, I have each pe check it's results and then I broadcast that result to all pes, so now there is a global understanding of whether the test passed. To do that, I need MPI when running on multiple pes but I can't have any MPI when running on a single pe testing serial. So, that broadcast (which is really implemented as an MPI_MAX global reduction) needed to live in the serial/mpi directories rather than in the unit test because we don't have a "NO_MPI" cpp yet. Because all of the global_reduction infrastructure requires a "distribution" (but the ice_broadcast does not and the bcstchk has no grid/decomposition defined), I decided to implement a global reduction with just a communicator, basically as a simple layer on MPI, but it had to be in the serial/mpi directories.

If this is undesirable, there are a few other things I could do

  • I could leverage the existing global reduction infrastructure, but this would mean the bcstchk unit tester would have to initialize a grid/decomposition even though it's not needed for the unit test.
  • We could add a cpp something like "NO_MPI" and I could then leverage that in the bcstchk itself and implement the MPI global reduction there.

I prefer not to do the first option. The second option is something that maybe we want to do eventually anyway, so I could start to implement it now. All the mpi and "no_mpi" code is segregated into the serial and mpi directories right now. Ultimately, we may want to merge those directories in which case we'd need a cpp.

!
! author: Phil Jones, LANL
! Oct. 2004: Adapted from POP version by William H. Lipscomb, LANL

#ifndef SERIAL_REMOVE_MPI
use mpi ! MPI Fortran module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the serial comm directory, why have anything "mpi"? Is this the first step in merging the comm directories?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Experience tells us that maintaining multiple "versions" of the same code (i.e. comm/mpi and comm/serial or io/io_binary, io/io_netcdf, and io/io_pio2) is not straight-forward. What often happens is a change is made in one version but not another. And that inconsistency can take a while to discover. Another option is to merge the code together, but then CPPs are needed. Ultimately, it's a tradeoff between having multiple sets of source code and using CPPs.

In this case, where possible, I'm trying to use the same code on the mpi and serial directories minus a CPP that is hardcoded into the serial side. ice_global_reductions.F90 and ice_broadcast.F90 are implemented this way. That means a simple "diff" can tell whether the codes are synced up. In practice, what I'm doing is introducing the CPPs while keeping two separate directories. So a bit of the worst of both scenarios. But until we decide to merge the directories and use CPPs, I feel it's a viable way to maintain the code in the two directories.

cicecore/drivers/unittest/bcstchk/bcstchk.F90 Show resolved Hide resolved
@@ -3,6 +3,7 @@ runtype = 'initial'
ice_ic = 'default'
grid_format = 'bin'
grid_type = 'tripole'
ns_boundary_type = 'tripole'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There probably should be a check during the initialization that if grid_type = 'tripole', then ns_boundary_type = 'tripole'. I haven't heard of any east-west tripole grids, and I don't think CICE's boundary conditions could handle that anyhow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, I'll add it.

@@ -2948,6 +2951,10 @@ subroutine init_zbgc
endif
if (.NOT. dEdd_algae) nbtrcr_sw = 1

! tcraig, added 6/1/21, why is nbtrcr_sw set here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a quick look at the code and it wasn't obvious. I suspect that there's 2 considerations which conflict with each other, making it confusing:

  1. The need the count up all of the tracer indices, to make sure we have the right number
  2. An attempt to keep the BGC code as modular as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes an issue in the current bgc sw implementation, but there are still others are pointed out in #437.

Comment on lines 971 to 972
! fieldname='bcint_enh_mam_cice'
fieldname='modalBCabsorptionParameter5band'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to keep this comment ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. I think ultimately we should have a namelist specify the fieldname. Let me implement that now and I'll test and push an update.

@@ -654,6 +658,7 @@ zbgc_nml
"``mu_max_phaeo``", "real", "maximum growth rate phaeocystis per day", "0.851"
"``mu_max_sp``", "real", "maximum growth rate small plankton per day", "0.851"
"``nitratetype``", "real", "mobility type between stationary and mobile nitrate", "-1.0"
"``optics_file``", "string", "optics file associated with modal aersols", "unknown_optics_file"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/aersols/aerosols/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you.

Comment on lines 12 to 13
use ice_constants, only: field_loc_center
use ice_constants, only: field_loc_Nface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why this is changed here...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sumchk is testing the ice_global_reduction infrastructure. If you look at that infrastructure, it relies heavily on the information contained in the grid and decomposition including the location and size of the halo, whether the grid is a tripole grid, and so forth. For non-tripole grids, the global reductions operate on the grids (sans halo areas) and the field_loc value makes no difference. However, for the tripole grids, there is special logic for field_loc_Nface and field_loc_NEface in order to remove redundant points on the tripole zipper.

I have added a test of sumchk with the tx1 grid and in order to trigger the tripole logic in the global reduction, I have to set the field location as Nface or NEface. I picked Nface arbitrarily. This has no impact on non-tripole grids. If I stick with "center" for tripole grids, I don't trigger the tripole logic. One could argue I should test center and Nface for tripole and non-tripole grids, but I decided to just do Nface for both. Maybe I'll revise that.

- add grid_type and ns_boundary_type tripole check
- update sumchk unit test to check both Nface and center points.  these are treated
  differently for tripole grids.
- update documentation of unit tests
@apcraig
Copy link
Contributor Author

apcraig commented Jun 7, 2021

I updated the documentation, added optics_file_fieldname namelist, and extended the sumchk test to include center and Nface field locations. A brief description of each unit test is now at the top of each unit test .F90 file and in the user guide, https://cice-consortium-cice--606.org.readthedocs.build/en/606/user_guide/ug_testing.html#unit-testing.

@apcraig apcraig merged commit d6eb125 into CICE-Consortium:master Jun 10, 2021
@apcraig apcraig mentioned this pull request Aug 10, 2021
phil-blain added a commit to phil-blain/CICE that referenced this pull request Nov 5, 2021
'forcing_diag' was renamed to 'debug_forcing' in d6eb125 (Add new unit
tests sumchk and bcstchk and update tests (CICE-Consortium#606), 2021-06-09), but this
instance was not renamed in the doc.
apcraig pushed a commit that referenced this pull request Nov 5, 2021
'forcing_diag' was renamed to 'debug_forcing' in d6eb125 (Add new unit
tests sumchk and bcstchk and update tests (#606), 2021-06-09), but this
instance was not renamed in the doc.
@apcraig apcraig deleted the cov02 branch August 17, 2022 20:58
phil-blain added a commit to phil-blain/CICE that referenced this pull request Apr 10, 2023
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests
(CICE-Consortium#606), 2021-06-09), the namelist variables
'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed
to print diagnostics for the grid point specified by those variables, if
they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The
"Diagnostic files" subsection of the  "Implementation" section of the
User guide was updated, but not the "Debugging hints" subsection of the
"Troubleshooting" section, where the various namelist flags and
subroutines useful for debugging are mentioned.

Adjust the description of 'debug_model' there, and also mention that in
contrast to 'print_points', this option outputs diagnostics several
times per time step.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Sep 1, 2023
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests
(CICE-Consortium#606), 2021-06-09), the namelist variables
'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed
to print diagnostics for the grid point specified by those variables, if
they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The
"Diagnostic files" subsection of the  "Implementation" section of the
User guide was updated, but not the "Debugging hints" subsection of the
"Troubleshooting" section, where the various namelist flags and
subroutines useful for debugging are mentioned.

Adjust the description of 'debug_model' there, and also mention that in
contrast to 'print_points', this option outputs diagnostics several
times per time step. While at it, mention that 'print_points' output is
every 'diagfreq' time steps.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Jan 26, 2024
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests
(CICE-Consortium#606), 2021-06-09), the namelist variables
'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed
to print diagnostics for the grid point specified by those variables, if
they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The
"Diagnostic files" subsection of the  "Implementation" section of the
User guide was updated, but not the "Debugging hints" subsection of the
"Troubleshooting" section, where the various namelist flags and
subroutines useful for debugging are mentioned.

Adjust the description of 'debug_model' there, and also mention that in
contrast to 'print_points', this option outputs diagnostics several
times per time step. While at it, mention that 'print_points' output is
every 'diagfreq' time steps.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Feb 2, 2024
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests
(CICE-Consortium#606), 2021-06-09), the namelist variables
'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed
to print diagnostics for the grid point specified by those variables, if
they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The
"Diagnostic files" subsection of the  "Implementation" section of the
User guide was updated, but not the "Debugging hints" subsection of the
"Troubleshooting" section, where the various namelist flags and
subroutines useful for debugging are mentioned.

Adjust the description of 'debug_model' there, and also mention that in
contrast to 'print_points', this option outputs diagnostics several
times per time step. While at it, mention that 'print_points' output is
every 'diagfreq' time steps.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Feb 9, 2024
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests
(CICE-Consortium#606), 2021-06-09), the namelist variables
'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed
to print diagnostics for the grid point specified by those variables, if
they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The
"Diagnostic files" subsection of the  "Implementation" section of the
User guide was updated, but not the "Debugging hints" subsection of the
"Troubleshooting" section, where the various namelist flags and
subroutines useful for debugging are mentioned.

Adjust the description of 'debug_model' there, and also mention that in
contrast to 'print_points', this option outputs diagnostics several
times per time step. While at it, mention that 'print_points' output is
every 'diagfreq' time steps.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Feb 12, 2024
In d6eb125 (Add new unit tests sumchk and bcstchk and update tests
(CICE-Consortium#606), 2021-06-09), the namelist variables
'debug_model_{i,j,iblk,task}' were added and 'debug_model' was changed
to print diagnostics for the grid point specified by those variables, if
they are defined, and fall back to (lonpnt(1), latpnt(1)) instead. The
"Diagnostic files" subsection of the  "Implementation" section of the
User guide was updated, but not the "Debugging hints" subsection of the
"Troubleshooting" section, where the various namelist flags and
subroutines useful for debugging are mentioned.

Adjust the description of 'debug_model' there, and also mention that in
contrast to 'print_points', this option outputs diagnostics several
times per time step. While at it, mention that 'print_points' output is
every 'diagfreq' time steps.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants