Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fast vectorized EVP kernel #205

Closed
wants to merge 4 commits into from
Closed

Conversation

mhrib
Copy link
Contributor

@mhrib mhrib commented Oct 12, 2018

New vectorized EVP kernel:

  • vectorized - contiguous vectors.
  • Running OpenMP on the GLOBAL net on one single node ie. reduce communication and avoiding HALO updates to every single block every iteration.
  • Able to run CICE mixed MPI+OpenMP on several nodes. Use Node "my_task==master_task" for EVP iterations:
  • Works for "regular" circular grids - eg. GX3. Ie. do a "circular" halo update.
  • Not tested for tri-pole grids. Doubt that it works. A stop is introduced in the code.
  • Namelist: Shift between "CICE standard" EVP and this EVP via. namelist. Default is evp_kernel=0 = "CICE standard"
  • Use environment "OMP_NUM_THREADS" to select number of threads used by the evp_kernel.
  • There is a natural "overhead" associated with gather/scatter and defining vectors. Therefore it becomes more efficient for "large" setups.

CODE CORRECTION??: Both valid for this kernel, but also for the CICE stepu subroutine:

  • Is "strocnx,strocny" used at all from subroutine "stepu"? Is it needed here? It is also calculated in subroutine "dyn_finish" short after the EVP-loop. Therefore I have removed the calculation in this EVP_kernel, but it can easily be re-introduced (using slightly more memory) by searching for: "!strocn"

--o--

Namelist:
&dynamics_nml
evp_kernel_ver = 0 ! 0: CICE (default) , 2: kernel_v2

Environment:
OMP_NUM_THREADS

Option: REAL4 internally instead of REAL8:

  • Not implemented as a namelist switch, but can be tried out simply by this "poor-man-option":

mv evp_kernel1d.F90 evp_kernel1d_r8.F90
cat evp_kernel1d_r8.F90 | sed s/DBL_KIND/REAL_KIND/g > evp_kernel1d.F90

--o--

  • Developer(s):
    Mads Hvid Ribergaard and Jacob Weismann Poulsen, DMI

  • Are the code changes bit for bit, different at roundoff level, or more substantial?
    In most cases it should be "bit-to-bit". Depending on how the code is translated during compilation and which math libs are used.

  • Does this PR create or have dependencies on Icepack or any other models?
    Not other than CICE normally have

  • Is the documentation being updated with this PR? No
    If not, does the documentation need to be updated separately at a later time?

  • Other Relevant Details:

Basic structure

  • evp_copyin() : gather
  • evp_kernel() : loop stress/stepu/halo_update
  • evp_copyout() : scatter
    There is a natural "overhead" associated with gather/scatter and defining vectors. Therefore it becomes more efficient for "large" setups.

Affected files:
./cicecore/cicedynB/dynamics/evp_kernel1d.F90 (The core, new file)
./cicecore/cicedynB/dynamics/ice_dyn_evp.F90 (switch between CICE/new kernels)
./cicecore/cicedynB/dynamics/ice_dyn_shared.F90 (namelist)
./cicecore/cicedynB/general/ice_init.F90 (namelist)
./cicecore/cicedynB/infrastructure/comm/mpi/ice_gather_scatter.F90 ()
./cicecore/cicedynB/infrastructure/comm/serial/ice_gather_scatter.F90 (
)
(*) Updated gather_scatter_ext to take care of integers + logicals (icetmask,iceumask)

Possible OMP issues (see below):
./cicecore/cicedynB/analysis/ice_diagnostics.F90 (2x)
./cicecore/cicedynB/analysis/ice_history.F90
./cicecore/cicedynB/dynamics/ice_dyn_eap.F90
./cicecore/cicedynB/dynamics/ice_transport_driver.F90 (2x)
./cicecore/cicedynB/dynamics/ice_transport_remap.F90
./cicecore/cicedynB/dynamics/ice_dyn_evp.F90
./cicecore/cicedynB/general/ice_init.F90
./cicecore/drivers/cice/CICE_RunMod.F90

OpenMP Issues ??:

  • Maybe OpenMP raises?. I have not tested it carefully, but a previous CICE version shows some raises. I have added a comment line just before the OMPs, that I did comment out last time. But all OMP's stays un-commented in this PR.
    Search for "!MHRI: CHECK THIS OMP"
    NOTE: I did only un-comment the OMPs to check its runs smoothly. Ie. this only means, that there is possible a thread-issue in one of the files - not necessary all of them.

@mhrib
Copy link
Contributor Author

mhrib commented Nov 26, 2018

This PR can be closed as it is replaced by #252.

@mhrib mhrib closed this Nov 26, 2018
@apcraig apcraig mentioned this pull request Feb 1, 2019
@mhrib mhrib deleted the EVP_kernel branch November 8, 2023 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants