-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor tracer initialization #235
Conversation
Some clarifying details:
I also started removing redundant info from the namelist, for the 'easy' tracers (age, etc). For those, I kept the on/off flags (tr_iage, etc) and set the number of tracers associated with that flag (n_trage etc) in the code instead of namelist. Note: it would be better to be consistent with the names of these numbers and use n_iage, n_FY, n_lvl, n_pnd instead of n_trage etc, if possible. |
I have completed step 1 and think I understand the rest, but am having some trouble understanding step 2. Can you clarify. When I look at n_trage, it is a local variable and is used only in a limited way,
Looking at n_trbri, if this is a namelist, why are we keeping it a namelist and overwriting the value based on tr_brine (another namelist)? What is happening here? And should the right hand side of the equals sign always be 1? I see it's 2 for tr_lvl. Do we actually want this,
instead of
thus preserving the namelist value? Also, just to be clear, the n_ are the number of tracers for that tracer group? nt_ is the start index of that tracer group in the tracer array? tr_ is a logical indicating whether that tracer group is active. Correct? It seems like n_ could just be hardwired for most tracers and never changed. tr_ would indicate whether it's active and nt_ would index it (if it's active) in the tracer array. Why do we need to zero out the n_ variables if the tracer is inactive. It that used as a flag later in the code instead of the tr_ flag? Also, the variable names and their relationship are not easy to figure out. Just look at this,
why aren't we using n_trbgcs instead of n_trzs? It would be nice to have greater consistency between these variable names, could we require a direct naming relationship between n_tr[var] and tr_[var] or something similar. I know this might not be possible in all cases, but at least for as many as we can. I find it a bit confusing. |
Yes, the names are confusing and this is why I suggested simplifying them. It helps to remember where they came from -- the n_trage etc were originally the cpp defs, and they were there so they could be turned off to reduce the memory footprint of the big tracer array. Many of them were just 1, i.e. on/off, and we also had a namelist flag that could then turn off the process or capability at runtime. Now with dynamic allocation, the n_trage etc are redundant in namelist, but they are still useful in the code to count up the total number of tracers for the allocation. So I started using the logical flags tr_[trcr] as the primary input, setting the tracer numbers n_[trcr] in the code (except for BGC, see below). E.g. tr_brine should be in namelist, n_trbri should be moved out of namelist and preferably renamed to n_brine. It would be assumed 0 unless tr_brine=T, in which case it's 1. Some of the tracer flags turn on/off more than one tracer (e.g. lvl and some bgc).
Yes to all of that. The n_ variables are zeroed out because of the (old) bit of code that adds them up all at once, which we can delete (item 3b above). I think n_aero and other bgc numbers are used elsewhere in the code; my preference is to initialize all of the n_ variables to 0 and then set them to whatever they need to be based on the logical flags. The logical flags should wrap ALL related tracer calculations -- the logicals are the primary, defining inputs. The wrench in all of this is that we have the option of using an arbitrary (up to some max) number of bgc tracers. I decided not to hardwire the bgc tracers to a particular number, mainly because of the aerosols in CESM and I'm not sure if @njeffery varies the number that she uses, depending on the scenario. For those, we'll still have the logical for consistency with the other tracers, and it will override the numbers being nonzero. However if one of the bgc numbers is zero, then it makes sense to set the logical to false, to prevent loop weirdness. I'd like to rename the flags and numbers to be more consistent -- need to make a list of them...... |
Thanks, that's helpful. I'm actually writing up a list myself right now, just to better understand what's there. Having the count_tracers subroutine with everything together helps a lot. I'l share the list, should have it in a few minutes. May have more questions then. |
It is useful to be able to vary the bgc tracer numbers to suit the forcing or the ocean bgc model.
n
…________________________________
From: Elizabeth Hunke <notifications@github.com>
Sent: Monday, November 12, 2018 4:15:30 PM
To: CICE-Consortium/CICE
Cc: Jeffery, Nicole; Mention
Subject: Re: [CICE-Consortium/CICE] Refactor tracer initialization (#235)
Yes, the names are confusing and this is why I suggested simplifying them. It helps to remember where they came from -- the n_trage etc were originally the cpp defs, and they were there so they could be turned off to reduce the memory footprint of the big tracer array. Many of them were just 1, i.e. on/off, and we also had a namelist flag that could then turn off the process or capability at runtime. Now with dynamic allocation, the n_trage etc are redundant in namelist, but they are still useful in the code to count up the total number of tracers for the allocation. So I started using the logical flags tr_[trcr] as the primary input, setting the tracer numbers n_[trcr] in the code (except for BGC, see below). E.g. tr_brine should be in namelist, n_trbri should be moved out of namelist and preferably renamed to n_brine. It would be assumed 0 unless tr_brine=T, in which case it's 1. Some of the tracer flags turn on/off more than one tracer (e.g. lvl and some bgc).
Also, just to be clear, the n_ are the number of tracers for that tracer group? nt_ is the start index of that tracer group in the tracer array? tr_ is a logical indicating whether that tracer group is active. Correct? It seems like n_ could just be hardwired for most tracers and never changed. tr_ would indicate whether it's active and nt_ would index it (if it's active) in the tracer array. Why do we need to zero out the n_ variables if the tracer is inactive. It that used as a flag later in the code instead of the tr_ flag?
Yes to all of that. The n_ variables are zeroed out because of the (old) bit of code that adds them up all at once, which we can delete (item 3b above). I think n_aero and other bgc numbers are used elsewhere in the code; my preference is to initialize all of the n_ variables to 0 and then set them to whatever they need to be based on the logical flags. The logical flags should wrap ALL related tracer calculations -- the logicals are the primary, defining inputs.
The wrench in all of this is that we have the option of using an arbitrary (up to some max) number of bgc tracers. I decided not to hardwire the bgc tracers to a particular number, mainly because of the aerosols in CESM and I'm not sure if @njeffery<https://github.com/njeffery> varies the number that she uses, depending on the scenario. For those, we'll still have the logical for consistency with the other tracers, and it will override the numbers being nonzero. However if one of the bgc numbers is zero, then it makes sense to set the logical to false, to prevent loop weirdness.
I'd like to rename the flags and numbers to be more consistent -- need to make a list of them......
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#235 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AH1K6iwpaD_y5kaASSXRlHR2ukiDXHkJks5uugESgaJpZM4YV62_>.
|
…ad code, add input init aborts for bgc
I just pushed a bunch of changes. These should address 1, 2, 3, and 4 on the list. Some of 8 is also done, abort checks on the bgc input have been added, and 9 variables were removed from CICE, 5 are namelist (max_ntrcr, max_nsw, n_bgc, nltrcr, n_trbgcz, n_trzs, n_trzaero, n_trbri, n_trbgcs). I am working on 5 and 7 now, I think they are related. |
I have verified bit-for-bit on conrad for all 4 compilers. The only failure was the alt01 which requires an update to the namelist setting (not yet done) to be compatible with the updates to the input checks. |
I just pushed a few more changes, including updates to alt01. @eclare108213, have a look when you have a chance and lets figure out what still needs to be done. |
This looks good, @apcraig, thanks! I'll pull the changes and test them here at LANL. These are some notes just from looking through the code changes, which either of us might address: Once we start changing Icepack, let's fix this to properly use the logical flag (unless nltrcr is a multiplier), from ice_step_mod.F90: Do we have a test with all pond parameterizations off and using the ccsm3 shortwave? Or do we need to keep this in alt01 and put the formdrag test in a different option set? For abort_flag=16 in ice_init.F90, the error message should be
I'm a little concerned that we might get out-of-bounds errors when aerosols are off. For now, we are specifying a single array index for tracers that are not in use, but the aero tracers needs many indices because they are always treated in loops like
That's why we initialized nt_aero as
in the original "bad kludge". If we have a testing option that has aerosols turned off and debugging on, it should catch the problem, if it exists. Remove the format statements from the bottom of input_data? |
I will run some tests, especially debug with n_aero set to 0. I'll also try to address some of the comments with code changes. I will test on conrad. |
running debug tests on conrad for all alt and box configs. getting some aborts. will try to fix. format statements are needed in input_data from what I can see. I'll look at the alt tests and try to move formdrag to a different alt test. |
I just read through ice_init.F90 and ice_init_column.F90, where the biggest changes are, and fixed a few things, nothing answer-changing. My base_suite built and the runs are still sitting in the queue. From the original list above, it looks like 6 is not started, 5 and 7 are partially complete, and everything else is done along with some additional enhancements that weren't in the list. 5, 6, 7 could become a 'cleanup' task in a new PR. 7 is captured in issue #238. |
I just pushed a new update. I added several new debug tests and updated the alt01 (now with cesm shortwave and no ponds) and alt04 (now with lvl ponds and formdrag) tests. Things are working pretty well. A couple of the new debug tests with the box configuration are failing, but all the alt tests with debug on are passing. The nt_aero and n_aero settings do not seem to be causing any problems in debug mode. Remember that when tr_aero is off, nt_aero is pointing at the unused tracer and n_aero=0. So it's likely the array sections in arguments are valid, and as long as they're not being used, should be OK. That seems to be true based on testing. See current results here, hash 770bd8f. https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks I am not particularly worried about the box failure. We should try to fix it, but I would not hold up the release for it either. I also agree that 6 on the list was never done, 5 and 7 were started but need quite a bit more thought and refactoring. @eclare108213 before or after the merge, could you review the model output for settings/namelist. I think we should make sure we are writing out the right stuff and that it's happening in the correct place during the init sequence. I also think if there are any other "abort checks" that could be added, that we should try to do that. I think it would be good to merge this before the weekend if we can, and we feel it's ready. We can continue to additional changes on a new branch as needed. I think this refactor has been very good to do! |
Question for @njeffery: We are checking the numbers of tracers that are allocated - are these checks correct, or e.g. should we check that n_fep+n_fed <= icepack_max_fe and similarly for doc, dic?
|
@apcraig I suggest to go ahead and merge, if you are happy with this so far, to get a new baseline since alt01 changed (right? That’s the only fail in my baseline tests yesterday). Then we can continue cleaning up in a new PR. |
This was forgotten back in 8b0ae03 (Refactor tracer initialization (CICE-Consortium#235), 2018-11-16)
This was forgotten back in 8b0ae03 (Refactor tracer initialization (CICE-Consortium#235), 2018-11-16)
This was forgotten back in 8b0ae03 (Refactor tracer initialization (CICE-Consortium#235), 2018-11-16)
This was forgotten back in 8b0ae03 (Refactor tracer initialization (CICE-Consortium#235), 2018-11-16)
This was forgotten back in 8b0ae03 (Refactor tracer initialization (CICE-Consortium#235), 2018-11-16)
* drivers/hadgem3: add missing 'subname' and use existing 'subname's * drivers/hadgem3/CICE_InitMod: update 'init_lvl' call Add the required 'iblk' argument. * drivers/hadgem3/CICE_RunMod: remove uneeded 'dt' arguments The subroutines 'prep_radiation', 'zsal_diags', 'bgc_diags' and 'hbrine_diags' do not take a 'dt' argument anymore, so remove it. * drivers/hadgem3/CICE_RunMod: get 'Lsub' from Icepack * drivers/hadgem3/CICE_RunMod: remove 'da_state_update' subroutine This subroutine is inside an 'ICE_DA' CPP, which is not used in any configuration. Remove it. * drivers/hadgem3/CICE_RunMod: remove stray '+' This '+' sign was copy-pasted there in error in 29b99b6 (CICE: Floe size distribution (#382), 2019-12-07). Remove it. * drivers/hadgem3: remove obsolete 'check_finished_file' subroutine Remove the call to 'check_finished_file' as well as the definition of the subroutine, as the 'hadgem3' driver is not used on machine 'bering' and it's unclear if machine 'bering' still exists. * drivers/hadgem3: fix cice_init so it calls 'count_tracers' This was forgotten back in 8b0ae03 (Refactor tracer initialization (#235), 2018-11-16) * drivers/hadgem3/CICE_RunMod: add call to 'save_init' The hadgem3 driver was not updated when 'save_init' was added in 83686a3 (Implement box model test from 2001 JCP paper (#151), 2018-10-22). As this subroutine is necessary to ensure proper initialization of the model, add it now. * drivers/hadgem3/CICE_RunMod: tweak loop indices in 'coupling_prep' Other drivers use 'ilo,ihi' and 'jlo,jhi' here. Do the same. * drivers/hagdem3: update driver to new time manager * drivers/hadgem3: pass 'mpi_comm_opa' explicitely to init_communicate In 066070e (Fix minor issues in documentation, key_ CPPs, bfbcomp return codes (#532), 2020-11-23), the 'ice_communicate' module was updated to remove CPP macros relating to the OASIS coupler (key_oasis*) and to the NEMO ocean model (key_iomput). These CPPs were used to make the correct MPI communicator accessible to the 'init_communicate' subroutine. However, that subroutine already accepts an optional MPI communicator as argument and it was deemed cleaner to require the driver layer to explicitely pass the communicator instead of making it accessible through 'use' statements. Update the 'hadgem3' driver, used for coupling with NEMO, to explicitely pass the NEMO MPI communicator 'mpi_comm_opa' to 'init_communicate'. * drivers: add 'nemo_concepts' driver Historically the 'hadgem3' driver has been used when compiling a single NEMO-CICE executable at ECCC. Going forward, all driver-level adjustements will be done in our own driver, 'nemo_concepts', 'CONCEPTS' being the name of the multi-departmental collaboration around using the NEMO ocean model. Copy CICE_InitMod, CICE_RunMod and CICE_FinalMod from the 'hadgem3' directory to a new 'nemo_concepts' directory under 'drivers/direct'. The following commits will clean up this new driver and port over some in-house adjustments. * drivers/nemo_concepts: remove unused 'writeout_finished_file' subroutine This subroutine was only called on machine 'bering', which is not an ECCC machine and probably does not exist anymore anyway. Remove it. * drivers/nemo_concepts: call 'scale_fluxes' with 'aice_init' Since 'merge_fluxes' is called with aice_init, it is more consistent to also call 'scale_fluxes', in 'coupling_prep' with 'aice_init' instead of 'aice'. Copy this in-house change to the new 'nemo_concepts' driver.
Cleaning up tracer initialization and namelists
Developer(s): E. Hunke
Please suggest code Pull Request reviewers in the column at right.
Are the code changes bit for bit, different at roundoff level, or more substantial? BFB
Please include the link to test results or paste the summary block from the bottom of the testing output below.
161 of 165 tests PASSED
0 of 165 tests FAILED
3 of 165 tests PENDING
Does this PR create or have dependencies on Icepack or any other models? Yes.
remove initializations for multiple init calls Icepack#230
Is the documentation being updated with this PR? (Y/N) No
If not, does the documentation need to be updated separately at a later time? (Y/N) Yes
Note: "Documentation" includes information on the wiki and .rst files in doc/source/,
which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.
This code is NOT READY to be merged. It's a mess (but it's a bit-for bit mess). It'll be a few days before I get back to this, so if anyone wants to jump in, go for it... I hope it's now arranged in the proper order to dynamically allocate the tracer arrays using ntrcr.
Here are the steps I suggest be done next, not necessarily in this order -- these are somewhat negotiable.
combine input_data2 and input_zbgc0 into a routine called “count_tracers”
keep n_trbgcz, n_aero, n_zaero, n_algae, n_doc etc in namelist, and remove the following n_[trcr] and set them in the code based on namelist logical flags:
I did this for the age, FY, lvl and pnd tracers.
in the combined input_data2 and input_zbgc0:
a. instead of this:
do this:
b. remove the old calculation of max_ntrcr (or use it as a sanity check)
change skl_bgc to tr_bgcs
change z_tracers to tr_bgcz
FYI: for max numbers for bgc tracers, see options/set_env.bgc in icepack