-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tech doc updates (for master) #209
Conversation
…to reflect: a) The gmtb-scm code repository is now public. b) The EMC NEMS repository is no longer in VLab as it has been moved to GitHub in August 2019. GMTB is forking this repository and no longer mirrorring it. c) Given that the CCPP-enabling capabilities in the FV3, NEMS, and NEMSfv3gfs repositories were committed to EMC's master in July 2019, the primary branch for CCPP development in the GitHub.com/NCAR fork or mirrors is no longer a special GMTB area (gmtb/ccpp) but is now the primary code (master for NEMSfv3gfs and FV3 and develop for NEMS). The difference in terminogy (master versus develop) is due to the transition of EMC's codes to GitHub and use of GitFlow (NEMS is now on GitHub using GitFlow, so develop is the proper origin/upstream). d) A clarification on Section 7.4.1 (Creating a PR) asking developers to, when relevant, note in the PR message the existence of associated PRs in separate repositories.
… reflect changes in the NEMSfv3gfs regression tests implemented when the CCPP-enabling changes to FV3, NEMS, and NEMSfv3gfs were committed to EMC master in July 2019.
Codecov Report
@@ Coverage Diff @@
## master #209 +/- ##
=======================================
Coverage 47.28% 47.28%
=======================================
Files 14 14
Lines 1343 1343
=======================================
Hits 635 635
Misses 708 708 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least one typo plus some suggestions and questions.
@@ -342,7 +342,7 @@ Regression testing is the process of testing changes to the programs to make sur | |||
Overview of the RTs | |||
^^^^^^^^^^^^^^^^^^^ | |||
|
|||
The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master, compares the results from the non-CCPP code to the *official baseline* and is called ``rt.conf``. Before running the RT script ``rt.sh`` in the same directory, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. | |||
The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master is called ``rt.conf`` and runs four types of configurations: IPD PROD, IPD REPRO, CCPP PROD, and CCPP REPRO. For the IPD configurations, CCPP is not used, that is, the code is compiled with ``CCPP=N``. The PROD configurations use the compiler flags used in NCEP operations for for superior performance, while the REPRO configurations remove certain compiler flags to create b4b identical results between CCPP and IPD configurations. Before running the RT script ``rt.sh`` in directory ``./tests``, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"operations for for superior" ==> "operations for superior"
"location for the RT run directories underneath which directories" really needs a comma
==> "location for the RT run directories, underneath which directories"
I do not see where I would modify the script -- it looks like all environment variables. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mods look fine to me. Good catch with the typo and comma.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will address the typo and comma. Variables RUNDIR_ROOT and NEMS_COMPILER can be set in your shell or entered in the script. Variable ACCNR needs to be set in the script. So, I think this part of the text is okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, export ACCNR=gmtb
and then running the script works fine.
|
||
This command and all others below produce log output in ``./tests/log_machine.compiler``. These log files contain information on the location of the run directories that can be used as templates for the user. Each ``rt*.conf`` contains one or more compile commands preceding a number of tests. | ||
|
||
|
||
Baselines | ||
^^^^^^^^^^^^^^^^^^^ | ||
|
||
Regression testing is only possible on machines for which baselines exist. EMC maintains *official baselines* of non-CCPP runs on *Jet* and *Wcoss* created with the Intel compiler. GMTB maintains additional baselines on *Theia, Jet, Cheyenne*, and *Gaea*. While GMTB is trying to keep up with changes to the official repositories, baselines maintained by GMTB are not guaranteed to be up-to-date. | ||
Regression testing is only possible on machines for which baselines exist. EMC maintains *official baselines* on *Theia* and *Wcoss* created with the Intel compiler. GMTB maintains additional baselines on *Jet*, *Cheyenne*, and *Gaea*. While GMTB is trying to keep up with changes to the official repositories, baselines maintained by GMTB are not guaranteed to be up-to-date. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are not guaranteed to be up-to-date
For each GMTB-stored baseline, it is clear which code version was tested? If so, I would state this (along with how to determine the tested code revision).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regression test baselines sit in directories .../trunk-YYYYMMDD/...
- the procedure is to copy the latest baseline directory over from theia (because it also contains the input data and some o the config files) and then update them by running the tests in "create" mode with the version of the code that corresponds to this date tag. Thus, either the directories are there (and the baseline is up to date), or they aren't there at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connection between the baseline and the version of the code used to create the baseline is done solely by the date in the directory name of the baseline. In order to obtain the code used to create a baseline dated YYYYMMDD, a person would obtain top of the master as of this date.
To make this more clear, I will change text from "Note that yyyymmdd
is the year, month and day the RT was created." to "Note that yyyymmdd
is the year, month and day the baseline was created using top of master code."
+---------------------------------------------+----------------------+ | ||
| https://github.com/NCAR/ccpp-physics | master | | ||
+---------------------------------------------+----------------------+ | ||
| https://github.com/NCAR/ccpp-framework | master | | ||
+---------------------------------------------+----------------------+ | ||
| https://github.com/NCAR/NEMS | gmtb/ccpp | | ||
| https://github.com/NCAR/NEMS | develop | | ||
+---------------------------------------------+----------------------+ | ||
| https://github.com/NCAR/FMS | GFS-FMS | | ||
+---------------------------------------------+----------------------+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it unusual that users are pointed to branches rather than tags. What sort of users are targeted here?
Also, aren't most of the repositories checked out as submodules? Those never create branches (and usually do not even clone the full branch or repository).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe unusual, but equivalent. We are always keeping the branches listed above in sync, i.e. they head of each of them is guaranteed to work with all the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should list branches and tags. The branch information is needed if someone is going to create a pull request for a certain branch of NEMSfv3gfs and its submodules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But tags will change much more often than the technical documentation!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This documentation is for top of master. Target audience is model developers, which will likely be modifying one or more submodules of NEMSfv3gfs. Because they will be creating PR of innovations, they cannot work from fixed tags.
All repositories under NEMSfv3gfs are submodules. Those are ccpp-physics, ccpp-framework, FV3, NEMS etc. Developers will be modifying one or more of these submodules.
Note that we also offer a public release of CCPP with the Single-Column Model. In that case, we point to tags because the public release is stable and does not change.
@@ -67,9 +66,9 @@ https://github.com/NCAR/ccpp-framework | |||
|
|||
https://github.com/NCAR/ccpp-physics | |||
|
|||
Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories. Both CCPP repositories are public (no GitHub account required) and may be used directly to read or create forks. Write permission is generally restricted, however. The SCM repository is private, to request access please send a message and your GitHub username to gmtb-help@ucar.edu. | |||
Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories. Both CCPP repositories and the SCM repositories are public (no GitHub account required) and may be used directly to read or create forks. Write permission is generally restricted, however. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up on line 63, I think "As for NEMSfv3gfs" would be easier to understand as "As with NEMSfv3gfs"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make this change, tks.
@@ -162,7 +161,7 @@ Start with checking out the main repository from the NCAR GitHub | |||
|
|||
.. code-block:: console | |||
|
|||
git clone -b gmtb/ccpp https://github.com/NCAR/NEMSfv3gfs | |||
git clone -b https://github.com/NCAR/NEMSfv3gfs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? I thought the -b
switch required a branch name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe you need to specify the branch name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the "-b", so it will read
git clone https://github.com/NCAR/NEMSfv3gfs
@@ -162,7 +161,7 @@ Start with checking out the main repository from the NCAR GitHub | |||
|
|||
.. code-block:: console | |||
|
|||
git clone -b gmtb/ccpp https://github.com/NCAR/NEMSfv3gfs | |||
git clone -b https://github.com/NCAR/NEMSfv3gfs | |||
cd NEMSfv3gfs | |||
git submodule init | |||
git submodule update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason this recipe is used instead of using --recurse-submodules
? Something like:
git clone -b master --recurse-submodules https://github.com/NCAR/NEMSfv3gfs
cd NEMSfv3gfs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can answer this one. NEMS contains a nested submodule, which at this point still lives on Vlab. The code of this submodule is not required for the "normal" user (only when running the NEMSCompset regression test version, which is different from the regression tests run through rt.sh). Because it is in Vlab, a --recursive
leads to a checkout error. Doing it the way Ligia described it avoids this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This update of the tech document looks good to me, with the corrections suggested by Steve and Dom.
- Fix typo and comma use - Make a few sentences more clear - Fix bug in git clone command
@@ -342,7 +342,7 @@ Regression testing is the process of testing changes to the programs to make sur | |||
Overview of the RTs | |||
^^^^^^^^^^^^^^^^^^^ | |||
|
|||
The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master, compares the results from the non-CCPP code to the *official baseline* and is called ``rt.conf``. Before running the RT script ``rt.sh`` in the same directory, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. | |||
The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master is called ``rt.conf`` and runs four types of configurations: IPD PROD, IPD REPRO, CCPP PROD, and CCPP REPRO. For the IPD configurations, CCPP is not used, that is, the code is compiled with ``CCPP=N``. The PROD configurations use the compiler flags used in NCEP operations for superior performance, while the REPRO configurations remove certain compiler flags to create b4b identical results between CCPP and IPD configurations. Before running the RT script ``rt.sh`` in directory ``./tests``, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories, underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand why an environment variable needs to be set in the script.
Is there a reason there cannot be a standard bash environment override such as:
ACCNR=${ACCNR:-""}
with a test for a blank ACCNR located before it is used (say around line 555)?
if [ -z "${ACCNR}" ]; then
echo "ERROR: set ACCNR to a valid account key"
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment somewhere else in this thread. export ACCR=gmtb
works just as fine, I am doing this all the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See here: https://github.com/NCAR/NEMSfv3gfs/blob/8a1e235d57a13d40e8ba71d3ce174d62dfa4f265/tests/rt.sh#L569 for theia and some other platforms, no ACCNR is set a priori. The reason is that there used to be a script that was supposed to determine the account number automatically (based on the user's groups and certain rules), but this doesn't work anymore since the transition to slurm (and didn't work so well either beforehand), hence no default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the script could be modified (around line 555) to override a shell environment variable. However, at this time I am just documenting the script as it is. I will change the text from "the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are..." to "the user has to set one or more environment variables on the working shell or in the script. The environment variables are..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify the situation.
On the WCOSS platforms (all phases/partitions), the variable ACCNR is set explicitly in the script and the user has to modify the corresponding lines. On all other platforms (cheyenne, theia, jet, gaea), ACCNR is explicitly NOT set in the host environment, and for one of them there is even a comment as to why:
# DO NOT SET AN ACCOUNT EVERYONE IS NOT A MEMBER OF
# USE AN ENVIRONMENT VARIABLE TO SET ACCOUNT
# ACCNR=cmp
This allows the user to do exactly what the comment says.
This is on purpose I think, because only the code managers and others involved in the actual hand-off to NCO are working on these systems. We should probably restrict our documentation to the user development platforms, in which case all variables can and should be set in the form of environment variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving but I believe the user interface could be improved and simplified with a few lines of code in rt.sh.
…ession test script rt.sh.
Let me know if the following language is acceptable. If not, please suggest a language. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience with my picky comments on rt.sh. The script may not be the most user-friendly, but it was written (not by me!) for a specific purpose and it does serve this purpose quite well.
Approved, waiting to see if Grant wants to add something.
…Test script rt.sh.
@@ -435,25 +435,6 @@ In case a user does not have write permissions to ``$STMP (/scratch4/NCEPDEV/stm | |||
RUNDIR_ROOT=${RUNDIR_ROOT:-${PTMP}/${USER}/FV3_RT}/rt_$$ | |||
|
|||
|
|||
Non-CCPP vs CCPP Tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked this explanation of creating your own baseline. Perhaps it is not the job of CCPP documentation to explain to users how to do so, but is this information found anywhere else? I realize that the specifics of non-CCPP vs CCPP in this section is longer valid, but the underlying instructions regarding personal baselines was nice and I'm sad to see it go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved.
Updates to the CCPP Technical Documentation to reflect changes in the code management and regression tests of CCPP when used with NEMSfv3gfs. No source code was altered, so no regression tests are needed.