Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tech doc updates (for master) #209

Merged
merged 5 commits into from
Aug 22, 2019
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 4 additions & 23 deletions doc/CCPPtechnical/source/BuildingRunningHostModels.rst
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ Regression testing is the process of testing changes to the programs to make sur
Overview of the RTs
^^^^^^^^^^^^^^^^^^^

The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master, compares the results from the non-CCPP code to the *official baseline* and is called ``rt.conf``. Before running the RT script ``rt.sh`` in the same directory, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree.
The RT configuration files are located in ``./tests`` relative to the top-level directory of NEMSfv3gfs and have names ``rt*.conf``. The default RT configuration file, supplied with the NEMSfv3gfs master is called ``rt.conf`` and runs four types of configurations: IPD PROD, IPD REPRO, CCPP PROD, and CCPP REPRO. For the IPD configurations, CCPP is not used, that is, the code is compiled with ``CCPP=N``. The PROD configurations use the compiler flags used in NCEP operations for superior performance, while the REPRO configurations remove certain compiler flags to create b4b identical results between CCPP and IPD configurations. Before running the RT script ``rt.sh`` in directory ``./tests``, the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are ``ACCNR`` (mandatory unless the user is a member of the default project *nems*; sets the account to be charged for running the RTs), ``NEMS_COMPILER`` (optional for the ``intel`` compiler option, set to ``gnu`` to switch), and potentially ``RUNDIR_ROOT``. ``RUNDIR_ROOT`` allows the user to specify an alternative location for the RT run directories, underneath which directories called ``rt_$PID`` are created (``$PID`` is the process identifier of the ``rt.sh`` invocation). This may be required on systems where the user does not have write permissions in the default run directory tree.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand why an environment variable needs to be set in the script.
Is there a reason there cannot be a standard bash environment override such as:

ACCNR=${ACCNR:-""}

with a test for a blank ACCNR located before it is used (say around line 555)?

if [ -z "${ACCNR}" ]; then
    echo "ERROR: set ACCNR to a valid account key"
fi

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment somewhere else in this thread. export ACCR=gmtb works just as fine, I am doing this all the time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See here: https://github.com/NCAR/NEMSfv3gfs/blob/8a1e235d57a13d40e8ba71d3ce174d62dfa4f265/tests/rt.sh#L569 for theia and some other platforms, no ACCNR is set a priori. The reason is that there used to be a script that was supposed to determine the account number automatically (based on the user's groups and certain rules), but this doesn't work anymore since the transition to slurm (and didn't work so well either beforehand), hence no default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the script could be modified (around line 555) to override a shell environment variable. However, at this time I am just documenting the script as it is. I will change the text from "the user has to set one or more environment variables and potentially modify the script to change the location of the automatically created run directories. The environment variables are..." to "the user has to set one or more environment variables on the working shell or in the script. The environment variables are..."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify the situation.

On the WCOSS platforms (all phases/partitions), the variable ACCNR is set explicitly in the script and the user has to modify the corresponding lines. On all other platforms (cheyenne, theia, jet, gaea), ACCNR is explicitly NOT set in the host environment, and for one of them there is even a comment as to why:

#  DO NOT SET AN ACCOUNT EVERYONE IS NOT A MEMBER OF
#  USE AN ENVIRONMENT VARIABLE TO SET ACCOUNT
#  ACCNR=cmp

This allows the user to do exactly what the comment says.

This is on purpose I think, because only the code managers and others involved in the actual hand-off to NCO are working on these systems. We should probably restrict our documentation to the user development platforms, in which case all variables can and should be set in the form of environment variables.


.. code-block:: console

Expand All @@ -356,15 +356,15 @@ Running the full default RT suite defined in ``rt.conf`` using the script ``rt.s

./rt.sh -f

This command can only be used on a NOAA machine using the Intel compiler, where the output of a non-CCPP build using the default Intel version is compared against the *official baseline*. For information on testing the CCPP code, or using alternate computational platforms, see the following sections.
This command can only be used on a NOAA machine using the Intel compiler, where an *official baseline* is available. For information on testing the CCPP code, or using alternate computational platforms, see the following sections.

This command and all others below produce log output in ``./tests/log_machine.compiler``. These log files contain information on the location of the run directories that can be used as templates for the user. Each ``rt*.conf`` contains one or more compile commands preceding a number of tests.


Baselines
^^^^^^^^^^^^^^^^^^^

Regression testing is only possible on machines for which baselines exist. EMC maintains *official baselines* of non-CCPP runs on *Jet* and *Wcoss* created with the Intel compiler. GMTB maintains additional baselines on *Theia, Jet, Cheyenne*, and *Gaea*. While GMTB is trying to keep up with changes to the official repositories, baselines maintained by GMTB are not guaranteed to be up-to-date.
Regression testing is only possible on machines for which baselines exist. EMC maintains *official baselines* on *Theia* and *Wcoss* created with the Intel compiler. GMTB maintains additional baselines on *Jet*, *Cheyenne*, and *Gaea*. While GMTB is trying to keep up with changes to the official repositories, baselines maintained by GMTB are not guaranteed to be up-to-date.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are not guaranteed to be up-to-date

For each GMTB-stored baseline, it is clear which code version was tested? If so, I would state this (along with how to determine the tested code revision).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression test baselines sit in directories .../trunk-YYYYMMDD/... - the procedure is to copy the latest baseline directory over from theia (because it also contains the input data and some o the config files) and then update them by running the tests in "create" mode with the version of the code that corresponds to this date tag. Thus, either the directories are there (and the baseline is up to date), or they aren't there at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection between the baseline and the version of the code used to create the baseline is done solely by the date in the directory name of the baseline. In order to obtain the code used to create a baseline dated YYYYMMDD, a person would obtain top of the master as of this date.
To make this more clear, I will change text from "Note that yyyymmdd is the year, month and day the RT was created." to "Note that yyyymmdd is the year, month and day the baseline was created using top of master code."


When porting the code to a new machine, it is useful to start by establishing a *personal baseline*. Future runs of the RT can then be compared against the *personal baseline* to ascertain that the results have not been inadvertently affected by code developments. The ``rt.sh -c`` option is used to create a *personal baseline*.

Expand Down Expand Up @@ -404,7 +404,7 @@ The *official baseline* directory is defined as:
RTPWD=$DISKNM/trunk-yyyymmdd/${COMPILER} # on Cheyenne
RTPWD=$DISKNM/trunk-yyyymmdd # elsewhere

Note that ``yyyymmdd`` is the year, month and day the RT was created.
Note that ``yyyymmdd`` is the year, month and day the baseline was created using top of master code.

.. warning:: Modifying ``$DISKNM`` will break the RTs!

Expand Down Expand Up @@ -435,25 +435,6 @@ In case a user does not have write permissions to ``$STMP (/scratch4/NCEPDEV/stm
RUNDIR_ROOT=${RUNDIR_ROOT:-${PTMP}/${USER}/FV3_RT}/rt_$$


Non-CCPP vs CCPP Tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked this explanation of creating your own baseline. Perhaps it is not the job of CCPP documentation to explain to users how to do so, but is this information found anywhere else? I realize that the specifics of non-CCPP vs CCPP in this section is longer valid, but the underlying instructions regarding personal baselines was nice and I'm sad to see it go.

^^^^^^^^^^^^^^^^^^^^^^

While the official EMC RTs do not execute the CCPP code, GMTB provides RTs to exercise the CCPP in its various modes: ``rt_ccpp_standalone.conf`` tests the CCPP with dynamic build and ``rt_ccpp_static.conf`` tests the CCPP with static build. These tests compare the results of runs done using the CCPP against a previously generated *personal baseline* created without the CCPP by running ``rt_ccpp_ref.conf``. For this comparison, both the non-CCPP *personal baseline* and the tests using the CCPP are performed with code built with the :term:`REPRO` compiler options.

The command below should be used to create a *personal baseline* using non-CCPP code compiled in :term:`REPRO` mode.

.. code-block:: console

./rt.sh -l rt_ccpp_ref.conf -c fv3 # create own reg. test baseline

Once the *personal baseline* in REPRO mode has been created, the CCPP tests can be run to compare against it. Use the ``-l`` option to select the test suite and the ``-m`` option to compare against the *personal baseline*.

.. code-block:: console

./rt.sh -l rt_ccpp_standalone.conf -m # dynamic build
./rt.sh -l rt_ccpp_static.conf -m # static build


Compatibility between the Code Base, the SDF, and the Namelist in the UFS Atmosphere
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
27 changes: 14 additions & 13 deletions doc/CCPPtechnical/source/CodeManagement.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ This chapter describes the organization of the code, provides instruction on the
UFS Atmosphere
-----------------------

The :term:`UFS` Atmosphere source code is contained in the NEMSfv3gfs code repository. CCPP users and developers should use the NEMSfv3gfs code and submodules maintained by GMTB in GitHub. For codes whose authoritative repository is in VLab, GMTB synchronizes these with VLab at periodic intervals.
The :term:`UFS` Atmosphere source code is contained in the NEMSfv3gfs code repository. CCPP users and developers should use the NEMSfv3gfs code and submodules maintained by GMTB in GitHub. For codes whose authoritative repository is managed by EMC, GMTB synchronizes these with EMC's at periodic intervals. Note that EMC is incrementally transitioning its repositories from VLab to GitHub, and therefore this code management is expected to change in the future.

https://github.com/NCAR/NEMSfv3gfs

Expand All @@ -32,23 +32,22 @@ https://github.com/NCAR/FMS

Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories.

Some of these repositories are public (no GitHub account required) and some are private. The public repositories (ccpp-framework, ccpp-physics, and Flexible Modeling System - FMS) may be used directly to read or create forks. Write permission is generally restricted, however. The private repositories require access - please send a request and your GitHub username to gmtb-help@ucar.edu.

The primary development by GMTB, including the latest CCPP developments, are maintained in the following branches:
Some of these repositories are public (no GitHub account required) and some are private. The public repositories (ccpp-framework, ccpp-physics, NEMS, and Flexible Modeling System - FMS) may be used directly to read or create forks. Write permission is generally restricted, however. The private repositories require access - please send a request and your GitHub username to gmtb-help@ucar.edu.

The following branches are recommended for CCPP users and developers:

+---------------------------------------------+----------------------+
| Repository (GMTB development version) | Branch name |
+=============================================+======================+
| https://github.com/NCAR/NEMSfv3gfs | gmtb/ccpp |
| https://github.com/NCAR/NEMSfv3gfs | master |
+---------------------------------------------+----------------------+
| https://github.com/NCAR/FV3 | gmtb/ccpp |
| https://github.com/NCAR/FV3 | master |
+---------------------------------------------+----------------------+
| https://github.com/NCAR/ccpp-physics | master |
+---------------------------------------------+----------------------+
| https://github.com/NCAR/ccpp-framework | master |
+---------------------------------------------+----------------------+
| https://github.com/NCAR/NEMS | gmtb/ccpp |
| https://github.com/NCAR/NEMS | develop |
+---------------------------------------------+----------------------+
| https://github.com/NCAR/FMS | GFS-FMS |
+---------------------------------------------+----------------------+
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it unusual that users are pointed to branches rather than tags. What sort of users are targeted here?
Also, aren't most of the repositories checked out as submodules? Those never create branches (and usually do not even clone the full branch or repository).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe unusual, but equivalent. We are always keeping the branches listed above in sync, i.e. they head of each of them is guaranteed to work with all the others.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should list branches and tags. The branch information is needed if someone is going to create a pull request for a certain branch of NEMSfv3gfs and its submodules.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But tags will change much more often than the technical documentation!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is for top of master. Target audience is model developers, which will likely be modifying one or more submodules of NEMSfv3gfs. Because they will be creating PR of innovations, they cannot work from fixed tags.
All repositories under NEMSfv3gfs are submodules. Those are ccpp-physics, ccpp-framework, FV3, NEMS etc. Developers will be modifying one or more of these submodules.
Note that we also offer a public release of CCPP with the Single-Column Model. In that case, we point to tags because the public release is stable and does not change.

Expand All @@ -61,15 +60,15 @@ CCPP developers should use the SCM code and submodules maintained by GMTB in Git

https://github.com/NCAR/gmtb-scm

As for NEMSfv3gfs, there are two submodules referenced in the gmtb-scm repository:
As with NEMSfv3gfs, there are two submodules referenced in the gmtb-scm repository:

https://github.com/NCAR/ccpp-framework

https://github.com/NCAR/ccpp-physics

Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories. Both CCPP repositories are public (no GitHub account required) and may be used directly to read or create forks. Write permission is generally restricted, however. The SCM repository is private, to request access please send a message and your GitHub username to gmtb-help@ucar.edu.
Users have read-only access to these repositories and as such cannot accidentally destroy any important (shared) branches of these authoritative repositories. Both CCPP repositories and the SCM repositories are public (no GitHub account required) and may be used directly to read or create forks. Write permission is generally restricted, however.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up on line 63, I think "As for NEMSfv3gfs" would be easier to understand as "As with NEMSfv3gfs"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will make this change, tks.


The primary development by GMTB, including the latest CCPP developments, are maintained in the following branches:
The following branches are recommended for CCPP users and developers:

+----------------------------------------+-------------------+
| Repository (GMTB development version) | Branch name |
Expand Down Expand Up @@ -162,7 +161,7 @@ Start with checking out the main repository from the NCAR GitHub

.. code-block:: console

git clone -b gmtb/ccpp https://github.com/NCAR/NEMSfv3gfs
git clone https://github.com/NCAR/NEMSfv3gfs
cd NEMSfv3gfs
git submodule init
git submodule update
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this recipe is used instead of using --recurse-submodules? Something like:

git clone -b master --recurse-submodules https://github.com/NCAR/NEMSfv3gfs
cd NEMSfv3gfs

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can answer this one. NEMS contains a nested submodule, which at this point still lives on Vlab. The code of this submodule is not required for the "normal" user (only when running the NEMSCompset regression test version, which is different from the regression tests run through rt.sh). Because it is in Vlab, a --recursive leads to a checkout error. Doing it the way Ligia described it avoids this problem.

Expand All @@ -181,7 +180,7 @@ Checking out remote branches as submodules means that your local branches are in

cd NEMS
git remote update
git checkout upstream/gmtb/ccpp
git checkout upstream/develop
cd ..

However, if you are making changes in a repository (submodule or main repository), you must create a local branch, for example in NEMSfv3gfs:
Expand Down Expand Up @@ -217,7 +216,7 @@ As opposed to branches without modifications described in step 3, changes to the

cd FV3
git remote update
git pull upstream gmtb/ccpp
git pull upstream master


-----------------------------------
Expand Down Expand Up @@ -371,6 +370,8 @@ Go to the github.com web interface, and navigate to your repository fork and bra
| Fill in a detailed description, including reporting on any testing you did
| Click on “Create pull request”

If your development also requires changes in other repositories, you must open PRs in those repositories as well. In the PR message for each repository, please note the associate PRs submitted to other repositories.

Several people (aka CODEOWNERS) are automatically added to the list of reviewers on the right hand side. If others should be reviewing the code, click on the “reviewers” item on the right hand side and enter their GitHub usernames

Once the PR has been approved, the change is merged to master by one of the code owners. If there are pending conflicts, this means that the code is not up to date with the trunk. To resolve those, pull the target branch from upstream as described above, solve the conflicts and push the changes to the branch on your fork (this also updates the PR).
Expand Down