Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for nvhpc on ncar machines #192

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

mnlevy1981
Copy link
Collaborator

I also added a "--debug" command line argument to build_examples.sh and some module load statements to build_examples.sh for running on the NCAR machines, though that is skipped for casper nodes (including crhtc nodes).

I also added a "--debug" command line argument to build_examples.sh and some
module load statements to build_examples.sh for running on the NCAR machines,
though that is skipped for casper nodes (including crhtc nodes)
@mnlevy1981
Copy link
Collaborator Author

It would be great if the compiler flags set in standalone/templates/ncar-*.mk matched what is used in CESM. Eventually, I think we want to use the CIME build system (or at least cime/CIME/scripts/configure to create the Makefile based on configure_machines.xml), but as a short-term fix I'd like to get the various NCAR templates cleaned up -- they are all modifications of different files from mom-ocean/mkmf, so they look fairly different from one another. If they could all follow the same basic pattern and then also get FFLAGS and CFLAGS "right" (matching CESM for that compiler), that would be a useful temporary step.

To run MOM standalone use ./build_examples_cesm.sh just like
./build_examples.sh. To use the CESM compiler flags, (or at least the
rough conversion I've done), add a "--cesm" arg to the run. The
NVHPC with CESM is not working, and with intel debug it is clear there
is an issue. The GNU and Intel CESM version (Non Debug) builds and
runs. See templates for the changes between non-cesm and cesm flags.

If you would like to use the builds without building yourself, check out
this repo on derecho in
/glade/u/home/manishrv/documents/installs/mom_interface_pr_192/components/mom/standalone/build.
…,GNU, INTEL)

Made changes to the intel and nvhpc compiler flags to build properly and
run with double_gyre. Unfortunately that meant diverging slightly from
the CESM compiler flags, see comments in each template for more
information. So they all can be built and run, but each template needs
some workshopping to make sure they're good, for example I took out
-Ktrap=fp in NVHPC DEBUG which I think I need to confirm can be taken out.
@mnlevy1981 mnlevy1981 marked this pull request as ready for review September 24, 2024 17:12
Copy link
Member

@alperaltuntas alperaltuntas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mnlevy1981, @manishvenu, was there a specific reason for removing the REPRO flag? Could we consider adding it back, with the option to control it through a CLI argument like --repro, similar to how --debug works?

@manishvenu
Copy link
Collaborator

manishvenu commented Sep 24, 2024

@mnlevy1981, @manishvenu, was there a specific reason for removing the REPRO flag? Could we consider adding it back, with the option to control it through a CLI argument like --repro, similar to how --debug works?

Hey @alperaltuntas ,

I think @mnlevy1981 would be better able to answer on the general reasoning hah, but, loosely, I think we discussed that there not being a need for OPT, VERBOSE or OPENMP.

Which means, on the code side, after we talked, the REPRO is the default mode now, so if "--debug" isn't called, we default to REPRO. The flags that were removed were OPT, VERBOSE, and OPENMP.

Happy to re-add any of them and test them really quick if need be.

Thanks,
Manish

@mnlevy1981
Copy link
Collaborator Author

@mnlevy1981, @manishvenu, was there a specific reason for removing the REPRO flag? Could we consider adding it back, with the option to control it through a CLI argument like --repro, similar to how --debug works?

yeah, @manishvenu summed it up -- the previous templates set

DEBUG =
REPRO =
VERBOSE =
OPENMP =

but we set REPRO=1 and didn't have a mechanism for changing any of the other variables. We definitely want to be able to set DEBUG=1, but DEBUG and REPRO are mutually exclusive so the default is REPRO=1,DEBUG= and the --debug flag sets REPRO=,DEBUG=1 instead. I removed the OPENMP options because we'll let the GPU-ification program come up with useful flags, and we can add VERBOSE back in later if there is ever a need for it.

Copy link
Member

@alperaltuntas alperaltuntas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@mnlevy1981
Copy link
Collaborator Author

An update -- the changes @manishvenu put in have let us do more extensive testing in the standalone driver, and has found issues in both MOM6 and MARBL. I'm going to address those with PRs (the MARBL PR is marbl-ecosys/MARBL#470 but the MOM6 hasn't been opened yet) and update .gitmodules, at which point this PR will be ready to merge. [I also have one more commit to make the different standalone/templates/ncar* files more consistent.]

General formatting is the same, the only differences are the actual flags and
whatnot used by the different compilers
@mnlevy1981
Copy link
Collaborator Author

MOM6 PR mentioned in #192 (comment) is NCAR/MOM6#305

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants