Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frontend for TSMP2 #24

Merged
merged 25 commits into from
Sep 26, 2024
Merged

Frontend for TSMP2 #24

merged 25 commits into from
Sep 26, 2024

Conversation

s-poll
Copy link
Member

@s-poll s-poll commented Aug 29, 2024

This PR introduces a top-level script that makes building TSMP2 models simple and easy to understand and enhances the cmake script with additional options.

The main changes include:

  • The introduction of submodules for the model components.
  • An extension of CmakeLists.txt. The model components are now explicitly switched on (defaults off). The path to the source code of the model components is now set to the submodules by default.
  • A top-level shell script build_tsmp2.sh has been introduced.

Smaller changes:

  • Log the building
  • Take default machine environment.
  • copy the environment file to the install_dir

The defaults for building are defined in CmakeLists.txt with the exception of the source and build directory.

Why was the top-level shell script chosen?

  • CMake does not seem to be able to choose the source and build directory depending on the options. A suggested solution from Stackoverflow was to use an additional cmake-based script, which then calls the cmake command from TSMP2. However, even though this variant has the advantage that each option can be easily passed on, it is less obvious which options are set in cmakelists.txt and which options are set in script.cmake. Code duplication would unfortunately be unavoidable. A shell script makes the separation between the build environment and the front-end clear.

@jjokella : Even though, if the changes should not affect you, would be great if you could have a check, if this effects pdaf.

README.md Outdated Show resolved Hide resolved
build_tsmp2.sh Outdated Show resolved Hide resolved
@jjokella
Copy link
Contributor

Thanks @s-poll , this PR looks great!

I had a slight confusion with the ouptut in the log file:

== TSMP2 settings ==                                                                                                                                                                           
====================                                                                                                                                                                           
MODEL-ID: eCLM-PDAF                                                                                                                                                                            
TSMP2DIR: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend                                                                                                                               
BUILDDIR: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bld/JURECADC_eCLM-PDAF                                                                                                        
INSTALLDIR: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bin/JURECADC_eCLM-PDAF                                                                                                      
CMAKE command:                                                                                                                                                                                 
cmake -S /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend -B /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bld/JURECADC_eCLM-PDAF    -DeCLM=ON -DPDAF=ON      -DCMAKE_INSTALL_PR
EFIX=/p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bin/JURECADC_eCLM-PDAF |& tee /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bld/eCLM-PDAF_2024-08-30_17-52.log            eCLMSRC: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/models/eCLM                                                                                                                    
MODEL_ID: eCLM                                                                                                                                                                                 
MODELCOUNT: 1   

In particular the MODEL-ID and MODEL_ID output. I think the first comes from build_tsmp2.sh, the second then from CMake (could not find exact point). Maybe it can be made clearer, which outputs are from the shell-wrapper, in particular where the output from the shell wrapper ends.

@jjokella
Copy link
Contributor

jjokella commented Aug 30, 2024

Regarding PDAF, I was up to now using PDAF_SRC as the check if PDAF is activated or not.

cmake/BuildeCLM.cmake:13:if(DEFINED PDAF_SRC)
cmake/BuildeCLM.cmake:43:if(DEFINED PDAF_SRC)
cmake/BuildOASIS3MCT.cmake:37:if(DEFINED PDAF_SRC)
cmake/BuildCLM3.5.cmake:8:if(DEFINED PDAF_SRC)

I guess, I should replace if(DEFINED PDAF_SRC) by if(${PDAF})?

The existing code now breaks for eCLM-PDAF (at least of PDAF_SRC is not explicitly set), because PDAF_SRC is not set anymore during compilation of eCLM.

With this change implemented, I get a NetCDF-error in the very last step:

ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_def_var_zstandard'
ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_def_var_quantize'
ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_inq_var_quantize'
ld: /p/software/jurecadc/stages/2023/software/netCDF-Fortran/4.6.0-ipsmpi-2022a/lib/libnetcdff.so: undefined reference to `nc_inq_var_zstandard'
make[3]: *** [Makefile:144: /p/scratch/cjibg36/keller5/DA/TSMP-PDAF/TSMP2-frontend/bin/JURECADC_eCLM-PDAF/bin/tsmp-pdaf] Error 1
gmake[2]: *** [CMakeFiles/PDAF-Framework.dir/build.make:86: PDAF-Framework/src/PDAF-Framework-stamp/PDAF-Framework-build] Error 2
gmake[1]: *** [CMakeFiles/Makefile2:167: CMakeFiles/PDAF-Framework.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2

With that, I have to leave it for now. Maybe you know, what could be changed to the CMake-commands that could lead to NetCDF-issue like that.

@kvrigor kvrigor self-requested a review September 2, 2024 08:05
-clean up confusing messages in CMakeLists.txt
-add message to indicate cmake call
-change ICON build according to model component not comp_srcs
Copy link
Member

@kvrigor kvrigor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @s-poll! I have left a number of comments on build_tsmp2.sh. When writing a front-end script, we have to be more careful on designing the interface, since interfaces (or APIs in general) are expected to not change over time. Here are the general principles I observe when doing so:

  1. Follow established syntax conventions for writing command-line options. See 1, 2)
  2. Avoid feature bloat. it's better to have fewer options than more. A shellscript with few options is easier to maintain and easier to use. Ideally, the front-end should have a small set of options that covers most simple use cases. If more build options are necessary, either use cmake directly or pass those extra options to cmake via the -- syntax. Here's an example of how -- can be used:
$ cmake --build
Usage: cmake --build <dir>             [options] [-- [native-options]]
       cmake --build --preset <preset> [options] [-- [native-options]]
Options:
  <dir>          = Project binary directory to be built.
  --preset <preset>, --preset=<preset>
                 = Specify a build preset.
  --list-presets
                 = List available build presets.
  --parallel [<jobs>], -j [<jobs>]
                 = Build in parallel using the given number of jobs. 
                   If <jobs> is omitted the native build tool's 
                   default number is used.
                   The CMAKE_BUILD_PARALLEL_LEVEL environment variable
                   specifies a default parallel level when this option
                   is not given.
  --target <tgt>..., -t <tgt>... 
                 = Build <tgt> instead of default targets.
  --config <cfg> = For multi-configuration tools, choose <cfg>.
  --clean-first  = Build target 'clean' first, then build.
                   (To clean only, use --target 'clean'.)
  --verbose, -v  = Enable verbose output - if supported - including
                   the build commands to be executed. 
  --             = Pass remaining options to the native tool.

build_tsmp2.sh Outdated Show resolved Hide resolved
build_tsmp2.sh Show resolved Hide resolved
build_tsmp2.sh Outdated Show resolved Hide resolved
build_tsmp2.sh Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
build_tsmp2.sh Outdated Show resolved Hide resolved
build_tsmp2.sh Show resolved Hide resolved
build_tsmp2.sh Show resolved Hide resolved
build_tsmp2.sh Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
-add version in build_tsmp2.sh
-fix spelling quiet in build_tsmp2.sh
-change bash shebang for more portability in build_tsmp2.sh
-add pipefail in build_tsmp2.sh
-rephrase tip in README
-change header to Building TSMP2 with CMake in README
-use function for OASIS3 build check
-shift MODELCOUNT to function
-add TSMP2_ENV output in build_tsmp2.sh log
README.md Outdated Show resolved Hide resolved
models/eCLM Outdated Show resolved Hide resolved
@kvrigor
Copy link
Member

kvrigor commented Sep 6, 2024

Both ./build_tsmp2.sh --eCLM --ParFlow --ICON and ./build_tsmp2.sh --eCLM --PDAF works in JURECA 🎉

@jjokella it'd be nice if you could also test again, at least for eCLM-PDAF.. also if you've observed more issues kindly raise them here

I think we can merge Stefan's PR once ./build_tsmp2.sh has been verified working and all the open issues above are resolved.

@s-poll
Copy link
Member Author

s-poll commented Sep 25, 2024

I would like to merge the PR. Are there any objections?

In my test all combinations worked (have not checked COSMO,CLM3.5).

In the testing, however, I found two weaknesses, which might be tackled in future.

  1. The bld directory is not deleted to ensure the workflow for small code modifications. For a clean building one should delete bld/SYSTEMNAME_MODEL-ID before building. In future one could set build_opt to indicate a clean build (default?), or also just parts (project, build, install). Maybe also include rebuilding specific components.
  2. It seems that some of the building procedures are modifying files within the original model source code (especially with PDAF). As far as I have seen this is not connected to the shell-based frontend. I would suggest that this is discussed in an issue. For now, I included the --force option when user choose to overwrite the model component.

@kvrigor
Copy link
Member

kvrigor commented Sep 26, 2024

Looks good @s-poll ! I've also re-tested the latest changes and the build script works.

@jjokella
Copy link
Contributor

Hi @s-poll , @kvrigor

eCLM-PDAF also builds now for me using the simple ./build_tsmp2.sh --eCLM --PDAF. Therefore, I think this PR can be merged for now.

Regarding PDAF and in-repo changes. This is true. For a start I have integrated PDAF into TSMP2 by adding the make-command inside the PDAF repo:

TSMP2/cmake/BuildPDAF.cmake

Lines 156 to 165 in 7ea08d8

# make pdaf
ExternalProject_Add(PDAF
PREFIX PDAF
SOURCE_DIR ${PDAF_SRC}/src
BUILD_IN_SOURCE TRUE
CONFIGURE_COMMAND ""
BUILD_COMMAND make ${PDAF_ENV_VARS} clean ../lib/libpdaf-d.a
INSTALL_COMMAND ""
DEPENDS ${PDAF_DEPENDENCIES}
)

From TSMP2-side, a better CMake-integration would be nice. From PDAF side this includes some more work.

See also https://github.com/HPSCTerrSys/pdaf/blob/tsmp-pdaf-patched/make.arch/cmake.h for the Makefile header file that reads the variables set in TSMP2's CMake files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants