-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compile script for Perlmutter CPU #4398
Conversation
|
|
Let us explicitly load
should give you libxml2. |
I think that a ticket to NERSC is due here. What are they expecting us to do for common libraries such as libxml2 and boost on perlmutter? I'll note that the official way to get libxml2 might appear to be
but since this e4s install doesn't provide boost and may only provide libxml2 as a side product of installing other packages, I am not sure how worth this route is. So one unfortunate possibility is that we end up installing our own libxml2 and boost, either directly or via our own spack. This situation is not an improvement for us over previous machines, but perhaps it represents what is maintainable at NERSC. At an appropriate point we can ask NERSC to install QMCPACK for users, since DOE BES has asked for this to happen. However realistically we'll need a working script before they can make a module for us. |
I made some changes per suggestions. Libxml2 is made available via |
Attaching |
If I remember correctly, e4s maintainers asked questions on github about |
@aannabe could you rerun ctest on complex with |
Test this please |
Attaching complex test results with |
Looks like this has caught some actual bugs. Likely in our use of MPI or perhaps the MPI wrapper has a problem that only surfaces with this MPICH. Interesting that these have only just shown up but it highlights the merits of running on more platforms and with different MPI etc. |
@correaa any idea? |
MPI_Broadcast seems to be returning an error code. The error code has an associated message in the MPI which is wrapped into a std::system_error runtime exception, isn't there a string with a message or error code in the trace? Would it be possible to have a big try catch for a std::exception and print e.what()? |
Test this please |
@ye-luo , was this solved eventually? Looking again, it could be related to a bug in support for runtime exceptions in the system or in a debugger. (is this running in a debugger.) I am very interested because
|
Proposed changes
I've been looking at compiling the CPU version on Perlmutter since Cori will be retired soon. The added script compiles the real and complex versions for CPU-only nodes. Some dependencies, such as LibXml2, are handled via spack as this is not provided by default.
I didn't have luck with GNU compilers and/or the
cc
,CC
wrappers provided by NERSC. However, thempicc
,mpic++
MPI wrappers with cray compile without problems.For the real build, all unit tests pass. 99% of deterministic tests pass, and there are 5 fails related to HEG (see attached).
For the complex build, 2 unit tests are failing. For the deterministic case, 84% is passing. The fails are related to HEG + Gaussian basis bulk systems (see attached).
I didn't explore the GPU build yet.
What type(s) of changes does this code introduce?
Does this introduce a breaking change?
What systems has this change been tested on?
NERSC/Perlmutter-CPU
Checklist
cplx_tests.txt
real_tests.txt