Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make CMake GPU options compact #5250

Closed
ye-luo opened this issue Dec 6, 2024 · 4 comments
Closed

Make CMake GPU options compact #5250

ye-luo opened this issue Dec 6, 2024 · 4 comments

Comments

@ye-luo
Copy link
Contributor

ye-luo commented Dec 6, 2024

My old attempt was QMC_GPU=no/NVIDIA/AMD/INTEL and I no longer think it is a good simplification see discussion #3930

My new suggestions is

  1. -DQMC_GPU="openmp;cuda", translate to internally ENABLE_OFFLOAD=ON and ENABLE_CUDA=ON
  2. -DQMC_GPU="hip", translate to internally ENABLE_CUDA and QMC_CUDA2HIP

@prckent @PDoakORNL what do you think?

@prckent
Copy link
Contributor

prckent commented Dec 19, 2024

I think this adds complexity and another layer we don't really need. What are you/we hoping to achieve with it?

Unless we are providing a good and simple default setting for these different architectures, we will end up writing explicit instructions for each vendor and architecture anyway => we don't need this QMC_GPU feature since users will be copying from our documentation and provided installation scripts. Therefore, we don't need this additional layer.

For comparison, GROMACS has https://manual.gromacs.org/current/install-guide/exotic.html and https://manual.gromacs.org/documentation/current/install-guide/index.html

@ye-luo
Copy link
Contributor Author

ye-luo commented Dec 19, 2024

All the ENABLE_OFFLOAD, ENABLE_CUDA, QMC_CUDA2HIP, ENABLE_SYCL will be internal variables derived from QMC_GPU, not exposed to users. QMC_GPU="openmp;cuda" is way more intuitive to read and edit compared to "-DENABLE_OFFLOAD=ON -DENABLE_CUDA=ON" the latter is error prone. I frequently type "ENABLE" wrong and my eye hurts checking them.

Our documentation can be significantly simplified as

vendor QMC_GPU value
NVIDIA "openmp;cuda"
AMD "openmp;hip"
Intel "openmp;sycl"

Users learn one option instead of 4, basically like GMX_GPU of GROMACS.

@prckent
Copy link
Contributor

prckent commented Dec 31, 2024

OK, on reflection lets try this or something close to it. I'll note we also need to specify the GPU architecture via QMC_GPU_ARCHS. I suggest that we set a default QMC_GPU based on this so that in practice most users won't need to do anything other than specify the architecture. e.g. the user puts QMC_GPU_ARCHS=gfx1234 and gets QMC_GPU=openmp;hip by default.

Edit: Simpler still -- we could set defaults for ENABLE_OFFLOAD, ENABLE_CUDA, QMC_CUDA2HIP, ENABLE_SYCL based on QMC_GPU_ARCHS and avoid creating QMC_GPU entirely. Regular users therefore only have one setting to specify for GPU builds. The need to change these defaults is quite rare -- e.g. compiler people looking to optimize pure OpenMP offload, or test builds aiming to activate only one set of code paths. Recognizing Peter's comments in the prior closed issue, any explicit ENABLE_xyz settings should override the defaults implied by QMC_GPU_ARCHS (easy to implement).

@ye-luo
Copy link
Contributor Author

ye-luo commented Dec 31, 2024

The default value of potentially interacting ENABLE_OFFLOAD, ENABLE_CUDA, QMC_CUDA2HIP, ENABLE_SYCL is non-trivial. That is the motivation of QMC_GPU which is treated as the single source of truth. ENABLE_OFFLOAD, ENABLE_CUDA, QMC_CUDA2HIP, ENABLE_SYCL all become non-cached and derived from QMC_GPU. QMC_GPU does have full control over individual switches. Basically QMC_GPU is a syntactic sugar but only one direction instead of bi-direction conversion.

The following assumes ENABLE_OFFLOAD, ENABLE_CUDA, QMC_CUDA2HIP, ENABLE_SYCL being replaced by QMC_GPU.

Currently if QMC_GPU_ARCHS is not set, we make a guess based on QMC_GPU and external tools. If both are specified, we don't do cross check in our cmake setting but the compiler toolchain catches error if there is any. Only power users needs to specify both.

What you suggested was to add another case that QMC_GPU_ARCHS is specified but QMC_GPU is not. I think we can have this smart default of QMC_GPU case. This will be a second step after introducing QMC_GPU.

In summary, QMC_GPU is for selecting programming models (software), QMC_GPU_ARCHS is for the hardware and I try to keep them independent. Users may rely on QMC_GPU_ARCHS to set QMC_GPU but developers need to access the cases like offload without CUDA or CUDA without offload. From software perspective, hard-coding GPU architecture at compilation isn't great. NVIDIA, Intel have nvptx and spir-v and AMD is working its own flavored spir-v. Not specifying QMC_GPU_ARCHS is actually a good thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants