Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spack variant for MPI GPU support #1102

Merged
merged 9 commits into from
Mar 4, 2024

Conversation

msimberg
Copy link
Collaborator

I've initially set it to force contiguous buffers always (it's a cmake_dependent_option so it's off if DLAF_WITH_MPI_GPU_SUPPORT is off).

Suggestions for naming etc. welcome.

@msimberg msimberg self-assigned this Feb 21, 2024
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg requested review from RMeli, aurianer, rasolca and albestro and removed request for RMeli and aurianer February 21, 2024 09:10
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg marked this pull request as ready for review February 21, 2024 14:12
Copy link
Collaborator

@albestro albestro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can just think of GPU-aware MPI, so something like mpi-gpu-aware or gpu-aware-mpi. But not sure it is a good idea to "separate" the variant name from the actual config option in CMake.

spack/packages/dla-future/package.py Show resolved Hide resolved
@msimberg
Copy link
Collaborator Author

I can just think of GPU-aware MPI, so something like mpi-gpu-aware or gpu-aware-mpi. But not sure it is a good idea to "separate" the variant name from the actual config option in CMake.

Yeah, I mainly wanted to keep them consistent, but "GPU-aware MPI" is also a good option. Note that we haven't released a version with the fixed CMake variables nor pushed the option upstream to spack, so there's still room to rename it. "GPU-aware MPI" was not an option in #1088, but I think it probably should've been an option.

Copy link
Member

@RMeli RMeli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like gpu-aware-mpi as suggested by @albestro (also in CMake), sounds a bit clearer. But I don't really mind.

It's not worth to expose DLAF_WITH_MPI_GPU_SUPPORT_FORCE_CONTIGUOUS to users?

@msimberg
Copy link
Collaborator Author

msimberg commented Feb 23, 2024

I can just think of GPU-aware MPI, so something like mpi-gpu-aware or gpu-aware-mpi. But not sure it is a good idea to "separate" the variant name from the actual config option in CMake.

I like gpu-aware-mpi as suggested by @albestro (also in CMake), sounds a bit clearer. But I don't really mind.

In that case I suppose the alternative would be:

  • spack variant:
    • mpi-gpu-aware
    • mpi-gpu-force-contiguous
    • mpi-cpu-force-contiguous
  • cmake options and defines:
    • DLAF_WITH_MPI_GPU_AWARE
    • DLAF_WITH_MPI_GPU_FORCE_CONTIGUOUS
    • DLAF_WITH_MPI_CPU_FORCE_CONTIGUOUS

I like this as well and I'll happily change it. @rasolca, @aurianer, thoughts? I think in #1088 I was quite biased by the MPICH environment variable MPICH_GPU_SUPPORT_ENABLED.

It's not worth to expose DLAF_WITH_MPI_GPU_SUPPORT_FORCE_CONTIGUOUS to users?

I wouldn't at this point but happy to be overruled. My reasoning is that with the benchmarks so far forcing it to be contiguous is never slower and often a lot faster so I consider turning it off pretty much a debugging switch and nothing more. Note that the CMake variable also defaults it to on. If we have enough results to support that it never makes sense to disable it we could just remove the CMake option.

@RMeli
Copy link
Member

RMeli commented Feb 23, 2024

My reasoning is that with the benchmarks so far forcing it to be contiguous is never slower and often a lot faster so I consider turning it off pretty much a debugging switch and nothing more.

Yes, I totally get this. My reasoning is that I often compile DLA-Future as a dependency in Spack, and having these options exposed is often useful. But I'm likely the only (?) person to benefit from this (i.e. an actual CP2K user should not care about a "debugging switch"), so I'm happy to leave it out if that's less confusing/error prone. (I can always add a custom Spack package on my end if needed... And make @albestro cringe ;) ).

If we have enough results to support that it never makes sense to disable it we could just remove the CMake option.

I would consider this carefully. Isn't it possible that in the future the situation will be reversed?

@msimberg
Copy link
Collaborator Author

My reasoning is that with the benchmarks so far forcing it to be contiguous is never slower and often a lot faster so I consider turning it off pretty much a debugging switch and nothing more.

Yes, I totally get this. My reasoning is that I often compile DLA-Future as a dependency in Spack, and having these options exposed is often useful. But I'm likely the only (?) person to benefit from this (i.e. an actual CP2K user should not care about a "debugging switch"), so I'm happy to leave it out if that's less confusing/error prone. (I can always add a custom Spack package on my end if needed... And make @albestro cringe ;) ).

Yeah, I get this too... I suppose this is a bit similar to the hdf5 variant. I suppose I'm mostly trying to avoid the situation where we add a variant that we then want to change/remove/rename in the upstream spack package. I was at some point considering if we could instead make this a runtime option instead and in that case we wouldn't need the variant. That said, this is not a hill I want to die on, so if it's useful it's useful and we add it.

@msimberg msimberg marked this pull request as draft February 26, 2024 14:23
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg marked this pull request as ready for review February 26, 2024 17:57
@msimberg msimberg requested review from RMeli and albestro February 26, 2024 17:57
@rasolca rasolca merged commit bb69c87 into eth-cscs:master Mar 4, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants