Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make QMC_SIMD_ALIGNMENT configured via CMake #2981

Merged
merged 5 commits into from
Mar 5, 2021

Conversation

ye-luo
Copy link
Contributor

@ye-luo ye-luo commented Mar 3, 2021

Proposed changes

Renamed QMC_CLINE to QMC_SIMD_ALIGNMENT.
Set QMC_SIMD_ALIGNMENT (SIMD vector length in bytes, not necessarily cache line size) explicitly via CMake rather than depending on compiler defined macros.
The default value is determined by check __AVX512F__ macro. If found, set 64 otherwise 32.
Basically the compiler macro logic is moved to CMake and the value is configurable but checked in CMake.

Background:
The old code was fine when targeting only one architecture (CPU).

  #if defined(__AVX512F__)
    #define QMC_SIMD_ALIGNMENT 64
  #else
    #define QMC_SIMD_ALIGNMENT 32
  #endif

When OpenMP offload is used, the source code is compiled for two targets CPU and GPU.

int foo(Vector<float, aligned_allocator<float, QMC_SIMD_ALIGNMENT>>)
{
  #pragma omp target
  { offload kernel }
}

The compiler transformed offload kernel carries C++ mangled function name foo with QMC_CLINE being part of it. For this reason, the caller (CPU pass) and the callee(GPU pass) doesn't match and kernels failed being loaded on GPU. https://bugs.llvm.org/show_bug.cgi?id=45401

What type(s) of changes does this code introduce?

  • Bugfix

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

Bora, Summit, Tulip, Theta

Checklist

  • Yes. This PR is up to date with current the current state of 'develop'

@ye-luo ye-luo force-pushed the move-alignment-to-cmake branch from a10a66a to bd2e49e Compare March 3, 2021 18:33
@ye-luo ye-luo force-pushed the move-alignment-to-cmake branch from 75dc37f to 0ad4e20 Compare March 4, 2021 03:31
@ye-luo ye-luo changed the title [WIP] Move QMC_CLINE to CMake. Move QMC_CLINE to CMake. Mar 4, 2021
@ye-luo ye-luo changed the title Move QMC_CLINE to CMake. Make QMC_CLINE configured via CMake Mar 4, 2021
Copy link
Contributor

@prckent prckent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please change QMC_CLINE to a self-describing and longer name. e.g. QMC_SIMD_ALIGNMENT. What CLINE means is not obvious.

  2. Rename CheckSIMDAlignement.cmake ->CheckSIMDAlignment.cmake

CMakeLists.txt Outdated
# Check SIMD alignment for CPU only
# This is intentionally placed before adding OpenMP offload compile options
# to avoid contamination from device compilation pass
#---------------------------------------------------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an example of the potential problem, as detailed in your PR text.

@ye-luo ye-luo changed the title Make QMC_CLINE configured via CMake Make QMC_SIMD_ALIGNMENT configured via CMake Mar 5, 2021
Copy link
Contributor

@prckent prckent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ye!

@ye-luo
Copy link
Contributor Author

ye-luo commented Mar 5, 2021

@prckent This should make the nightly test error on sulfur go away without need of passing "-march=skylake" as a workaround.

@ye-luo ye-luo merged commit 8de11ad into QMCPACK:develop Mar 5, 2021
@ye-luo ye-luo deleted the move-alignment-to-cmake branch March 5, 2021 16:18
@prckent
Copy link
Contributor

prckent commented Mar 6, 2021

Nightly scripts have been updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants