-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nightly Sycl unit test failures with intel/2023.1.0, intel/2024.1.0 on Intel Ponte Vecchio #1961
Comments
Updating the issue with failures as of SHA 32aa75a Configuration 1 (no TPLs): salloc -N 1 -p PV
source /projects/x86-64-icelake-rocky8/spack-config/blake-setup-user-module-env.sh
module purge
module load cmake intel-oneapi-compilers/2023.1.0 intel-oneapi-dpl/2022.1.0 git
# Required for the hashmap accumulator
export ZES_ENABLE_SYSMAN=1
# Configuration
$KOKKOSKERNELS_PATH/cm_generate_makefile.bash --with-sycl --arch=INTEL_PVC --compiler=/projects/x86-64-icelake-rocky8/compilers/intel-oneapi-compilers/2023.1.0/gcc/8.5.0/base/6g2jkiv/compiler/2023.1.0/linux/bin-llvm/clang++ --cxxflags="-fp-model=precise" --shared --kokkos-cmake-flags=-DKokkos_ENABLE_ONEDPL=OFF -kokkos-path=$KOKKOS_PATH Test failures on PVC: 23:43:24 The following tests FAILED:
23:43:24 15 - sparse_sycl (SEGFAULT)
23:43:24 16 - blocksparse_sycl (Failed) Configuration 2 (oneMKL): salloc -N 1 -p PV
source /projects/x86-64-icelake-rocky8/spack-config/blake-setup-user-module-env.sh
module purge
module load git cmake intel-oneapi-compilers/2023.1.0 intel-oneapi-dpl/2022.1.0 intel-oneapi-mkl/2023.1.0 intel-oneapi-tbb/2021.9.0
# Required for the hashmap accumulator
export ZES_ENABLE_SYSMAN=1
# Configuration
$KOKKOSKERNELS_PATH/cm_generate_makefile.bash --with-sycl --arch=INTEL_PVC --compiler=icpx --cxxflags="-fp-model=precise" --shared --with-tpls=mkl --kokkos-cmake-flags=-DKokkos_ENABLE_ONEDPL=OFF -kokkos-path=$KOKKOS_PATH Test failures on PVC:
|
Joe installed intel oneapi 2024.1.0 on Blake, I tested the MKL configuration above: Test failures:
Configuration (Sycl backend, intel/2024.1.0 with mkl/2024.0.0): source /projects/x86-64-icelake-rocky8/spack-config/blake-setup-user-module-env.sh
module purge
module load cmake intel-oneapi-compilers/2024.1.0 intel-oneapi-dpl/2022.5.0 intel-oneapi-tbb/2021.12.0 intel-oneapi-mkl/2024.0.0
module list
# Required for the hashmap accumulator
export ZES_ENABLE_SYSMAN=1
# Configuration
$KOKKOSKERNELS_PATH/cm_generate_makefile.bash --with-sycl --arch=INTEL_PVC --compiler=icpx --cxxflags="-fp-model=precise -Wno-pass-failed" --shared --with-tpls=mkl --kokkos-path=$KOKKOS_PATH
make -j16
# Unit tests
export ONEAPI_DEVICE_SELECTOR=ext_oneapi_level_zero:gpu
ctest --output-on-failure
|
I've been poking around with this: In the SpGEMM, it seems that however, I've tried replacing auto v = sycl::atomic_ref<std::remove_reference_t<decltype(*addr)>,
sycl::memory_order::relaxed,
sycl::memory_scope::device,
sycl::access::address_space::global_space>(*addr);
v += val; but no luck so far. I've also tried running the Kokkos Core atomics unit tests built with the same Core that I use for the Kernels unit tests, and the Core atomic unit tests all pass. |
Reimplementing template <typename InPtr, typename T>
KOKKOS_INLINE_FUNCTION T *alignPtr(InPtr p) {
std::uintptr_t ptrVal = reinterpret_cast<std::uintptr_t>(p);
while (ptrVal % alignof(T)) {
++ptrVal;
}
return reinterpret_cast<T *>(ptrVal);
} seems to make the SpGEMM unit tests pass. However, using the equivalent template <typename InPtr, typename T>
KOKKOS_INLINE_FUNCTION T *alignPtr(InPtr p) {
std::uintptr_t ptrVal = reinterpret_cast<std::uintptr_t>(p);
return reinterpret_cast<T *>((ptrVal + alignof(T) - 1) / alignof(T) * alignof(T));
} does not. May be a SYCL compiler issue (unless |
unsigned int f1(unsigned int i, unsigned int align) // today
{
return ((i + align - 1) & (~(align - 1)));
}
unsigned int f2(unsigned int i, unsigned int align)
{
return ((i + align - 1) / align * align);
}
unsigned int f3(unsigned int i, unsigned int align) // gcc
{
return (i + align - 1) & (-align);
}
unsigned int f4(unsigned int i, unsigned int align)
{
while (i % align) {
++i;
}
return i;
} only in clang-trunk x86 in godbolt, f1 and f3 compile to the same instructions. f2 and f4 are each different again. |
Status update as-of 7/9/2024 following merge of some recent fixes: Sycl + PV, no MKLFailing tests
Failure output snips:
Sycl + PV, with MKLFailing tests:
Failure output snips:
sparse_sycl:
blocksparse_sycl
|
Status update 7/12/2024: After the recent gemv fallback updates, the Sycl builds are in better shape with only the
|
Testing with the Sycl backend on Intel Ponte Vecchio on the new Blake showed a couple failing sub-tests (failure output listed below the failing executable), depending on which environment variables set:
Default (
ZES_ENABLE_SYSMAN
unset)ZES_ENABLE_SYSMAN=1
Reproducer (Blake PV queue):
SHAs:
kokkos/kokkos@7e299b4
acdd896
Edit: Added shas used in the testing
The text was updated successfully, but these errors were encountered: