Skip to content

Commit

Permalink
Blas1 asum: work around for openblas error with short vectors (#2384)
Browse files Browse the repository at this point in the history
Signed-off-by: Carl William Pearson <cwpears@sandia.gov>
  • Loading branch information
cwpearson authored Oct 16, 2024
1 parent 978823c commit b052734
Showing 1 changed file with 12 additions and 0 deletions.
12 changes: 12 additions & 0 deletions blas/tpls/KokkosBlas_Host_tpl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -790,6 +790,18 @@ double HostBlas<std::complex<double> >::nrm2(KK_INT n, const std::complex<double
}
template <>
double HostBlas<std::complex<double> >::asum(KK_INT n, const std::complex<double>* x, KK_INT x_inc) {
// see issue 2005
// On some platforms with OpenBLAS < 0.3.26, dzasum on vectors less than 16 entries is producing 0.
// this has been observed on some (not all) systems with:
// clang 14.0.6 / 15.0.7 AND OpenBLAS 0.3.23 AND Sapphire Rapids CPU
// unfortunately, it's not clear exactly what the trigger is
if (n > 0 && n < 16) {
double ret = 0.0;
for (int i = 0; i < n; ++i) {
ret += Kokkos::abs(x[i].real()) + Kokkos::abs(x[i].imag());
}
return ret;
}
return F77_FUNC_DZASUM(&n, x, &x_inc);
}
template <>
Expand Down

0 comments on commit b052734

Please sign in to comment.