Skip to content

Commit

Permalink
Merge pull request #1552 from kliegeois/Fix_SPMV_SIMD_Padding
Browse files Browse the repository at this point in the history
Avoid the SIMD code branch if the batched size is not a multiple of the vector length
  • Loading branch information
srajama1 authored Sep 28, 2022
2 parents caf47dc + 3562c23 commit 70b4605
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion batched/sparse/impl/KokkosBatched_Spmv_TeamVector_Impl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,7 @@ KOKKOS_INLINE_FUNCTION int TeamVectorSpmvInternal::invoke(
const OrdinalType ys1) {
#if !defined(__CUDA_ARCH__) && !defined(__HIP_DEVICE_COMPILE__)
if (member.team_size() == 1) {
if (N_team > 1 && valuess0 == 1) {
if (N_team > 1 && valuess0 == 1 && valuess1 % N_team == 0) {
/*
Left layout as valuess0 = 1 and non-zero vector length given at
compilation time Here we use the SIMD data type which is using Intel
Expand Down

0 comments on commit 70b4605

Please sign in to comment.