Skip to content

Commit

Permalink
Reduce write cache polution to improve IAKV performance (#2457) (#2476)
Browse files Browse the repository at this point in the history
Co-authored-by: Chunyuan WU <chunyuan.wu@intel.com>
  • Loading branch information
liangan1 and chunyuan-w authored Jan 17, 2024
1 parent df2387e commit c95eb77
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions csrc/cpu/aten/kernels/MaskedMultiHeadAttentionKrnl.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -798,7 +798,8 @@ scale_dot_product_for_indirect_access_kv_cache(
}
}
}
flag_access[thread_id][bi][hi] = 1;
if (flag_access[thread_id][bi][hi] == 0)
flag_access[thread_id][bi][hi] = 1;
}
}
}
Expand Down Expand Up @@ -1102,7 +1103,8 @@ scale_dot_product_for_indirect_access_kv_cache_half(
flag_access[thread_id][bi][hi]);
}
}
flag_access[thread_id][bi][hi] = 1;
if (flag_access[thread_id][bi][hi] == 0)
flag_access[thread_id][bi][hi] = 1;
}
}
}
Expand Down

0 comments on commit c95eb77

Please sign in to comment.