Skip to content

Commit

Permalink
Add sfence after non-temporal stores on x86
Browse files Browse the repository at this point in the history
We definitely need fences there, see:
https://doc.rust-lang.org/core/arch/x86/fn._mm_sfence.html

Even more interesting, the discussion on NT stores in Rust:
rust-lang/rust#114582

And on broken NT stores in LLVM:
llvm/llvm-project#64521

Oh boy ...
  • Loading branch information
ackxolotl committed Aug 20, 2024
1 parent ac9c40a commit 27a8477
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions benches/memcpy.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ unsafe fn memcpy_avx(mut src: *const u8, mut dst: *mut u8, count: usize) {
_mm_prefetch::<_MM_HINT_T2>(src as *const i8);
dst = dst.add(32);
}
_mm_sfence();
}

#[cfg(all(any(target_arch = "x86", target_arch = "x86_64"), target_feature = "sse", target_feature = "avx512f"))]
Expand All @@ -51,6 +52,7 @@ unsafe fn memcpy_avx512(mut src: *const u8, mut dst: *mut u8, count: usize) {
_mm_prefetch::<_MM_HINT_T2>(src as *const i8);
dst = dst.add(64);
}
_mm_sfence();
}

#[cfg(all(target_arch = "aarch64", target_feature = "neon"))]
Expand Down

0 comments on commit 27a8477

Please sign in to comment.