Fix nonlocal unsoundness vs. atomic store #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
MOVNTI, MOVNTDQ, and friends weaken TSO when next to other stores. As most stores are not nontemporal, LLVM uses simple stores when lowering LLVMIR like
atomic store ... release
on x86. These facts could allow something like the following code to be emitted:But these stores are NOT ordered with respect to each other! Nontemporal stores induce the CPU to use write-combining buffers. These writes will be resolved in bursts instead of at once, and the write may be further deferred until a serialization point. Even a non-temporal write to any other location will not force the deferred writes to be resolved first. Thus, assuming cache-line-sized buffers of 64 bytes, the CPU may resolve these writes in e.g. this actual order:
This could e.g. result in other threads accessing this address after the flag is set, thus accessing memory via safe code that was assumed to be correctly synchronized. This could result in observing tearing or other inconsistent program states, especially as the number of writes, thus the number of write buffers that may begin retiring simultaneously, thus the chance of them resolving in an unfortunate order, increases. If using
&mut [u8]
to write uninitialized memory is permitted ( per rust-lang/unsafe-code-guidelines#346 ), it could even result in an access to&[u8]
actually being reading uninitialized memory in safe code!To guarantee program soundness, code using nontemporal stores must currently use SFENCE in its safety boundary, unless and until LLVM decides this combination of facts should be considered a miscompilation and motivation to choose lowerings that do not require explicit SFENCE.
For further information, see: