Use global_asm to include the sse/avx impls #272
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR replaces uses Rust's
global_asm
to include the sse2/sse41/avx2/avx512 assembly functions used by blake3. This is useful to reduce the toolchain requirements of this crates' user, which is especially nice for cross-compilation purposes. We no longer require a working C compiler/assembler to get the fast asm version of blake3 on x86_64!You'll notice this MR commits a certain amount of code crimes to get it working:
windows_gnu.S
version of the code, as it doesn't require running the preprocessor on it. However, this means it always uses thewin64
ABI, instead of theC
ABI. As such, the FFI definitions have to be changed to show this (e.g. theextern "win64"
). While this works, it feels terrible.build.rs
to fixup the assembly so it can be included inglobal_asm
without issues. In particular, the initial.intel_syntax
is removed (that's the default in global_asm), and the{
s and}
s are doubled to escape them (as a{}
is used for formatting)..section .text
with.text
and.section .rodata
with.static_data
, to make it compile.While I think most of this is fine, fixing the ABI so we can keep using
extern "C"
ABIs would be a bit nicer.