Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Legalize imul.i64x2 for both AVX and non-AVX x86 CPUs #1759

Merged
merged 9 commits into from
Jun 3, 2020

Conversation

abrown
Copy link
Contributor

@abrown abrown commented May 26, 2020

The convert_i64x2_imul custom legalization checks the ISA flags for AVX512DQ or AVX512VL support and legalizes imul.i64x2 to an x86_pmullq in this case; if not, it uses a lengthy SSE2-compatible instruction sequence. For this logic to work, we need:

  • the AVX512 instruction to be defined as a separate Cranelift instruction, x86_pmullq (this additional instruction would go away in the new backend)
  • a mechanism for accessing the x86 TargetIsa so that we can inspect its flags during legalization
  • a new SSE2 instruction, x86_pmuludq for implementing the SSE2-compatible instruction sequence

@abrown abrown force-pushed the i64x2-mul branch 2 times, most recently from e48a82c to fdcfd1c Compare May 26, 2020 17:11
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels May 26, 2020
@github-actions
Copy link

Subscribe to Label Action

cc @bnjbvr

This issue or pull request has been labeled: "cranelift", "cranelift:area:aarch64", "cranelift:area:x64", "cranelift:meta"

Thus the following users have been cc'd because of the following labels:

  • bnjbvr: cranelift

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good! I wonder if the legalization couldn't be done in the meta-language directly using an ISA predicate (and thus avoiding the Any and downcasting).

cranelift/codegen/meta/src/isa/x86/instructions.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x86/enc_tables.rs Outdated Show resolved Hide resolved
cranelift/codegen/src/isa/x86/enc_tables.rs Show resolved Hide resolved
cranelift/codegen/src/isa/x86/enc_tables.rs Show resolved Hide resolved
Copy link
Member

@bnjbvr bnjbvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless guarding-legalizations-against-ISA-settings is easy to implement, the proposed approach here works fine. Thanks for the patch!

Without this special instruction, legalizing to the AVX512 instruction AND the SSE instruction sequence is impossible. This extra instruction would be rendered unnecessary by the x64 backend.
This is necessary when we would like to check specific ISA flags, e.g.
This instruction multiplies the lower 32 bits of two 64x2 unsigned integers into an i64x2; this is necessary for lowering Wasm's i64x2.mul.
This instruction does not exist in the SSE2 feature set; it can be added later with an VEX/EVEX encoding.
The `convert_i64x2_imul` custom legalization checks the ISA flags for AVX512DQ or AVX512VL support and legalizes `imul.i64x2` to an `x86_pmullq` in this case; if not, it uses a lengthy SSE2-compatible instruction sequence.
@abrown abrown merged commit 5db384c into bytecodealliance:master Jun 3, 2020
@abrown abrown deleted the i64x2-mul branch June 3, 2020 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. cranelift:wasm cranelift Issues related to the Cranelift code generator
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants