Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mono][llvm][jit] Adding support for Vector128::ExtractMostSignificantBits intrinsic on ARM64 with LLVM + refactoring miniJIT opcodes #89431

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

matouskozak
Copy link
Member

Adding support for Vector128.ExtractMostSignificantBits intrinsic on ARM64 LLVM.

Implementation

The implementation for LLVM follows the same approach as with miniJIT introduced in #84345.

To unify LLVM and miniJIT approaches, the previously introduced opcodes OP_ARM64_USHL and OP_ARM64_EXT_IMM were removed and replaced with INTRINS_AARCH64_ADV_SIMD_USHL and OP_XLOWER/OP_XUPPER as these were already supported on LLVM.

This change introduces a slight change in generated assembly by miniJIT. Specifically, in byte/sbyte element types scenario the code changes from:

    and.16b        v0, v0, v1
    ldr                 q1, 0x60
    ushl.16b       v0, v0, v1
    eor.16b         v1, v1, v1
    **ext.16b         v2, v1, v0, #0x8**
    addv.16b      b2, v2
    umov.b         w1, v2[0]
    ext.16b         v0, v0, v1, #0x8
    addv.16b      b0, v0
    umov.b         w0, v0[0]
    lsl                  x0, x0, #8
    orr                 w0, w0, w1

to:

    and.16b        v0, v0, v1
    ldr                 q1, 0x60
    ushl.16b       v0, v0, v1
    eor.16b         v1, v1, v1
    **mov.8b         v1, v0**
    addv.16b      b1, v1
    umov.b         w1, v1[0]
    **eor.16b        v16, v16, v16**
    ext.16b         v0, v0, v16, #0x8
    addv.16b      b0, v0
    umov.b         w0, v0[0]
    lsl                  x0, x0, #8
    orr                 w0, w0, w1

The change removed 1xext.16b instruction and added two new ones (mov.8b and eor.16b). If this change introduces any significant regressions we can revert the change for miniJIT.

An extract of generated code for byte element type using LLVM fullAOT:

    and.16b        v0, v1, v0
    ushl.16b       v0, v0, v2
    ext.16b         v1, v0, v0, #0x8
    addv.8b        b0, v0
    addv.8b        b1, v1
    fmov             w9, s0
    fmov             w8, s1
    orr                 w0, w9, w8, lsl #8

Contributes to #76025.

/cc: @ivanpovazan

@matouskozak
Copy link
Member Author

matouskozak commented Jul 25, 2023

/azp run runtime-llvm

@azure-pipelines
Copy link

Azure Pipelines failed to run 1 pipeline(s).

@matouskozak
Copy link
Member Author

/azp run runtime-llvm

@azure-pipelines
Copy link

No commit pushedDate could be found for PR 89431 in repo dotnet/runtime

@matouskozak
Copy link
Member Author

/azp run runtime-llvm

@azure-pipelines
Copy link

No commit pushedDate could be found for PR 89431 in repo dotnet/runtime

@matouskozak
Copy link
Member Author

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@matouskozak matouskozak marked this pull request as ready for review July 26, 2023 15:51
Copy link
Member

@fanyang-mono fanyang-mono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Well done :)

@matouskozak
Copy link
Member Author

The failing CI lanes look unrelated or are already tracked.

Copy link
Member

@ivanpovazan ivanpovazan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
Thank you for verifying that the performance is not affected with this change (any further optimization can be addressed later - in a separate PR).

@matouskozak matouskozak merged commit e928f16 into dotnet:main Jul 27, 2023
129 of 152 checks passed
@ghost ghost locked as resolved and limited conversation to collaborators Aug 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants