Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement StoreSelectedScalar for Arm64 #93223

Merged
merged 24 commits into from
Nov 16, 2023

Conversation

SwapnilGaikwad
Copy link
Contributor

@SwapnilGaikwad SwapnilGaikwad commented Oct 9, 2023

Contribute towards #84510.

// AdvSimd
// ST2 (single structure)
public static unsafe void StoreSelectedScalar64x2(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2) value, byte index);

// ST3 (single structure)
public static unsafe void StoreSelectedScalar64x3(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2, Vector64<byte>   Value3) value, byte index);
public static unsafe void StoreSelectedScalar64x3(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2, Vector64<sbyte>  Value3) value, byte index);
public static unsafe void StoreSelectedScalar64x3(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2, Vector64<short>  Value3) value, byte index);
public static unsafe void StoreSelectedScalar64x3(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2, Vector64<ushort> Value3) value, byte index);
public static unsafe void StoreSelectedScalar64x3(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2, Vector64<int>    Value3) value, byte index);
public static unsafe void StoreSelectedScalar64x3(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2, Vector64<uint>   Value3) value, byte index);
public static unsafe void StoreSelectedScalar64x3(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2, Vector64<float>  Value3) value, byte index);

// ST4 (single structure)
public static unsafe void StoreSelectedScalar64x4(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2, Vector64<byte>   Value3, Vector64<byte>   Value4) value, byte index);
public static unsafe void StoreSelectedScalar64x4(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2, Vector64<sbyte>  Value3, Vector64<sbyte>  Value4) value, byte index);
public static unsafe void StoreSelectedScalar64x4(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2, Vector64<short>  Value3, Vector64<short>  Value4) value, byte index);
public static unsafe void StoreSelectedScalar64x4(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2, Vector64<ushort> Value3, Vector64<ushort> Value4) value, byte index);
public static unsafe void StoreSelectedScalar64x4(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2, Vector64<int>    Value3, Vector64<int>    Value4) value, byte index);
public static unsafe void StoreSelectedScalar64x4(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2, Vector64<uint>   Value3, Vector64<uint>   Value4) value, byte index);
public static unsafe void StoreSelectedScalar64x4(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2, Vector64<float>  Value3, Vector64<float>  Value4) value, byte index);


// AdvSimd.Arm64
// ST2 (single structure)
public static unsafe void StoreSelectedScalar128x2(byte*   address, (Vector128<byte>   Value1, Vector128<byte>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(sbyte*  address, (Vector128<sbyte>  Value1, Vector128<sbyte>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(short*  address, (Vector128<short>  Value1, Vector128<short>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(ushort* address, (Vector128<ushort> Value1, Vector128<ushort> Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(int*    address, (Vector128<int>    Value1, Vector128<int>    Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(uint*   address, (Vector128<uint>   Value1, Vector128<uint>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(long*   address, (Vector128<long>   Value1, Vector128<long>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(ulong*  address, (Vector128<ulong>  Value1, Vector128<ulong>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(float*  address, (Vector128<float>  Value1, Vector128<float>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(double* address, (Vector128<double> Value1, Vector128<double> Value2) value, byte index);

// ST3 (single structure)
public static unsafe void StoreSelectedScalar128x3(byte*   address, (Vector128<byte> Value1,    Vector128<byte> Value2,      Vector128<byte> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(sbyte*  address, (Vector128<sbyte> Value1,   Vector128<sbyte> Value2,     Vector128<sbyte> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(short*  address, (Vector128<short> Value1,   Vector128<short> Value2,     Vector128<short> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(ushort*  address, (Vector128<ushort> Value1,  Vector128<ushort> Value2,    Vector128<ushort> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(int*  address, (Vector128<int> Value1,     Vector128<int> Value2,       Vector128<int> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(uint*  address, (Vector128<uint> Value1,    Vector128<uint> Value2,      Vector128<uint> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(long*  address, (Vector128<long> Value1,    Vector128<long> Value2,      Vector128<long> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(ulong*  address, (Vector128<ulong> Value1,    Vector128<ulong> Value2,      Vector128<ulong> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(float*  address, (Vector128<float> Value1,   Vector128<float> Value2,     Vector128<float> Value3), byte index);
public static unsafe void StoreSelectedScalar128x3(double*  address, (Vector128<double> Value1,   Vector128<double> Value2,     Vector128<double> Value3), byte index);

// ST4 (single structure)
public static unsafe void StoreSelectedScalar128x4(byte*   address, (Vector128<byte> Value1,        Vector128<byte> Value2,      Vector128<byte> Value3,      Vector128<byte> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(sbyte*   address, (Vector128<sbyte> Value1,       Vector128<sbyte> Value2,     Vector128<sbyte> Value3,     Vector128<sbyte> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(short*   address, (Vector128<short> Value1,       Vector128<short> Value2,     Vector128<short> Value3,     Vector128<short> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(ushort*   address, (Vector128<ushort> Value1,      Vector128<ushort> Value2,    Vector128<ushort> Value3,    Vector128<ushort> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(int*   address, (Vector128<int> Value1,         Vector128<int> Value2,       Vector128<int> Value3,       Vector128<int> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(uint*   address, (Vector128<uint> Value1,        Vector128<uint> Value2,      Vector128<uint> Value3,      Vector128<uint> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(long*   address, (Vector128<long> Value1,        Vector128<long> Value2,      Vector128<long> Value3,      Vector128<long> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(ulong*   address, (Vector128<ulong> Value1,        Vector128<ulong> Value2,      Vector128<ulong> Value3,      Vector128<ulong> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(float*   address, (Vector128<float> Value1,       Vector128<float> Value2,     Vector128<float> Value3,     Vector128<float> Value4), byte index);
public static unsafe void StoreSelectedScalar128x4(double*   address, (Vector128<double> Value1,       Vector128<double> Value2,     Vector128<double> Value3,     Vector128<double> Value4), byte index);      

@ghost ghost added the community-contribution Indicates that the PR has been added by a community member label Oct 9, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI new-api-needs-documentation labels Oct 9, 2023
@dotnet-issue-labeler
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@ghost
Copy link

ghost commented Oct 9, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Contribute towards #84510.

// AdvSimd
public static unsafe void StoreSelectedScalar64x2(byte*   address, (Vector64<byte>   Value1, Vector64<byte>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(sbyte*  address, (Vector64<sbyte>  Value1, Vector64<sbyte>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(short*  address, (Vector64<short>  Value1, Vector64<short>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(ushort* address, (Vector64<ushort> Value1, Vector64<ushort> Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(int*    address, (Vector64<int>    Value1, Vector64<int>    Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(uint*   address, (Vector64<uint>   Value1, Vector64<uint>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar64x2(float*  address, (Vector64<float>  Value1, Vector64<float>  Value2) value, byte index);

// AdvSimd.Arm64
public static unsafe void StoreSelectedScalar128x2(byte*   address, (Vector128<byte>   Value1, Vector128<byte>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(sbyte*  address, (Vector128<sbyte>  Value1, Vector128<sbyte>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(short*  address, (Vector128<short>  Value1, Vector128<short>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(ushort* address, (Vector128<ushort> Value1, Vector128<ushort> Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(int*    address, (Vector128<int>    Value1, Vector128<int>    Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(uint*   address, (Vector128<uint>   Value1, Vector128<uint>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(long*   address, (Vector128<long>   Value1, Vector128<long>   Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(ulong*  address, (Vector128<ulong>  Value1, Vector128<ulong>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(float*  address, (Vector128<float>  Value1, Vector128<float>  Value2) value, byte index);
public static unsafe void StoreSelectedScalar128x2(double* address, (Vector128<double> Value1, Vector128<double> Value2) value, byte index);
Author: SwapnilGaikwad
Assignees: -
Labels:

area-CodeGen-coreclr, new-api-needs-documentation, community-contribution

Milestone: -

@SwapnilGaikwad
Copy link
Contributor Author

This patch would emit the following assembly sequence

1. When the index is a constant value
AdvSimd.StoreSelectedScalar64x2((Int32*)_dataTable.outArrayPtr, (AdvSimd.LoadVector64((Int32*)(_dataTable.inArray1Ptr)), AdvSimd.LoadVector64((Int32*)(_dataTable.inArray2Ptr))), 1);
...
            stp     d8, d16, [fp, #0x10]	// [V18 tmp17], [V19 tmp18]
            ldr     q16, [fp, #0x10]	// [V04 tmp3]
            ldr     q17, [fp, #0x18]	// [V04 tmp3+0x08]
            st2     {v16.s, v17.s}[1], [x20]
...
2. When index is not a constant
AdvSimd.StoreSelectedScalar64x2((Int32*)_dataTable.outArrayPtr, (AdvSimd.LoadVector64((Int32*)(_dataTable.inArray1Ptr)), AdvSimd.LoadVector64((Int32*)(_dataTable.inArray2Ptr))), elemIndex);
...
 mov     x0, x21
movz    x2, #0xC8A0      // code for System.Runtime.Intrinsics.Arm.AdvSimd:StoreSelectedScalar64x2(ulong,System.ValueTuple`2[System.Runtime.Intrinsics.Vector64`1[int],System.Runtime.Intrinsics.Vector64`1[int]],ubyte)
movk    x2, #0x5DB8 LSL #16
movk    x2, #0xFFFF LSL #32
ldr     x2, [x2]
...

; Assembly listing for method System.Runtime.Intrinsics.Arm.AdvSimd:StoreSelectedScalar64x2(ulong,System.ValueTuple`2[System.Runtime.Intrinsics.Vector64`1[short],System.Runtime.Intrinsics.Vector64`1[short]],ubyte) (FullOpts)

G_M60890_IG01:  ;; offset=0x0000
            stp     fp, lr, [sp, #-0x20]!
            mov     fp, sp
            stp     d0, d1, [fp, #0x10]	// [V01 arg1], [V01 arg1+0x08]
						;; size=12 bbWeight=1 PerfScore 2.50
G_M60890_IG02:  ;; offset=0x000C
            ldr     q16, [fp, #0x10]	// [V01 arg1]
            ldr     q17, [fp, #0x18]	// [V01 arg1+0x08]
            uxtb    w1, w1
            adr     x2, [G_M60890_IG03]
            add     x2, x2, x1,  LSL #3
            br      x2
						;; size=24 bbWeight=1 PerfScore 7.00
G_M60890_IG03:  ;; offset=0x0024
            st2     {v16.h, v17.h}[0], [x0]
            b       G_M60890_IG07
						;; size=8 bbWeight=1 PerfScore 2.00
G_M60890_IG04:  ;; offset=0x002C
            st2     {v16.h, v17.h}[1], [x0]
            b       G_M60890_IG07
						;; size=8 bbWeight=1 PerfScore 2.00
G_M60890_IG05:  ;; offset=0x0034
            st2     {v16.h, v17.h}[2], [x0]
            b       G_M60890_IG07
						;; size=8 bbWeight=1 PerfScore 2.00
G_M60890_IG06:  ;; offset=0x003C
            st2     {v16.h, v17.h}[3], [x0]
						;; size=4 bbWeight=1 PerfScore 1.00
G_M60890_IG07:  ;; offset=0x0040
            ldp     fp, lr, [sp], #0x20
            ret     lr
						;; size=8 bbWeight=1 PerfScore 2.00

3. When the index is variable and out of bounds The current test crashes with an error.
Illegal instruction (core dumped)

Not sure whether it's the desired behaviour. Also, I haven't considered the possibility of creating a ROP gadget with the sequence for Case 2 above. The sequence of instruction where value of an index can be used control the jump target.

@kunalspathak
Copy link
Member

As seen in #93197, the API names should be just StoreSelectedScalar because you can deduce the shape and tuple size from the parameters.

Copy link
Member

@kunalspathak kunalspathak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double check about the immediate operand out of range scenario.

src/coreclr/jit/emitarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsicarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsicarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsiclistarm64.h Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsiclistarm64.h Outdated Show resolved Hide resolved
@ghost ghost added the needs-author-action An issue or pull request that requires more info or actions from the author. label Oct 9, 2023
@kunalspathak
Copy link
Member

I will double check about the immediate operand out of range scenario.

Alright, so if you follow #93197, the immOp is handled in it.

@kunalspathak
Copy link
Member

  1. When the index is variable and out of bounds

They should throw IndexOutOfBoundException

@ghost ghost removed the needs-author-action An issue or pull request that requires more info or actions from the author. label Oct 16, 2023
@SwapnilGaikwad SwapnilGaikwad changed the title Implement StoreSelectedScalarNx2 for Arm64 Implement StoreSelectedScalar for Arm64 Oct 16, 2023
src/coreclr/jit/hwintrinsic.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/lsraarm64.cpp Show resolved Hide resolved
src/coreclr/jit/hwintrinsiclistarm64.h Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsiccodegenarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/emitarm64.cpp Outdated Show resolved Hide resolved
src/coreclr/jit/hwintrinsic.cpp Outdated Show resolved Hide resolved
@ghost ghost added needs-author-action An issue or pull request that requires more info or actions from the author. and removed needs-author-action An issue or pull request that requires more info or actions from the author. labels Oct 16, 2023
@kunalspathak
Copy link
Member

RunBasicScenario_UnsafeRead segfaults with the following error

Interesting. What is the cause of it?

@SwapnilGaikwad
Copy link
Contributor Author

Now we get out of range exception for all the invalid scenarios (with constant or variable index) with all the test cases.

System.ArgumentOutOfRangeException: Specified argument was out of the range of valid values.

I never got to ensure from the assembly that it throws InvalidIndexException. 🤔

This mystery is also resolved now. There was a check for array out of bounds missing. Now added it. Can be confirmed from the assembly for StoreSelectedScalar as well.

...
G_M56847_IG04:
      ldr     x0, [fp, #0x38]	// [V00 arg0]
      ldr     q16, [fp, #0x20]	// [V01 arg1]
      ldr     w1, [fp, #0x1C]	// [V02 arg2]
      uxtb    w1, w1
      cmp     w1, #4
      blo     G_M56847_IG05
      bl      CORINFO_HELP_THROW_ARGUMENTOUTOFRANGEEXCEPTION
...
Full assembly
; Assembly listing for method System.Runtime.Intrinsics.Arm.AdvSimd:StoreSelectedScalar(ulong,System.Runtime.Intrinsics.Vector128`1[int],ubyte) (MinOpts)
; Emitting BLENDED_CODE for generic ARM64 - Unix
; MinOpts code
; debuggable code
; fp based frame
; fully interruptible
; No PGO data
; Final local variable assignments
;
;  V00 arg0         [V00    ] (  1,  1   )    long  ->  [fp+0x38]  do-not-enreg[]
;  V01 arg1         [V01    ] (  1,  1   )  simd16  ->  [fp+0x20]  HFA(simd16)  do-not-enreg[S] <System.Runtime.Intrinsics.Vector128`1[int]>
;  V02 arg2         [V02    ] (  1,  1   )   ubyte  ->  [fp+0x1C]  do-not-enreg[]
;# V03 OutArgs      [V03    ] (  1,  1   )  struct ( 0) [sp+0x00]  do-not-enreg[XS] addr-exposed "OutgoingArgSpace"
;
; Lcl frame size = 48

G_M56847_IG01:  ;; offset=0x0000
            stp     fp, lr, [sp, #-0x40]!
            mov     fp, sp
            str     x0, [fp, #0x38]	// [V00 arg0]
            str     w1, [fp, #0x1C]	// [V02 arg2]
            str     q0, [fp, #0x20]	// [V01 arg1]
						;; size=20 bbWeight=1 PerfScore 4.50
G_M56847_IG02:  ;; offset=0x0014
            movz    x0, #0x45A8
            movk    x0, #0x4004 LSL #16
            movk    x0, #0xFFFF LSL #32
            ldr     w0, [x0]
            cbz     w0, G_M56847_IG04
						;; size=20 bbWeight=1 PerfScore 5.50
G_M56847_IG03:  ;; offset=0x0028
            bl      CORINFO_HELP_DBG_IS_JUST_MY_CODE
						;; size=4 bbWeight=0.50 PerfScore 0.50
G_M56847_IG04:  ;; offset=0x002C
            ldr     x0, [fp, #0x38]	// [V00 arg0]
            ldr     q16, [fp, #0x20]	// [V01 arg1]
            ldr     w1, [fp, #0x1C]	// [V02 arg2]
            uxtb    w1, w1
            cmp     w1, #4
            blo     G_M56847_IG05
            bl      CORINFO_HELP_THROW_ARGUMENTOUTOFRANGEEXCEPTION
						;; size=28 bbWeight=1 PerfScore 9.00
G_M56847_IG05:  ;; offset=0x0048
            ldr     w1, [fp, #0x1C]	// [V02 arg2]
            uxtb    w1, w1
            adr     x2, [G_M56847_IG06]
            add     x2, x2, x1,  LSL #3
            br      x2
						;; size=20 bbWeight=1 PerfScore 5.00
G_M56847_IG06:  ;; offset=0x005C
            st1     {v16.s}[0], [x0]
            b       G_M56847_IG10
						;; size=8 bbWeight=1 PerfScore 2.00
G_M56847_IG07:  ;; offset=0x0064
            st1     {v16.s}[1], [x0]
            b       G_M56847_IG10
						;; size=8 bbWeight=1 PerfScore 2.00
G_M56847_IG08:  ;; offset=0x006C
            st1     {v16.s}[2], [x0]
            b       G_M56847_IG10
						;; size=8 bbWeight=1 PerfScore 2.00
G_M56847_IG09:  ;; offset=0x0074
            st1     {v16.s}[3], [x0]
						;; size=4 bbWeight=1 PerfScore 1.00
G_M56847_IG10:  ;; offset=0x0078
            nop
            nop
						;; size=8 bbWeight=1 PerfScore 1.00
G_M56847_IG11:  ;; offset=0x0080
            ldp     fp, lr, [sp], #0x40
            ret     lr
						;; size=8 bbWeight=1 PerfScore 2.00

; Total bytes of code 136, prolog size 20, PerfScore 48.10, instruction count 34, allocated bytes for code 136 (MethodHash=5b1221f0) for method System.Runtime.Intrinsics.Arm.AdvSimd:StoreSelectedScalar(ulong,System.Runtime.Intrinsics.Vector128`1[int],ubyte) (MinOpts)
; ============================================================

Interestingly, for some tests I got IndexOutOfRangeException earlier because the test was executing without an error but failing in the validation stage where it was accessing the out of range index. Only the first test was failing with an illegal instruction probably because of the code layout. Invalid index tried to jump to location that did not contain a valid index. For the further tests, the jump target at invalid index contained a valid instruction so it did not throw any exception during the execution.

@kunalspathak
Copy link
Member

Seems the indices of few test cases needs adjustment for them to not throw Exception.

@SwapnilGaikwad
Copy link
Contributor Author

SwapnilGaikwad commented Nov 2, 2023

Seems the indices of few test cases needs adjustment for them to not throw Exception.

Had to move the range check after recomputing the simdSize 🤦 . Now the current failures seems while sending the test data to helix system.

@SwapnilGaikwad
Copy link
Contributor Author

can you double check?

Sure! It's tricky to debug, and it isn't obvious why compilation of System.Runtime.Serialization.SerializationGuard would fail. I'll try rebasing/merging main, if the error persists then can debug further.

@kunalspathak
Copy link
Member

can you double check?

Sure! It's tricky to debug, and it isn't obvious why compilation of System.Runtime.Serialization.SerializationGuard would fail. I'll try rebasing/merging main, if the error persists then can debug further.

From what I remember, these tests were passing earlier and failing with the recent commit and are in arm64, so most likly related. I could be wrong.

@SwapnilGaikwad
Copy link
Contributor Author

can you double check?

Sure! It's tricky to debug, and it isn't obvious why compilation of System.Runtime.Serialization.SerializationGuard would fail. I'll try rebasing/merging main, if the error persists then can debug further.

From what I remember, these tests were passing earlier and failing with the recent commit and are in arm64, so most likly related. I could be wrong.

You're right, it seems to be related to the arg check. Not sure how to debug it though. I'll try to see if I can reproduce the failure. If that doesn't progress, should I put the explicit check as before?

@kunalspathak
Copy link
Member

kunalspathak commented Nov 6, 2023

If that doesn't progress, should I put the explicit check as before?

I would say just inspect the code changes and see if the code path for existing SelectedScalar with single vector is unchanged. That should fix the problem.

@kunalspathak
Copy link
Member

@SwapnilGaikwad - can you resolve the merge conflicts?

Copy link
Member

@fanyang-mono fanyang-mono left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to Mono codebase looks good to me.

Change-Id: I81d386adbbcc195b777f436329f24438b0cca10c
@kunalspathak
Copy link
Member

The change to Mono codebase looks good to me.

@fanyang-mono - was the change essentially to make sure that mono continue to ignore newer APIs whose 2nd parameter is tuple and only support the existing StoreSelectedScalar API?

@fanyang-mono
Copy link
Member

fanyang-mono commented Nov 16, 2023

The change to Mono codebase looks good to me.

@fanyang-mono - was the change essentially to make sure that mono continue to ignore newer APIs whose 2nd parameter is tuple and only support the existing StoreSelectedScalar API?

Unfortunately, yes, Mono is not super friendly with handling API with 4 and more input arguments. We need to add additional code to handle those cases, rather than automatically handled by the machinery like Vector64/128. There is an issue to track Mono side of the work related to these new set of API's.

@kunalspathak kunalspathak merged commit 2e912a3 into dotnet:main Nov 16, 2023
203 of 205 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Dec 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member new-api-needs-documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants