Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Intrinsify EqualityComparer<T>.Default.Equals and GetHashCode calls #6475

Closed
omariom opened this issue Aug 10, 2016 · 6 comments
Closed
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Milestone

Comments

@omariom
Copy link
Contributor

omariom commented Aug 10, 2016

Extracted from New methods for EqualityComparer.

Tuples could directly benefit from that. As well as any code which gets IEqualityComparer<T> externally - it could devirtualize calls to Equals and GetHashCode if the instance passed in is EqualityComparer<T>.Default.

Quoting @jkotas :

The .NET Native compiler for UWP does similar transformation of EqualityComparer.Default.Equals to reduce binary size for full AOT. You can see the class library side of the implementation here: https://github.com/dotnet/corert/blob/master/src/System.Private.Reflection.Execution/src/Internal/IntrinsicSupport/EqualityComparerHelpers.cs. The compiler side of the implementation is closed source unfortunately, but there is no rocket science in it.
This optimization would make RyuJIT better AOT codegenerator for CoreRT project, closer to the .NET Native compiler for UWP compiler.

The beauty of doing this as JIT optimization is that it will kick in for all existing code, without any changes.

@jamesqo
Copy link
Contributor

jamesqo commented Aug 11, 2016

Would be great to have this, although it may require duplicating some more logic with EqualityComparer.CreateComparer. +1

@omariom
Copy link
Contributor Author

omariom commented Apr 19, 2017

Does it have a chance to appear in 2.0?

@jkotas
Copy link
Member

jkotas commented Apr 19, 2017

I am not aware of anybody working on this right now, so it is pretty unlikely.

@AndyAyersMS AndyAyersMS self-assigned this Sep 21, 2017
@AndyAyersMS
Copy link
Member

Am going to start in on this.

@AndyAyersMS
Copy link
Member

I have this partially working now, see master...AndyAyersMS:EqualityComparerDefault
For

public static bool Compare(ref ValueTuple<int, int> a, ref ValueTuple<int, int> b)
{
    return a.Equals(b);
}

we now get the following code

; Total bytes of code 23, prolog size 0

G_M57469_IG02:
   mov      eax, dword ptr [rdx]
   mov      edx, dword ptr [rdx+4]
   cmp      dword ptr [rcx], eax
   jne      SHORT G_M57469_IG03
   cmp      dword ptr [rcx+4], edx
   sete     al
   movzx    rax, al
   jmp      SHORT G_M57469_IG04

G_M57469_IG03:
   xor      eax, eax

G_M57469_IG04:
   ret

instead of

; Total bytes of code 108, prolog size 7

G_M57469_IG01:
   push     rdi
   push     rsi
   push     rbx
   sub      rsp, 48
   mov      rsi, rcx

G_M57469_IG02:
   mov      edi, dword ptr [rdx]
   mov      ebx, dword ptr [rdx+4]
   mov      rcx, 0x7FFF427D3028
   mov      edx, 79
   call     CORINFO_HELP_CLASSINIT_SHARED_DYNAMICCLASS
   mov      rcx, 0x1EE27732820
   mov      rcx, gword ptr [rcx]
   mov      edx, dword ptr [rsi]
   mov      r8d, edi
   mov      rax, qword ptr [rcx]
   mov      rax, qword ptr [rax+72]
   call     qword ptr [rax+32] ;; virtual
   test     eax, eax
   je       SHORT G_M57469_IG03
   mov      rcx, 0x1EE27732820
   mov      rcx, gword ptr [rcx]
   mov      edx, dword ptr [rsi+4]
   mov      r8d, ebx
   mov      rax, qword ptr [rcx]
   mov      rax, qword ptr [rax+72]
   call     qword ptr [rax+32] ;; virtual
   jmp      SHORT G_M57469_IG04

G_M57469_IG03:
   xor      eax, eax

G_M57469_IG04:
   add      rsp, 48
   pop      rbx
   pop      rsi
   pop      rdi
   ret

This uses an explicit expansion of Equals by the jit when T is a simple integer type. It would be cleaner if the jit could determine what class Default exactly refers to, then regular devirt/inline could kick in and we'd handle all the methods on the comparer and use their actual implementations.

I'm not sure how to best go about doing that. It seems like the knowledge of this mapping might need to be hard coded on the VM side?

Also we might want to tone down struct promotion a bit, so we don't eagerly load all the fields like we do here. Or else move the loads closer to their uses....

@jkotas
Copy link
Member

jkotas commented Sep 21, 2017

I'm not sure how to best go about doing that. It seems like the knowledge of this mapping might need to be hard coded on the VM side?

Yes. .NET Native has some code for it - it is probably not 100% reusable for RyuJIT, but bits and pieces may be. As you have said, the basic idea is that the virtual comparer gets replaced by a concrete one - like https://github.com/dotnet/corert/blob/280f6a0e29954d31c94263d6137d215ca71282fd/src/System.Private.CoreLib/src/Internal/IntrinsicSupport/EqualityComparerHelpers.cs#L155.

AndyAyersMS referenced this issue in AndyAyersMS/coreclr Sep 21, 2017
Mark `EqualityComparer<T>.Default`'s getter as `[Intrinsic]` so
the jit knows there is something special about it. Extend the jit's
named intrinsic recognizer to recognize this method.

Add a new jit interface method to determine the exact type returned
by `EqualityComparer<T>.Default`, given `T`. Compute the return type by
mirroring the logic used in the actual implementation.

Invoke this interface method when trying to devirtualize calls where
the 'this' object in the call comes from `EqualityComparer<T>.Default`.
Handle both the early and late devirtualization possibilties.

The devirtualized method can then be inlined if devirtualization
happens early; inlining uses the normal jit heuristics here.

Closes #6688.
AndyAyersMS referenced this issue in AndyAyersMS/coreclr Sep 24, 2017
Mark `EqualityComparer<T>.Default`'s getter as `[Intrinsic]` so
the jit knows there is something special about it. Extend the jit's
named intrinsic recognizer to recognize this method.

Add a new jit interface method to determine the exact type returned
by `EqualityComparer<T>.Default`, given `T`. Compute the return type by
mirroring the logic used in the actual implementation.

Invoke this interface method when trying to devirtualize calls where
the 'this' object in the call comes from `EqualityComparer<T>.Default`.
Handle both the early and late devirtualization possibilties.

The devirtualized method can then be inlined if devirtualization
happens early; inlining uses the normal jit heuristics here.

Closes #6688.
AndyAyersMS referenced this issue in AndyAyersMS/coreclr Sep 26, 2017
Mark `EqualityComparer<T>.Default`'s getter as `[Intrinsic]` so
the jit knows there is something special about it. Extend the jit's
named intrinsic recognizer to recognize this method.

Add a new jit interface method to determine the exact type returned
by `EqualityComparer<T>.Default`, given `T`. Compute the return type by
mirroring the logic used in the actual implementation. Bail out when
`T` is not final as those cases won't simplify down much and lead to
code bloat.

Invoke this interface method when trying to devirtualize calls where
the 'this' object in the call comes from `EqualityComparer<T>.Default`.

The devirtualized methods can then be inlined. Since the specific comparer
`Equal` and `GetHashCode` methods look more complicated in IL than they
really are, mark them with AggressiveInlining attributes.

If devirtualization and inlining happen, it is quite likely that the value
of the comparer object itself is not used in the body of the comparer. This
value comes from a static field cache on the comparer helper.

When the comparer value is ignored, the jit removes the field access since it
is non-faulting. It also removes the the class init helper that is there to
ensure that the (no-longer accessed) field is properly initialized. This helper
has relatively high overhead even in the fast case where the class has been
initialized aready.

Add a perf test.

Closes #6688.
AndyAyersMS referenced this issue in dotnet/coreclr Sep 27, 2017
Mark `EqualityComparer<T>.Default`'s getter as `[Intrinsic]` so
the jit knows there is something special about it. Extend the jit's
named intrinsic recognizer to recognize this method.

Add a new jit interface method to determine the exact type returned
by `EqualityComparer<T>.Default`, given `T`. Compute the return type by
mirroring the logic used in the actual implementation. Bail out when
`T` is not final as those cases won't simplify down much and lead to
code bloat.

Invoke this interface method when trying to devirtualize calls where
the 'this' object in the call comes from `EqualityComparer<T>.Default`.

The devirtualized methods can then be inlined. Since the specific comparer
`Equal` and `GetHashCode` methods look more complicated in IL than they
really are, mark them with AggressiveInlining attributes.

If devirtualization and inlining happen, it is quite likely that the value
of the comparer object itself is not used in the body of the comparer. This
value comes from a static field cache on the comparer helper.

When the comparer value is ignored, the jit removes the field access since it
is non-faulting. It also removes the the class init helper that is there to
ensure that the (no-longer accessed) field is properly initialized. This helper
has relatively high overhead even in the fast case where the class has been
initialized aready.

Add a perf test.

Closes #6688.
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@msftgits msftgits added this to the Future milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 29, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI enhancement Product code improvement that does NOT require public API changes/additions optimization tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

5 participants