Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New methods for EqualityComparer<T> #6407

Closed
omariom opened this issue Jul 29, 2016 · 14 comments
Closed

New methods for EqualityComparer<T> #6407

omariom opened this issue Jul 29, 2016 · 14 comments

Comments

@omariom
Copy link
Contributor

omariom commented Jul 29, 2016

EqualityComparer<T>.Default is convenient when we need the ability to compare values to be passed around as a thing. But if just need to call Equals(x,y) in a generic context we have to pay the cost of not inlined dynamicaly dispatched calls plus checks that the static field holding the default comparer has been initialized. It is especially bad for primitive and simple structs.

I propose adding the following static methods to EqualityComparer<T>:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool DefaultEquals(T x, T y)
{
    if (typeof(T) == typeof(int))  return ((int)(object)x).Equals((int)(object)y);
    if (typeof(T) == typeof(long)) return ((long)(object)x).Equals((long)(object)y);

    // the same for other primitive values types 
    // and possibly for DateTime,DateTimeOffset, TimeSpan

    return Default.Equals(x, y);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static int DefaultGetHashCode(T obj)
{
    if (typeof(T) == typeof(int))  return ((int)(object)obj).GetHashCode();
    if (typeof(T) == typeof(long)) return ((long)(object)obj).GetHashCode();

    // the same for other primitive values types 
    // and possibly for DateTime,DateTimeOffset, TimeSpan

    return Default.GetHashCode(obj);
}

Types and code that could benefit from the change:

1. Any code that calls EqualityComparer.Default.Equals and EqualityComparer.Default.GetHashCode where T is a primitive type.

2. Tuples

This is how a call to ValueTuple<int, long>.Equals looks now:

; Assembly listing for method CoreClrTest.Program:TestTupleEquals(struct,struct):bool
       push     rdi
       push     rsi
       push     rbx
       sub      rsp, 32
       mov      rsi, rcx
       mov      edi, dword ptr [rdx]
       mov      rbx, qword ptr [rdx+8]
       mov      rcx, 0xD1FFAB1E
       mov      edx, 68
       call     CORINFO_HELP_CLASSINIT_SHARED_DYNAMICCLASS
       mov      rcx, 0xD1FFAB1E
       mov      rcx, gword ptr [rcx]
       mov      edx, dword ptr [rsi]
       mov      r8d, edi
       mov      rax, qword ptr [rcx]
       mov      rax, qword ptr [rax+72]
       call     qword ptr [rax+32]System.Collections.Generic.EqualityComparer`1[Int32][System.Int32]:Equals(int,int):bool:this
       test     eax, eax
       je       SHORT G_M17236_IG03
       mov      rcx, 0xD1FFAB1E
       mov      edx, 72
       call     CORINFO_HELP_CLASSINIT_SHARED_DYNAMICCLASS
       mov      rcx, 0xD1FFAB1E
       mov      rcx, gword ptr [rcx]
       mov      rdx, qword ptr [rsi+8]
       mov      r8, rbx
       mov      rax, qword ptr [rcx]
       mov      rax, qword ptr [rax+72]
       call     qword ptr [rax+32]System.Collections.Generic.EqualityComparer`1[Int64][System.Int64]:Equals(long,long):bool:this
       jmp      SHORT G_M17236_IG04
G_M17236_IG03:
       xor      eax, eax
G_M17236_IG04:
       add      rsp, 32
       pop      rbx
       pop      rsi
       pop      rdi
       ret      
; Total bytes of code 130, prolog size 7 

and how it could look with direct calls to DefaulsEquals.

; Assembly listing for method CoreClrTest.Program:TestTupleEquals(struct,struct):bool
       sub      rsp, 40
       mov      eax, dword ptr [rdx]
       mov      rdx, qword ptr [rdx+8]
       mov      r8d, dword ptr [rcx]
       cmp      r8d, eax
       sete     al
       movzx    rax, al
       test     eax, eax
       je       SHORT G_M17242_IG03
       mov      rax, qword ptr [rcx+8]
       cmp      rax, rdx
       sete     al
       movzx    rax, al
       jmp      SHORT G_M17242_IG04
G_M17242_IG03:
       xor      eax, eax
G_M17242_IG04:
       add      rsp, 40
       ret      
; Total bytes of code 48, prolog size 4

3. Dictionaries

Currently each successful lookup requires at least 2 interface calls to comparer's GetHashCode and Equals . We could check if the comparer Dictionary gets is null or the default one and devitalize the calls as DefaultGetHashCode and DefaultEquals.
In my experiments it makes lookups 40-45% faster when TKey is int.

p.s.

Wishes

We could implement DefaultGetHashCode and DefaultEquals as:

public static bool DefaultGetHashCode(T obj)
{
    if (typeof(T).IsValueType && !typeof(T).IEnum) 
        return obj.GetHashCode();

    return Default.GetHashCode(obj);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static bool DefaultEquals(T x, T y)
{
    if (typeof(T).IsValueType && x is IEquatable<T>)  
        return ((IEquatable<T>)(object)x).Equals((T)(object)y);

    return Default.Equals(x, y);
}

but typeof(T).IsValueType, typeof(T).IEnum are not JIT time constants yet and x is IEquatable<T> is not treated by JIT as generic constraint.

/cc @stephentoub @jkotas @KrzysztofCwalina @jamesqo @mikedn

@jamesqo
Copy link
Contributor

jamesqo commented Jul 29, 2016

But if just need to call Equals(x,y) in a generic context we have to pay the cost of not inlined dynamicaly dispatched calls

This would probably best be done as a CLR optimization, rather than adding new public APIs that make up for deficiencies in the runtime. I've read for example that the JVM is able to inline virtual methods, although I don't know how easy this would be to do for the CLR.

plus checks that the static field holding the default comparer has been initialized.

I thought that was fixed with dotnet/coreclr#4340.

I propose adding the following static methods to EqualityComparer:

We'd end up having to maintain 2 different implementations of Default.Equals: the virtual methods, and the ones you propose adding. It could introduce bugs where the behavior of the two methods differ.

  1. Any code that calls EqualityComparer.Default.Equals and EqualityComparer.Default.GetHashCode.

Not necessarily, since most of those calls are made as them typed as IEqualityComparer<T>. Most container types take an IEqualityComparer<T> as part of their constructor so you can plug in whatever custom comparison logic you wish.

  1. Tuples

When the C# 7 feature is finally introduced, I think most people will find it more natural to compare the items one at a time, e.g.

(name, age) = GetPersonTuple();
if (name == "John" && age == 13)
{
    // ...
}

as opposed to

if (GetPersonTuple.Equals(("John", 13)))
{
}

If people are concerned with performance, they should just avoid calling ValueTuple.Equals and go with the first form.

We could check if the comparer Dictionary gets is null or the default one and devitalize the calls as DefaultGetHashCode and DefaultEquals.

That sounds very flaky, and would add an additional untaken branch for the cases where the comparer passed in was not the default comparer. It would also require additional bookkeeping logic on the part of Dictionary.

@OzieGamma
Copy link

  1. Tuples

In C# 7 instead of writing:

(name, age) = GetPersonTuple();
if (name == "John" && age == 13)
{
    // ...
}

I think the following code would be as/more natural:

(name, age) = GetPersonTuple();
if ((name, age) == ("Jhon", 13))
{
    // ...
}

OR

tuple = GetPersonTuple();
if (tuple == ("Jhon", 13))
{
    // ...
}

Since creating a Tuple is going to be more easy, I expect more of them to be created. I would even use the (a, b, c) == (A, B, C) syntax instead of all the &&. It is just easier to understand.

When/if Tuple patterns arrive in C#, people might write it as (a, b, c) is (A, B, C) instead. If those optimizations are not to hard to implement they would definitely be worth it.

@omariom
Copy link
Contributor Author

omariom commented Jul 29, 2016

@jamesqo

This would probably best be done as a CLR optimization, rather than adding new public APIs that make up for deficiencies in the runtime.

Yes, but I think this specific case is unlikely to be implemented soon.

I thought that was fixed with dotnet/coreclr#4340.

Can it be because mscorlib is always crossgened?
Anyway it is still a dynamically dispatched call.

We'd end up having to maintain 2 different implementations of Default.Equals : the virtual methods, and the ones you propose adding.

They will serve different purposes: direct calls to DefaultEquals and passing instance of IEquailityComparer<T> to LINQ methods, for example.

It could introduce bugs where the behavior of the two methods differ.

We will be here and won't allow it to happen )

Not necessarily, since most of those calls are made as them typed as IEqualityComparer .

An example of real code:

        protected bool Set<T>(ref T field, T value, [CallerMemberName] string propertyName = null)
        {
            if (string.IsNullOrEmpty(propertyName))
                ThrowArgumentNullException(nameof(propertyName));

            if (!EqualityComparer<T>.Default.Equals(field, value))
            {
                field = value;
                RaisePropertyChanged(propertyName);
                return true;
            }

            return false;
        }

EqualityComparer<T>.Default.Equals is called 3 times in ConcurrentDictionary, 3 times in Dictionary, 1 time in List<T> etc

That sounds very flaky, and would add an additional untaken branch for the cases where the comparer passed in was not the default comparer. It would also require additional bookkeeping logic on the part of Dictionary.

Just 2 aggressively inlined methods in Dictionary. Minimal changes for FindEntry, for example.
Devirtualization is quite common optimization practice. And 40% better perf speaks for itself. The cost of the check and the branch is minimal compared with interface dispatch - the slowest of all .NET's dynamic dispatches. The case requires thorough perf tests, of course.

Imagine a Dictionary<TupleValue<int, long>, Foo>. Every check for equality leads to one interface call, and then two virtual calls. It is just to compare 4 primitive variables.

@omariom
Copy link
Contributor Author

omariom commented Jul 29, 2016

@OzieGamma

I think the following code would be as/more natural:

tuple = GetPersonTuple();
if (tuple == ("Jhon", 13))
{
    // ...
}

Good idea! ValueTuple doesn't currently have equality operators. You can create an issue and a PR in CoreFx.

@KrzysztofCwalina
Copy link
Member

KrzysztofCwalina commented Jul 29, 2016

I understand the concerns raised by @jamesqo, but I have to say I think the wins here are just too good to ignore, and I am skeptical we can get similar wins by doing the work in the runtime[s] anytime soon. i.e. I would be nice we our optimized solved this problem, but I think these API is a more practical (viable from cost/value perspective) way to get similar wins.

@jamesqo
Copy link
Contributor

jamesqo commented Jul 29, 2016

@omariom

An example of real code:

In the specific example you've shown, I think the overhead of raising the event/etc. and whatnot will be much greater than the overhead of a single virtual call.

1 time in List etc

It's not called at all in List. I recently submitted a PR in fact (#6212) to optimize List.Contains to use Array.IndexOf instead of calling EqualityComparer.Equals per iteration of the loop, since with the former it's just a single virtual call as opposed to N virtual calls. (IndexOf/LastIndexOf is specialized for all the comparers.) So this further limits the new API's use case to places that don't search through an array for an item, since in those cases Array.IndexOf / Array.LastIndexOf could simply be substituted to give a perf speedup. (In fact, it may be even slower for types like enums where we have to fall back to Default.Equals.)

@omariom @KrzysztofCwalina Maybe we should instead add DefaultEquals as an internal method, and let the standard collection types which live in mscorlib use that. (Or at least put the method in the System.Runtime.CompilerServices namespace.) I can't imagine anyone outside of collection writers/micro-optimizers who would want to use the new public method, when they could just write Default.Equals.

@omariom
Copy link
Contributor Author

omariom commented Jul 29, 2016

@jamesqo

Maybe we should instead add DefaultEquals as an internal method

ValueTuple is in CoreFx.

@GSPP
Copy link

GSPP commented Jul 30, 2016

A different solution would be to make the JIT use the type information of values stored in a static readonly field. The JIT is able to look at those values already. E.g. static readonly bool b = false; ... if (b) ... is statically deleted (I just re-tested it to make sure).

Here, the JIT could use the known receiver type to devirtualize the calls.

If there are concerns regarding reflection setting readonly fields: a) The JIT already disregards that and b) it could generate if (receiverTypeMatchesExpection) DirectCall(); else VCall();. This check would almost always be true, except if the field was (illegally) overwritten using reflection. The branch is predictable, so it costs ~1 cycle.

This optimization would help in all kinds of situations.

@jamesqo
Copy link
Contributor

jamesqo commented Jul 30, 2016

Here, the JIT could use the known receiver type to devirtualize the calls.

As @omariom mentioned before, it doesn't look like the JIT does any devirtualization at the moment.

I think a more practical thing to do would be to recognize code patterns of EqualityComparer<T>.Default.Equals or GetHashCode and transform those to regular Equals / GetHashCode calls on the object itself for different generic instantiations. Then, when/if RyuJIT does do devirtualization, that can be removed and devirtualization will be used in place.

@jkotas
Copy link
Member

jkotas commented Jul 31, 2016

The .NET Native compiler for UWP does similar transformation of EqualityComparer<T>.Default.Equals to reduce binary size for full AOT. You can see the class library side of the implementation here: https://github.com/dotnet/corert/blob/master/src/System.Private.Reflection.Execution/src/Internal/IntrinsicSupport/EqualityComparerHelpers.cs. The compiler side of the implementation is closed source unfortunately, but there is no rocket science in it.

This optimization would make RyuJIT better AOT codegenerator for CoreRT project, closer to the .NET Native compiler for UWP compiler.

@omariom
Copy link
Contributor Author

omariom commented Jul 31, 2016

@jkotas @jamesqo
That's even better. It will allow to devirtualize and may be inline calls to any IEquatable<T> struct, like TupleValue or user defined structs. Would be interesting to see how much faster dictionary lookups (with tuple keys) become.

@omariom
Copy link
Contributor Author

omariom commented Aug 5, 2016

What have been decided?
@jkotas, will you create a separate issue?

/cc @KrzysztofCwalina

@jkotas
Copy link
Member

jkotas commented Aug 5, 2016

Even if we introduced a new API, we would still need to do special things for it in the runtime/JIT to make it work well - without performance cliffs. So this issue looks like runtime/JIT optimization to me. The new APIs maybe nice, but they are not required to realize the perf benefits.

The beauty of doing this as JIT optimization is that it will kick in for all existing code, without any changes.

@omariom
Copy link
Contributor Author

omariom commented Aug 5, 2016

Then I am closing this issue in favor of a JIT specific one.

https://github.com/dotnet/coreclr/issues/6688

@omariom omariom closed this as completed Aug 5, 2016
@msftgits msftgits transferred this issue from dotnet/coreclr Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

6 participants