-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve dictionary lookup perf for OrdinalIgnoreCase #36252
Improve dictionary lookup perf for OrdinalIgnoreCase #36252
Conversation
Tagging subscribers to this area: @eiriktsarpalis |
Also, the above chart should clearly demonstrate that |
[Fact] | ||
public void Dictionary_Comparer_NonRandomizedStringComparers() | ||
{ | ||
RunTest(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be better to use MemberData
in order to get separate pass/fail results for each of these comparers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that initially, but it was a bit cryptic to match the failure back to the argument instance. Maybe there's some secret sauce to getting a friendlier error message in that case?
Right now you get a line number as part of the exception if there's a failure. And while it's not ideal, at least it's a straightforward mapping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. Possibly it would require some trick such as passing a 2nd parameter that was the comparer name.
private NonRandomizedStringEqualityComparer() { } | ||
// Flag indicates whether this instance wraps EqualityComparer<string>.Default | ||
// or StringComparer.Ordinal. | ||
private readonly bool _wrapsOrdinalComparer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to guess this makes the dictionaries one pointer bigger. But I learned something new to me, it seems in at least some cases the CLR will pack boolean fields into the object header, apparently:
using System;
class Program
{
static void Main(string[] args)
{
int size = 50000;
X[] array = new X[size];
long before = GC.GetTotalMemory(true);
for (int i = 0; i < array.Length; i++)
{
array[i] = new X();
}
long after = GC.GetTotalMemory(true);
Console.WriteLine((after - before)/ size);
GC.KeepAlive(array);
}
}
class X // comment out all bools, and it prints 24 (on 64 bit)
{
bool a;
bool b;
bool c;
bool d;
bool e;
bool f;
bool g;
bool h; // with bools a-h it still shows 24
// bool i; // uncomment this and it shows 32
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkotas do I understand that right or am I misreading what I did?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The minimum object size with CoreCLR GC is 12 bytes on 32-bit platforms and 24 bytes on 64-bit platforms. If the payload is less, the object size is padded to the minimum size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - padding, of course. Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's on the singleton comparer instance, not the dictionary instance. That's why there are two singletons for the non-randomized equality comparer: one that wraps EqualityComparer<string>.Default
, and one that wraps StringComparer.Ordinal
. Their runtime behavior is identical. The only difference is that for compat I need to make sure that when you call the Dictionary<...>.Comparer
property getter I give you back the same comparer instance you used to instantiate the dictionary. So this information is smuggled via the comparer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer your other question re: how bools are stored, all objects in the CLR are at least 3 pointers (24 bytes on x64) in size. The first two pointers are collectively referred to as the "object header". The object's member data always begins just after these two pointers.
Due to how the GC handles variable-length objects (string
, Array
, Utf8String
), the length field must always be the first thing directly after the object header. When walking objects, the GC always attempts to dereference this field, even for normal fixed-length objects which don't have a length field. It'll eventually get ignored for these types of objects. But the interesting consequence to this is that for all objects there must be a pointer's worth of dereferenceable data immediately after the object header. It can be junk, but it must be dereferenceable. For normal fixed-size objects, the VM starts putting the member data at where this length field would have been.
That's why the object size doesn't change when you have between 0 - 8 bools, because they take up the space that was already carved out for this object's member data. But as soon as you have a 9th bool it spills over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GrabYourPitchforks but that wouldn't require each object to to be padded to at least 12/24 bytes, only the current gc segment would need to be at least 4/8 byte longer to ensure that there's valid memory after the last object on the gc segment to dereference there. The primary reason i've read for why it is that way is (plug is a continous memory range of surviving objects and gap is a continous range of dead objects / free space)
(from Pro .NET Memory Management)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Suchiman Sure, I suppose you could do this as an optimization, but why bother? The number of zero-length objects (basically, System.Object instances and nothing else) allocated is vanishingly small compared to the number of "useful" object instances allocated. Doesn't seem worthwhile for the GC to change from its existing logic due since we'd expect no real-world gains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GrabYourPitchforks this doesn't just affect zero-length objects but all objects that are smaller than 4/8 bytes, e.g. a few bools, but the point i was trying to make, was that this is unlikely to be the cause but rather an effect (if the objects are always that large, one can save on the check) of the objects being padded. The likely primary cause is that the GC needs the additional space for the plan phase of the GC (as seen in the previous excerpt from the book).
BTW DictionarySlim beats even the newer numbers you have posted here for OrdinalIgnoreCase. I'm now replacing many dictionaries with DictionarySlim ... so any improvements we could make there would be great too. |
{ | ||
internal static new IEqualityComparer<string?> Default { get; } = new NonRandomizedStringEqualityComparer(); | ||
internal static readonly IEqualityComparer<string?> AroundDefaultComparer = new NonRandomizedStringEqualityComparer(wrapsOrdinalComparer: false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Around
in the name looks odd to me. Should the names match StringComparer - Ordinal
and OrdinalIgnoreCase
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see what this means now.
I would not describe the relationship between NonRandomized and regular comparer as one wrapping the other. There is no pointer from the NonRandomized to the regular one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, Around confused me as well.
@mjsabby You mean the prototype |
{ | ||
internal interface INonRandomizedEqualityComparer<in T> : IEqualityComparer<T> | ||
{ | ||
IEqualityComparer<T> GetComparerForSerialization() => GetRandomizedComparer(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have two different methods? I would expect one should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future-proofing, mainly. Check out the discussion over at #4761 (comment).
Summary: you may have types whose default hash code calculation is non randomized (e.g., int
, long
). If the dictionary starts to become unbalanced, you'll want to fall back to a randomized hash code calculation routine. But the default comparer is the non-randomized one, not the randomized one, so GetComparerForSerialization
would return the non-randomized comparer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add these abstractions only once they are required.
|
||
namespace System.Collections.Generic | ||
{ | ||
internal interface INonRandomizedEqualityComparer<in T> : IEqualityComparer<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this interface necessary? Can we just make NonRandomizedStringEqualityComparer unsealed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment it's only for string. But this also provides the infrastructure to have it work for other types as well, such as Utf8String
. (I didn't do that as part of this PR, but it was in one of the early prototypes back in the coreclr repo.)
You recommend deleting it for now and crossing that bridge later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Future proofing like this works against our goals around framework linkability.
@mjsabby are you intending to use string keys in your |
@danmosemsft @GrabYourPitchforks sorry, yes we copied the code and wrote a version that is doing case insensitive comparisons. We also freeze them to disk so we also had to rip out the seed. So my comments are not valid for this PR. |
|
||
namespace System.Collections.Generic | ||
{ | ||
internal interface INonRandomizedEqualityComparer<in T> : IEqualityComparer<T> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Regardless of whether we keep this interface in the end, I'd say I find the name a bit confusing, in the sense that most IEqualityComparer instances out there are non-randomized but do not implement this subtype. Perhaps IRandomizableEqualityComparer might be more suitable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe IFastEqualityComparer<in T>
? The premise of the interface is "My GetHashCode
routine is faster than a normal GetHashCode
routine, but I'm not collision-resistant. I'm intended as an implementation detail for collections who want optimistically to assume the data isn't malicious, but I also provide a fallback in case things go bad."
src/libraries/System.Private.CoreLib/src/System/Collections/Generic/Dictionary.cs
Outdated
Show resolved
Hide resolved
@GrabYourPitchforks, |
@stephentoub Absolutely! I'll update this PR to incorporate the earlier feedback and also get to |
Great, thanks. |
@jkotas I started looking at this again. Not seeing a way to work around having the interface or an abstract base class if we want to support both Ordinal and OrdinalIgnoreCase. Here's the implementation as proposed here: internal interface INonRandomized : IEqualityComparer<string> { }
internal sealed class NonRandomizedStringOrdinal : INonRandomized { }
internal sealed class NonRandomizedStringOrdinalIgnoreCase : INonRandomized { } Here, the dictionary has a reference to one of the two concrete implementations. The dictionary makes a virtual call to I could use a single concrete implementation Would this be a sufficient reason for keeping the interface around? It would allow us to get away with only a single indirection in the hot path. It introduces some extra indirections in |
Agree. I believe that NonRandomizedStringEqualityComparer should work as the base class (it does not need to be actually abstract):
My issue with internal |
This would be a significant win, it would be nice to get this committed before we fork for 5.0. |
Major changes since last iteration:
Those last two points deserve a little more discussion. We're in the process of updating internal SDL guidance with regard to cryptographic and non-cryptographic hash algorithms, and this also served as a good opportunity to talk with the Crypto Board regarding whether we're following best practice with our usage of these hash algorithms. Per those discussions, the best hygiene is to use different seeds per instance rather than an app-global seed where possible. So, once the Dictionary or HashSet type sees 100 collisions, it will instantiate a brand new equality comparer with a brand new seed. This does incur some overhead, but only after 100 collisions are seen, which means that the dictionary is already in its "be safe rather than fast" fallback behavior. Since this involves the creation of new non-randomized and randomized string equality comparer types, there's now the issue of what to do when serializing dictionary instances. Rather than complicate the serialization logic to special-case all of these components, I changed the |
IEqualityComparer<string?> GetUnderlyingEqualityComparer(); | ||
} | ||
|
||
internal static class InternalStringEqualityComparer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: The static method can be in IInternalStringEqualityComparer
. No need to have a separate type with just a single method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have other examples of where we've introduced static methods in interfaces?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not have much, if any. It got enabled in C# not too long ago. C# allows it now so we should be free to use it. Using it as internal implementation detail is low risk - we can undo it if necessary.
Would it be possible to gather updated perf numbers for Dictionary and HashSet now? Just to check. |
@danmosemsft Here are the updated perf numbers. In this table, Linguistic (
|
Those numbers look great to me. |
Failing test is tracked at #40416. |
string.GetHashCode
(and all built-inStringComparer.GetHashCode
routines) have built-in randomness to help prevent hash collision attacks. This means that collection types likeDictionary<string, ...>
,HashSet<string>
, and others (where TKey = string) get these protections for free automatically.However, hash code randomization does incur some perf penalty, so
Dictionary<string, ...>
special-cases two specific code paths. If you instantiate a dictionary instance vianew Dictionary<string, ...>(comparer: null)
ornew Dictionary<string, ...>(EqualityComparer<string>.Default)
, the dictionary instance will swap out the comparer and use a faster non-randomized comparer, falling back to the randomized comparer only when the dictionary's internal buckets become unbalanced.This PR also special-cases
new Dictionary<string, ...>(StringComparer.Ordinal)
andnew Dictionary<string, ...>(StringComparer.OrdinalIgnoreCase)
to get the same perf boost. No other comparers (such as the linguisticStringComparer
singletons) are special-cased.Perf results are below. Note that
StringComparison.Ordinal
andStringComparison.OrdinalIgnoreCase
see a significant perf boost. Other comparers remain within the range of noise.