-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Optimization: Reduce many array allocations in GetCustomAtttributes()'s inherited path #20779
Optimization: Reduce many array allocations in GetCustomAtttributes()'s inherited path #20779
Conversation
This is a follow-up to many allocations recognized in dotnet#20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes.
{ | ||
if (_emptyArray == null) | ||
{ | ||
MethodInfo method = typeof(Array).GetMethod(nameof(Array.Empty)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be using reflection to get to Array.Empty
. Getting to Array.Empty
this way is expensive and we are not really benefiting from caching done by Array.Empty
here since we have our own cache.
It should just cache the empty array created by CreateAttributeArrayHelper
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were worried about allocating a duplicate empty here, but I agree in the scheme of changes here that's a bit silly given we did so for every single call previously - simplified in latest:
return _emptyArray ?? (_emptyArray = Attribute.CreateAttributeArrayHelper(m_runtimeType, 0));
{ | ||
return useObjectArray ? Array.Empty<object>() : caType.GetEmptyArray(); | ||
} | ||
|
||
object[] typedResult = CreateAttributeArrayHelper(arrayType, result.Count); | ||
Array.Copy(result.ToArray(), 0, typedResult, 0, result.Count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing result.ToArray()
here that allocates a temporary array, it should be better to just go over the list and assign one element at a time. Another option would be to use a bulk copy but that's likely more expensive to setup here since there are a very few attributes in typical case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on both counts - simple loop copy added here, perf is much better.
@@ -1294,6 +1294,11 @@ internal static object[] GetCustomAttributes(RuntimeType type, RuntimeType caTyp | |||
type = type.BaseType as RuntimeType; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can replace List<object>
with ListBuilder<object>
to save another allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I follow this note - ListBuilder<T>
is private inside RuntimeType
and expects a known capacity at start, do you want to consider changing either of those things? Or aiming at ValueListBuilder<T>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default ListBuilder<T>
constructor does not need capacity, and changing it to internal should be fine too.
ValueListBuilder
would work too, but probably not as well as ListBuilder<T>
in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it - I'm still figuring out what scope of changes is acceptable in which areas of CoreFX/CoreCLR, so I really appreciate the feedback! ListBuilder<T>
is exposed/used now, and I've greatly expanded the benchmarks above to illustrate the gains there. Thanks!
@@ -1462,6 +1462,7 @@ internal RuntimeType ReflectedType | |||
private static object s_methodInstantiationsLock; | |||
private string m_defaultMemberName; | |||
private object m_genericCache; // Generic cache for rare scenario specific data. It is used to cache Enum names and values. | |||
private object[] _emptyArray; // Object array cache for Attribute.GetCustomAttributes() pathological no-result case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this can reuse m_genericCache
to avoid adding more fields here.
…ibuteArrayHelper This exposes Attribute.CreateAttributeArrayHelper to internal and uses it directly on the RuntimeType caching for empty arrays. Though this allocated 1 additional array overall, it's simpler, faster to init, and still is an infinite win over the per-call allocations before this overall changesets.
This removes a .ToArray() for the sake of Array.Copy() where a simple for loop suffices and removes the allocation. Reversing the "empty" result checks is also just a bit cleaner here. This also expands the same fix to the MemberInfo path. Note: should DRY these up too (longstanding issue) - but let's do that in a separate commit for clarity.
This exposes RuntimeType.ListBuilder<T> for internal usage and replaces the List<T> allocation in GetCustomAttributes() paths to reduce allocations and increase performance in the inherited crawl paths (which is the default for many optional-parameter methods in layers above). Note: there is a subtle behavior depending on previous-null (not possible with a struct now) in AttributeUsageCheck() that I believe still behaves correctly, but could use another set of eyes and a full test suite run to confirm. object[] attributes was removed there simply because it wasn't used before - only cleaning up.
@@ -1743,7 +1759,7 @@ internal static bool IsAttributeDefined(RuntimeModule decoratedModule, int decor | |||
|
|||
#region Private Static Methods | |||
private static bool AttributeUsageCheck( | |||
RuntimeType attributeType, bool mustBeInheritable, object[] attributes, IList derivedAttributes) | |||
RuntimeType attributeType, bool mustBeInheritable, RuntimeType.ListBuilder<object> derivedAttributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ListBuilder
is struct
. It should be passed around via ref
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed - looking at this locally. private static unsafe object[] GetCustomAttributes()
on 1496 needs refactor love though, as the old method is a bit brittle there - seeing about a refactor to make this cleaner and efficient.
} | ||
return typedResult; | ||
} | ||
// Return a cached, empty array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other places where CreateAttributeArrayHelper
is called for zero count. It would be better to have the special casing for zero length to be in CreateAttributeArrayHelper
instead at every callsite.
CreateAttributeArrayHelper(Type elementType, int elementCount)
{
return (elementCount != 0) ?
Array.CreateInstance(elementType, elementCount) : elementType.GetEmptyArray();
}
And call Array.CreateInstance
in GetEmptyArray directly, without going back to CreateAttributeArrayHelper
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the return type of CreateAttributeArrayHelper
should be object[]
to make this work well. Changing it will also fix a corner case bug with the useObjectArray special casing. object[]
cannot be cast to Attribute[]
. Looks like we do not have a test that covers it since you can create non-standard attributes that are ValueTypes in IL only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that explains the value types ternaries in here - I was wondering how the hell value types were allowed - had the same thought, but Attribute[]
return prevented it before from being generically usable. Agree on centralizing - will push in a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I had to look at this twice - may be confusing to look at in text here: there are 2 instances of this helper. The one in CustomAttribute
is already object[]
, the one on Attribute
is Attribute[]
(and needs to be, since the return there is more specific all the way up).
This also reverts the CreateAttributeArrayHelper => internal change, since it's no longer needed.
…plification This fixes the struct passing duplication and tweaks how we're creating arrays a bit, centralizing the zero-element checks to cover all cases as well as simplify the per-method code to rely on the fact this is happening underneath.
@jkotas @stephentoub another round of fixes & tweaks in - numbers updated in the post, looking good but more is still on the table. I yanked in your latest changes here and cleaned up the cases a bit per-method - ready for a look when you have time. I'll poke a bit more at the remaining allocations here - there's still a lot of arrays to relay private methods results up and down but they're shared so much I'm not seeing a clear refactor path here yet (and such should probably be a different PR, unless you disagree). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you!
bool mustBeInheritable = false; | ||
bool useObjectArray = (caType == null || caType.IsValueType || caType.ContainsGenericParameters); | ||
Type arrayType = useObjectArray ? typeof(object) : caType; | ||
RuntimeType arrayType = useObjectArray ? typeof(object) as RuntimeType : caType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally think of as
being used when we think it's possible the cast may fail, and want a null instead of an exception. But this should never fail, so can we use (RuntimeType)typeof(object)
instead? It's also more consistent with what's used elsewhere in this method already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely - done!
bool mustBeInheritable = false; | ||
bool useObjectArray = (caType == null || caType.IsValueType || caType.ContainsGenericParameters); | ||
Type arrayType = useObjectArray ? typeof(object) : caType; | ||
RuntimeType arrayType = useObjectArray ? typeof(object) as RuntimeType : caType; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto... same in other similar places.
Just FYI: I'm perplexed on how to eliminate the overall type-as-parameter and that being the return type problem here, but I'm poking |
I have a few more changes to further reduce allocations - since this is reviewed I'll fire up another PR after unless it's wanted here. Those two further reduce allocations on hits, more so in the inherited crawl path:
|
I have taken a cursory look and these look great. Thank you!
I am not sure what is struct unfriendly in |
For anyone curious, follow-up posted as #20795 |
…'s inherited path (dotnet#20779) * Optimization: avoid 2 array allocations in inherited attribute misses This is a follow-up to many allocations recognized in dotnet#20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes. * Move RuntimeType empty array cache generation to Attribute.CreateAttributeArrayHelper This exposes Attribute.CreateAttributeArrayHelper to internal and uses it directly on the RuntimeType caching for empty arrays. Though this allocated 1 additional array overall, it's simpler, faster to init, and still is an infinite win over the per-call allocations before this overall changesets. * CustomAttributes: remove needless array copy in the inherited hit case This removes a .ToArray() for the sake of Array.Copy() where a simple for loop suffices and removes the allocation. Reversing the "empty" result checks is also just a bit cleaner here. This also expands the same fix to the MemberInfo path. Note: should DRY these up too (longstanding issue) - but let's do that in a separate commit for clarity. * GetCusomAttributes: use ListBuilder<object> for inheritance crawls This exposes RuntimeType.ListBuilder<T> for internal usage and replaces the List<T> allocation in GetCustomAttributes() paths to reduce allocations and increase performance in the inherited crawl paths (which is the default for many optional-parameter methods in layers above). Note: there is a subtle behavior depending on previous-null (not possible with a struct now) in AttributeUsageCheck() that I believe still behaves correctly, but could use another set of eyes and a full test suite run to confirm. object[] attributes was removed there simply because it wasn't used before - only cleaning up. * Attribute caching: use Array.CreateInstance() directly on RuntimeType This also reverts the CreateAttributeArrayHelper => internal change, since it's no longer needed. * Ref passing for RuntimeType.ListBuilder<object> & CustomAttribute simplification This fixes the struct passing duplication and tweaks how we're creating arrays a bit, centralizing the zero-element checks to cover all cases as well as simplify the per-method code to rely on the fact this is happening underneath.
This benchmarks specifically GetCustomAttributes and IsDefined (used in dotnet/coreclr#20779 and dotnet/coreclr#20795) Note: there are many code paths through the attribute pipeline which vary drastically in terms of performance and allocations - so there's what seems to be an excessive number of benchmarks here, but they are all pretty unique.
* Adds initial benchmarks for System.Reflection: attributes This benchmarks specifically GetCustomAttributes and IsDefined (used in dotnet/coreclr#20779 and dotnet/coreclr#20795) Note: there are many code paths through the attribute pipeline which vary drastically in terms of performance and allocations - so there's what seems to be an excessive number of benchmarks here, but they are all pretty unique. * Add new category to perflab\ReflectionPerf * Consolidate classes PR feedback: don't follow the Math conventions here...go with what it should be :)
…'s inherited path (dotnet/coreclr#20779) * Optimization: avoid 2 array allocations in inherited attribute misses This is a follow-up to many allocations recognized in dotnet/coreclr#20448. A lot of methods ultimately call through CustomAttributes.GetCustomAttributes(). In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List<object> and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to. However, in the common miss case as simple as: typeof(T).GetCustomAttributes(typeof(TAttr), true); ...and many other overloads, all the same path underneath... We can avoid the last 2 arrays in the miss case. We have the List<object> to go off of. If that's a zero-entry list, we can return an Array.Empty<TAttr>(). That's effectively what this change does. While converting the entire attribute pipeline to generics is problematic and has issue since some object[] return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately from Array.Empty<T> underneath) and return that for the miss case. There are far more wins to be had here, but they require more changes. * Move RuntimeType empty array cache generation to Attribute.CreateAttributeArrayHelper This exposes Attribute.CreateAttributeArrayHelper to internal and uses it directly on the RuntimeType caching for empty arrays. Though this allocated 1 additional array overall, it's simpler, faster to init, and still is an infinite win over the per-call allocations before this overall changesets. * CustomAttributes: remove needless array copy in the inherited hit case This removes a .ToArray() for the sake of Array.Copy() where a simple for loop suffices and removes the allocation. Reversing the "empty" result checks is also just a bit cleaner here. This also expands the same fix to the MemberInfo path. Note: should DRY these up too (longstanding issue) - but let's do that in a separate commit for clarity. * GetCusomAttributes: use ListBuilder<object> for inheritance crawls This exposes RuntimeType.ListBuilder<T> for internal usage and replaces the List<T> allocation in GetCustomAttributes() paths to reduce allocations and increase performance in the inherited crawl paths (which is the default for many optional-parameter methods in layers above). Note: there is a subtle behavior depending on previous-null (not possible with a struct now) in AttributeUsageCheck() that I believe still behaves correctly, but could use another set of eyes and a full test suite run to confirm. object[] attributes was removed there simply because it wasn't used before - only cleaning up. * Attribute caching: use Array.CreateInstance() directly on RuntimeType This also reverts the CreateAttributeArrayHelper => internal change, since it's no longer needed. * Ref passing for RuntimeType.ListBuilder<object> & CustomAttribute simplification This fixes the struct passing duplication and tweaks how we're creating arrays a bit, centralizing the zero-element checks to cover all cases as well as simplify the per-method code to rely on the fact this is happening underneath. Commit migrated from dotnet/coreclr@6d9fc60
This is a follow-up to many allocations recognized in #20448.
A lot of methods ultimately call through
CustomAttributes.GetCustomAttributes()
. In the inherited search path (default for most searches above), the inheritance path of a class is traversed, resulting in an array allocation per crawled type, a copy to the overall List and then after that - in current code - an array allocation from that list and a typed array allocation it's copied to.However, in the common miss case as simple as:
...and many other overloads, all the same path underneath...
We can avoid the last 2 arrays in the miss case. We have the
List<object>
to go off of. If that's a zero-entry list, we can return anArray.Empty<TAttr>()
. That's effectively what this change does.While converting the entire attribute pipeline to generics is problematic and has issue since some
object[]
return abstracts aren't sealed, we can at least somewhat trivially cache an array per attribute type (only one static, ultimately fromArray.Empty<T>
underneath) and return that for the miss case.There are far more wins to be had here, but they require more changes.
Benchmarks are in flux at the moment (see #20763), but the following benchmark:
Yields the following gains with this change:
Before
Cached Array & loop copy
Above with
ref
ListBuilder<T>
usageThe gains are somewhat trivial on a single call (2 arrays and 24 bytes), but this is a path called a lot at scale and hopefully we agree multiplier on these changes is huge.
cc @benaadams @stephentoub @adamsitnik @noahfalk