Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further reduce Linq usage in ComponentModel.Annotations #57897

Open
drieseng opened this issue Aug 22, 2021 · 9 comments
Open

Further reduce Linq usage in ComponentModel.Annotations #57897

drieseng opened this issue Aug 22, 2021 · 9 comments

Comments

@drieseng
Copy link
Contributor

drieseng commented Aug 22, 2021

In PR #56753 usage of Linq was reduced in System.ComponentModel.Annotations.
My first attempt to reduce this further (PR #57392) failed miserably, so I'd like to first indicate some other opportunities before I spend time implementing and benchmarking these.

The changes I'm proposing should:

  • improve performance
  • reduce allocations
  • make System.ComponentModel.Annotations more linker friendly by avoiding to bring in System.Linq.

Once these opportunities have been validated, I'll implement the changes and use benchmarks to show the effect these changes have. I understand of course that - even after initial validation - some changes may still be rejected.
I only hope that the probability will be a lot lower.

Here are the changes I'm proposing:

AssociatedMetadataTypeTypeDescriptor

  • We should be able to improve performance here by "manually" creating HashSets for mainTypeMemberNames and buddyTypeMembers. To be validated with a benchmark of course.
  • We can avoid creating a new array here and just pass the HashSet<string>. The code probably dates back from when String.Join(String, IEnumerable<String>) didn't exist.
  • We can avoid using Enumerable.FirstOrDefault<TSource>(this System.Collections.Generic.IEnumerable<TSource> source) here and just check if Type.GetMember(...) returned an array with one or more entries.
  • Use AttributeCollection.CopyTo(Array array, int index) here to obtain array of attributes.

CustomValidationAttribute

  • Use Type.GetMethod(String, BindingFlags) to retrieve the MethodInfo. This is both a performance optimalization and a minor bug fix. After this change we’ll throw the same exception as .NET Framework.
    I’ll of course add a corresponding unit test.

FileExtensionsAttribute

  • Use String.Join(String, IEnumerable<String>) here to avoid Linq.
  • Use a regular foreach with yield return here to avoid Linq.
  • Use a regular foreach with an InvariantCultureIgnoreCase comparison here to avoid Linq and avoid the extra allocation of String.ToLowerInvariant().

Validator

  • Update GetObjectPropertyValidationErrors(…) and GetPropertyValues(…) to return List<T> instead of respectively IEnumerable<ValidationError> and ICollection<KeyValuePair<ValidationContext, object?>>. This avoids interface dispatch and allow us to directly use the result here and here. This in turn eliminates allocation of an extra list (and the List<T>.AddRange(…) invocations).
  • Check if Count is greater than zero here instead of using Enumerable.Any().

ValidationAttributeStore

  • Lazily scan the attributes here to distinguish attributes of type ValidationAttributes and DisplayAttribute without use of Linq and change the type of StoreItem.ValidationAttributes from IEnumerable<ValidationAttribute> to List<ValidationAttribute>. I used the volatite/Interlock.CompareExchange "trick" to lazily get the attributes, and BDN reported a saving of 5 KB for a single TypeStoreItem (I temporily disabled caching of TypeStoreItem to run the benchmark). To be discussed and investigated in detail.
  • Immediately initialize the dictionary here with the correct capacity.

For those changes that are validated, please indicate which changes - if any - can be grouped in a single PR.

@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.ComponentModel.DataAnnotations untriaged New issue has not been triaged by the area owner labels Aug 22, 2021
@ghost
Copy link

ghost commented Aug 22, 2021

Tagging subscribers to this area: @ajcvickers, @bricelam, @roji
See info in area-owners.md if you want to be subscribed.

Issue Details

In PR #56753 usage of Linq was reduced in System.ComponentModel.Annotations.
My first attempt to reduce this further (PR #57392) failed miserably, so I'd like to first indicate some other opportunities before I spend time implementing and benchmarking these.

The changes I'm proposing should:

  • improve performance
  • reduce allocations
  • make System.ComponentModel.Annotations more linker friendly by avoiding to bring in System.Linq.

Once these opportunities have been validated, I'll implement the changes and use benchmarks to show the effect these changes have. I understand of course that - even after initial validation - some changes may still be rejected.
I only hope that the probability will be a lot lower.

Here are the changes I'm proposing:

AssociatedMetadataTypeTypeDescriptor

  • We should be able to improve performance here by "manually" creating HashSets for mainTypeMemberNames and buddyTypeMembers. To be validated with a benchmark of course.
  • We can avoid creating a new array here and just pass the HashSet. The code probably dates back from when String.Join(String, IEnumerable<String>) didn't exist.
  • We can avoid using Enumerable.FirstOrDefault<TSource>(this System.Collections.Generic.IEnumerable<TSource> source) here and just check if Type.GetMember(...) returned an array with one or more entries.
  • Use AttributeCollection.CopyTo(Array array, int index) here to obtain array of attributes.

CustomValidationAttribute

  • Use Type.GetMethod(String, BindingFlags) to retrieve the MethodInfo. This is both a performance optimalization and a minor bug fix. After this change we’ll throw the same exception as .NET Framework.
    I’ll of course add a corresponding unit test.

FileExtensionsAttribute

  • Use String.Join(String, IEnumerable<String>) here to avoid Linq.
  • Use a regular foreach with yield return here to avoid Linq.
  • Use a regular foreach with an InvariantCultureIgnoreCase comparison here to avoid Linq and avoid the extra allocation of String.ToLowerInvariant().

Validator

  • Update GetObjectPropertyValidationErrors(…) and GetPropertyValues(…) to return List instead of respectively IEnumerable and ICollection<KeyValuePair<ValidationContext, object?>>. This avoids interface dispatch and allow us to directly use the result here and here. This in turn eliminates allocation of an extra list (and the List<T>.AddRange(…) invocations).
  • Check if Count is greater than zero here instead of using Enumerable.Any().

ValidationAttributeStore

  • Lazily scan the attributes here to distinguish attributes of type ValidationAttributes and DisplayAttribute without use of Linq and we can also change the type of StoreItem.ValidationAttributes from IEnumerable to List. I used the volatite/Interlock.CompareExchange "trick" to lazily get the attributes, and BDN reported a saving of 5 KB for a single TypeStoreItem (I temporily disabled caching of TypeStoreItem to run the benchmark). To be discussed and investigated in detail.
  • Immediately initialize the dictionary here with the correct capacity.

For those changes that are accepted, please indicate which changes - if any - can be grouped in a single PR.

Author: drieseng
Assignees: -
Labels:

area-System.ComponentModel.DataAnnotations, untriaged

Milestone: -

@stephentoub
Copy link
Member

Thanks, @drieseng.

I'm not convinced throughput improvements to System.ComponentModel.Annotations matter, though I'm happy to be convinced otherwise with data. Does any of this code show on hot paths? Does it show up on startup paths where these changes will help reduce startup, maybe by reducing JIT'ing or something like that?

If there are places where the use of LINQ is gratuitous, and the same functionality can be implemented without LINQ and with minimal additional (or even less) code, I'm all for replacing it. I skimmed your list and some of your examples fall into this category.

With regards to linker friendliness, it's important to keep in mind that all it takes is one call anywhere in an app to a given method, and that method needs to then be kept. So if we delete all of our use of method M in order to enable M to be trimmed away but then the app uses M, it can't be trimmed away. Worse, if we delete our use of M by replacing it with a lot of custom code, not only do we now have more code to maintain, but if M does end up getting used by the app, we've actually increased code size, as now the app has to carry both M and our custom replacement for it.

@drieseng
Copy link
Contributor Author

I can't help but get the impression that my email address has the wrong domain name.
Why are changes in - for example - #56753 that different (except for the fact that no benchmark results were provided at all)?

@stephentoub
Copy link
Member

stephentoub commented Aug 23, 2021

Why are changes in - for example - #56753 that different

They're not. For example, it replaced use of Any() with use of Length > 0, and use of enumType.GetCustomAttributes(typeof(FlagsAttribute), false).Any(); with IsDefined, and use of Cast by just using the propertly typed variable, etc. Those are all good examples of "If there are places where the use of LINQ is gratuitous, and the same functionality can be implemented without LINQ and with minimal additional (or even less) code, I'm all for replacing it." You can also see @eerhardt's comment about the issue it was helping to fix.

I can't help but get the impression that my email address has the wrong domain name.

I don't know what that means...?

@drieseng
Copy link
Contributor Author

Changes from people with the microsoft domain are just more readily accepted. I can even understand this to some degree.

I was trying to give something back, but instead it appears I've been wasting mine and your time.

@Wraith2
Copy link
Contributor

Wraith2 commented Aug 23, 2021

Changes from people with the microsoft domain are just more readily accepted.

From what I've seen internal suggestions are held to the same standard but much of the discussion can be had in person instead of on github. Watch some of the api review meetings on youtube and you'll see that there's a lot of pushback internally and the quality bar is strictly upheld for everyone.

Some of your suggestions are good. I'd advise removing some of the smaller uses of linq in a PR and seeing what review feedback you get. Any negative reviews will have to explain their reasoning and you'll be able to use that feedback to either iterate or help you better understand the boundaries or what will be acceptable in future PR's.

@stephentoub
Copy link
Member

stephentoub commented Aug 23, 2021

I was trying to give something back, but instead it appears I've been wasting mine and your time.

I'm sorry that's the impression you're getting; that is not what I've been trying to convey. I explicitly wrote:
"If there are places where the use of LINQ is gratuitous, and the same functionality can be implemented without LINQ and with minimal additional (or even less) code, I'm all for replacing it. I skimmed your list and some of your examples fall into this category." If you'd like to submit PRs for these, we'd happily accept them.

For the broader cases, where there are tradeoffs to be made, there needs to be good reason to churn the code. If such reasons exist, we'd also happily accept such fixes, too, but it'll involve more discussion to understand the benefits vs cost.

@drieseng
Copy link
Contributor Author

@stephentoub Let me apologize for taking the feedback too negatively then. It's mostly the first paragraph that gave me the impression that even the smallest change had to be strongly motivated (more on this below). I did not, and do not, expect all the changes I identified to be validated (meaning, even considered for a PR).

Is it ok if I number each change, so you can indicate for which changes you'd accept a PR? I don't want to submit 10 PRs, and get that first paragraph as feedback for each of them. I created this issue to make sure I don't implement changes that don't even have a chance of being accepted.

I already mentioned this in the other PR: I'm definitely not a (heavy) user of System.ComponentModel.Annotations. I actually have close to zero benefit in improving it for myself (or even zero, unless through unintenional usage). I just though it was a low bar entry into contributing to .NET. If you expect another motivation other than the items I listed in the main description and altruism, then I'll have to disappoint you.

@stephentoub
Copy link
Member

Is it ok if I number each change

Sure

@ajcvickers ajcvickers removed the untriaged New issue has not been triaged by the area owner label Sep 9, 2021
@ajcvickers ajcvickers added this to the Future milestone Sep 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants