-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidate LINQ's internal IIListProvider/IPartition into base Iterator class #98969
Conversation
LINQ has an internal base Iterator class that's used when operators manually implement enumeration rather than having the compiler implement it with an iterator. That base class includes several abstract/virtual methods, including a Select and Where virtual method that have been present since the beginning of LINQ: those are used in a case of A().B(), where B is Select or Where and where A can then improve the processing of B by returning a customized implementation aware of some aspect of both A and B (e.g. the enumerable returned from .Where().Select() includes both the where and select functionality in that single object). Over the years, other specialization has been added to LINQ, in the form of additional internal interfaces: IIListProvider and IPartition. These interfaces similarly enable optimizing sequences A().B(), where B is other LINQ methods, e.g. IIListProvider enables optimizing ToArray/ToList/Count, and IPartition enables optimizing Skip/Take/First/Last/ElementAt. There was a complicated venn diagram of which types implemented which interfaces and base type. This PR merges IIListProvider/IPartition into the base Iterator class. Everything from IPartition is virtual, enabling derivations to specialize just a subset of the functionality, and deduplicating some implementations that were providing the same implementation instead of having it shared in a base. Code that was type testing for the interfaces now type tests for the base class, which means we can delete some type testing where both the interfaces and the base class were previously being tested for. We no longer have this strange split across multiple optimization-focused internal implementation details, and instead have everything consolidated in the one base class. This also means that all of the calls that were previously interface dispatch are now virtual dispatch.
Tagging subscribers to this area: @dotnet/area-system-linq Issue DetailsLINQ has an internal base Iterator class that's used when operators manually implement enumeration rather than having the compiler implement it with an iterator. That base class includes several abstract/virtual methods, including a Select and Where virtual method that have been present since the beginning of LINQ: those are used in a case of A().B(), where B is Select or Where and where A can then improve the processing of B by returning a customized implementation aware of some aspect of both A and B (e.g. the enumerable returned from .Where().Select() includes both the where and select functionality in that single object). Over the years, other specialization has been added to LINQ, in the form of additional internal interfaces: IIListProvider and IPartition. These interfaces similarly enable optimizing sequences A().B(), where B is other LINQ methods, e.g. IIListProvider enables optimizing ToArray/ToList/Count, and IPartition enables optimizing Skip/Take/First/Last/ElementAt. There was a complicated venn diagram of which types implemented which interfaces and base type. This PR merges IIListProvider/IPartition into the base Iterator class. Everything from IPartition is virtual, enabling derivations to specialize just a subset of the functionality, and deduplicating some implementations that were providing the same implementation instead of having it shared in a base. Code that was type testing for the interfaces now type tests for the base class, which means we can delete some type testing where both the interfaces and the base class were previously being tested for. We no longer have this strange split across multiple optimization-focused internal implementation details, and instead have everything consolidated in the one base class. This also means that all of the calls that were previously interface dispatch are now virtual dispatch. I renamed some types along the way to clarify what they represent.
|
src/libraries/System.Linq/src/System/Linq/DefaultIfEmpty.SpeedOpt.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
This regressed the size of the "Stage 2" app by 0.5%. It may not look like much, but AFAIK we already got rid of a bunch of LINQ use in ASP.NET because of the size and don't use it much. I experimentally tried to compile the Stage 2 app using the optimized for size version of LINQ we use on iDevices and WASM and the size dropped by 1.5% - i.e. the cost of the extra generics introduced here is proportionally a significant portion of the LINQ overhead in general. It looks like this: Here is the before/after MSTAT file (it's not just this PR, but official build before regression and after regression). Cc @eerhardt |
Superficially the issue seems to be that calling, say, |
Note, too, that the "optimized for size" version doesn't actually exclude all of the select variations. They were part of the original LINQ implementation, and whether on purpose or by accident didn't get ifdef'd out along with all of the other specializations. I expect that 1.5% would increase even more if the Where/Select virtuals on Iterator were put onto the same plan as all of the other abstracts/virtuals that are there to optimize various sequences of operations. Fundamentally, this is a tradeoff between size and throughput. Each of these specializations makes a very real throughput improvement. If we believe the size is more important, we can switch Native AOT over to a plan like that used by browser, omitting all of these throughput optimizations that bring in more code (though I'd prefer if finding a way to do so via a feature switch rather than having multiple builds.
Yes, this is why @eerhardt was replacing use of Select in #98109. But if we're going to write specialized code that is effectively just the simple version of Select in order to avoid pulling in all the various specializations, then we should think about just having Native AOT use the simple version of Select. |
I would expect it also affects startup because it will be more type loads. When we're doing perf changes, we rarely look at perf outside throughput. It's not a problem specific to this PR. I haven't filed this as an issue because I'm not sure we care enough. Making LINQ even bigger will build a stronger case for people to avoid it due to perf. I already recommend people to delete all the references to it if they care about perf and put a https://github.com/dotnet/corert/blob/master/src/Common/src/TypeSystem/Common/LinqPoison.cs in their project to stay clean. |
LINQ has an internal base Iterator class that's used when operators manually implement enumeration rather than having the compiler implement it with an iterator. That base class includes several abstract/virtual methods, including a Select and Where virtual method that have been present since the beginning of LINQ: those are used in a case of A().B(), where B is Select or Where and where A can then improve the processing of B by returning a customized implementation aware of some aspect of both A and B (e.g. the enumerable returned from .Where().Select() includes both the where and select functionality in that single object). Over the years, other specialization has been added to LINQ, in the form of additional internal interfaces: IIListProvider and IPartition. These interfaces similarly enable optimizing sequences A().B(), where B is other LINQ methods, e.g. IIListProvider enables optimizing ToArray/ToList/Count, and IPartition enables optimizing Skip/Take/First/Last/ElementAt. There was a complicated venn diagram of which types implemented which interfaces and base type.
This PR merges IIListProvider/IPartition into the base Iterator class. Everything from IPartition is virtual, enabling derivations to specialize just a subset of the functionality, and deduplicating some implementations that were providing the same implementation instead of having it shared in a base. Code that was type testing for the interfaces now type tests for the base class, which means we can delete some type testing where both the interfaces and the base class were previously being tested for. We no longer have this strange split across multiple optimization-focused internal implementation details, and instead have everything consolidated in the one base class. This also means that all of the calls that were previously interface dispatch are now virtual dispatch.
I renamed some types along the way to clarify what they represent.