Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update .Net source generators to use new SyntaxValueProvider.ForAttributeWithMetadataName api. #70754

Closed
CyrusNajmabadi opened this issue Jun 14, 2022 · 40 comments
Assignees
Labels
area-Meta blocked Issue/PR is blocked on something - see comments blocking-release source-generator Indicates an issue with a source generator feature tenet-performance Performance related issue
Milestone

Comments

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Jun 14, 2022

Roslyn recently added a new API for greatly speeding up SGs that use the pattern of looking for nodes that have an attribute on it with a particular fully qualified name for the attribute. For example, the json serialization generator looks for this attribute:

private const string JsonSerializableAttributeFullName = "System.Text.Json.Serialization.JsonSerializableAttribute";

Currently, this is extremely expensive as on every edit, the generator must hit every attributed class in every file and semantically check if it has an attribute that binds to this fully qualified name. This may mean thousands of files bound, with expensive semantic models and binding checks performed, all for nodes+attributes completely unrelated to this SG.

The added API provides a new entry point to get the same functionality as previously possible, but with much better performance. Now, the generator says what attributes they care about (passing in teh fully qualified metadata name) and roslyn can smartly examine the source code and not even bother doing any binding or semantics when it would be pointless. Intuitively, you can think of it that when roslyn sees [FooBar] it goes "well, FooBar (and FooBarAttribute) could not ever bind to JsonSerializableAttribute, so it's pointless to even try". This massively reduces the amount of work done (>40x in roslyn itself) especially when the attribute is never, or almost never, used.

Once this api is available in a consumable roslyn package, we should update all the runtime's SGs to use it. That includes the generators for

  1. Regex. Done in Polyfill the incremental generator ForAttributeWithMetadataName from roslyn. #70911. This also added the main polyfill code.
  2. Json. Done in Polyfill the incremental generator ForAttributeWithMetadataName from roslyn (for JsonGenerator). #71653
  3. Logging. Done in Polyfill the incremental generator ForAttributeWithMetadataName from roslyn (for LoggingGenerator). #71651
  4. eventsource. Done in Polyfill the incremental generator ForAttributeWithMetadataName from roslyn (for EventSourceGeneration). #71662
  5. libraryimport. in progress: Polyfill the incremental generator ForAttributeWithMetadataName from roslyn (for LibraryImportGenerator). #71652
@CyrusNajmabadi CyrusNajmabadi added the tenet-performance Performance related issue label Jun 14, 2022
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Jun 14, 2022
@CyrusNajmabadi
Copy link
Member Author

Tagging @stephentoub @ViktorHofer for visibility. I'm happy to contribute the changes on the runtime side for this if that can help out.

@jkoritzinsky
Copy link
Member

We need to make sure that we not only keep compatibility for downlevel generators (so introduce another build variant of the out-of-band packaged generators that uses the new roslyn API version) and ensure that we don't break uplevel repos like aspnetcore by adopting a Roslyn version that is newer than what many of them use (as we turn on in-box generators by default, which causes build failures when the Roslyn version the upstream repo uses is too old).

Generally, we try to keep the version of Roslyn that the "new" generators build against to at newest the version that ships with the SDK that dotnet/arcade is using to ensure we won't cause compile errors upstack.

<!-- Compatibility with VS 17.X/.NET SDK 7.0.1xx -->
<!--
This version is a moving target until we ship.
It should never go ahead of the Roslyn version included in the SDK version in dotnet/arcade's global.json to avoid causing breaks in product construction.
-->
<MicrosoftCodeAnalysisVersion_4_X>4.2.0-2.final</MicrosoftCodeAnalysisVersion_4_X>

For generators that don't ship outside of dotnet/runtime, we can update them as soon as we have a NuGet package (so that includes the EventSource generator and the XUnitWrapperGenerator)

@CyrusNajmabadi
Copy link
Member Author

Note: teh major offenders i'm seeing in traces are Json and Logging. So it woudl be great to be able to update these asap and get these into a VS release that has the new updated API.

@jkoritzinsky
Copy link
Member

Since logging ships out-of-band, we could add another build dimension to it now against the newer Roslyn API version since we can put it into a different path in the NuGet package based on Roslyn version.

JSON ships in-box too, so we can't update the in-box version without breaking upstream repos, but we could update the package version for downlevel consumers immediately if we wanted to.

Basically we can update every generator project that isn't in the following list in some manner now (either by introducing new projects for the out-of-band generators or updating the in-repo ones directly), and then we can update the ones in the list after we know we won't break upstream repos.

<NetCoreAppLibraryGenerator>
LibraryImportGenerator;
System.Text.Json.SourceGeneration.Roslyn4.0;
System.Text.RegularExpressions.Generator;
</NetCoreAppLibraryGenerator>

@danmoseley
Copy link
Member

cc @joperezr who I believe is looking at source generator perf at the moment.

@danmoseley danmoseley added area-Meta source-generator Indicates an issue with a source generator feature labels Jun 15, 2022
@ghost
Copy link

ghost commented Jun 15, 2022

Tagging subscribers to this area: @dotnet/area-meta
See info in area-owners.md if you want to be subscribed.

Issue Details

Roslyn recently added a new API for greatly speeding up SGs that use the pattern of looking for nodes that have an attribute on it with a particular fully qualified name for the attribute. For example, the json serialization generator looks for this attribute:

private const string JsonSerializableAttributeFullName = "System.Text.Json.Serialization.JsonSerializableAttribute";

Currently, this is extremely expensive as on every edit, the generator must hit every attributed class in every file and semantically check if it has an attribute that binds to this fully qualified name. This may mean thousands of files bound, with expensive semantic models and binding checks performed, all for nodes+attributes completely unrelated to this SG.

The added API provides a new entry point to get the same functionality as previously possible, but with much better performance. Now, the generator says what attributes they care about (passing in teh fully qualified metadata name) and roslyn can smartly examine the source code and not even bother doing any binding or semantics when it would be pointless. Intuitively, you can think of it that when roslyn sees [FooBar] it goes "well, FooBar (and FooBarAttribute) could not ever bind to JsonSerializableAttribute, so it's pointless to even try". This massively reduces the amount of work done (>40x in roslyn itself) especially when the attribute is never, or almost never, used.

Once this api is available in a consumable roslyn package, we should update all the runtime's SGs to use it. That includes the generators for

  1. Json
  2. Regex
  3. Logging
  4. libraryimport
  5. eventsource
  6. Client.SourceGenerator (asp.net: https://github.com/dotnet/aspnetcore/tree/3ea008c80d5cc63de7f90ddfd6823b7b006251ff/src/SignalR/clients/csharp/Client.SourceGenerator/src)

@jaredpar has prototyped this for a couple here:
909709a
c22809d

The changes are straightforward and easy to adopt.

Author: CyrusNajmabadi
Assignees: -
Labels:

area-Meta, tenet-performance, untriaged, source-generator

Milestone: -

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Jun 15, 2022

@stephentoub @jkoritzinsky

Basically we can update every generator project that isn't in the following list in some manner now (either by introducing new projects for the out-of-band generators or updating the in-repo ones directly), and then we can update the ones in the list after we know we won't break upstream repos.

There's a potential hybrid option we can take.

  1. use the new roslyn API in all the generators where it is safe to do so immediately.
  2. port most** of the implementation of hte new API over to the runtime repo for the generators that cannot move over immediately. This would allow us to get the perf benefit without having to wait a very long.

So what does most** mean? Well, currently the API depends on some internal functionality in order to support a very unlikely, but useful full correctness case. Specifically, the API will understand if you do the following:

File1:

global alias FooBarAttribute = System.Text.Json.Serialization.JsonSerializableAttribute;

File2:

[FooBar]
class Whatever { }

In other words, it knows that that global alias exists, and as such the json-generator should run on Whatever. If you were willing to say that that was a currently unsupported scenario, we could port over the entirety of the new API except teh part that supports global aliases. Practically speaking, i think all users would still be fine. It would be up to you if this functionality break would be worth the massive speedup we're seeing (over 40x in roslyn itself).

To me, given just how bad the perf problem is here, my preference is that we take the perf gain for all the generators, and we accept a highly unlikely functionality break. Then, when you can finally move all generators over to the new api we'll have teh best of both worlds. High performance, and complete correctness.

Any thoughts from your perspective?

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 15, 2022

Here are my thoughts on this:

  • We should only bump code analysis dependencies if they work with publicly released Visual Studio versions (including the preview channel).
  • For packages, we already include roslyn3.x source generators so that our packages work on older Visual Studio releases. I don't think we want to add another analyzer version as we already have the fall back version. It's impossible to multi-target based on different code analysis dependencies today in a single project, therefore you need to create a dedicated project per roslyn version to be targeted which adds up to the overall build.

Based on that I would propose the following:

  • We should upgrade non-shipping roslyn4.x source generators to leverage the new API when publicly released tooling supports them. My assumption is that this will be VS 17.4 but I could be mistaken EDIT: VS 17.3 Preview 3.
  • We should upgrade shipping roslyn4.x source generators to leverage the new API when dependent repositories upgraded to the newer code analysis version. If I'm not mistaken, the roslyn version that this depends on, will ship with .NET 7 Preview 6. Based on that, such changes would need to target .NET 7 RC1 or RC2.
  • It might make sense to change the roslyn4.0 folder in our packages in which the source generators reside to the actual roslyn version that we will depend on when shipping .NET 7. If I'm not mistaken, that would be roslyn4.3 with this change. Presumably the SDK falls back to use the roslyn3.11 targeting source generators if a Visual Studio version is used that doesn't support roslyn4.3.

cc @eerhardt @jaredpar

@eerhardt
Copy link
Member

It might make sense to change the roslyn4.0 folder in our packages in which the source generators reside to the actual roslyn version that we will depend on when shipping .NET 7. If I'm not mistaken, that would be roslyn4.4 with this change. Presumably the SDK falls back to use the roslyn3.11 targeting source generators if a Visual Studio version is used that doesn't support roslyn4.4.

The issue with this proposal is that projects building with the 6.0.x00 SDK will now see degraded performance when they use the 7.0.0 System.Text.Json nupkg. With the 6.0.x System.Text.Json nupkg, they will use the roslyn4.0 version, which has incremental build support. When the project upgrades to the 7.0.0 nupkg, there is no roslyn4.0 built version, so they will fall back to roslyn3.11 and now all the incremental support we took advantage of in .NET 6 is removed.

It is pretty easy to get into this situation AFAICT. You just need to be on a VS 2022 version that doesn't include .NET 7 - so let's say VS 17.0. You are using the 6.0.0 System.Text.Json nupkg, and NuGet Manager says a new version is out in the "Updates" tab, so you update to STJ 7.0.0.

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 15, 2022

Agreed, that's a problem and needs to be solved, thanks for pointing that out. What makes this problematic is that we currently don't have a lifecycle policy for source generators when they ship inside packages. I just recently had this discussion with @eerhardt.

Visual Studio 2019, which only works with the roslyn3.11 source generators is supported until April 2029. Based on that, we would need to build and package our roslyn3.11 source generators for the next 6-7 years.

Visual Studio 2022 17.0 is supported until July 2023, 17.2 until January 2024 and the support date for 17.3 isn't yet announced. Because of these support dates, we would need to keep our roslyn4.0 targeted source generators around until at least .NET 9.

It's very likely that we will again depend on newer roslyn features going forward and with that build and package even more versions of our source generators. Packages will grow unbounded and the main branch hosts all different versions of the roslyn source generators.

I don't think there's a good solution to mitigate the increasing build complexity, package growth and ever expanding matrix of source generators. One option that we might want to discuss is to couple our package support policy with toolset versions. Just theoretically speaking, we could mark the System.Text.Json 7.0 package to not support Visual Studio 2019 or older SDKs that don't have a specific roslyn version available. I don't suggest that we should to that but wanted mention it.

@CyrusNajmabadi
Copy link
Member Author

We should upgrade non-shipping roslyn4.x source generators to leverage the new API when publicly released tooling supports them. My assumption is that this will be VS 17.4 but I could be mistaken.

This would be a pretty bitter pill @ViktorHofer . That's likely 6 months out. Can you clarify why we could not get this in for the 17.3 line?

@CyrusNajmabadi
Copy link
Member Author

@ViktorHofer @eerhardt any thoughts on: #70754 (comment)

At least with the idea there, we could get performance benefits to users immediately, albeit with a potential (though hopefully unlikely) functionality regression. This would be a stop-gap for the immediate future until you could actually adopt this new API (which has all hte performance benefit, without any functionality regression).

I'm coming from this from a perspective of trying to address the extremely bad performance regressions we're seeing in VS, which we are already getting complaints on from users. Having to wait until 17.4 is something i'd really like to avoid.

@ViktorHofer
Copy link
Member

This would be a pretty bitter pill @ViktorHofer . That's likely 6 months out. Can you clarify why we could not get this in for the 17.3 line?

Your change was merged into main and https://github.com/dotnet/roslyn/commits/release/dev17.4 but not https://github.com/dotnet/roslyn/commits/release/dev17.3. Based on that, I don't think that your change will be part of the roslyn that ships with 17.3. @jaredpar is my assumption correct? Would it be possible and make sense to backport @CyrusNajmabadi's change into dev17.3?

@CyrusNajmabadi
Copy link
Member Author

pulling in @vatsalyaagrawal . I'm def a little confused. I was under the impression that our main branch was 17.3, not 17.4. @vatsalyaagrawal can you clarify?

@vatsalyaagrawal
Copy link

Yes, Roslyn main branch is currently pointing to 17.3.P3.

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 15, 2022

That's great to hear. That would mean that we can update our internal non-shipping source generators to make use of the new API when VS 17.3 P3 publicly releases.

@CyrusNajmabadi
Copy link
Member Author

@ViktorHofer Ok great. I think that's a much more reasonable time frame :)

@stephentoub
Copy link
Member

stephentoub commented Jun 15, 2022

That would mean that we can update our internal non-shipping source generators to make use of the new API when VS 17.3 P3 publicly releases.

Ok great. I think that's a much more reasonable time frame :)

To make sure there's no disconnect, @CyrusNajmabadi, I believe Viktor isn't talking about the two you actually care about today (json and logging). "internal non-shipping source generators" sounds to me like our generators that only support the building of runtime. @ViktorHofer, can you clarify?

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 15, 2022

Correct. Let me update my above proposal:

  1. We should upgrade non-shipping roslyn4.x source generators to leverage the new API when VS 17.3 Preview 3 publicly releases.
  2. We should update our inbox shipping roslyn4.x source generators to leverage the new API when dependent repositories upgraded to the newer code analysis version. If I'm not mistaken, the roslyn version that this depends on, will ship with .NET 7 Preview 6. Based on that, such changes would need to target .NET 7 Preview 7.
  3. We should add roslyn4.3 targeting source generators to our out-of-band shipping packages to leverage the new API. These would co-exist with the already shipping roslyn3.11 and roslyn4.0 source generators in the package. This can be done in combination with the non-shipping internal source generators right when VS 17.3 Preview 3 publicly releases.
  4. It might make sense to change the roslyn4.0 folder in our packages in which the source generators reside to the actual roslyn version that we will depend on when shipping .NET 7. If I'm not mistaken, that would be roslyn4.3 with this change. Presumably the SDK falls back to use the roslyn3.11 targeting source generators if a Visual Studio version is used that doesn't support roslyn4.3. @eerhardt pointed out above that this would regress the performance for customers that use VS 2022 < 17.3. .NET 8 is probably a better target to remove the roslyn4.0 source generators from our packages.

@CyrusNajmabadi
Copy link
Member Author

@ViktorHofer Can you clarify: in what VS release woudl customers get the important performance fixes for all .net generators? Right now, the perf pain is extremely significant, so we really need a plan/solution that can address this in the near term.

Thanks!

@stephentoub
Copy link
Member

stephentoub commented Jun 15, 2022

I didn't see an answer to @CyrusNajmabadi proposal. As I understand it, he's proposing adding code for a polyfill into runtime. Generators built against the latest sdk get all the benefits; generators built against older sdks get most of the benefits. If that's an accurate summary, that sounds good to me both from a perspective of helping all customers soon and from a perspective of keeping the diffs between our builds minimal. Is this feasible?

@CyrusNajmabadi
Copy link
Member Author

@stephentoub That is correct with the caveat that a currently supported case may temporarily not be supported. Specifically, the case where a user defines a global alias to the attribute the generator needs, and then references that aliases in the attributes they have on their types.

@ViktorHofer
Copy link
Member

@ViktorHofer Can you clarify: in what VS release woudl customers get the important performance fixes for all .net generators? Right now, the perf pain is extremely significant, so we really need a plan/solution that can address this in the near term.

If we follow my above proposal, that would be .NET 7 Preview 7.

I didn't see an answer to @CyrusNajmabadi proposal. As I understand it, he's proposing adding code for a polyfill into runtime. Generators built against the latest sdk get all the benefits; generators built against older sdks get most of the benefits. If that's an accurate summary, that sounds good to me both from a perspective of helping all customers soon and from a perspective of keeping the diffs between our builds minimal. Is this feasible?

If I understand @CyrusNajmabadi's proposal correctly, this would just be temporary until all source generators can target the new API and we would then remove the polyfill. Is that right? If so, would we able to do so in time for Preview 6 which I believe closes next week?

@stephentoub
Copy link
Member

If we follow my above proposal, that would be .NET 7 Preview 7.

For all customers? Above you wrote "These would co-exist with the already shipping roslyn3.11 and roslyn4.0 source generators in the package"...who would still be using those?

@CyrusNajmabadi
Copy link
Member Author

If we follow my above proposal, that would be .NET 7 Preview 7.

Sorry, i'm asking which VS release, not which .net release. My strong preference is that we have some perf fix in place for 17.3.

@stephentoub
Copy link
Member

Anyone using older versions of Visual Studio but the very latest packages:

What about .NET 6 apps targeting the .NET 6 SDK?

@CyrusNajmabadi
Copy link
Member Author

CyrusNajmabadi commented Jun 16, 2022

As an example: https://devdiv.visualstudio.com/DevDiv/_workitems/edit/1555650 is one of many customer reported cases of this. In this case basic IDE features are being delayed by 10s of seconds, and the servicehub process is churning through CPU and tons of ram. In this case 30+GB just from the jsongenerator and logging generator.

Note that these are features which we generally want to run in 100s of ms worst case. So we're several orders of magnitude impacted here. This also other OOP semantic operations (like diagnostics and whatnot).

@jkoritzinsky
Copy link
Member

Blocked for adopting the official API on dotnet/roslyn#63318

@jkoritzinsky jkoritzinsky added the blocked Issue/PR is blocked on something - see comments label Aug 10, 2022
@jeffhandley jeffhandley modified the milestones: 7.0.0, Future Aug 11, 2022
@joperezr
Copy link
Member

We do still intend to make this change for 7.0, even though it will likely miss RC1 given that we are currently blocked on that Roslyn bug getting fixed. Once it is, I will go ahead and make this change. Re-adjusting the milestone appropriately.

@joperezr joperezr modified the milestones: Future, 7.0.0 Aug 11, 2022
@danmoseley
Copy link
Member

Roslyn bug was fixed over a week ago. Are we still waiting on the flow?

@jkoritzinsky
Copy link
Member

If we update the generators to use the new Roslyn here, we need to update all consuming repositories as well.

I think the best solution will be to update the SDK in Arcade to use an SDK with the fix so we don’t need to manually upgrade the Roslyn version used in each repo.

@danmoseley
Copy link
Member

I think the best solution will be to update the SDK in Arcade to use an SDK with the fix so we don’t need to manually upgrade the Roslyn version used in each repo.

Who owns doing that -- is that you @jkoritzinsky , @joperezr or someone else?

@joperezr
Copy link
Member

I'm planning to take care of it this week.

@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Aug 30, 2022
@CyrusNajmabadi
Copy link
Member Author

@joperezr Anything left to do with this item? Or can we close it out?

@joperezr
Copy link
Member

joperezr commented Sep 6, 2022

PR isn't merged yet 😄. PR should be ready to go but I'm still waiting on making sure that all upstack repos have ingested an arcade update that will bump their compiler and SDK versions so that they'll be able to ingest the new source generators.

@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Sep 10, 2022
@CyrusNajmabadi
Copy link
Member Author

@joperezr Anything left to do with this item? Or can we close it out?

@joperezr
Copy link
Member

joperezr commented Oct 6, 2022

This can be closed now.

@joperezr joperezr closed this as completed Oct 6, 2022
@CyrusNajmabadi
Copy link
Member Author

Thanks!

@ghost ghost locked as resolved and limited conversation to collaborators Nov 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-Meta blocked Issue/PR is blocked on something - see comments blocking-release source-generator Indicates an issue with a source generator feature tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

10 participants