-
Notifications
You must be signed in to change notification settings - Fork 4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add source generators working doc
- Loading branch information
Showing
1 changed file
with
291 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,291 @@ | ||
# Source Generators | ||
|
||
|
||
## Summary | ||
|
||
Source generators aim to enable _compile time metaprogramming_, that is, code that can be created | ||
at compile time and added to the compilation. Source generators will be able to read the contents of | ||
the compilation before running, as well as access any _additional files_, enabling generators to | ||
introspect both user C# code and generator specific files. | ||
|
||
> **Note**: This proposal is separate from the [previous generator design](generators.md) | ||
### High Level Design Goals | ||
|
||
- Generators produce one or more strings that represent C# source code to be added to the compilation. | ||
- Explicitly _additive_ only. Generators can add new code to a compilation but may **not** modify existing user code. | ||
- May access _additional files_, that is, non-C# source texts. | ||
- Run _un-ordered_, each generator will see the same input compilation, with no access to files created by other source generators. | ||
- A user specifies the generators to run via list of assemblies, much like analyzers. | ||
|
||
## Implementation | ||
|
||
At the simplest level source generators are an implementation of `Microsoft.CodeAnalysis.ISourceGenerator` | ||
|
||
```csharp | ||
namespace Microsoft.CodeAnalysis | ||
{ | ||
public interface ISourceGenerator | ||
{ | ||
void Execute(SourceGeneratorContext context); | ||
} | ||
} | ||
``` | ||
|
||
Generator implementations are defined in external assemblies passed to the compiler | ||
using the same `-analyzer:` option used for diagnostic analyzers. | ||
|
||
An assembly can contain a mix of diagnostic analyzers and source generators. | ||
Since generators are loaded from external assemblies, a generator cannot be used to build | ||
the assembly in which it is defined. | ||
|
||
`ISourceGenerator` has a single `Execute` method that is called by the host (either the IDE | ||
or the command-line compiler). `Execute` passes an instance of `SourceGeneratorContext` that provides access | ||
to the `Compilation` and allows the generator to alter it by adding source and reporting diagnostics. | ||
|
||
```csharp | ||
namespace Microsoft.CodeAnalysis | ||
{ | ||
public struct SourceGeneratorContext | ||
{ | ||
public Compilation Compilation { get; } | ||
|
||
// TODO: replace AnalyzerOptions with an differently named type that is otherwise identical. | ||
// The concern being that something added to one isn't necessarily applicable to the other. | ||
public AnalyzerOptions AnalyzerOptions { get; } | ||
|
||
public CancellationToken CancellationToken { get; } | ||
|
||
// TODO: we need to add a way of declaring diagnostic descriptors, presumably the same mechanism used by analyzers | ||
public void ReportDiagnostic(Diagnostic diagnostic) { throw new NotImplementedException(); } | ||
|
||
public void AddSource(string fileNameHint, SourceText sourceText) { throw new NotImplementedException(); } | ||
} | ||
} | ||
``` | ||
It is assumed that some generators will want to generate more than one `SourceText`, for example in a 1:1 mapping | ||
for additional files. The `fileNameHint` parameter of `AddSource` is intended to address this: | ||
|
||
1. If the generated files are emitted to disk, having some ability to put some distinguishing text might be useful. | ||
For example, if you have two `.resx` files, generating the files with simply names of `ResxGeneratedFile1.cs` and | ||
`ResxGeneratedFile2.cs` wouldn't be terribly useful -- you'd want it to be something like | ||
`ResxGeneratedFile-Strings.cs` and `ResxGeneratedFile-Icons.cs` if you had two `.resx` files | ||
named "Strings" and "Icons" respectively. | ||
|
||
2. The IDE needs some concept of a "stable" identifier. Source generators create a couple of fun problems for the IDE: | ||
users will want to be able to set breakpoints in a generated file, for example. If a source generator outputs multiple | ||
files we need to know which is which so we can know which file the breakpoints go with. A source generator of course is | ||
allowed to stop emitting a file if its inputs change (if you delete a `.resx`, then the generated file associated with it | ||
will also go away), but this gives us some control here. | ||
|
||
This was called "hint" in that the compiler is implicitly allowed to control the filename in however it ultimately | ||
needs, and if two source generators give the same "hint" it can still distinguish them with any sort of | ||
prefix/suffix as necessary. | ||
|
||
### IDE Integration | ||
|
||
One of the more complicated aspects of supporting generators is enabling a high-fidelity | ||
experience in Visual Studio. For the purposes of determining code correctness, it is | ||
expected that all generators will have had to be run. Obviously, it is impractical to run | ||
every generator on every keystroke, and still maintain an acceptable level of performance | ||
within the IDE. | ||
|
||
#### Progressive complexity opt-in | ||
|
||
It is expected instead that source generators would work on an 'opt-in' approach to IDE | ||
enablement. | ||
|
||
By default, a generator implementing only `ISourceGenerator` would see no IDE integration | ||
and only be correct at build time. Based on conversations with 1st party customers, | ||
there are several cases where this would be enough. | ||
|
||
However, for scenarios such as code first gRPC, and in particular Razor and Blazor, | ||
the IDE will need to be able to generate code on-they-fly as those file types are | ||
edited and reflect the changes back to other files in the IDE in near real-time. | ||
|
||
The proposal is to have a set of advanced interfaces that can be optionally implemented, | ||
that would allow the IDE to query the generator to decide what needs to be run in the case | ||
of any particular edit. | ||
|
||
For example an extension that would cause generation to run after saving a third party | ||
file might look something like: | ||
|
||
```csharp | ||
namespace Microsoft.CodeAnalysis | ||
{ | ||
public interface ITriggeredByAdditionalFileSavedGenerator : ISourceGenerator | ||
{ | ||
ImmutableArray<string> SupportedAdditionalFileExtensions { get; } | ||
} | ||
} | ||
``` | ||
It is expected that there will be various levels of opt in, that can be added to a generator | ||
in order to achieve the specific level of performance required of it. | ||
|
||
What these exact APIs will look like remains an open question, and it's expected that we will | ||
need to prototype some real-world generators before knowing what their precise shape will be. | ||
|
||
### Output files | ||
|
||
It it desirable that the generated source texts be available for inspection after generation, | ||
either as part of creating a generator or seeing what code was generated by a third party | ||
generator. | ||
|
||
By default, generated texts will be persisted to a `GeneratedFiles/{GeneratorAssemblyName}` | ||
sub-folder within `CommandLineArguments.OutputDirectory`. The `fileNameHint` from | ||
`SourceGeneratorContext.AddSource` will be used to create a unique name, with appropriate | ||
collision renaming applied if required. For instance, on Windows a call to | ||
`AddSource("MyCode", ...);` from `MyGenerator.dll` for a C# project might be | ||
persisted as `obj/debug/GeneratedFiles/MyGenerator.dll/MyCode.cs`. | ||
|
||
File output is not required for the correct function of either command line or IDE based | ||
generation, and can be completely disabled, if required. The IDE will work on in-memory | ||
copies of the generated source texts (for 'Find all references', breakpoints etc.) and | ||
periodically flush any changes to disk. | ||
|
||
To support the use case where a user wishes to generate the source text, then commit | ||
the generated files to source control, we will allow changing the location of the | ||
generated files via an appropriate command line switch, and matching MSBuild property | ||
(naming still to be determined). | ||
|
||
In these cases it will be up to the user if they wish to generate over the files again | ||
in the future (in which case they would still be generated, but output to a | ||
source controlled location), or remove the generators and perform the action as a one | ||
time step. | ||
|
||
It is currently an open question how for example, the action of setting a breakpoint in | ||
a disk-based generated file will function. | ||
|
||
TK: how do we save PDBs/Source link etc? | ||
|
||
### Editing experiences for third party languages | ||
|
||
One of the interesting scenarios that source generators will enable is essentially | ||
the 'embedding' of C# within other languages (and vice versa). This is how Razor | ||
works today, and the Razor team maintains a significant language service investment | ||
in Visual Studio to enable it. | ||
|
||
A possible goal of this project would be to find a generic way to represent this: | ||
that would allow the Razor team to reduce their tooling investment, while allowing | ||
third parties the opportunity to enable the same sort of experiences | ||
(including 'Go to definition', 'Find all references' etc.) relatively cheaply. | ||
|
||
The current thinking is to have some form of 'side-channel' available to | ||
the generator. As the generator emits source text, it would indicate where | ||
in the original document this was generated from. This would allow the | ||
compiler API to track e.g. a generated `Symbol` as having an `OriginalDefinition` | ||
that represents a span of third party source text (such as a Razor tag in a | ||
`.cshtml` file). | ||
|
||
We discussed embedding this directly in the source text via `#pragma` but | ||
this would require language changes and limit the feature to a specific version | ||
of C#. Other considerations could be specially formed comments or `#if FALSE --` | ||
blocks. In general a 'side-channel' approach seems preferable to specially crafted | ||
grammar in the generated text. | ||
|
||
This is not necessarily a goal required for the success of Source Generators; | ||
Razor’s language service can be updated to work with source generators | ||
if it proves to be infeasible, but it certainly something we want to consider | ||
as part of the work. | ||
|
||
### MSBuild Integration | ||
|
||
It is expected that generators will need some form of configuration system, and we intend to allow | ||
certain properties to flow through from MSBuild to facilitate this. | ||
|
||
> **Note**: This is still under design and open to change. | ||
|
||
### Performance targets | ||
|
||
Ultimately, the performance of the feature is going to be somewhat dependent on the performance of the | ||
generators written by customers. Progressive opt-in, and build-time only by-default will allow the IDE | ||
to mitigate many of the potential performance problems posed by third party generators. However, there | ||
is still a risk that third-party generators will cause unacceptable performance problems for the IDE, | ||
and the design of the feature will need to keep this in mind. | ||
|
||
For 1st party generators, especially Razor and Blazor, we aim at a minimum to match the existing | ||
performance seen by users today. It is expected that even naïve generator-based implementations | ||
will perform significantly faster than the existing tooling, due to less communication overhead | ||
and duplicated work, but improving the speed of these experiences is not a primary goal of this project. | ||
|
||
### Language Changes | ||
|
||
This design does not currently propose altering the language, it is purely a compiler feature. | ||
The previous design for source generators introduced the `replace` and `original` keywords. | ||
This proposal removes these, as the source generated is purely additional and so there is no | ||
need for them. We expect that most scenarios are possible with the existing use of `partial` | ||
definitions; as a V1 we expect to ship in this state. If concrete scenarios are later shown | ||
that can’t be achieved with the V1 approach we would consider allowing modification as a V2. | ||
|
||
## Use cases | ||
|
||
We've identified several first and third party candidates that would benefit from source generators: | ||
|
||
- ASP.Net: Improve startup time | ||
- Blazor and Razor: Massively reduce tooling burden | ||
- Azure Functions: regex compilation during startup | ||
- Azure SDK | ||
- [gRPC](https://docs.microsoft.com/en-us/aspnet/core/grpc/?view=aspnetcore-3.1) | ||
- Resx file generation | ||
- [System.CommandLine](https://github.com/dotnet/command-line-api) | ||
- Serializers | ||
- [SWIG](http://www.swig.org/) | ||
|
||
|
||
## Discussion / Open Issues / TODOs: | ||
|
||
**Interface vs Class for ISourceGenerator**: | ||
|
||
We discussed about this being an interface or class. Analyzers chose to have a abstract base class, | ||
but we weren't sure what we'd end up a need since ultimately we only had one method on this. | ||
Keeping it an interface also was more natural since we have other interfaces that | ||
implement this interface as well for optional light-up. | ||
|
||
**IDependsOnCompilationGenerator**: | ||
|
||
We did discuss if there should be an IDependsOnCompilationGenerator to formally state | ||
that you actually use a compilation. After all, if you don't use the compilation | ||
then we know your performance in the IDE is greatly simplified. However every | ||
scenario we've had for reading additional files has also needed the compilation, | ||
so we simply weren't sure what that was going to bring. | ||
|
||
**Breakpoints in generated files**: | ||
|
||
Do we map this back to the in-memory file? | ||
|
||
**Should generators be push or pull**: | ||
|
||
Source generators are pull-based, analyzers are push-based (registration based). Should | ||
we use a push-based model for generators as well? | ||
|
||
- If we go down the push-based model, walking the tree should make sure to continue | ||
to produce events for as many nodes as possible, even with errors, as generators | ||
will often work in the presence | ||
|
||
- The events that we use today for analyzers may require may more work to produce, | ||
since we expect analyzers to run during full compilation, while generators may | ||
not want to even construct the symbol table | ||
|
||
- The progressive-performance-opt-in model may work better in a push-based model, | ||
since you would only register for the things you care about | ||
|
||
**Should we share more with the analyzer type hierarchy?**: | ||
|
||
We would still need to differentiate analyzers from generators, since | ||
they would be generated at different times (generator diagnostics only on | ||
the first compilation, analyzer diagnostics only on the second compilation) | ||
|
||
**Can we predict how often some of our sample customers (Razor?) will have to run the generators?**: | ||
|
||
They can't predict that right now, and the incorporation of timers into their | ||
current generation makes it very difficult to predict the consequences of | ||
only event-based generation | ||
|
||
**Do we have a priority list of the most important customers?**: | ||
|
||
No, we should work out priority in order to prioritize features. | ||
|
||
**Security Review**: | ||
|
||
Do generators create any new security risks not already posed via analyzers and nuget? |