Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/source generators #40162

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
291 changes: 291 additions & 0 deletions docs/features/source-generators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,291 @@
# Source Generators


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Do we have a way to indicate at the start that this is just a proposal, and not actually a feature that is already implemented?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in the feature branch, I think that implies it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When @sharwell initially made this comment the PR was targeting master. @chsienki then changed to target the feature branch, partially in response to this. That's why the comment seems confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point though it should be clear this isn't a shipped feature though given the entirety of the documentation is in a feature branch 😄

## Summary

Source generators aim to enable _compile time metaprogramming_, that is, code that can be created
at compile time and added to the compilation. Source generators will be able to read the contents of
the compilation before running, as well as access any _additional files_, enabling generators to
introspect both user C# code and generator specific files.

> **Note**: This proposal is separate from the [previous generator design](generators.md)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 It would be good to also update the other page to mention its relation to this one.


### High Level Design Goals

- Generators produce one or more strings that represent C# source code to be added to the compilation.
- Explicitly _additive_ only. Generators can add new code to a compilation but may **not** modify existing user code.
- May access _additional files_, that is, non-C# source texts.
- Run _un-ordered_, each generator will see the same input compilation, with no access to files created by other source generators.
- A user specifies the generators to run via list of assemblies, much like analyzers.

## Implementation

At the simplest level source generators are an implementation of `Microsoft.CodeAnalysis.ISourceGenerator`

```csharp
namespace Microsoft.CodeAnalysis
{
public interface ISourceGenerator
{
void Execute(SourceGeneratorContext context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronous? non-cancellable? how does that design impact compilations?

Copy link
Member

@sharwell sharwell Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synchronous?

This code doesn't read from disk (files were already read) or write to disk (compiler writes content to disk if necessary), so it's synchronous like analyzers are synchronous.

non-cancellable?

context.CancellationToken

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generators very frequently need to read from the disk, particularly if their input is a mix of the semantic model and external files. Allowing for async may help for performance ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeromelaban

Generators that read from disk in the context of source generators are an anti-pattern that must be avoided. To properly fit into our build model all inputs to the compiler, and by consequence generators, must be visible to MSBuild. If they're not then changes to those files will not result in compilation being re-done but rather MSBuild considering the compilation up to date and hence skipping it entirely.

Source generators that want to read non-source files must do so through the /additionalFiles options. This is the same approach that we take for analyzers.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaredpar Thanks! I do not understand the rationale behind the anti-pattern. Reading files from the disk and performing up-to-date checks are two separate tasks. Could you explain a bit more about it?

The additional files option is also very limiting in its current form. While it provides access to the files content, it removes access to metadata provided by msbuild (most notably Link or others like nuget source) which can be very useful in some situations. AdditionalText also a synchronous API, which may force generators to spawn tasks to read content when there are many files (think many hundreds) and do parallel work when needed.

Reading through additional files doc, it does have the ability to tie a specific set of input files to a specific generator, which will force to rerun all generators. For a first version, it may be good enough, or could be done through msbuild metadata.

One information that could prove very useful for additional files is the diff with a previous run with the same compilation instance. Generators could use it to partially generate source or content, or reuse previously generated content. Diffing of compilations may not be useful at all, as determining what's changed could be challenging.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One information that could prove very useful for additional files is the diff with a previous run with the same compilation instance.

That would require state tracking, which could easily lead to broken behavior.

also a synchronous API, which may force generators to spawn tasks to read content when there are many files (think many hundreds) and do parallel work when needed.

This is precisely the sort of perf cliff that terrifies me. VS performance has already nose-dived in recent releases for me, and we basically had to stop using analyzers over here because of the massive cost they added to a build. This just seems to be exacerbating the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I do not understand the rationale behind the anti-pattern. Reading files from the disk and performing up-to-date checks are two separate tasks. Could you explain a bit more about it?

If you need to read in contents of a file, then you need to specify that file as a dependency. Otherwise you simply will not be run if that file does not change. Adding that file as a dependency can be done with .AdditionalFiles. This informs the entire system about this dependency so that it can properly be used when computing what actually needs to run/update during a build.

Copy link

@jeromelaban jeromelaban Dec 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is precisely the sort of perf cliff that terrifies me. VS performance has already nose-dived in recent releases for me, and we basically had to stop using analyzers over here because of the massive cost they added to a build. This just seems to be exacerbating the issue.

Performance is definitely a concern, and having generators running at every keystroke is scenario that most likely won't scale. For instance, it is frequent that some complex generators can take 15+ seconds to run, though part of it is just creating the initial Compilation that roslyn already has.

Running generators at build time, then refresh intellisense afterwards is a scenario that can be reasoned with pretty easily (this is what Uno.SourceGeneration does at this point), once explained.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading files from the disk and performing up-to-date checks are two separate tasks. Could you explain a bit more about it?

The way MSBuild evaluates up to date checks is to essentially analyze the inputs to Targets, check to see if they're unchanged and if so not run the target. That means for the compiler to properly participate in up to date checks all of the files used by the compiler must be listed in the Targets. This applies to both the core compiler and any analyzer or source generator plugin.

The /additionalFiles parameter is how we allow analyzers, and now source generators, participate into this process.

Further an at API level the compiler must be able to represent an immutable snapshot of a compilation. This is a core part of our model and that includes removing the state of the file system from the equation. Hence all file inputs to the compiler end up getting represented as SourceText instances including the input to /additionalFiles.

If source generators, or analyzers, directly read from disk they're violating at least the immutable snapshot property of the model and / or the MSBuild targets model.

}
}
```

Generator implementations are defined in external assemblies passed to the compiler
using the same `-analyzer:` option used for diagnostic analyzers.

An assembly can contain a mix of diagnostic analyzers and source generators.
Since generators are loaded from external assemblies, a generator cannot be used to build
the assembly in which it is defined.

`ISourceGenerator` has a single `Execute` method that is called by the host (either the IDE
or the command-line compiler). `Execute` passes an instance of `SourceGeneratorContext` that provides access
to the `Compilation` and allows the generator to alter it by adding source and reporting diagnostics.

```csharp
namespace Microsoft.CodeAnalysis
{
public struct SourceGeneratorContext
Copy link
Member

@sharwell sharwell Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 This section can be outdented one level since it uses a code fence (```) #Resolved

{
public Compilation Compilation { get; }

// TODO: replace AnalyzerOptions with an differently named type that is otherwise identical.
// The concern being that something added to one isn't necessarily applicable to the other.
public AnalyzerOptions AnalyzerOptions { get; }

public CancellationToken CancellationToken { get; }

// TODO: we need to add a way of declaring diagnostic descriptors, presumably the same mechanism used by analyzers
public void ReportDiagnostic(Diagnostic diagnostic) { throw new NotImplementedException(); }
Copy link
Member

@sharwell sharwell Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Do we need to be able to declare diagnostic descriptors? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. This actually came up in our early discussions and I forgot to add it here. I'll add a note that we need to do so.


In reply to: 354522628 [](ancestors = 354522628)


public void AddSource(string fileNameHint, SourceText sourceText) { throw new NotImplementedException(); }
}
}
```
It is assumed that some generators will want to generate more than one `SourceText`, for example in a 1:1 mapping
for additional files. The `fileNameHint` parameter of `AddSource` is intended to address this:

1. If the generated files are emitted to disk, having some ability to put some distinguishing text might be useful.
For example, if you have two `.resx` files, generating the files with simply names of `ResxGeneratedFile1.cs` and
`ResxGeneratedFile2.cs` wouldn't be terribly useful -- you'd want it to be something like
`ResxGeneratedFile-Strings.cs` and `ResxGeneratedFile-Icons.cs` if you had two `.resx` files
named "Strings" and "Icons" respectively.

2. The IDE needs some concept of a "stable" identifier. Source generators create a couple of fun problems for the IDE:
users will want to be able to set breakpoints in a generated file, for example. If a source generator outputs multiple
files we need to know which is which so we can know which file the breakpoints go with. A source generator of course is
allowed to stop emitting a file if its inputs change (if you delete a `.resx`, then the generated file associated with it
will also go away), but this gives us some control here.

This was called "hint" in that the compiler is implicitly allowed to control the filename in however it ultimately
needs, and if two source generators give the same "hint" it can still distinguish them with any sort of
prefix/suffix as necessary.

### IDE Integration

One of the more complicated aspects of supporting generators is enabling a high-fidelity
experience in Visual Studio. For the purposes of determining code correctness, it is
expected that all generators will have had to be run. Obviously, it is impractical to run
every generator on every keystroke, and still maintain an acceptable level of performance
within the IDE.

#### Progressive complexity opt-in

It is expected instead that source generators would work on an 'opt-in' approach to IDE
enablement.

By default, a generator implementing only `ISourceGenerator` would see no IDE integration
and only be correct at build time. Based on conversations with 1st party customers,
there are several cases where this would be enough.
Comment on lines +98 to +100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not instill confidence. I would prefer to have a default of "runs as a background operation after a source file is saved", similar to an asynchronous version of single file generators. If we want to improve performance, there are other good approaches we can use, such as:

  1. Filtering (e.g. by input file name or by an attribute applied within a file).
  2. Design-time generation of a reference API, separate from implementation generation. This could save time through reduced validation of inputs and avoiding generation of member implementations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots of use cases for generators where there is no user-observable output from the source generator (think statically compiled DI etc). We want to start with the absolute simplest model, and build up the complexity from there.


In reply to: 354526436 [](ancestors = 354526436)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sharwell in most scenarios I've been able to deal with, actual generation time is generally insignificant, and the determination of the API to generate takes up most of the time for larger generators.


However, for scenarios such as code first gRPC, and in particular Razor and Blazor,
the IDE will need to be able to generate code on-they-fly as those file types are
edited and reflect the changes back to other files in the IDE in near real-time.

The proposal is to have a set of advanced interfaces that can be optionally implemented,
that would allow the IDE to query the generator to decide what needs to be run in the case
of any particular edit.

For example an extension that would cause generation to run after saving a third party
file might look something like:

```csharp
namespace Microsoft.CodeAnalysis
{
public interface ITriggeredByAdditionalFileSavedGenerator : ISourceGenerator
{
ImmutableArray<string> SupportedAdditionalFileExtensions { get; }
}
}
```
It is expected that there will be various levels of opt in, that can be added to a generator
in order to achieve the specific level of performance required of it.

What these exact APIs will look like remains an open question, and it's expected that we will
need to prototype some real-world generators before knowing what their precise shape will be.

### Output files

It it desirable that the generated source texts be available for inspection after generation,
either as part of creating a generator or seeing what code was generated by a third party
generator.

By default, generated texts will be persisted to a `GeneratedFiles/{GeneratorAssemblyName}`
sub-folder within `CommandLineArguments.OutputDirectory`. The `fileNameHint` from
Copy link
Member

@sharwell sharwell Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❔ Is OutputDirectory going to be under obj or bin for default scenarios?

💡 It would help implementers relate to this description to include an example of the output location. #Resolved

`SourceGeneratorContext.AddSource` will be used to create a unique name, with appropriate
collision renaming applied if required. For instance, on Windows a call to
`AddSource("MyCode", ...);` from `MyGenerator.dll` for a C# project might be
persisted as `obj/debug/GeneratedFiles/MyGenerator.dll/MyCode.cs`.

File output is not required for the correct function of either command line or IDE based
generation, and can be completely disabled, if required. The IDE will work on in-memory
copies of the generated source texts (for 'Find all references', breakpoints etc.) and
periodically flush any changes to disk.

To support the use case where a user wishes to generate the source text, then commit

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of scenario is very problematic, as the user generally does not control when the generators are run (when run from the IDE). This may create many sorts of "my modifications went away" scenarios that frustrate users. Having a non-negotiable scenario of "files are always re-generated on build" is easier to reason with.

the generated files to source control, we will allow changing the location of the
generated files via an appropriate command line switch, and matching MSBuild property
Copy link
Member

@agocke agocke Dec 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something to consider: if we can figure out the output names of the generator, and have a list of output items with the same name, we can decide whether or not they are up-to-date based on file stamps.

We might want to try to get this in V1, since it may be difficult to add after-the-fact. #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the doc to track


In reply to: 355151636 [](ancestors = 355151636)

(naming still to be determined).

In these cases it will be up to the user if they wish to generate over the files again
in the future (in which case they would still be generated, but output to a
source controlled location), or remove the generators and perform the action as a one
time step.

It is currently an open question how for example, the action of setting a breakpoint in
a disk-based generated file will function.

TK: how do we save PDBs/Source link etc?

### Editing experiences for third party languages

One of the interesting scenarios that source generators will enable is essentially
the 'embedding' of C# within other languages (and vice versa). This is how Razor
works today, and the Razor team maintains a significant language service investment
in Visual Studio to enable it.

A possible goal of this project would be to find a generic way to represent this:
that would allow the Razor team to reduce their tooling investment, while allowing
third parties the opportunity to enable the same sort of experiences
(including 'Go to definition', 'Find all references' etc.) relatively cheaply.

The current thinking is to have some form of 'side-channel' available to
the generator. As the generator emits source text, it would indicate where
in the original document this was generated from. This would allow the
compiler API to track e.g. a generated `Symbol` as having an `OriginalDefinition`
that represents a span of third party source text (such as a Razor tag in a
`.cshtml` file).

We discussed embedding this directly in the source text via `#pragma` but
this would require language changes and limit the feature to a specific version
of C#. Other considerations could be specially formed comments or `#if FALSE --`
blocks. In general a 'side-channel' approach seems preferable to specially crafted
grammar in the generated text.

This is not necessarily a goal required for the success of Source Generators;
Razor’s language service can be updated to work with source generators
if it proves to be infeasible, but it certainly something we want to consider
as part of the work.

### MSBuild Integration

It is expected that generators will need some form of configuration system, and we intend to allow

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature is critical. In a general case, a IDictionary<string, IList<(string, IDictionary<string, string>)> (for ItemGroups and their metadata) and IDictionary<string, IDictionary<string, string>> (for properties and metadata) would be the best way to propagate this information in such a way that it would not require generators to forcibly ship with an msbuild props file that defines which properties they are interested in.

This is also the way to get the proper way input files from msbuild (and not just rely on some arbitrary enumeration on files and folders based on the project's location, having to skip bin/obj files properly in user code).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeromelaban

Part of the issue here is the sheer amount of data that gets stored in MSBuild properties and item metadata. Forcing all of it into the compilation by default would decrease performance as well as impacting our ability to avoid unnecessary rebuilds as literally everything in MSBuild would be considered an input to the compilation.

Instead we've chosen this approach where analyzers / generators can pragmatically determine the item metadata and properties that are interesting to them.

certain properties to flow through from MSBuild to facilitate this.
Comment on lines +193 to +194
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 It may be sufficient to pass in the .editorconfig representation from the compiler


> **Note**: This is still under design and open to change.


### Performance targets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider out of process from the start given the nature of generating very very large sources, even if the generator was written well?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For IDE we're looking at a minimal bring up scenario to allow the compiler work to proceed. It's not intended to be the final shipping solution. This reduced their cost significantly and unblocked our ability to move the compiler forward. Expectation is that for shipping we'll move it out of proc just as we do for analyzers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much yeet.


Ultimately, the performance of the feature is going to be somewhat dependent on the performance of the
generators written by customers. Progressive opt-in, and build-time only by-default will allow the IDE
to mitigate many of the potential performance problems posed by third party generators. However, there
is still a risk that third-party generators will cause unacceptable performance problems for the IDE,
and the design of the feature will need to keep this in mind.
Comment on lines +201 to +205
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feature should be implemented under the assumption that third-party source generators have taken all reasonable measures to ensure their generators perform well when invoked. It is not acceptable to mitigate a non-existent problem by forcibly degrading the developer experience for someone using third-party generators, so this should be rephrased to simply say that third-party generator authors are responsible for ensuring the generators perform well. With that in mind:

  • We need to be able to identify cases where a third-party generator is actively causing performance problems (so users don't blame us for something we didn't cause).
  • We need to minimize the overhead of the source generator driver itself to ensure that third-party generator authors have the ability to address performance issues that arise

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In many cases, the performance of the generator is very tricky to define, as it is extremely dependent on the number of files to process (think XAML to cs generation for instance). Being able to view very clearly which generator is taking up the generation time is a must have.

Good behavior for generators, such as avoiding .ToDisplayString() to perform symbols lookups and comparisons, should be documented as best practices.


For 1st party generators, especially Razor and Blazor, we aim at a minimum to match the existing
performance seen by users today. It is expected that even naïve generator-based implementations
will perform significantly faster than the existing tooling, due to less communication overhead
and duplicated work, but improving the speed of these experiences is not a primary goal of this project.

### Language Changes

This design does not currently propose altering the language, it is purely a compiler feature.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 I agree, even if it would have been a very nice feature to have. The ramifications of such changes are very complex to reason with.

The previous design for source generators introduced the `replace` and `original` keywords.
This proposal removes these, as the source generated is purely additional and so there is no
need for them. We expect that most scenarios are possible with the existing use of `partial`
definitions; as a V1 we expect to ship in this state. If concrete scenarios are later shown
that can’t be achieved with the V1 approach we would consider allowing modification as a V2.

## Use cases

We've identified several first and third party candidates that would benefit from source generators:

- ASP.Net: Improve startup time
- Blazor and Razor: Massively reduce tooling burden
- Azure Functions: regex compilation during startup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (Azure SDK) team would like to experiment with this feature too.

- Azure SDK
- [gRPC](https://docs.microsoft.com/en-us/aspnet/core/grpc/?view=aspnetcore-3.1)
- Resx file generation
- [System.CommandLine](https://github.com/dotnet/command-line-api)
- Serializers
Copy link
Member

@CyrusNajmabadi CyrusNajmabadi Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do these interact? it was stated above that there is no ordering, but it seems like it will be hard to remove that. i.e. how do 'serializers' and 'blazor/razor' interact? will one never be dependent on hte other? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the plan. Here serializers are really aimed at adding potential compile time specialization for generation. Razor/Blazor is for transforming cshtml. Razor/Blazor does use serialization but would just be faster if the generator for the serializer it's using is present.


In reply to: 354541360 [](ancestors = 354541360)

- [SWIG](http://www.swig.org/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Compiler compilers, such as ANTLR

💡 Language conversion, such as generation of a CUDA or OpenCL program as a string from a managed function

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another example is the SharpGenTools project at https://github.com/SharpGenTools/SharpGenTools.



## Discussion / Open Issues / TODOs:
Copy link

@jeromelaban jeromelaban Dec 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The high level design goals mentions that ordered execution is out of scope. A DependsOn / Before / After model is very important as generators very frequently need to depend on the generated code of other generators. If that may help, here's how we implemented it in Uno.SourceGeneration, it's using literal strings to enable that as we haven't found a better way to do that.


**Interface vs Class for ISourceGenerator**:

We discussed about this being an interface or class. Analyzers chose to have a abstract base class,
but we weren't sure what we'd end up a need since ultimately we only had one method on this.
Keeping it an interface also was more natural since we have other interfaces that
implement this interface as well for optional light-up.

**IDependsOnCompilationGenerator**:

We did discuss if there should be an IDependsOnCompilationGenerator to formally state

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an attribute ?

that you actually use a compilation. After all, if you don't use the compilation
then we know your performance in the IDE is greatly simplified. However every
scenario we've had for reading additional files has also needed the compilation,
so we simply weren't sure what that was going to bring.

**Breakpoints in generated files**:

Do we map this back to the in-memory file?

**Should generators be push or pull**:

Source generators are pull-based, analyzers are push-based (registration based). Should
we use a push-based model for generators as well?

- If we go down the push-based model, walking the tree should make sure to continue
to produce events for as many nodes as possible, even with errors, as generators
will often work in the presence

- The events that we use today for analyzers may require may more work to produce,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "more work", specifically, doing the minimal amount of work needed to produce an event, i.e. creating a Compilation should not need to do any binding.

since we expect analyzers to run during full compilation, while generators may
not want to even construct the symbol table

- The progressive-performance-opt-in model may work better in a push-based model,
since you would only register for the things you care about

**Should we share more with the analyzer type hierarchy?**:

We would still need to differentiate analyzers from generators, since
they would be generated at different times (generator diagnostics only on
Copy link
Member

@agocke agocke Dec 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
they would be generated at different times (generator diagnostics only on
they would be run at different times (generator diagnostics only on
``` #Resolved

the first compilation, analyzer diagnostics only on the second compilation)

**Can we predict how often some of our sample customers (Razor?) will have to run the generators?**:

They can't predict that right now, and the incorporation of timers into their
current generation makes it very difficult to predict the consequences of
only event-based generation

**Do we have a priority list of the most important customers?**:

No, we should work out priority in order to prioritize features.

**Security Review**:

Do generators create any new security risks not already posed via analyzers and nuget?
Copy link
Member

@gafter gafter Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly they do introduce new security risks. See my "insert a trojan horse virus into the program at compile-time" generator. It is open-source and as you can see by examining its source it does nothing nefarious. See https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf for its design document. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this a new security risk though? Or more accurately how is the security risk of a source generator any different than that of analyzers? A misbehaving analyzer could use reflection tricks to do virtually anything a source generator could do. Certainly a bad actor wouldn't baulk at using reflection just because it wasn't supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as we understand, there is nothing a generator can do that an analyzer couldn't have achieved previously via private reflection. Ultimately we've been allowing arbitrary code execution from packages during compilation for some time (especially via nuget), we're just making it explicit now. This is a note that we need to chat with the security folks to ensure our assumptions are correct.


In reply to: 354613044 [](ancestors = 354613044)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Analyzers come via package references and package references can do basically anything. They can install props, targets, random non-.NET binaries, play the he-man song on repeat, etc ... Even in the case we assume they're checked (they're not) analyzers are code running with full trust on the machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, this is the he-man song: http://aka.ms/davkean