Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial proposal for P/Invokes via Source Generators #33742

Merged
merged 9 commits into from
Mar 31, 2020
214 changes: 214 additions & 0 deletions docs/design/features/source-generator-pinvokes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
# Source Generator P/Invokes

## Purpose

The CLR possesses a rich built-in marshaling mechanism for interoperability with native code that is handled at runtime. This system was designed to free .NET developers from having to author complex and potentially ABI sensitive [type conversion code][typemarshal_link] from a managed to an unmanaged environment. The built-in system works with both [P/Invoke][pinvoke_link] (i.e. `DllImportAttribute`) and [COM interop](https://docs.microsoft.com/dotnet/standard/native-interop/cominterop). The generated portion is typically called an ["IL Stub"][il_stub_link] since the stub is generated by inserting IL instructions into a stream and then passing that stream to the JIT for compilation.

A consequence of this approach is that marshaling code is not immediately available post-link for AOT scenarios (e.g. [`crossgen`](../../workflow/building/coreclr/crossgen.md) and [`crossgen2`](crossgen2-compilation-structure-enhancements.md)). The immediate unavailability of this code has been mitigated by a complex mechanism to have marshalling code generated by during AOT compilation. The [IL Linker][ilinker_link] is another tool that struggles with runtime generated code since it is unable to understand all potential used types without seeing what is generated.

The user experience of the built-in generation initially appears ideal, but there are several negative consequences that make the system costly in the long term:

* Bug fixes in the marshaling system require an update to the entire runtime.
* New types require enhancements to the marshaling system for efficient marshal behavior.
* [`ICustomMarshaler`](https://docs.microsoft.com/dotnet/api/system.runtime.interopservices.icustommarshaler) incurs a substantial performance penalty.
* Once a marshaling bug becomes expected behavior the bug is difficult to fix. This is due to user reliance on shipped behavior and since the marshaling system is built into the runtime there aren't ways to select previous or new behavior.
* Example involving COM marshaling: https://github.com/dotnet/coreclr/pull/23974.
* Debugging the auto-generated marshaling IL Stub is difficult for runtime developers and close to impossible for consumers of P/Invokes.

This is not to say the P/Invoke system should be completely redesigned. The current system is heavily used and its simplicity for consuming native assets is a benefit. Rather this new mechanism is designed to provide a way for marshaling code to be generated by an external tool but work with existing `DllImportAttribute` practices in a way that isn't onerous on current .NET developers.

The [Roslyn Compiler](https://github.com/dotnet/roslyn) team is working on a [Source Generator feature][source_gen_link] that will allow the generation of additional source files that can be added to an assembly during the compilation process - the runtime generation IL Stubs is an in-memory version of this scenario.

**Note** This proposal is targeted at addressing P/Invoke improvements but could be adapted to work with COM interop utilizing the new [`ComWrappers`][comwrappers_link] API.

### Requirements

* [Source generators][source_gen_link]
* Branch: https://github.com/dotnet/roslyn/tree/features/source-generators
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

* Support for non-`void` return types in [`partial`](https://docs.microsoft.com/dotnet/csharp/language-reference/keywords/partial-method) methods.
* https://github.com/dotnet/csharplang/issues/3301

## Design

Using Source Generators is focused on integrating with existing `DllImportAttribute` practices from an invocation point of view (i.e. callsites should not need to be updated). The idea behind Source Generators is that code for some scenarios can be precomputed using user declared types and logic thus avoiding the need to generator code at runtime.

**Goals**

* Allow P/Invoke interop evolution independently on runtime.
* High performance: No reflection at runtime, compatible in an AOT scenario.

**Non-Goals**

* 100 % parity with existing P/Invoke marshaling rules.
* Zero code change for the developers.

### P/Invoke Walkthrough

The P/Invoke algorithm is presented below using a simple example.

``` CSharp
/* A */ [DllImportAttribute("Kernel32.dll")]
/* B */ extern static bool QueryPerformanceCounter(out long lpPerformanceCount);
...
long count;
/* C */ QueryPerformanceCounter(out count);
```

At (A) in the above code snippet, the runtime is told to look for an export name `QueryPerformanceCounter` (B) in the `Kernel32.dll` binary. There are many additional attributes on the `DllImportAttribute` that help with export discovery and can influence the semantics of the generated IL Stub. Point (C) represents an invocation of the P/Invoke. Most of the work occurs at (C) at runtime, since (A) and (B) are merely declarations the compiler uses to embed the relevant details into assembly metadata that is read at runtime.

1) During invocation, the function declaration is determined to be an external call requiring marshaling. Given the defined properties in the `DllImportAttribute` instance as well as the metadata of the user-defined signature an IL Stub is generated.

2) The runtime attempts to find a binary with the name supplied in `DllImportAttribute`.

* Discovery of the target binary is complicated and can be influenced by the [`AssemblyLoadContext`](https://docs.microsoft.com/dotnet/api/system.runtime.loader.assemblyloadcontext) and [`NativeLibrary`](https://docs.microsoft.com/dotnet/api/system.runtime.interopservices.nativelibrary) classes.

3) Once the binary is found and loaded into the runtime, it is queried for the expected export name. The name of the attributed function is used by default but this is configurable by the [`DllImportAttribute.EntryPoint`](https://docs.microsoft.com/dotnet/api/system.runtime.interopservices.dllimportattribute.entrypoint) property.

* This process is also influenced by additional `DllImportAttribute` properties as well as by the underlying platform. For example, the [Win32 API ANSI/UNICODE convention](https://docs.microsoft.com/windows/win32/intl/conventions-for-function-prototypes) on Windows is respected and a(n) `A`/`W` suffix may be appended to the function name if it is not immediately found.

4) The IL Stub is called like any other .NET method.

5) The IL Stub marshals arguments as appropriate and invokes the export via the `calli` instruction.

6) Once the export returns control to the IL Stub the marshaling logic cleans up and ensures any returned data is marshaled back out to the calling function.

### Source Generator Integration

An example of how the previous P/Invoke snippet could be transformed is below. This example is using the proposed API in this document. The Source Generator has a restriction of no user code modification so that is reflected in the design and mitigations for easing code adoption is presented later.

`Program.cs` (User written code)

``` CSharp
/* A */ [GeneratedDllImportAttribute("Kernel32.dll")]
/* B */ partial static bool QueryPerformanceCounter(out long lpPerformanceCount);
...
long count;
/* C*/ QueryPerformanceCounter(out count);
```

Observe point (A), the new attribute. This attribute provides an indication to a Source Generator that the following declaration represents a native export that will be called via Source Generated stub.

During the source generation process the metadata in the `GeneratedDllImportAttribute` (A) would be used to generate a stub and invoke the desired native export. Also note that the method declaration is marked `partial`. The Source Generator would then generate the source for this partial method. The invocation (C) remains unchanged to that of usage involving `DllImportAttribute`.

AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved
`Stubs.g.cs`:

``` CSharp
/* D */ partial static bool QueryPerformanceCounter(out long lpPerformanceCount)
{
unsafe
{
long result = 0;
bool success = QueryPerformanceCounter(&result) != 0;
lpPerformanceCount = result;
return success;
}
}

[DllImportAttribute("Kernel32.dll")]
/* E */ private static extern int QueryPerformanceCounter(long* lpPerformanceCount);
```

The Source Generator would generate the implementation of the partial method (D) in a separate translation unit (`Stubs.g.cs`). At point (E) a `DllImportAttribute` declaration is created based on the user's original declaration (A) for a private P/Invoke specifically for the generated code. The P/Invoke signature from the original declaration would be modified to contain only [blittable types][blittable_link] to ensure the JIT could inline the invocation. Finally note that the user's original function signature would remain in to avoid impacting existing callsites.

In this system it is not defined how marshaling of specific types would be performed. The built-in runtime has complex rules for some types, and it is these rules that once shipped become the de facto standard - often times regardless if the behavior is a bug or not. The design here is not concerned with how the arguments go from a managed to unmanaged environment. With the IL Stub generation extracted from the runtime new type marshaling (e.g. `Span<T>`) could be introduced without requiring an corresponding update to the runtime itself. The `Span<T>` type is good example of a type that at present has no support for marshaling, but with Source Generators, users could update to the latest generator and have support without changing the runtime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few questions related to marshallers:

  1. In this example, isn't an ILstub still being generated? Is there a way to avoid that stub, or have it pre-generated?
  2. Will stubs.g.cs be human writable? If so, how?
  3. How would a new marshaller, like Span, be introduced? How do we envision adding them to our project? (Nuget?)
  4. What is the diagnostics experience if I use a type that is not supported by the source generator? A build error? An exception emited for runtime? (eg. Trying Span before such a marshaller is introduced)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffschwMSFT Here are my thoughts:

  1. Will stubs.g.cs be human writable? If so, how?

Sure. That is already possible since users can convert their non-blittable P/Invoke signature into a blittable one and instead implement a wrapper that does all the marshaling manually. There would be little need to do any generation if the signature only contain blittable types - at least that seems to be the tenor of the current thought.

  1. How would a new marshaller, like Span, be introduced? How do we envision adding them to our project? (Nuget?)

That would be a design consideration for the default Source Generator supplied by the .NET team. If that source generator is extensible or not is entirely up to us. Third parties that want their own P/Invoke source generator would need to define their own mechanism to extensible marshaling. For the default, we would add support as types become interesting. Span<T> would of course be in the default as well as Nullable<T> and various others as the language continues to evolve, but we could add support and users could opt in without requiring a runtime update.

  1. What is the diagnostics experience if I use a type that is not supported by the source generator? A build error? An exception emited for runtime? (eg. Trying Span before such a marshaller is introduced)

That is an interesting question for the Source Generator team. @chsienki How is the Source Generator team thinking about propagation of source generator states? How does a Source Generator convey failures to the users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generators can produce diagnostics (much like an analyzer does today), which are presented to the user via the command line / IDE etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chsienki. @jeffschwMSFT Seems like any Source Generator for P/Invokes would be able to alert a user when a type is hit that can't be marshaled in a nice message. This could point to the classic P/Invoke mechanism when the Source Generator can't support the scenario and/or we point users to a repo to file an issue or guidance for how to manually work around support.


### Adoption of Source Generator for existing code

In the current Source Generator design modification of any user written code is not permitted. This includes modification of any non-functional metadata (e.g. Attributes). The above design therefore introduces a new attribute and signature for consumption of a native export. Therefore, in order to consume Source Generator's users would need to update their source and adoption could be stunted by this.

As a mitigation it would be possible to create a [Roslyn Analyzer and Code fix](https://github.com/dotnet/roslyn/wiki/Getting-Started-Writing-a-Custom-Analyzer-&-Code-Fix) to aid the developer in converting their `DllImportAttribute` marked functions to use `GeneratedDllImportAttribute`. Furthermore, the function would need to be updated to have the `partial` keyword and potentially the enclosing class.

## Proposed API

Given the Source Generator restrictions and potential confusion about overloaded attribute usage, the new `GeneratedDllImportAttribute` attribute mirrors the existing `DllImportAttribute`.

``` CSharp
namespace System.Runtime.InteropServices
{
/// <summary>
/// Attribute used to indicate a Source Generator should create a function for marshaling
/// arguments instead of relying on the CLR to generate an IL Stub at runtime.
/// </summary>
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
public sealed class GeneratedDllImportAttribute : Attribute
{
/// <summary>
/// Enables or disables best-fit mapping behavior when converting Unicode characters
/// to ANSI characters.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.BestFitMapping"/>
public bool BestFitMapping;

/// <summary>
/// Indicates the calling convention of an entry point.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.CallingConvention"/>
public CallingConvention CallingConvention;

/// <summary>
/// Indicates how to marshal string parameters to the method and controls name mangling.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.CharSet"/>
public CharSet CharSet;

/// <summary>
/// Indicates the name or ordinal of the DLL entry point to be called.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.EntryPoint"/>
public string? EntryPoint;

/// <summary>
/// Controls whether the System.Runtime.InteropServices.DllImportAttribute.CharSet
/// field causes the common language runtime to search an unmanaged DLL for entry-point
/// names other than the one specified.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.ExactSpelling"/>
public bool ExactSpelling;

/// <summary>
/// Indicates whether unmanaged methods that have HRESULT or retval return values
/// are directly translated or whether HRESULT or retval return values are automatically
/// converted to exceptions.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.PreserveSig"/>
public bool PreserveSig;

/// <summary>
/// Indicates whether the callee calls the SetLastError Windows API function before
/// returning from the attributed method.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.SetLastError"/>
public bool SetLastError;

/// <summary>
/// Enables or disables the throwing of an exception on an unmappable Unicode character
/// that is converted to an ANSI "?" character.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.ThrowOnUnmappableChar"/>
public bool ThrowOnUnmappableChar;
}
}
```

## Questions

* Can the above API be used to provide a reverse P/Invoke stub?
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

## References

[P/Invoke][pinvoke_link]

[Type Marshaling][typemarshal_link]

[IL Stubs description][il_stub_link]

<!-- Common links -->
[dotnet_link]: https://docs.microsoft.com/dotnet/core/tools/dotnet
[typemarshal_link]: https://docs.microsoft.com/dotnet/standard/native-interop/type-marshaling
[pinvoke_link]: https://docs.microsoft.com/dotnet/standard/native-interop/pinvoke
[comwrappers_link]: https://github.com/dotnet/runtime/issues/1845
[il_stub_link]: https://mattwarren.org/2019/09/26/Stubs-in-the-.NET-Runtime/
[source_gen_link]: https://github.com/dotnet/roslyn/blob/features/source-generators/docs/features/source-generators.md
[blittable_link]: https://docs.microsoft.com/dotnet/framework/interop/blittable-and-non-blittable-types
[ilinker_link]: https://github.com/mono/linker