Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial proposal for P/Invokes via Source Generators #33742

Merged
merged 9 commits into from
Mar 31, 2020
233 changes: 233 additions & 0 deletions docs/design/features/source-generator-pinvokes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
# Source Generator P/Invokes

## Purpose

The CLR possesses a rich built-in marshaling mechanism for interoperability with native code that is handled at runtime. This system was designed to free .NET developers from having to author complex and potentially ABI sensitive [type conversion code][typemarshal_link] from a managed to an unmanaged environment. The built-in system works with both [P/Invoke][pinvoke_link] (i.e. `DllImportAttribute`) and [COM interop](https://docs.microsoft.com/dotnet/standard/native-interop/cominterop). The generated portion is typically called an ["IL Stub"][il_stub_link] since the stub is generated by inserting IL instructions into a stream and then passing that stream to the JIT for compilation.

A consequence of this approach is that marshaling code is not immediately available post-link for AOT scenarios (e.g. [`crossgen`](../../workflow/building/coreclr/crossgen.md) and [`crossgen2`](crossgen2-compilation-structure-enhancements.md)). The immediate unavailability of this code has been mitigated by a complex mechanism to have marshalling code generated at AOT time.
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

The user experience of the built-in generation initially appears ideal, but there are several negative consequences that make the system costly in the long term:

* Bug fixes in the marshaling system require an update to the entire runtime.
* New types require enhancements to the marshaling system for efficient marshal behavior.
* [`ICustomMarshaler`](https://docs.microsoft.com/dotnet/api/system.runtime.interopservices.icustommarshaler) incurs a substantial performance penalty.
* Once a marshaling bug becomes expected behavior the bug is difficult to fix. This is due to user reliance on shipped behavior and since the marshaling system is built into the runtime there aren't ways to select previous or new behavior.
* Example involving COM marshaling: https://github.com/dotnet/coreclr/pull/23974.
* Debugging the auto-generated marshaling IL Stub is difficult for runtime developers and close to impossible for consumers of P/Invokes.

This is not to say the P/Invoke system should be completely redesigned. The current system is heavily used and its simplicity for consuming native assets is a benefit. Rather this new mechanism is designed to provide a way for marshaling code to be generated by an external tool but integrate with `DllImportAttribute` in a way that isn't onerous on current .NET developers.

The [Roslyn Compiler](https://github.com/dotnet/roslyn) team is working on a [Source Generator feature][source_gen_link] that will allow the generation of additional source files that can be added to an assembly during the compilation process - the runtime generation IL Stubs is an in-memory version of this scenario.

**Note** This proposal is targeted at addressing P/Invoke improvements but could be adapted to work with COM interop utilizing the new [`ComWrappers`][comwrappers_link] API.

### Requirements

* [Source generators][source_gen_link]
* Branch: https://github.com/dotnet/roslyn/tree/features/source-generators
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

## Design

Using Source Generators is focused on integrating with existing uses of `DllImportAttribute` from an invocation point of view (i.e. callsites should not need to be updated). The idea behind Source Generators is that code for some scenarios can be precomputed using user declared types and logic thus avoiding the need to generator code at runtime. The desire is then to provide a way that existing code can continue to function but can be modified in a way that allows it to leverage this new compiler feature.
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

### P/Invoke Walkthrough

The P/Invoke algorithm is presented below using a simple example.

``` CSharp
/* A */ [DllImportAttribute("Kernel32.dll")]
/* B */ extern static bool QueryPerformanceCounter(out long lpPerformanceCount);
...
long count;
/* C */ QueryPerformanceCounter(out count);
```

At (A) in the above code snippet, the runtime is told to look for an export name `QueryPerformanceCounter` (B) in the `Kernel32.dll` binary. There are many additional attributes on the `DllImportAttribute` that help with export discovery and can influence the semantics of the generated IL Stub. Point (C) represents an invocation of the P/Invoke. Most of the work occurs at (C) at runtime, since (A) and (B) are merely declarations the compiler uses to embed the relevant details into assembly metadata that is read at runtime.

1) During invocation, the function declaration is determined to be an external call requiring marshaling. Given the defined properties in the `DllImportAttribute` instance as well as the metadata of the user-defined signature an IL Stub is generated.

* An IL Stub is always generated when an assembly is compiled in `Debug`. In `Release` builds it is possible no IL Stub is generated and instead the JIT will elide the stub and inline the invocation. We will come back to this inlining below.
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

2) The runtime attempts to find a binary with the name supplied in `DllImportAttribute`.

* Discovery of the target binary is complicated and can be influenced by the [`AssemblyLoadContext`](https://docs.microsoft.com/dotnet/api/system.runtime.loader.assemblyloadcontext) and [`NativeLibrary`](https://docs.microsoft.com/dotnet/api/system.runtime.interopservices.nativelibrary) classes.

3) Once the binary is found and loaded into the runtime, it is queried for the expected export name. The name of the attributed function is used by default but this is configurable by the [`DllImportAttribute.EntryPoint`](https://docs.microsoft.com/dotnet/api/system.runtime.interopservices.dllimportattribute.entrypoint) property.

* This process is also influenced by additional `DllImportAttribute` properties as well as by the underlying platform. For example, the [Win32 API ANSI/UNICODE convention](https://docs.microsoft.com/windows/win32/intl/conventions-for-function-prototypes) on Windows is respected and a(n) `A`/`W` suffix may be appended to the function name if it is not immediately found.

4) The IL Stub is called like any .NET method, but the address of the export is passed to the generated IL Stub via a 'hidden' argument.
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

5) The IL Stub then marshals arguments as appropriate and invokes the export via the `calli` instruction.

6) Once the export returns control to the IL Stub the marshaling logic cleans up and ensures any returned data is marshaled back out to the calling function.

### Source Generator Integration

An example of how the previous P/Invoke snippet could be transformed is below. This example is using the proposed API in this document. The Source Generator currently has a limitation that we will ignore at present - the inability to modify user written code. This limitation is ignored to present an optimal user experience, but will be discussed later in case the current limitation becomes a hard requirement.

`Program.cs` (Pre-generated code)

``` CSharp
/* A */ [DllImportAttribute("Kernel32.dll")]
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved
/* D */ [GenerateNativeImportAttribute]
/* B */ extern static bool QueryPerformanceCounter(out long lpPerformanceCount);
...
long count;
/* C*/ QueryPerformanceCounter(out count);
```

Observe point (D), the new attribute. This attribute provides an indication to a Source Generator that the following declaration should be updated.

`Program.cs` (In-memory updated code)

``` CSharp
/* D */ [GenerateNativeImportAttribute]
/* B */ partial static bool QueryPerformanceCounter(out long lpPerformanceCount);
...
long count;
/* C */ QueryPerformanceCounter(out count);
```

During the source generation process the `DllImportAttribute` (A) would be removed. The `GenerateNativeImportAttribute` (D) would remain to provide a metadata indication that the function was auto-generated. Also note that the method declaration has been updated and marked `partial`. The Source Generator would then generate the source for this partial method. The invocation (C) remains unchanged.

AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved
`Stubs.g.cs`:

``` CSharp
/* E */ partial static bool QueryPerformanceCounter(out long lpPerformanceCount)
{
unsafe
{
long result = 0;
bool success = QueryPerformanceCounter(&result) != 0;
lpPerformanceCount = result;
return success;
}
}

[DllImportAttribute("Kernel32.dll")]
/* F */ private static extern int QueryPerformanceCounter(long* lpPerformanceCount);
```

The Source Generator would generate the implementation of the partial method (E) in a separate translation unit (`Stubs.g.cs`). At point (F) the `DllImportAttribute` declaration from the user's original declaration is copied into a private P/Invoke specifically for the generated code. The P/Invoke signature from the original declaration would be modified to contain only [blittable types][blittable_link] to ensure the JIT could inline the invocation. Finally note that the user's original function signature would remain in to avoid impacting existing callsites.

In this system it is not defined how marshaling of specific types would be performed. The built-in runtime has complex rules for some types, and it is these rules that once shipped become the de facto standard - often times regardless if the behavior is a bug or not. The design here is not concerned with how the arguments go from a managed to unmanaged environment.
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

### Source Generator constraint mitigations

In the current Source Generator design modification of any user written code is not permitted. This includes modification of any non-functional metadata (e.g. Attributes). The above design mentions modification of user source so the current design may be untenable. Let us define the following options and describe how the above design would be modified.

* In-memory source modification - This would yield the ideal design above.

* On disk source modification - Altering source on disk would require the `GenerateNativeImportAttribute` to contain all properties from `DllImportAttribute`. This is required since removing the `DllImportAttribute` would be impossible and thus would cause issues during build since the method must also be marked as `partial`.

* No source modification - This aligns with the current Source Generator constraints. It would be possible to create a [Roslyn Analyzer and Code fix](https://github.com/dotnet/roslyn/wiki/Getting-Started-Writing-a-Custom-Analyzer-&-Code-Fix) to aid the developer in converting their `DllImportAttribute` marked functions to use `GenerateNativeImportAttribute`. Furthermore, the function would need to be updated to have the `partial` keyword and potentially the enclosing class.

## Proposed APIs

The ideal solution with support for minor user defined source updates using Source Generators:
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

``` CSharp
namespace System.Runtime.InteropServices
{
/// <summary>
/// Attribute used to indicate a Source Generator should create a function for marshaling
/// arguments instead of relying on the CLR to generate an IL Stub at runtime.
/// </summary>
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
public sealed class GenerateNativeImportAttribute : Attribute
{
}
}
```

An alternative solution that merges the existing `DllImportAttribute` with the `GenerateNativeImportAttribute`. This option would be used if user source cannot be modified in any manner.

``` CSharp
namespace System.Runtime.InteropServices
{
/// <summary>
/// Attribute used to indicate a Source Generator should create a function for marshaling
/// arguments instead of relying on the CLR to generate an IL Stub at runtime.
/// </summary>
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false, Inherited = false)]
public sealed class GenerateNativeImportAttribute : Attribute
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved
{
/// <summary>
/// Enables or disables best-fit mapping behavior when converting Unicode characters
/// to ANSI characters.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.BestFitMapping"/>
public bool BestFitMapping;

/// <summary>
/// Indicates the calling convention of an entry point.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.CallingConvention"/>
public CallingConvention CallingConvention;

/// <summary>
/// Indicates how to marshal string parameters to the method and controls name mangling.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.CharSet"/>
public CharSet CharSet;

/// <summary>
/// Indicates the name or ordinal of the DLL entry point to be called.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.EntryPoint"/>
public string? EntryPoint;

/// <summary>
/// Controls whether the System.Runtime.InteropServices.DllImportAttribute.CharSet
/// field causes the common language runtime to search an unmanaged DLL for entry-point
/// names other than the one specified.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.ExactSpelling"/>
public bool ExactSpelling;

/// <summary>
/// Indicates whether unmanaged methods that have HRESULT or retval return values
/// are directly translated or whether HRESULT or retval return values are automatically
/// converted to exceptions.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.PreserveSig"/>
public bool PreserveSig;

/// <summary>
/// Indicates whether the callee calls the SetLastError Windows API function before
/// returning from the attributed method.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.SetLastError"/>
public bool SetLastError;

/// <summary>
/// Enables or disables the throwing of an exception on an unmappable Unicode character
/// that is converted to an ANSI "?" character.
/// </summary>
/// <see cref="System.Runtime.InteropServices.DllImportAttribute.ThrowOnUnmappableChar"/>
public bool ThrowOnUnmappableChar;
}
}
```

## Questions

* Can the above API be used to provide a reverse P/Invoke stub?
AaronRobinsonMSFT marked this conversation as resolved.
Show resolved Hide resolved

## References

[P/Invoke][pinvoke_link]

[Type Marshaling][typemarshal_link]

[IL Stubs description][il_stub_link]

<!-- Common links -->
[dotnet_link]: https://docs.microsoft.com/dotnet/core/tools/dotnet
[typemarshal_link]: https://docs.microsoft.com/dotnet/standard/native-interop/type-marshaling
[pinvoke_link]: https://docs.microsoft.com/dotnet/standard/native-interop/pinvoke
[comwrappers_link]: https://github.com/dotnet/runtime/issues/1845
[il_stub_link]: https://mattwarren.org/2019/09/26/Stubs-in-the-.NET-Runtime/
[source_gen_link]: https://github.com/dotnet/roslyn/blob/master/docs/features/generators.md
[blittable_link]: https://docs.microsoft.com/dotnet/framework/interop/blittable-and-non-blittable-types