Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the ClangSharpPInvokeGenerator to allow better reuse and to be more explicit #34

Merged
merged 5 commits into from
Apr 26, 2019

Conversation

tannergooding
Copy link
Member

I imagine there will be some feedback and/or questions that need to be addressed, so feel free 😄


This rewrites the ClangSharpPInvokeGenerator code. For the most part, the rewrite tries to produce code that is nearly identical to the previous codegen.

An overview is:

  • The visitation logic is centralized in CursorVisitor and extension points BeginHandle and EndHandle are exposed, to make it easy for someone to handle or skip particular nodes and to execute their own code.
  • The file generation logic is centralized in CursorWriter. It takes a Func<CXCursor, bool> predicate that allows a user to indicate if they want to process or skip a given node. The visitation logic should account for nested types and for nodes that always need to be traversed
  • Both the traversal and generation logic try to be explicit as possible. This means there are a lot of switch statements and the fallback behavior is generally to log a message and break the debugger. This was done to make it very easy to know what has and hasn't been considered. It also makes it easier to break out individual logic or paths where needed.

The notable changes are:

  • The visitation logic now considers Location.IsInMainFile, this ultimately means the order is slightly different and that we don't visit #includes. This changes the output order slightly but also means we don't need to keep track of already visited nodes.
  • We return uint for uint rather than int
  • We no longer prefix @ to everything, and instead only prefix it to reserved C# keywords
  • We better follow typedefs and resolve the correct type to output. For example, we will now write out public CXCompletionString CompletionString rather than public IntPtr @CompletionString. The latter (IntPtr) will still be done if there is a double indirection.
  • We no longer explicitly declarate an enum as : int, since that is the default
  • We output IntPtr rather than ulong for size_t, to make the code produced more portable

Potential Future changes:

  • The writer should ideally be extended to allow writing a type per file, this makes it easier to parse and compare deltas
  • The writer should ideally be extended to allow writing "unsafe" code, rather than trying to create wrapper types or use managed types that require marshaling
    • Wrapper types don't always work, they sometimes break functionality based on the underlying calling convention or ABI
    • Modern .NET has a lot of helper types, like Span<T>, that make it much easier to work with unsafe types
    • Purely unsafe/blittable code has less overhead, since there is no marshaling required
  • Allow more user customization of the output. This would allow options like the above (single vs multi-file generation and safe vs unsafe generation) and others like AnyCPU vs Platform Specific generation or Windows vs Unix generation (for handling the long is IntPtr vs int problem)


switch (name)
{
case "size_t":
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully a better can be solved to handle this and other similar types... Potentially could check for both regular and -m32 mode and if it differs chooise IntPtr, that way it is more general....


namespace ClangSharpPInvokeGenerator
{
internal static class CXTypeExtensions
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the writer is extended to support user-configuration, these might need to either take additional parameters or become instance methods on the writer... The later might be better since they are more specific to the writer than to the actual interop types.


case CXTypeKind.CXType_Char_S:
{
return "MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(StringMarshaler))";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be LPUTF8Str if we said net48, netstandard2.1, and netcoreapp10 are the baseline


case CXCursorKind.CXCursor_ParmDecl:
{
if (cursor.Type.GetParmModifier(cursor).Equals("out"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is probably a better way to handle this combination...

}
}

public static string GetParmModifierForPointeeType(this CXType type, CXCursor cursor, CXType pointeeType)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could also support in and ref for "safe" code. Part of the problem, however, is that if attributes don't exist we are just guessing and will often get it wrong....

@@ -158,31 +158,38 @@ public static void Main(string[] args)
sw.WriteLine(" using System.Runtime.InteropServices;");
sw.WriteLine();

var structVisitor = new StructVisitor(sw);
var structDeclWriter = new CursorWriter(sw, indentation: 1, (cursor) => cursor.Kind == CXCursorKind.CXCursor_StructDecl);
Copy link
Member Author

@tannergooding tannergooding Apr 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could potentially expose a method that takes the predicate, rather than the constructor, so you can reuse the instance across multiple filters...

sw.WriteLine($" private const string libraryPath = \"{libraryPath}\";");
sw.WriteLine();

functionDeclWriter.PrefixStrip = prefixStrip;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally there would be a better way to pass this, and other options, in. Potentially a CursorWriterOptions type....

@tannergooding
Copy link
Member Author

I'll get a fix up that resolves the couple issues I called out in an hour or two.

@mjsabby
Copy link
Contributor

mjsabby commented Apr 25, 2019

@tannergooding Sounds good. Now that it's SDK style, I'm thinking we can make it all nice like other dotnet repos with Azure DevOps and analyzers! I did want to ask you about if we were to make these signed nuget packages what secret sauce the dotnet repos have. We can have that conversation offline if need be.

@tannergooding
Copy link
Member Author

Dependency on S.R.CompilerServices.Unsafe was removed.

Fixed the launchSettings.json to include a Unix profile and to not be machine dependent. It still makes an assumption about the location of LLVM and assumes it is running from the current output directory (which is how VS runs it for F5).

Now that it's SDK style, I'm thinking we can make it all nice like other dotnet repos with Azure DevOps and analyzers!

That would be nice. It could potentially be use dotnet/arcade for the infrastructure (which would give it everything other repos like Roslyn or CoreFX have availalbe) or it could go with something smaller and more targeted... NuGet signing I'm not as sure about, but I know the right people to talk to.

@mjsabby
Copy link
Contributor

mjsabby commented Apr 26, 2019

The F5 is important so that's good. I'm wondering if we could do a download of LLVM-* and have it be there as part of the build. It's cached for most scenarios, but it'll allow us the latest and greatest.

I'm now eyeing how we can improve LLVMSharp with this rewrite. I've been bogged down with issues in the generator and now at least somethings seem better that it could be 100% generated without modifications.

I did have a question. It is my understanding that a single IntPtr in a struct for marshalling can't be ABI incompatobe. But you said it is, are there examples?

@tannergooding
Copy link
Member Author

I'm wondering if we could do a download of LLVM-* and have it be there as part of the build. It's cached for most scenarios, but it'll allow us the latest and greatest.

It might be possible, but it's also a 350 MB tarball which might be more than some people expect (especially if they already have it locally). Probably worth some further thought to determine the right behavior...

It is my understanding that a single IntPtr in a struct for marshalling can't be ABI incompatobe. But you said it is, are there examples?

Struct wrappers are generally not a problem for most ABIs. However, __thiscall in particular can be problematic (there may be others as well, but I don't know of any specifics) since it defines different rules for returning a struct and returning a primitive. Full Framework and .NET Core prior to 3.0 had a bug that allowed this to "just work", but that has been fixed for netcoreapp3.0 and will now do the correct thing in P/Invoke scenarios (but preserves the old bug for the COM scenario).

@tannergooding
Copy link
Member Author

For reference on the __thiscall scenario in particular, you can see dotnet/coreclr#23974 and related PRs/issues

@tannergooding
Copy link
Member Author

I'm now eyeing how we can improve LLVMSharp with this rewrite. I've been bogged down with issues in the generator and now at least somethings seem better that it could be 100% generated without modifications.

That's Great!

I'm going to try out a few more libraries after this is merged and try to ensure more scenarios are covered as well as there are likely plenty of cursor types that still need to be handled.

@mjsabby
Copy link
Contributor

mjsabby commented Apr 26, 2019

Are you waiting for me to merge? Or is there is more to be done here?

@tannergooding
Copy link
Member Author

Nothing more to be done here. Was waiting to ensure nothing else came up from the conversation above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants