Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regenerate LLVMSharp using unsafe bindings. #105

Merged
merged 6 commits into from
Jun 12, 2019

Conversation

tannergooding
Copy link
Member

@tannergooding tannergooding commented Jun 11, 2019

This is the same as dotnet/ClangSharp#69.

This does, as a part of that, drop support for the LLVMSharp.API namespace.

I do think that having better bindings (that more closely mirror the LLVM C++ API) is very useful (I did something similar in ClangSharp here: https://github.com/microsoft/ClangSharp/tree/master/sources/ClangSharp.PInvokeGenerator/Types).

However, with the regeneration here, practically the entire LLVMSharp.API namespace needs to be rewritten as it doesn't support unsafe code and was duplicating most of the logic that exists in the "simple" wrappers.

Ideally, the LLVMSharp.API would be rewritten and it would, itself, effectively be a minimal wrapper over the wrapper types (which themselves just abstract away the unsafe code) and it would have a caching layer to avoid a bunch of short lived allocations when traversing the types.

@tannergooding
Copy link
Member Author

I'm fine with holding off on merging this until the LLVMSharp.API namespace can be rewritten. But I don't think I will get to it in the immediate, as I am still focusing on ClangSharp right now.

@TChatzigiannakis
Copy link
Contributor

I added the LLVMSharp.API to begin with, and to be honest I'll feel relieved if we can find a more stable solution for keeping it up to date.

@tannergooding
Copy link
Member Author

As the person who added it to start, are you comfortable with the APIs being removed for the time being and being replaced with a lighter-weight wrapper in some future PR?

I believe most of the functionality, modulo mirroring the C++ type hierarchy, is exposed in the lower-level wrapper types (e.g. LLVMModuleRef, LLVMBuilderRef, LLVMTypeRef, etc) now.

A future PR could then expose the higher-level class wrappers again, but this time calling into the low-level wrappers for the supported functionality. This would essentially give you three levels of use:

  • The raw LLVM API, which works directly with unsafe code
    • This API should have minimal overhead if consumed (no marshaling, etc)
  • A slightly abstracted API which exposes the methods via the appropriate wrapper types (such as LLVMBuilderRef rather than LLVMOpaqueBuilder*) and which takes "safe" managed types
    • This is needed, rather than exposing "safe" overloads for the raw API since the raw API takes pointers and we want to make it slightly more type safe for the pointer types.
    • This has some minimal overhead with regards to pinning or marshaling where appropriate
  • A high level API which tries to mirror the C++ type hierarchy. This provides even more safety and allows you to properly differentiate between a LLVMTypeRef that is a Function and one that is an Integer (for example)
    • This is really just a wrapper over the previous wrapper and takes care of the the type hierarchy + caching so that you don't have a lot of short-lived object allocations

@mjsabby
Copy link
Contributor

mjsabby commented Jun 12, 2019

@tannergooding If @TChatzigiannakis is good with this then we're good, because it hasn't got a lot of traction due to the infrequent (or none I think) releases of LLVMSharp in the intervening time.

If @TChatzigiannakis is not unhappy with this, then I'm certainly not going to stop you.

@TChatzigiannakis
Copy link
Contributor

@tannergooding @mjsabby I'm definitely comfortable with that, and your plan sounds very good to me.

Regarding the current LLVMSharp.API namespace, the reason I added it in the first place was that I didn't want to touch the generated code too much, but I still needed to:

  • Have more information when debugging (instead of seeing a handle) and being able to navigate using the properties.
  • Have an API that is more type safe.
  • Consume the library in a way that doesn't feel unnatural in C# (e.g. be able to use foreach or LINQ for some properties, instead of following a reference to the first object and then the next object, and so on).
  • Take advantage of the using construct to dispose the unmanaged structures.

However:

  • Because it was handwritten by consulting the LLVM source code, I unfortunately got a few things wrong (for example, the method that emits a malloc should have had a more general type).
  • The IDisposable abstraction is leaky (there is no guarantee that there will always be exactly one class instance owning a handle; it's up to the user to keep it that way, and proper usage is not documented anywhere).
  • The placement of some things was awkward.
  • A lot of boxing and unboxing takes place (although it's unclear to me how much performance matters for LLVMSharp, but still there's no reason why we shouldn't have something efficient).

So, from a user perspective, I would like to eventually have the "good" parts of it (either in the C++-like API or in the marshalled API), without the shortcomings that we currently have.

(And, of course, let me know if I can help in any way!)

@tannergooding
Copy link
Member Author

Thanks @TChatzigiannakis.

It's worth noting that for the simple-wrapper, a lot of the "niceties" you wanted will still be supported with the simple wrapper.

For example, the simple wrapper types implement IDisposable (so you can use it with using) and the "safe" APIs return Span<T> or T[] where possible (so you can use with foreach or, for the T[] case treat as IEnumerable).

(And, of course, let me know if I can help in any way!)

If you wanted to help with redoing the higher level wrapper, I'd be happy to help review/etc.

If you look at ClangSharp, you'll see I did the three tier system I talked about above (note to self: I should probably move the last tier into ClangSharp proper as well, rather than having it as internal to the PInvokeGenerator).

Namely, you have the raw API in sources/ClangSharp/Generated and then a minimal wrapper over that in sources/ClangSharp/Extensions (still not quite sure how the namespaces/folders should be split, but I don't think naming the folder Extensions is correct long term, and I'm not sure if they should live in the root namespace or a nested one).

The minimal wrapper uses structs, the "type-safe" C names (e.g. CXCursor rather than void*), and exposes most relevant raw APIs as instance or static methods. It also deals with some of the simple C# concepts, like GetHashCode, ToString, or Equals and implements IEquatable<T> and IDisposable where appropriate.

There is then the "one step higher" wrapper in sources/ClangSharp.PInvokeGenerator/Cursors which basically wraps the previous wrapper. It, however, uses classes and exposes types based on the CXCursorKind. It also tries to more closely mirror the C++ Clang API where possible (e.g. we have a Decl and NamedDecl type, even though there is no explicit Decl or NamedDecl cursor kind).

Given that this final tier is using classes, it also has some logic when creating instances to cache things (so we don't create a new instance every time you want to retrieve a property) and for splitting out methods/properties to only be exposed where they make sense (e.g. getting the ReturnType for a FieldDecl doesn't make sense, but it does make sense for a FunctionDecl).
-- There is also likely some cleanup to be done there, like changing the caching logic to not keep instances around forever/etc, but since it is centralized, that is easy enough to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants