Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to avoid bindgen being out of sync with cc? #2962

Open
Sympatron opened this issue Oct 25, 2024 · 3 comments
Open

How to avoid bindgen being out of sync with cc? #2962

Sympatron opened this issue Oct 25, 2024 · 3 comments

Comments

@Sympatron
Copy link

Sympatron commented Oct 25, 2024

My understanding is, that it is very common to wrap a C library in a Rust *-sys crate by using bindgen to automatically generate Rust bindings and cc to compile the C code.

As far as I understand it bindgen uses libclang to parse the C headers and cc uses whatever compiler the user provides.

This can lead to problems, because both toolchains don't need to agree on everything. While cross compiling for thumbv6m-none-eabi on Windows, bindgen generated u32 for enums and cc used u8 when possible (-fshort-enums).

This is quite unfortunate and hard to catch. Is there a common workaround for this?

PS: If there is a better place for this issue please tell me.

@pvdrz
Copy link
Contributor

pvdrz commented Oct 25, 2024

the easiest I can think of is to set the relevant environment variables for clang-sys and use the same paths when calling Build::compiler

The docs for Build::compiler also say that the compiler is automatically detected from a number of environment variables so it might be that setting CC or similar to be consistent with the clang-sys environment variables might work as well.

@geeklint
Copy link

I also ran into the problem recently that I had generated bindings that were incorrect because of enum size differences; similarly on a thumbX-none-eabi project, although in my case I wasn't using cc but instead linking with separately compiled code as a separate step from the cargo build.

I had this idea for a "this would've prevented my mistake" feature: when bindgen encounters an enum, if it detects that the target platform is one that common C compilers disagree on the size of enums1 and you haven't manually specified -fshort-enums or -fno-short-enums it could issue a warning to double check that the bindings are correct.

A warning seemed like the most useful UX here because, at least for my use case, bindgen couldn't know that I was planning to link with code compiled by gcc, but the warning would've prompted me to figure out that the bindings were wrong before I learned so the hard way.

Footnotes

  1. likely most critically: -none-eabi arm platforms

@MaulingMonkey
Copy link

MaulingMonkey commented Jan 19, 2025

likely most critically: -none-eabi arm platforms

ARM defers enum ABI choice to platform ABI, but of course *-none-* has no platform ABI. As @geeklint discovered (see: godbolt & Rust GameDev Discord Discussion), GCC chooses a 1-byte repr, Clang chooses a 4-byte repr, for a simple:

typedef enum { Hello = 1 } Test;

Solutions that have run through my mind include:

  1. Making bindgen (optionally?) generate a bindings.cpp to feed to cc full of static_asserts validating alignment/size/signedness/???. Probably the best option for catching differences between bindgen and cc, although it doesn't necessairly do anything to fix caught issues. I've done similar for manually generated FFI and it's helped.

  2. For enums specifically, making a miserable pile of cc-driven types and using that to implement bindings.rs instead of core::ffi::c_int etc. - I implemented the first 90% of this as abienum, although the build.rs probably needs to export metadata to support #[repr(u32)] enum Test { ... } style enums. The second 90%, actually modifying bindgen to use something like abienum, is left as an exercise to the reader. The third 90% has yet to be identified. This also doesn't help if you don't specify ${CC} to tell cc to use GCC when linking prebuilt libs that were built with GCC.

  3. Trying to upstream core::ffi::c_enum_* somehow. This would require making rustc aware of ${CC} / ${CFLAGS} / ???, since compilers for the "same" target disagree on layout, which seems like something that would have a lot of pushback (although I could be wrong.)

  4. Taint generated enums such that they're improper_ctypes somehow, at least on unknown/none style platforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants