Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely basic Vtable generation #2145

Merged
merged 6 commits into from
Jan 29, 2022
Merged

Conversation

DrChat
Copy link
Contributor

@DrChat DrChat commented Dec 24, 2021

This is an extremely naïve and basic implementation for generating C++ vtable bindings.
Vtables are only generated if the class with the virtual methods has no base classes, and the virtual tables themselves are laid out as done so with MSVC.

Below is an immediate todo list. I'd appreciate tips or pointers on how to address each :)

  • Address how to write tests for platform-specific outputs
  • Generate vtables differently depending on the target platform
    • Unnecessary at this early stage; both GCC and MSVC emit the same vtables.
  • Handle compiler-specific virtual destructor layouts
    • GCC uses two slots for a destructor (why?)
    • Destructor ordering relative to other methods is lost in translation
  • Address ergonomics of calling virtual functions for rust-bindgen users.
Sample Output

class PureVirtualIFace {
public:
    virtual void Foo() = 0;
    virtual void Bar(unsigned int) = 0;
};

class AnotherInterface {
public:
    virtual void Baz() = 0;
};

class Implementation : public PureVirtualIFace {
public:
    void Foo() override {}
    void Bar(unsigned int) override {}
};

class DoubleImpl : public PureVirtualIFace, public AnotherInterface {
public:
    void Foo() override {}
    void Bar(unsigned int) override {}

    void Baz() override {}
};
/* automatically generated by rust-bindgen 0.59.2 */

#[repr(C)]
pub struct PureVirtualIFace__bindgen_vtable {
    pub PureVirtualIFace_Foo: unsafe extern "C" fn(this: &mut PureVirtualIFace),
    pub PureVirtualIFace_Bar: unsafe extern "C" fn(
        this: &mut PureVirtualIFace,
        arg1: ::std::os::raw::c_uint,
    ),
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct PureVirtualIFace {
    pub vtable_: *const PureVirtualIFace__bindgen_vtable,
}
#[test]
fn bindgen_test_layout_PureVirtualIFace() {
    assert_eq!(
        ::std::mem::size_of::<PureVirtualIFace>(),
        8usize,
        concat!("Size of: ", stringify!(PureVirtualIFace))
    );
    assert_eq!(
        ::std::mem::align_of::<PureVirtualIFace>(),
        8usize,
        concat!("Alignment of ", stringify!(PureVirtualIFace))
    );
}
#[repr(C)]
pub struct AnotherInterface__bindgen_vtable {
    pub AnotherInterface_Baz: unsafe extern "C" fn(this: &mut AnotherInterface),
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct AnotherInterface {
    pub vtable_: *const AnotherInterface__bindgen_vtable,
}
#[test]
fn bindgen_test_layout_AnotherInterface() {
    assert_eq!(
        ::std::mem::size_of::<AnotherInterface>(),
        8usize,
        concat!("Size of: ", stringify!(AnotherInterface))
    );
    assert_eq!(
        ::std::mem::align_of::<AnotherInterface>(),
        8usize,
        concat!("Alignment of ", stringify!(AnotherInterface))
    );
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct Implementation {
    pub _base: PureVirtualIFace,
}
#[test]
fn bindgen_test_layout_Implementation() {
    assert_eq!(
        ::std::mem::size_of::<Implementation>(),
        8usize,
        concat!("Size of: ", stringify!(Implementation))
    );
    assert_eq!(
        ::std::mem::align_of::<Implementation>(),
        8usize,
        concat!("Alignment of ", stringify!(Implementation))
    );
}
#[repr(C)]
#[derive(Debug, Copy, Clone)]
pub struct DoubleImpl {
    pub _base: PureVirtualIFace,
    pub _base_1: AnotherInterface,
}
#[test]
fn bindgen_test_layout_DoubleImpl() {
    assert_eq!(
        ::std::mem::size_of::<DoubleImpl>(),
        16usize,
        concat!("Size of: ", stringify!(DoubleImpl))
    );
    assert_eq!(
        ::std::mem::align_of::<DoubleImpl>(),
        8usize,
        concat!("Alignment of ", stringify!(DoubleImpl))
    );
}

VTable layouts

MSVC:

class PureVirtualIFace	size(8):
	+---
 0	| {vfptr}
	+---

PureVirtualIFace::$vftable@:
	| &PureVirtualIFace_meta
	|  0
 0	| &PureVirtualIFace::Foo 
 1	| &PureVirtualIFace::Bar 

class AnotherInterface	size(8):
	+---
 0	| {vfptr}
	+---

AnotherInterface::$vftable@:
	| &AnotherInterface_meta
	|  0
 0	| &AnotherInterface::Baz 


class Implementation	size(8):
	+---
 0	| +--- (base class PureVirtualIFace)
 0	| | {vfptr}
	| +---
	+---

Implementation::$vftable@:
	| &Implementation_meta
	|  0
 0	| &Implementation::Foo 
 1	| &Implementation::Bar 


class DoubleImpl	size(16):
	+---
 0	| +--- (base class PureVirtualIFace)
 0	| | {vfptr}
	| +---
 8	| +--- (base class AnotherInterface)
 8	| | {vfptr}
	| +---
	+---

DoubleImpl::$vftable@PureVirtualIFace@:
	| &DoubleImpl_meta
	|  0
 0	| &DoubleImpl::Foo 
 1	| &DoubleImpl::Bar 

DoubleImpl::$vftable@AnotherInterface@:
	| -8
 0	| &DoubleImpl::Baz 

GCC:

Vtable for PureVirtualIFace
PureVirtualIFace::_ZTV16PureVirtualIFace: 4 entries
0     (int (*)(...))0
8     (int (*)(...))(& _ZTI16PureVirtualIFace)
16    (int (*)(...))__cxa_pure_virtual
24    (int (*)(...))__cxa_pure_virtual

Class PureVirtualIFace
   size=8 align=8
   base size=8 base align=8
PureVirtualIFace (0x0x7f12e99aa420) 0 nearly-empty
    vptr=((& PureVirtualIFace::_ZTV16PureVirtualIFace) + 16)

Vtable for AnotherInterface
AnotherInterface::_ZTV16AnotherInterface: 3 entries
0     (int (*)(...))0
8     (int (*)(...))(& _ZTI16AnotherInterface)
16    (int (*)(...))__cxa_pure_virtual

Class AnotherInterface
   size=8 align=8
   base size=8 base align=8
AnotherInterface (0x0x7f12e99aa480) 0 nearly-empty
    vptr=((& AnotherInterface::_ZTV16AnotherInterface) + 16)

Vtable for Implementation
Implementation::_ZTV14Implementation: 4 entries
0     (int (*)(...))0
8     (int (*)(...))(& _ZTI14Implementation)
16    (int (*)(...))Implementation::Foo
24    (int (*)(...))Implementation::Bar

Class Implementation
   size=8 align=8
   base size=8 base align=8
Implementation (0x0x7f12e98561a0) 0 nearly-empty
    vptr=((& Implementation::_ZTV14Implementation) + 16)
  PureVirtualIFace (0x0x7f12e99aa4e0) 0 nearly-empty
      primary-for Implementation (0x0x7f12e98561a0)

Vtable for DoubleImpl
DoubleImpl::_ZTV10DoubleImpl: 8 entries
0     (int (*)(...))0
8     (int (*)(...))(& _ZTI10DoubleImpl)
16    (int (*)(...))DoubleImpl::Foo
24    (int (*)(...))DoubleImpl::Bar
32    (int (*)(...))DoubleImpl::Baz
40    (int (*)(...))-8
48    (int (*)(...))(& _ZTI10DoubleImpl)
56    (int (*)(...))DoubleImpl::_ZThn8_N10DoubleImpl3BazEv

Class DoubleImpl
   size=16 align=8
   base size=16 base align=8
DoubleImpl (0x0x7f12e99bf000) 0
    vptr=((& DoubleImpl::_ZTV10DoubleImpl) + 16)
  PureVirtualIFace (0x0x7f12e99aa600) 0 nearly-empty
      primary-for DoubleImpl (0x0x7f12e99bf000)
  AnotherInterface (0x0x7f12e99aa660) 8 nearly-empty
      vptr=((& DoubleImpl::_ZTV10DoubleImpl) + 56)

For reference, you can check the Vtable layouts on GCC/Clang/MSVC with these compiler commands.
Note that for best results, the classes must be used at least once in code (to prevent DCE).

This PR addresses #27.

@emilio
Copy link
Contributor

emilio commented Dec 29, 2021

Thanks for working on this!

This is an extremely naïve and basic implementation for generating C++ vtable bindings. Vtables are only generated if the class with the virtual methods has no base classes, and the virtual tables themselves are laid out as done so with MSVC.

Below is an immediate todo list. I'd appreciate tips or pointers on how to address each :)

* [ ]  Address how to write tests for platform-specific outputs

Presumably you mean target-specific outputs right? If so that should work already by passing the right -target args to bindgen. We have some like headers/win32-thiscall_nightly.hpp that targets win32, etc.

* [ ]  Generate vtables differently depending on the target platform

The BindgenContext should have the right information already, see is_target_wasm32 or target_pointer_size, for example.

* [ ]  Address ergonomics of calling virtual functions for `rust-bindgen` users.

We could generate intrinsic methods that go to the vtable or something, but for now just having the virtual methods is a huge improvement.

@DrChat
Copy link
Contributor Author

DrChat commented Dec 29, 2021

Thanks for the tips!
Looking at the todo list some more, it appears that there are no platform-specific differences uncovered by this basic implementation (both GCC and MSVC generate the same virtual function tables).

I would say this is ready to go, however it seems that the call to function_item.process_before_codegen(...) is problematic.
From running the tests, it looks like calling this function will mark the respective C++ functions as "seen", which means they won't be generated again (such as if a function is both virtual and implemented).

In addition, it'd be nice to get a canonical name for these functions that do not include the classname, as the classname is already implicit for the vtable structure, e.g:

#[repr(C)]
pub struct C__bindgen_vtable {
    C_do_thing: fn(&mut self, arg1: ::std::os::raw::c_char),
    C_do_thing1: fn(&mut self, arg1: ::std::os::raw::c_int),
}
// FIXME: Is there a canonical name without the class prepended?
let function_name = function_item.canonical_name(ctx);

Any thoughts on how to resolve these problems?

@emilio
Copy link
Contributor

emilio commented Dec 30, 2021

Thanks for the tips! Looking at the todo list some more, it appears that there are no platform-specific differences uncovered by this basic implementation (both GCC and MSVC generate the same virtual function tables).

The only differences I'm aware of are related to overloaded functions. Leaving that as a TODO / TBD seems fair.

I would say this is ready to go, however it seems that the call to function_item.process_before_codegen(...) is problematic. From running the tests, it looks like calling this function will mark the respective C++ functions as "seen", which means they won't be generated again (such as if a function is both virtual and implemented).

If we're generating vtables at all, process_before_codegen shouldn't be involved whatsoever I'd think. If we skip generating a vtable entry (for any reason) that would break all the following vtable entries, right? They'd end up calling the wrong function or crashing in awful ways.

In addition, it'd be nice to get a canonical name for these functions that do not include the classname, as the classname is already implicit for the vtable structure, e.g:

#[repr(C)]
pub struct C__bindgen_vtable {
    C_do_thing: fn(&mut self, arg1: ::std::os::raw::c_char),
    C_do_thing1: fn(&mut self, arg1: ::std::os::raw::c_int),
}

From looking at that code, two nits:

  • Does fn(&mut self) really work? shouldn't it be fn(self: &mut C)? Maybe not a huge deal if we somehow make calling those convenient.
  • Should the struct member be public so it can actually be used?

Re. the function name, does Function::name() not work? I think that's pretty much what you want.

Another thing that we need to get right is probably virtual destructors... Any non-trivial use of virtual methods involves virtual destructors, so generating broken vtables for such classes would be pretty bad. It's ok to merge most of the code as-is behind a flag / default-off or something though, so we can iterate on it.

@DrChat
Copy link
Contributor Author

DrChat commented Dec 30, 2021

Does fn(&mut self) really work? shouldn't it be fn(self: &mut C)? Maybe not a huge deal if we somehow make calling those convenient.

Ah yes, good point! We'd want &mut C because it doesn't make sense to pass the VTable pointer (even if technically correct).

Should the struct member be public so it can actually be used?

Also true - let me fix that!

Re. the function name, does Function::name() not work? I think that's pretty much what you want.

It works except for overloaded functions, where multiple functions will be generated with the same name. canonical_name was convenient because it appended a number on its own, but it prepends the classname as well.

Another thing that we need to get right is probably virtual destructors... Any non-trivial use of virtual methods involves virtual destructors, so generating broken vtables for such classes would be pretty bad. It's ok to merge most of the code as-is behind a flag / default-off or something though, so we can iterate on it.

Yeah, I need to look at virtual destructors some more. It looks like they aren't being emitted for some reason.

@DrChat
Copy link
Contributor Author

DrChat commented Dec 30, 2021

It also appears that virtual destructor functions are not showing up in the self.methods list.

@DrChat DrChat marked this pull request as ready for review December 31, 2021 01:45
@DrChat
Copy link
Contributor Author

DrChat commented Dec 31, 2021

Alright, this should be ready to review now.
This code as-is will only generate vtables for the most basic cases, and the ergonomics of accessing the vtables have not been addressed.
Basic testing on both Windows and Linux with C++/Rust interop seems to be successful at this stage, though I'd expect this to break under any amount of stress testing.

@DrChat
Copy link
Contributor Author

DrChat commented Jan 2, 2022

rust-vtable.zip

To add, this is a simple project you can use to play around and validate the VTable generation as part of this PR.
You'll need to check out this PR and have it in ../rust-bindgen.

Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, sorry for the lag. One question is whether we should enable this by default or put it behind a runtime flag for now, since we expect various edge-casey and not-so-edge-casey things to be potentially broken, and as-is it's not really ergonomic to use.

I tend to think we should (even if we enable it for tests or what not). I'm happy to do the work if so though.

@emilio
Copy link
Contributor

emilio commented Jan 29, 2022

Let's get this merged, I'll add a flag in a follow-up. Thanks so much!

@emilio emilio merged commit be6d07a into rust-lang:master Jan 29, 2022
@DrChat
Copy link
Contributor Author

DrChat commented Jan 29, 2022

Awesome, thanks for the review!

And yeah, I agree that this should probably be gated behind a flag.
Without a doubt there will be bugs (man, C++ vtable logic is complicated now that I'm looking at it from the outside).

@Tazdevil971
Copy link

Tazdevil971 commented Feb 11, 2022

Great feature! Maybe I'm a bit too late, but what was the rationale behind using regular references instead of pointers for the this parameter?

At least in my use case, where vtables entries will be called directly from C++, this can easily cause mutable aliasing in multithreaded code, which is probably fine, but a bit too close to UB in my opinion.

I think having a simple pointer is safer, as library writers can pick what reference type suites it the most. For example I would probably keep it immutable, just to be safe.

(BTW, I've been using it for the past couple of days, and everything seems to be running perfectly out of the box)

Edit: Ok I've created a new issue detailing the problem, #2163

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants