Skip to content

Commit

Permalink
Rewrite the core of the binding generator.
Browse files Browse the repository at this point in the history
TL;DR: The binding generator is a mess as of right now. At first it was funny
(in a "this is challenging" sense) to improve on it, but this is not
sustainable.

The truth is that the current architecture of the binding generator is a huge
pile of hacks, so these few days I've been working on rewriting it with a few
goals.

 1) Have the hacks as contained and identified as possible. They're sometimes
    needed because how clang exposes the AST, but ideally those hacks are well
    identified and don't interact randomly with each others.

    As an example, in the current bindgen when scanning the parameters of a
    function that references a struct clones all the struct information, then if
    the struct name changes (because we mangle it), everything breaks.

 2) Support extending the bindgen output without having to deal with clang. The
    way I'm aiming to do this is separating completely the parsing stage from
    the code generation one, and providing a single id for each item the binding
    generator provides.

 3) No more random mutation of the internal representation from anywhere. That
    means no more Rc<RefCell<T>>, no more random circular references, no more
    borrow_state... nothing.

 4) No more deduplication of declarations before code generation.

    Current bindgen has a stage, called `tag_dup_decl`[1], that takes care of
    deduplicating declarations. That's completely buggy, and for C++ it's a
    complete mess, since we YOLO modify the world.

    I've managed to take rid of this using the clang canonical declaration, and
    the definition, to avoid scanning any type/item twice.

 5) Code generation should not modify any internal data structure. It can lookup
    things, traverse whatever it needs, but not modifying randomly.

 6) Each item should have a canonical name, and a single source of mangling
    logic, and that should be computed from the inmutable state, at code
    generation.

    I've put a few canonical_name stuff in the code generation phase, but it's
    still not complete, and should change if I implement namespaces.

Improvements pending until this can land:

 1) Add support for missing core stuff, mainly generating functions (note that
    we parse the signatures for types correctly though), bitfields, generating
    C++ methods.

 2) Add support for the necessary features that were added to work around some
    C++ pitfalls, like opaque types, etc...

 3) Add support for the sugar that Manish added recently.

 4) Optionally (and I guess this can land without it, because basically nobody
    uses it since it's so buggy), bring back namespace support.

These are not completely trivial, but I think I can do them quite easily with
the current architecture.

I'm putting the current state of affairs here as a request for comments... Any
thoughts? Note that there are still a few smells I want to eventually
re-redesign, like the ParseError::Recurse thing, but until that happens I'm
way happier with this kind of architecture.

I'm keeping the old `parser.rs` and `gen.rs` in tree just for reference while I
code, but they will go away.

[1]: https://github.com/Yamakaky/rust-bindgen/blob/master/src/gen.rs#L448
  • Loading branch information
emilio committed Sep 16, 2016
1 parent bbd6b2c commit cfdf15f
Show file tree
Hide file tree
Showing 142 changed files with 8,115 additions and 6,967 deletions.
24 changes: 15 additions & 9 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,18 @@
[package]
authors = ["Jyun-Yan You <jyyou.tw@gmail.com>"]
authors = [
"Jyun-Yan You <jyyou.tw@gmail.com>",
"Emilio Cobos Álvarez <ecoal95@gmail.com>",
"The Servo project developers",
]
build = "build.rs"
description = "A binding generator for Rust"
homepage = "https://github.com/crabtw/rust-bindgen"
homepage = "https://github.com/servo/rust-bindgen"
keywords = ["bindings", "ffi", "code-generation"]
license = "BSD-3-Clause"
name = "bindgen"
readme = "README.md"
repository = "https://github.com/crabtw/rust-bindgen"
version = "0.16.0"
repository = "https://github.com/servo/rust-bindgen"
version = "0.17.0"

[[bin]]
doc = false
Expand All @@ -20,22 +24,24 @@ quasi_codegen = "0.15"
[dependencies]
clang-sys = "0.8.0"
docopt = "0.6.82"
libc = "0.2.*"
log = "0.3.*"
libc = "0.2"
log = "0.3"
env_logger = "0.3"
rustc-serialize = "0.3.19"
syntex_syntax = "0.38"
syntex_syntax = "0.43"
regex = "0.1"

[dependencies.aster]
features = ["with-syntex"]
version = "0.21.1"
version = "0.26"

[dependencies.clippy]
optional = true
version = "*"

[dependencies.quasi]
features = ["with-syntex"]
version = "0.15"
version = "0.19"

[features]
llvm_stable = []
Expand Down
7 changes: 4 additions & 3 deletions build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,12 @@ mod codegen {

pub fn main() {
let out_dir = env::var_os("OUT_DIR").unwrap();
let src = Path::new("src/gen.rs");
let dst = Path::new(&out_dir).join("gen.rs");
let src = Path::new("src/codegen/mod.rs");
let dst = Path::new(&out_dir).join("codegen.rs");

quasi_codegen::expand(&src, &dst).unwrap();
println!("cargo:rerun-if-changed=src/gen.rs");
println!("cargo:rerun-if-changed=src/codegen/mod.rs");
println!("cargo:rerun-if-changed=src/codegen/helpers.rs");
}
}

Expand Down
83 changes: 43 additions & 40 deletions src/bin/bindgen.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,32 +2,20 @@
#![crate_type = "bin"]

extern crate bindgen;
extern crate env_logger;
#[macro_use]
extern crate docopt;
#[macro_use]
extern crate log;
extern crate clang_sys;
extern crate rustc_serialize;

use bindgen::{Bindings, BindgenOptions, LinkType, Logger};
use bindgen::{Bindings, BindgenOptions, LinkType};
use std::io;
use std::path;
use std::env;
use std::default::Default;
use std::fs;
use std::process::exit;

struct StdLogger;

impl Logger for StdLogger {
fn error(&self, msg: &str) {
println!("{}", msg);
}

fn warn(&self, msg: &str) {
println!("{}", msg);
}
}

const USAGE: &'static str = "
Usage:
Expand All @@ -40,6 +28,9 @@ Usage:
[--dtor-attr=<attr>...] \
[--opaque-type=<type>...] \
[--blacklist-type=<type>...] \
[--whitelist-type=<type>...] \
[--whitelist-function=<name>...] \
[--whitelist-var=<name>...] \
<input-header> \
[-- <clang-args>...]
Expand Down Expand Up @@ -95,15 +86,25 @@ Options:
ulonglong
slonglong
--raw-line=<raw> TODO
--dtor-attr=<attr> TODO
--no-class-constants TODO
--no-unstable-rust TODO
--no-namespaced-constants TODO
--no-bitfield-methods TODO
--ignore-methods TODO
--opaque-type=<type> TODO
--blacklist-type=<type> TODO
--raw-line=<raw> Add a raw line at the beginning of the output.
--dtor-attr=<attr> Attributes to add to structures with destructor.
--no-class-constants Avoid generating class constants.
--no-unstable-rust Avoid generating unstable rust.
--no-namespaced-constants Avoid generating constants right under namespaces.
--no-bitfield-methods Avoid generating methods for bitfield access.
--ignore-methods Avoid generating all kind of methods.
--opaque-type=<type> Mark a type as opaque.
--blacklist-type=<type> Mark a type as hidden.
--whitelist-type=<type> Whitelist the type. If this set or any other
of the whitelisting sets is not empty, then
all the non-whitelisted types (or dependant)
won't be generated.
--whitelist-function=<regex> Whitelist all the free-standing functions
matching <regex>. Same behavior on emptyness
than the type whitelisting.
--whitelist-var=<regex> Whitelist all the free-standing variables
matching <regex>. Same behavior on emptyness
than the type whitelisting.
<clang-args> Options other than stated above are passed
directly through to clang.
Expand Down Expand Up @@ -134,6 +135,9 @@ struct Args {
flag_ignore_methods: bool,
flag_opaque_type: Vec<String>,
flag_blacklist_type: Vec<String>,
flag_whitelist_type: Vec<String>,
flag_whitelist_function: Vec<String>,
flag_whitelist_var: Vec<String>,
arg_clang_args: Vec<String>,
}

Expand Down Expand Up @@ -182,7 +186,10 @@ impl Into<ParseResult<(BindgenOptions, Box<io::Write>)>> for Args {
options.gen_bitfield_methods = !self.flag_no_bitfield_methods;
options.ignore_methods = self.flag_ignore_methods;
options.opaque_types.extend(self.flag_opaque_type.drain(..));
options.blacklist_type.extend(self.flag_blacklist_type.drain(..));
options.hidden_types.extend(self.flag_blacklist_type.drain(..));
options.whitelisted_types.extend(self.flag_whitelist_type.drain(..));
options.whitelisted_functions.extend(self.flag_whitelist_function.drain(..));
options.whitelisted_vars.extend(self.flag_whitelist_var.drain(..));
options.clang_args.extend(self.arg_clang_args.drain(..));
options.clang_args.push(self.arg_input_header);

Expand All @@ -191,6 +198,13 @@ impl Into<ParseResult<(BindgenOptions, Box<io::Write>)>> for Args {
}

pub fn main() {
log::set_logger(|max_log_level| {
use env_logger::Logger;
let env_logger = Logger::new();
max_log_level.set(env_logger.filter());
Box::new(env_logger)
}).expect("Failed to set logger.");

let mut bind_args: Vec<_> = env::args().collect();

if let Some(clang) = clang_sys::support::Clang::find(None) {
Expand All @@ -217,24 +231,13 @@ pub fn main() {
.and_then(|d| d.argv(bind_args.iter()).decode())
.unwrap_or_else(|e| e.exit());

let logger = StdLogger;
let result: ParseResult<_> = args.into();
let (options, out) = result.unwrap_or_else(|msg| {
logger.error(&msg);
exit(-1);
panic!("Failed to generate_bindings: {:?}", msg);
});

match Bindings::generate(&options, Some(&logger as &Logger), None) {
Ok(bindings) => match bindings.write(out) {
Ok(()) => (),
Err(e) => {
logger.error(&format!("Unable to write bindings to file. {}", e));
exit(-1);
}
},
Err(()) => {
logger.error("Failed to generate bindings".into());
exit(-1);
}
}
let bindings = Bindings::generate(options, None)
.expect("Unable to generate bindings");
bindings.write(out)
.expect("Unable to write bindings to file.");
}
Loading

0 comments on commit cfdf15f

Please sign in to comment.