Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crate_universe: Rewrite generator in starlark #1287

Open
uhthomas opened this issue Apr 22, 2022 · 10 comments · May be fixed by #1425
Open

crate_universe: Rewrite generator in starlark #1287

uhthomas opened this issue Apr 22, 2022 · 10 comments · May be fixed by #1425

Comments

@uhthomas
Copy link
Contributor

This is just an idea, I'd really appreciate some feedback and ideas.

Is it conceivably possible to rewrite the generator in pure starlark? This would remove the bootstrapping problem of distributing generator binaries or hackily building the generator with the native cargo toolchain.

def cargo_bazel_bootstrap(name = "cargo_bazel_bootstrap", rust_version = rust_common.default_version):

We're using a custom generator to work around some issues, but this makes the analysis phase almost a minute slower as repository rules are not remotely cacheable.

Finished release [optimized] target(s) in 38.77s

Failing a full rewrite, some more ideas:

  • Rewrite only some functionality in starlark.
  • Move some functionality out of the repository rule and into just a normal rule. This could be achieved with a macro. The normal rule could then check if repinning is required, whilst allowing the repository rule to generate the build files it needs.

Thanks for taking the time to read this :) Please don't hesitate to ask any questions.

@UebelAndre
Copy link
Collaborator

I'm not entirely sure the generator can be entirely written in starlark but the renderer side of it very likely could be and think that'd be great!

We're using a custom generator to work around some issues, but this makes the analysis phase almost a minute slower as repository rules are not remotely cacheable.

Finished release [optimized] target(s) in 38.77s

To clarify, are you building a new generator in repository rules? crate_universe as never designed for users to be compiling the generator in repository rules. Instead, the expectation is that users who want to compile their own binaries can do so and upload them to some static asset host (s3, artifactory) where devs and CI would pull them down for use in their builds.

That said though, if the renderer were to be implemented in starlark, then there would be no need to download or build a generator in most cases since if you have a trusted lockfile, the only computation left to do is write BUILD files from pre-defined templates which is something doable in most lanugages.

@uhthomas
Copy link
Contributor Author

Sounds great, thanks! I have some pretty good ideas on how to tackle this and I think the implementation could be quite nice.

I was thinking about the generator a bit and don't see why that couldn't also happen in Starlark. The cargo binary will happy output JSON which makes it easy to work with.

@uhthomas
Copy link
Contributor Author

I've made a prototype implementation and it looks pretty good so far. It has shown some weirdness with the format of the lockfile, but has been easy to work with otherwise.

The blocker at the moment is figuring out how to handle selects and cfg expressions (conditional compilation, syntax reference). Example:

cfg(target_family = "unix")

The lockfile isn't quite as data-oriented as it initially seems. It expects tools which work with the lockfile to be able to parse and evaluate these expressions.

Does it seem reasonable to suggest a slight change to the lockfile? I don't see any drawbacks for an explicit list of targets as opposed to a cfg expression.

@UebelAndre

@UebelAndre
Copy link
Collaborator

Does it seem reasonable to suggest a slight change to the lockfile? I don't see any drawbacks for an explicit list of targets as opposed to a cfg expression.

Can you provide an example of the specific change you're thinking of?

@uhthomas
Copy link
Contributor Author

Yep, of course.

So here's an excerpt from the JSON lockfile (represented as CUE for brevity):

"sha2 0.10.2": common_attrs: deps: selects: {
	"cfg(any(target_arch = \"aarch64\", target_arch = \"x86_64\", target_arch = \"x86\"))": [{
		id:     "cpufeatures 0.2.2"
		target: "cpufeatures"
	}]
}

Currently the "selects" deps are a map of cfg expressions to deps. I would propose that instead it should be a list of objects which include a list of platform names and deps. Such:

"sha2 0.10.2": common_attrs: deps: selects: [{
	platforms: [
		"aarch64-apple-darwin",
		"aarch64-apple-ios",
		"aarch64-linux-android",
		"aarch64-unknown-linux-gnu",
		"i686-apple-darwin",
		"i686-linux-android",
		"i686-pc-windows-msvc",
		"i686-unknown-freebsd",
		"i686-unknown-linux-gnu",
		"x86_64-apple-darwin",
		"x86_64-apple-ios",
		"x86_64-linux-android",
		"x86_64-pc-windows-msvc",
		"x86_64-unknown-freebsd",
		"x86_64-unknown-linux-gnu",
	]
	deps: [{
		id:     "cpufeatures 0.2.2"
		target: "cpufeatures"
	}]
}

@UebelAndre
Copy link
Collaborator

That'd probably be fine but seems like it'd be noisy. The reason I wrote the cfg string is because it was shorter and slightly more meaningful. The lockfile includes a mapping of those strings to a list of platforms so they only need to be defined once. Would it still work to use that?

"conditions": {
    "cfg(any(target_arch = \"aarch64\", target_arch = \"x86_64\", target_arch = \"x86\"))": [
        "aarch64-apple-darwin",
        "aarch64-apple-ios",
        "aarch64-linux-android",
        "aarch64-unknown-linux-gnu",
        "i686-apple-darwin",
        "i686-linux-android",
        "i686-pc-windows-msvc",
        "i686-unknown-freebsd",
        "i686-unknown-linux-gnu",
        "x86_64-apple-darwin",
        "x86_64-apple-ios",
        "x86_64-linux-android",
        "x86_64-pc-windows-msvc",
        "x86_64-unknown-freebsd",
        "x86_64-unknown-linux-gnu"
    ]
}

@uhthomas
Copy link
Contributor Author

uhthomas commented Jun 12, 2022

I completely overlooked the conditions in the lockfile, thanks! I think that will work fine.

Is there a reference which explains the format of the lockfile? Something which isn't source code? Might be helpful for future readers.


re: it'd be noisy

I agree, though the lockfile is already very large.

❯ cat Cargo.Bazel.lock | wc -l
10719

@uhthomas
Copy link
Contributor Author

So, I got it all working! :)

Just need to clean some stuff up. Hopefully will open a PR for this within the next few days.

@uhthomas
Copy link
Contributor Author

uhthomas commented Jun 13, 2022

If you're interested, this is the dev branch: main...uhthomas:1287.

There are a few more things to do, but it's possible to build and run the cargo-bazel binary with these changes which is a huge milestone.

❯ bazel run @rules_rust//crate_universe:bin
(11:08:42) INFO: Invocation ID: 94c7a23a-a6fa-4f96-8d8a-3639c3d5fa1a
(11:08:42) INFO: Current date is 2022-06-13
(11:08:42) INFO: Analyzed target @rules_rust//crate_universe:bin (0 packages loaded, 0 targets configured).
(11:08:42) INFO: Found 1 target...
Target @rules_rust//crate_universe:cargo_bazel_bin up-to-date:
  bazel-bin/external/rules_rust/crate_universe/cargo_bazel_bin
(11:08:42) INFO: Elapsed time: 0.226s, Critical Path: 0.01s
(11:08:42) INFO: 1 process: 1 internal.
(11:08:42) INFO: Build completed successfully, 1 total action
(11:08:42) INFO: Running command line: bazel-bin/external/rules_rust/crate_universe/cargo_bazel_b(11:08:42) INFO: Build completed successfully, 1 total action
cargo-bazel 0.2.2

USAGE:
    cargo_bazel_bin <SUBCOMMAND>

OPTIONS:
    -h, --help       Print help information
    -V, --version    Print version information

SUBCOMMANDS:
    generate    Generate Bazel Build files from a Cargo manifest
    help        Print this message or the help of the given subcommand(s)
    query       Query workspace info to determine whether or not a repin is needed
    splice      Splice together disjoint Cargo and Bazel info into a single Cargo workspace
                    manifest
    vendor      Vendor BUILD files to the workspace with either repository definitions or `cargo
                    vendor` generated sources

@uhthomas
Copy link
Contributor Author

uhthomas commented Jun 17, 2022

@UebelAndre Would you mind checking out my branch and trying it out? It's not feature complete, though rules_rust builds with it (bazel builld //...) and we're also now using it internally. It seems to work well and bazel run @crate_universe_crate_index//:repin is blissful in comparison to the bazel sync stuff.

The vendoring of BUILD files had to be removed, apologies if this is a problem. The way the manifest is interpreted by starlark makes it infeasible.

However bazel query --output=build @crate_universe_crate_index//... or bazel query --output=build @crate_universe_crate_index__toml-0.5.9//... does show the expanded macros.

Looking forward to hearing your experience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants