Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process_wrapper: add support for terminating rustc after it emits rmeta. #1207

Merged
merged 1 commit into from
Apr 19, 2022

Conversation

gigaroby
Copy link
Contributor

@gigaroby gigaroby commented Mar 21, 2022

This is a key component of supporting pipelining in rules_rust. The
bazel rules will (in a follow-up PR) configure rustc to run either to
completion for rules whose dependencies require the full rlib files or
until they emit the rmeta files if dependencies only require that.

This is safe to commit as there are no changes to user-visible behavior
until the new flags are used.

@gigaroby gigaroby marked this pull request as draft March 21, 2022 14:21
@gigaroby gigaroby force-pushed the pipeline branch 6 times, most recently from d2423a5 to f824879 Compare March 23, 2022 12:43
@gigaroby gigaroby changed the title Draft pipeline support. Not sure how to implement the bazel part yet. process_wrapper: add support for terminating rustc after it emits rmeta. Mar 23, 2022
@gigaroby gigaroby force-pushed the pipeline branch 16 times, most recently from 1d273f2 to 1225f0b Compare March 25, 2022 12:57
@gigaroby gigaroby marked this pull request as ready for review March 25, 2022 13:19
@gigaroby gigaroby force-pushed the pipeline branch 2 times, most recently from 1ffd558 to 531eeb7 Compare March 28, 2022 12:41
util/process_wrapper/main.rs Outdated Show resolved Hide resolved
util/process_wrapper/main.rs Outdated Show resolved Hide resolved
util/process_wrapper/main.rs Outdated Show resolved Hide resolved
util/process_wrapper/rustc.rs Outdated Show resolved Hide resolved
util/process_wrapper/rustc.rs Outdated Show resolved Hide resolved
let parsed: JsonValue = line
.parse()
.expect("process wrapper error: expected json messages in pipeline mode");
if let Some(emit) = get_key(parsed.clone(), "emit") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to clone? Seems we can move parsed.

Nit/discussion - from get_key I'd expect that we're getting a reference to a key while somebody else keeps ownership of the map. What we are actually doing is consuming the map and returning the value. Sometimes in Rust APIs these "conversion" methods are expressed as to_<foo>(). So maybe to_key is more intention revealing? Maybe take_key/collect_key are better? WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were two places where parsed might have been moved. Modified get_key so it takes a reference to the json object and instead clones the string value out of it. WDYT?

@gigaroby gigaroby force-pushed the pipeline branch 2 times, most recently from 0343ba7 to d135600 Compare April 1, 2022 13:10
@scentini scentini self-requested a review April 1, 2022 13:56
util/process_wrapper/output.rs Outdated Show resolved Hide resolved
@hlopko
Copy link
Member

hlopko commented Apr 5, 2022

@UebelAndre do you also want to take a look?

Copy link
Collaborator

@UebelAndre UebelAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly have nits but could you also explain "pipelining" to me? What's the workflow that this unblocks?

test/process_wrapper/rustc_quit_on_rmeta.rs Outdated Show resolved Hide resolved
test/process_wrapper/rustc_quit_on_rmeta.rs Outdated Show resolved Hide resolved
test/process_wrapper/fake_rustc.rs Outdated Show resolved Hide resolved
util/process_wrapper/rustc.rs Show resolved Hide resolved
rust/repositories.bzl Outdated Show resolved Hide resolved
@gigaroby
Copy link
Contributor Author

gigaroby commented Apr 5, 2022

@UebelAndre thanks for review :)

I mostly have nits but could you also explain "pipelining" to me? What's the workflow that this unblocks?

Sure :) Pipelining (in Cargo terms) allows Cargo to unblock a lib -> lib dependency by asking the compiler to emit a metadata file during compilation. When rustc is asked to provide a metadata file, it will produce two files for any given library a .rmeta (the metadata) which is produced quickly and a .rlib (the full object). As soon as Cargo sees the rmeta file, it unblocks all the library targets that depend on that, increasing parallelism. Bazel is (currently) unable to unblock dependencies mid-action so we need two different invocations of the compiler. The first invocation will be killed as soon as the metadata file is produced while the second will continue to completion. The idea is that for rlib -> rlib dependencies we will only run the lightweight metadata invocation, then unblock the dependencies, then compile the full rlib at link time (or whatever other action depends on the full file). You can also read the original rationale for introducing this in Cargo at https://internals.rust-lang.org/t/evaluating-pipelined-rustc-compilation/10199.

I hope this answers your question.

@gigaroby gigaroby force-pushed the pipeline branch 3 times, most recently from 5ff23fa to d30adb5 Compare April 9, 2022 09:40
@dfreese
Copy link
Collaborator

dfreese commented Apr 10, 2022

Ignorant question: if pipelining requires two runs of cargo, instead of building up a way to terminate rustc, could --emit=metadata be the only emit value passed in the first invocation? What would be the technical rationale for actually needing to kill the process?

Related, it seems like having that as always being the first step would cause the build to fail fast, probably solving #428, more-or-less.

@gigaroby
Copy link
Contributor Author

Ignorant question: if pipelining requires two runs of cargo, instead of building up a way to terminate rustc, could --emit=metadata be the only emit value passed in the first invocation? What would be the technical rationale for actually needing to kill the process?

Sadly no. As I have found out while working on the bazel side of this (which is still WIP), the CLI invocations of the two compiler instances need to be basically identical for this to work. Changing any of the flags (including stuff unrelated to codegen, like --error-format) causes rustc to not accept the generated .rmeta file once we get to the linking stage where we need the .rlib.

Related, it seems like having that as always being the first step would cause the build to fail fast, probably solving #428, more-or-less.

This is still going to be behind a flag and disabled by default at least for the time being.

@gigaroby
Copy link
Contributor Author

@UebelAndre are you happy with this change now?

@dfreese
Copy link
Collaborator

dfreese commented Apr 11, 2022

Sadly no. As I have found out while working on the bazel side of this (which is still WIP), the CLI invocations of the two compiler instances need to be basically identical for this to work. Changing any of the flags (including stuff unrelated to codegen, like --error-format) causes rustc to not accept the generated .rmeta file once we get to the linking stage where we need the .rlib.

Sad day. Has that been flagged to the rust team? That seems like something rustc should be able to support.

@gigaroby
Copy link
Contributor Author

Sad day. Has that been flagged to the rust team? That seems like something rustc should be able to support.

Well, not really. What rustc does is to emit metadata and object in the same invocation of the compiler. The double invocation is just because of a limitation in bazel (which can't emit output out of a rule that's still executing).

@hlopko
Copy link
Member

hlopko commented Apr 11, 2022

In addition to command line flags changing the content of the rlib as @gigaroby explained, there's also a second issue/feature :) --emit=metadata and --emit=metadata,link produces different .rmeta files. The first produces what you know as cargo check outputs. Those .rmeta files cannot be used for downstream compilations producing object files since they don't contain all the necessary information, they can only be used to emit most (but strictly speaking not all) compilation errors. If you want to use the .rmeta files for compilation, you have to use what cargo pipelining does, which is --emit=metadata,link (other combinations work too, like --emit=metadata,mir, but then you're hitting the first issue). Unsurprisingly, .rmeta file is generated faster with --emit=metadata than with --emit=metadata,link+process termination.

So regarding #428, we could use the metadata files from pipelining for an equivalent of cargo check, but those actions will still be slower than using plain --emit=metadata.

Copy link
Collaborator

@UebelAndre UebelAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I have one more question and my final set of nits. Generally looks good though, great work and thanks!

util/process_wrapper/rustc.rs Show resolved Hide resolved
util/process_wrapper/output.rs Show resolved Hide resolved
util/process_wrapper/output.rs Outdated Show resolved Hide resolved
test/process_wrapper/fake_rustc.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@UebelAndre UebelAndre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting up with all my boring requests! Looks good to me 😄

@jsgf
Copy link

jsgf commented Apr 17, 2022

BTW, I've successfully implemented Rust pipelining in Buck rules. The key is to not use --emit metadata. More details: #228 (comment)

This is a key component of supporting pipelining in rules_rust. The
bazel rules will (in a follow-up PR) configure rustc to run either to
completion for rules whose dependencies require the full rlib files or
until they emit the rmeta files if dependencies only require that.

This is safe to commit as there are no changes to user-visible behavior
until the new flags are used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants