Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generating the coverage map #74091

Merged
merged 1 commit into from
Jul 19, 2020
Merged

Conversation

richkadel
Copy link
Contributor

@richkadel richkadel commented Jul 6, 2020

@tmandry @wesleywiser

rustc now generates the coverage map and can support (limited)
coverage report generation, at the function level.

Example commands to generate a coverage report:

$ BUILD=$HOME/rust/build/x86_64-unknown-linux-gnu
$ $BUILD/stage1/bin/rustc -Zinstrument-coverage \
$HOME/rust/src/test/run-make-fulldeps/instrument-coverage/main.rs
$ LLVM_PROFILE_FILE="main.profraw" ./main
called
$ $BUILD/llvm/bin/llvm-profdata merge -sparse main.profraw -o main.profdata
$ $BUILD/llvm/bin/llvm-cov show --instr-profile=main.profdata main

rust coverage report only 20200706

r? @wesleywiser

Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: #34701 - Implement support for LLVMs code coverage instrumentation

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 6, 2020
@richkadel richkadel force-pushed the llvm-coverage-map-gen-4 branch from 79b365d to 9c4930b Compare July 7, 2020 02:48
@richkadel
Copy link
Contributor Author

@tmandry @wesleywiser - I did my own detailed review of the output and tests and made a few important, small improvements to this PR.

I don't expect to make any more changes until I get feedback. Thanks!

@richkadel richkadel force-pushed the llvm-coverage-map-gen-4 branch from 9c4930b to a0b3457 Compare July 7, 2020 21:21
@richkadel
Copy link
Contributor Author

I noticed a mistake in the virtual_file_mapping creation and fixed it.

FYI, I also made a temporary change to src/ci/azure-pipelines/try.yml to enable testing on macos and windows in bors try tests. I'd like to try this PR to make sure there aren't any platform-specific issues before someone tries it in a rollup.

Thanks!

@wesleywiser
Copy link
Member

Sorry for the delay, it's been a busy few days for me. I'm starting the review so let's go ahead and do

@bors try

@bors
Copy link
Contributor

bors commented Jul 8, 2020

⌛ Trying commit 08b006f81ef8377ae1b121f3dfbef22e141f5201 with merge 31e6847482c04e9b90e2b2a8abf75c59af671f5f...

@richkadel
Copy link
Contributor Author

@wesleywiser @tmandry - FYI, until now, I have only tested on very small, simple programs.

Just today I started trying -Zinstrument-coverage on larger crates, and there are some problems. Not really surprising.

I propose landing the current version--with review comments resolved, but otherwise, as-is--since we already recognize this is not "complete" yet.

But (your call), if you prefer, I can make more changes to this PR to get the function-level coverage working on more complex crates.

Just so you know, here are two problems I've found so far, when attempting to build the json5format crate:

  1. When I try to instrument a complete package, including dependencies (via RUSTFLAGS=-Zinstrument-coverage cargo build), and need to call the query tcx.coverageinfo(def_id) from codegen_intrinsic_call(), injecting the LLVM instrprof.increment intrinsic fails for dependent crates because tcx.hir().get_if_local(def_id) returns None (i.e., the def_id is not "local" in the current type context (not found in it's hir::map::Map). I'm not sure how I should resolve this yet. (I'm also not sure, yet, if this is failing on all dependent crate functions, or possibly for functions pulled in from one crate to another -- wild guess based on the first crate that failed like this.)

    FYI, I worked around this problem (so far, at least) by limiting the instrumentation to the main crate (not dependencies) only, using cargo rust instead.

  2. I'm still not computing the virtual_file_mapping array correctly when building coverage data for a module with functions in more than one file. This shouldn't be hard to resolve. I just misinterpreted the format spec (which is a little vague on this point, and the examples only have one file).

And just so you are aware, I know the CounterExpression indexes are wrong, but this is by design. The indexes for CounterExpressions need to be recomputed (which also affects the expression operands themselves). This should be a very simple translation, and I know how I plan to do this, but I want to validate the solution before checking it in, but I don't (yet) produce any actual Counter Expressions expressions to validate against.

@richkadel
Copy link
Contributor Author

I propose landing the current version--with review comments resolved, but otherwise, as-is--since we already recognize this is not "complete" yet.

But (your call), if you prefer, I can make more changes to this PR to get the function-level coverage working on more complex crates.

Oh, and if I do land this PR "as-is", my next PR would resolve the known issues with larger crates.

Copy link
Member

@tmandry tmandry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

cc @petrhosek who I think would be interested in taking a look.

src/librustc_codegen_llvm/coverageinfo/mapgen.rs Outdated Show resolved Hide resolved
src/librustc_codegen_llvm/coverageinfo/mapgen.rs Outdated Show resolved Hide resolved
src/librustc_codegen_llvm/coverageinfo/mapgen.rs Outdated Show resolved Hide resolved
src/librustc_codegen_ssa/traits/statics.rs Show resolved Hide resolved
src/librustc_session/config.rs Outdated Show resolved Hide resolved
src/rustllvm/CoverageMappingWrapper.cpp Show resolved Hide resolved
src/rustllvm/CoverageMappingWrapper.cpp Outdated Show resolved Hide resolved
src/test/run-make-fulldeps/instrument-coverage/Makefile Outdated Show resolved Hide resolved
src/librustc_codegen_llvm/coverageinfo/mapgen.rs Outdated Show resolved Hide resolved
@tmandry
Copy link
Member

tmandry commented Jul 8, 2020

  1. When I try to instrument a complete package, including dependencies (via RUSTFLAGS=-Zinstrument-coverage cargo build), and need to call the query tcx.coverageinfo(def_id) from codegen_intrinsic_call(), injecting the LLVM instrprof.increment intrinsic fails for dependent crates because tcx.hir().get_if_local(def_id) returns None (i.e., the def_id is not "local" in the current type context (not found in it's hir::map::Map). I'm not sure how I should resolve this yet. (I'm also not sure, yet, if this is failing on all dependent crate functions, or possibly for functions pulled in from one crate to another -- wild guess based on the first crate that failed like this.)

Yeah, computing the hash on HIR will fail when codegenning functions from another crate (either #[inline] or generic functions being monomorphized by the current crate). All that's available for those functions is their signature and MIR, roughly.

We'll have to find another way to compute a somewhat-stable hash, or include it in crate metadata.

In terms of whether to do it now, it seems like an issue from a past PR and this is an experimental feature, so I'd say it's okay to leave as-is and fix in a follow up PR.

@bors
Copy link
Contributor

bors commented Jul 8, 2020

💔 Test failed - checks-azure

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 8, 2020
@richkadel
Copy link
Contributor Author

💔 Test failed - checks-azure

Good thing we "try"d !

MacOS and Windows both failed, but earlier than I expected. I'll need to resolve these issues and we'll need to "try" again I think.

@richkadel
Copy link
Contributor Author

Yeah, computing the hash on HIR will fail when codegenning functions from another crate (either #[inline] or generic functions being monomorphized by the current crate). All that's available for those functions is their signature and MIR, roughly.

In that case, I should be able to work around that limitation by adding the hash to the Rust coverage intrinsics injected into the MIR.

In terms of whether to do it now, it seems like an issue from a past PR and this is an experimental feature, so I'd say it's okay to leave as-is and fix in a follow up PR.

SGTM, if ok with Wesley too. Thanks!

@@ -80,3 +80,36 @@ jobs:
# INITIAL_RUST_CONFIGURE_ARGS: --build=x86_64-pc-windows-msvc --enable-extended --enable-profiler
# SCRIPT: python x.py dist
# DEPLOY_ALT: 1

# TEMPORARILY ENABLE MAC AND WINDOWS TESTS ON BORS TRY.
# FIXME(richkadel): remove this before merging the PR
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting this so we don't forget it.

src/test/run-make-fulldeps/instrument-coverage/Makefile Outdated Show resolved Hide resolved
src/rustllvm/CoverageMappingWrapper.cpp Show resolved Hide resolved
pub fn coverage_loc(&self, source_map: &SourceMap) -> CoverageLoc {
let (file, start_line, start_col) =
lookup_file_line_col(source_map, BytePos::from_u32(self.coverage_span.start_byte_pos));
let (_, end_line, end_col) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe assert!() or debug_assert!() that these files are the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wesleywiser Yes, thanks! So I added this check, and surprisingly saw some cases (during ad hoc testing) where the assert failed. I was getting different filenames for the start and end of the byte_pos range!

But I've also seen some coverage results that show source ranges inside comments (not rustdoc code comments), so something is off sometimes.

It's almost surprising that a lot of things work correctly. So I still need to do more investigation.

For now, to avoid crashing, I changed this function to return an Option. Coverage mappings that have this problem get None from coverage_loc(), and I just skip them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@richkadel I wonder if any of those are related to macros? If you look at the MIR for a simple hello world (playground), you can see that some of the spans come from the playground file (src/lib.rs) while many of the others come from the expansion of the println!() macro and point into the std library (/rustc/{git commit}/src/libstd/macros.rs).

src/librustc_mir/transform/instrument_coverage.rs Outdated Show resolved Hide resolved
src/librustc_mir/transform/instrument_coverage.rs Outdated Show resolved Hide resolved
src/librustc_session/options.rs Show resolved Hide resolved
@wesleywiser
Copy link
Member

In terms of whether to do it now, it seems like an issue from a past PR and this is an experimental feature, so I'd say it's okay to leave as-is and fix in a follow up PR.

SGTM, if ok with Wesley too. Thanks!

That seems fine to me as well. There's no expectation that unstable features are fully implemented or bug-free during development.

@richkadel richkadel force-pushed the llvm-coverage-map-gen-4 branch 4 times, most recently from dccafde to c379fc8 Compare July 13, 2020 18:50
@richkadel
Copy link
Contributor Author

@wesleywiser @tmandry - Assuming the PR build succeeds, can we start another @bors try?

I think the error on Windows is resolved. I'm confused by the MacOS error, but with other changes made, I'd like to test it again.

Thanks!

@richkadel
Copy link
Contributor Author

FYI, the normal PR build is failing with:

Command failed. Attempt 5/5:
error: the lock file /checkout/Cargo.lock needs to be updated but --locked was passed to prevent this
If you want to try to generate the lock file without accessing the network, use the --offline flag.
failed to run: /checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo build --manifest-path /checkout/src/bootstrap/Cargo.toml --locked
Build completed unsuccessfully in 0:00:00
make: *** [prepare] Error 1
Makefile:60: recipe for target 'prepare' failed

I did add a new tool, rust-demangler, to the main Cargo.toml, which caused Cargo.lock to be updated. If that's the cause, what do I need to do?

@richkadel richkadel force-pushed the llvm-coverage-map-gen-4 branch from c379fc8 to 8926430 Compare July 13, 2020 19:36
@richkadel
Copy link
Contributor Author

FYI, the normal PR build is failing with:

Command failed. Attempt 5/5:
error: the lock file /checkout/Cargo.lock needs to be updated but --locked was passed to prevent this

Never mind! This is resolved.

@richkadel
Copy link
Contributor Author

@wesleywiser @tmandry - I think I've addressed all of your comments with the latest changes. I'm going to go back through them though and reply or resolve.

Note that when I started working on your feedback last week, I already anticipated some restructuring. Instead of submitting changes just to resolve the feedback alone, I incorporated your feedback into the restructured code.

So there are some more differences from what you reviewed last time, but all feedback was taken into account.

I was able to generate coverage on the entire json5format crate, which was awesome. There are some SourceMap glitches I think (user-error I'm sure) but a lot of it works well, showing coverage counts across functions, closures, and multiple instances of generic functions (woohoo!).

I also implemented support for name demangling, so llvm-cov show results are quite readable!

Here's an example from the revised test:

https://github.com/rust-lang/rust/pull/74091/files#diff-b91fcde238799d1eb0c57d2c4651ccc0

@richkadel
Copy link
Contributor Author

  1. When I try to instrument a complete package, including dependencies (via RUSTFLAGS=-Zinstrument-coverage cargo build), and need to call the query tcx.coverageinfo(def_id) from codegen_intrinsic_call(), injecting the LLVM instrprof.increment intrinsic fails for dependent crates because tcx.hir().get_if_local(def_id) returns None (i.e., the def_id is not "local" in the current type context (not found in it's hir::map::Map). I'm not sure how I should resolve this yet. (I'm also not sure, yet, if this is failing on all dependent crate functions, or possibly for functions pulled in from one crate to another -- wild guess based on the first crate that failed like this.)

Yeah, computing the hash on HIR will fail when codegenning functions from another crate (either #[inline] or generic functions being monomorphized by the current crate). All that's available for those functions is their signature and MIR, roughly.

As previously discussed, I resolved this by adding the function source hash to the count_code_region() intrinsic calls. (There will always be at least 1 of these.) That allows me to push the hash value into the coverage map in codegen, and I have it when generating the LLVM-IR. This "works", but there may be other issues with mapping coverage to source references from external crates (as I noted in another comment earlier today). So I may have to revisit this if it's only the tip of the iceberg.

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 18, 2020
@mati865
Copy link
Contributor

mati865 commented Jul 18, 2020

Looks spurious:

Details
failures:

---- [ui] rustdoc-ui\failed-doctest-output.rs stdout ----
\rustdoc-ui\failed-doctest-output.rs

-	---- $DIR/failed-doctest-output.rs - OtherStruct (line 21) stdout ----
-	error[E0425]: cannot find value `no` in this scope
-	  --> $DIR/failed-doctest-output.rs:22:1
-	   |
-	LL | no
-	   | ^^ not found in this scope
-	
-	error: aborting due to previous error
-	
-	For more information about this error, try `rustc --explain E0425`.
-	Couldn't compile the test.
19	---- $DIR/failed-doctest-output.rs - SomeStruct (line 11) stdout ----
20	Test executable failed (exit code 101).
21	

30	note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
31	
32	
+	---- $DIR/failed-doctest-output.rs - OtherStruct (line 21) stdout ----
+	error[E0425]: cannot find value `no` in this scope
+	  --> $DIR/failed-doctest-output.rs:22:1
+	   |
+	LL | no
+	   | ^^ not found in this scope
+	
+	error: aborting due to previous error
+	
+	For more information about this error, try `rustc --explain E0425`.
+	Couldn't compile the test.
33	
34	failures:
35	    $DIR/failed-doctest-output.rs - OtherStruct (line 21)

@wesleywiser
Copy link
Member

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 18, 2020
@bors
Copy link
Contributor

bors commented Jul 18, 2020

⌛ Testing commit a6f8b8a with merge 28487bc5bc0433159b337420f8f7f1a7b3571703...

@bors
Copy link
Contributor

bors commented Jul 18, 2020

💥 Test timed out

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jul 18, 2020
@richkadel
Copy link
Contributor Author

Ugh. I checked the Pipeline UI just a couple of minutes before this timeout, and it was still going strong, just slow to complete the remaining tests on Windows.

It was chugging along on test/ui at the time.

Timing out like this after 4 hours without an actual problem was a waste of resources.

Any way to extend the timeout?

@richkadel
Copy link
Contributor Author

Actually, it looks like the build was proactively cancelled by someone.

I agree the job was taking exceptionally long, but I assume it is due to an infrastructure issue, not the PR. Sad to see it get that far without getting the chance to complete.

@wesleywiser @tmandry - Any reason to think that the PR has a problem? (It did complete the bors try (3 days ago), in a normal amount of time.)

If not, can we queue it up again?

Thanks

@wesleywiser
Copy link
Member

Intermittent timeouts are relatively common so it's probably just that. I'm not at a computer to verify but we can give it another try on the assumption that's the issue.

@bors rety

@wesleywiser
Copy link
Member

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 19, 2020
@richkadel
Copy link
Contributor Author

Thanks @wesleywiser !

@bors
Copy link
Contributor

bors commented Jul 19, 2020

⌛ Testing commit a6f8b8a with merge 47ea6d9...

@bors
Copy link
Contributor

bors commented Jul 19, 2020

☀️ Test successful - checks-actions, checks-azure
Approved by: tmandry
Pushing 47ea6d9 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 19, 2020
@bors bors merged commit 47ea6d9 into rust-lang:master Jul 19, 2020
@eggyal
Copy link
Contributor

eggyal commented Jul 21, 2020

This is absolutely fantastic—huge thank you @richkadel, and everyone who's worked on this so far.

One thing that I've noticed on a large library crate, when building the test harness with -Zinstrument-coverage -Clink-dead-code, is that there are a number of functions for which there is no coverage data at all—presumably because, despite the provided codegen option, they are dead code that's being stripped somewhere along the way. Could the presently known limitations with -Zinstrument-coverage lead to such behaviour, or is this likely a bug (either here or in -Clink-dead-code)—in which case I'll try to produce an MCVE?

@mati865
Copy link
Contributor

mati865 commented Jul 21, 2020

@eggyal could you make sure LTO is not enabled?

@eggyal
Copy link
Contributor

eggyal commented Jul 21, 2020

@mati865 It's not. Have been investigating since my previous comment and it appears to be due to generics/monomorphisation. For example:

pub struct Foo<T>(T);

impl<T> Foo<T> {
    pub fn foo(&self) {
        // coverage exists (although regions look a bit buggy to me?)
    }

    pub fn bar(&self) {
        // no coverage
    }
}

#[cfg(test)]
mod tests {
    #[test]
    fn test() {
        let foo = super::Foo(());
        foo.foo();
    }
}

I guess Foo::<()>::bar is not actually generated unless it's called?

@wesleywiser
Copy link
Member

@eggyal Thanks for the repro! This feature is still very much "in progress" so lots of stuff doesn't work correctly. Even still, I think it would be helpful to open bug reports as you find things so that we can track what issues have been resolved and which are still open.

@richkadel
Copy link
Contributor Author

@mati865

I guess Foo::<()>::bar is not actually generated unless it's called?

Yes, I believe that's true. Generics are still just source templates, and don't actually become executable code until instantiated, when used. So the impl block is not instantiated as a whole. The parameterized impl block essentially represents a collection of template functions, inheriting the impls type parameters. Each generic function can be instantiated independently (or not at all, in the case of bar() in your example).

@mati865
Copy link
Contributor

mati865 commented Jul 21, 2020

You probably meant to cc @eggyal

@cuviper cuviper added this to the 1.47.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants