-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid LLVM coverage data produced when compiled with -C opt-level=1
#82144
Comments
Another thing I just noticed is that the problematic functions (ie |
@catenacyber - Thanks for filing this issue. I was able to build and test your example, and I can reproduce the problem. Just to be clear, this is not exactly a bug. Due to certain requirements of LLVM's InstrProf coverage, some compiler (and/or linker) optimizations are not compatible. But coverage instrumentation is not expected to be used in production software, so limiting coverage instrumentation to non-optimized code is acceptable. I may be able to issue a compiler warning when a But I'm concerned that this issue does not really get to the real problem. As I understand it from your last couple of postings on google/oss-fuzz#4697, you still have not figured out why your program is failing when coverage is enabled and you have set Can you provide a minimal example and steps to reproduce the problem where you have set |
Thanks @richkadel I also think that I will work on minimizing the example and steps to reproduce the problem with |
@wesleywiser @tmandry - I've been thinking about how to reduce friction for cases like the one that prompted this issue (google/oss-fuzz#4697). Looking at that pull request, and seeing the challenges that @catenacyber has had, trying to disable optimizations, I think this must be a case where another tool is wrapping invocations of the Rust compiler, and forcing certain flags on it. It seems (in this case) possible to add We know that I think we should flag this combination as a compiler error (not a warning), both in command line option processing, and at any point downstream that an incompatible option might be enabled (for example, I believe that when LTO is enabled, the compiler also enables This way, the problem @catenacyber is having would manifest itself at the point of actually trying to compile the program, which should help @catenacyber debug the root cause. I don't think the error would ever be a "bad" thing, because the developer can avoid the error by not adding In both cases, this approach ensures the developer is in control. The other option would be to automatically disable optimizations when WDYT? |
I now think I have indeed 2 problems. The second one, I can reproduce with suricata
Then running llvm-cov, I get So, I will investigate further the second problem (I did not manage to have a reproducer with Rust only yet) and create another issue about it |
Are you trying to generate coverage from both C++ and Rust in the same binary? This has not been tried. I think it should work if you are using LLVM 11. I'm not sure about LLVM 12. Clang MUST generate Coverage Mapping Format version 4. Note that there are multiple new version in the works, so generating both Rust and C/C++ coverage in the same binary is going to be difficult to keep working as versions progress. You may want to consider instrumenting a binary with only Rust coverage, and a separate binary with only C/C++. This recommendation is new, based on conversations I just had yesterday (Feb 22), but it is probably good advice. You are breaking new ground by attempting to generate coverage from the same binary from two different languages. Note, your C/C++ flags also have coverage options and optimization options (like |
It sets
And it works for my ecc-diff-fuzzer (where there is only a little part of Rust)
I will try that again
Version 5 seems only an extension, so it should not be incompatible, or am I missing something ?
It looks like when I try to instrument only rust for coverage (ie setting RUSTFLAGS, but not CFLAGS), running my compiled binary does not produce any default.profraw file TL;DR |
Still getting |
I need to correct at least one thing I said here. I spoke to a colleague that is more familiar with Clang and LLVM, and was told that Clang's implementation of LLVM InstrProf coverage is compatible with at least some of Clang's optimization levels. (Clang -O1 and -O2 at least, maybe higher.) It's not clear to me if Clang's -O1 is exactly the same as Rust's -O1. (I doubt they are required to be identical, but I'm not sure how similar they are, particularly when it comes to correlating optimization levels to LLVM passes and options.) But it's reasonable to believe that it's possible for the Rust Compiler to support some optimization levels with coverage instrumentation at the same time. The sample in this issue shows that is not the case, or at least, not always the case. The current workaround is to disable optimizations when instrumenting for coverage. In most cases, coverage testing is not used in production, so some performance degradation is not going to be a major sacrifice (generally). We can leave this issue in place, in case someone has ideas on how to expand support for coverage to work with some optimization levels. (Just setting expectations though: this is not currently a priority feature request, and AFAIK, no one is actively working on it.) Here is something I did discover though. I compiled the sample, with and without $ grep 'pgocount = load' $(find target -name '*.ll') | wc -l
199
$ grep 'pgocount = load' $(find target_with_O1 -name '*.ll') | wc -l
169 30 counters are no longer present in LLVM IR. This lines up with what @catenacyber saw as well. Something, either in Rust or due to Rust-specific LLVM passes or options, is removing counters and are probably corrupting the coverage counts expected in the coverage map. |
I think having an external tool that invokes rustc for you, providing the correct flags and then potentially runs tests and collects the coverage data for you would be great! |
FYI, linking instrumented Rust and C/C++ in the same binary is likely to fail (or it could work now but break in the future.) The reason is that the separate compilers each expect their own version of the runtime instrumentation library, but only one can be loaded in the binary. There are no compatibility guarantees between versions. |
@wesleywiser @tmandry - Before I put together a PR, I'd like to make sure you agree with it and will approve it. tl;dr - I believe issuing an error when As I see it, we have three options:
We could also add one more flag/option to override the error, so if someone really wants to try combining them, they still can, but they must proactively enable it. Either way, the error means there are no surprises, and it won't be possible to build a program that might "just happen to work" with the combination one day, but fail to compile after some small change in their program that triggers the coverage failure. |
Thanks @tmandry |
@catenacyber - Just to be clear, @tmandry is explaining a case that is known to fail, but the explanation does not state what is known to work. Until we have evidence that someone has tried generating coverage reports from both Rust and C++, we don't know for certain if there is a configuration that will work. But known constraints are:
I don't know if there are other dependencies, but given the issues with optimization levels (discussed in this issue) and the observation that optimization levels (and other compiler options, probably) can change LLVM IR in ways that may affect coverage, there may also be limitations on what compiler flags are allowed, for both Rust and Clang. If all of these things are done right (and if I haven't missed anything), then it might work; but it would be cool to see a non-trivial example that does work, to know for sure. Sadly, even if it works today, I know there are multiple new versions of Coverage Mapping Format in the pipeline that Clang will probably support, and it may be non-trivial to add them to Rust as well. It will be difficult to keep Rust in sync with both the LLVM version and the coverage map version, and still take advantage of new releases of the Rust compiler. That's why I'm starting to believe the best practice may have to be to generate coverage reports for each language separately, from the same binary; which means running the program or tests twice. |
I spoke to both @wesleywiser and @tmandry offline, and both agreed with this plan (including the additional flag to override the error, which will be useful for rustc developers in diagnosing this kind of issue). And I see @catenacyber also gave a thumbs up. I will move forward with a PR for this. I plan to split out a separate issue that summarizes the feature request to support some The compiler error message will reference that issue, so when someone encounters the error, they will know why the flag combination is not supported, and what can be done, if desired. When the PR to add the error lands, we'll close this issue (#82144), and the feature request will remain open for now. |
That works for my ecc-diff-fuzzer project (which is smaller) With Suricata, I currently get llvm-cov Any more insight on how to debug this ?
I do not manage to get Rust-only coverage from my binary (ie it does not produce default.profraw file) when I set my RUSTFLAGS but no CFLAGS/CXXFLAGS |
This bug does not show up with So, as stated in google/oss-fuzz#4697 (comment) So a regression took place between 5be3f9f and 3ff10e7 for coverage of optimized code cf this report Furthermore, since this comment, Suricata added a new dependency with the brotli crate which confuses coverage, but this is another bug report #82875 |
Thanks for recognizing that this was a regression and for bisecting to find the exact date of the change! #79109 landed on 12-3-2020. As you said, that was a big feature commit to complete almost all of the planned capabilities. But I think I can narrow down the likely problem code once I get into it. I should be able to add a new test or two, and try disabling change(s) implemented in this commit to see if the tests work without some feature. Then I'll need to devise a better solution to whatever feature had to be disabled to make it work. I don't have the time to do this right now, so I'm glad you found a workaround for Suricata! But I will try to find the time in the next few weeks. |
Note to self, notable changes in that PR:
|
I tried a quick patch to remove |
OK, good to know. |
…=tmandry coverage bug fixes and optimization support Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to address multiple, somewhat related issues. Fixed a significant flaw in prior coverage solution: Every counter generated a new counter variable, but there should have only been one counter variable per function. This appears to have bloated .profraw files significantly. (For a small program, it increased the size by about 40%. I have not tested large programs, but there is anecdotal evidence that profraw files were way too large. This is a good fix, regardless, but hopefully it also addresses related issues. Fixes: rust-lang#82144 Invalid LLVM coverage data produced when compiled with -C opt-level=1 Existing tests now work up to at least `opt-level=3`. This required a detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR when compiled with coverage, and a lot of trial and error with codegen adjustments. The biggest hurdle was figuring out how to continue to support coverage results for unused functions and generics. Rust's coverage results have three advantages over Clang's coverage results: 1. Rust's coverage map does not include any overlapping code regions, making coverage counting unambiguous. 2. Rust generates coverage results (showing zero counts) for all unused functions, including generics. (Clang does not generate coverage for uninstantiated template functions.) 3. Rust's unused functions produce minimal stubbed functions in LLVM IR, sufficient for including in the coverage results; while Clang must generate the complete LLVM IR for each unused function, even though it will never be called. This PR removes the previous hack of attempting to inject coverage into some other existing function instance, and generates dedicated instances for each unused function. This change, and a few other adjustments (similar to what is required for `-C link-dead-code`, but with lower impact), makes it possible to support LLVM optimizations. Fixes: rust-lang#79651 Coverage report: "Unexecuted instantiation:..." for a generic function from multiple crates Fixed by removing the aforementioned hack. Some "Unexecuted instantiation" notices are unavoidable, as explained in the `used_crate.rs` test, but `-Zinstrument-coverage` has new options to back off support for either unused generics, or all unused functions, which avoids the notice, at the cost of less coverage of unused functions. Fixes: rust-lang#82875 Invalid LLVM coverage data produced with crate brotli_decompressor Fixed by disabling the LLVM function attribute that forces inlining, if `-Z instrument-coverage` is enabled. This attribute is applied to Rust functions with `#[inline(always)], and in some cases, the forced inlining breaks coverage instrumentation and reports. FYI: `@wesleywiser` r? `@tmandry`
Thanks very much @richkadel |
No, I didn't test that. Fingers crossed, but I suspect it will work now. |
now that rust-lang/rust#82144 got fixed
Now that rust-lang/rust#82144 got fixed
* Adds structure-aware target for suricata * Remove rustc wrapper for suricata now that rust-lang/rust#82144 got fixed * Remove suricata rust workarounds for coverage Now that rust-lang/rust#82144 got fixed
@catenacyber - Did you propose or submit a patch to upstream LLVM to change how it handles missing functions (to make it a warning but not an error)? Reading the comments on various PRs and Issues, it looks like you did not follow through with this, but I'm considering making that change to I'm still seeing this error triggered, very infrequently, and I'm not sure under what circumstances. The current implementation is (IMO) unnecessarily brittle and disruptive (throwing out the entire coverage report when it's only missing one function name). If you have a patch in progress, can you send me the LLVM link? Or if not, do you still have the code for a proposed patch? I noticed that if you return a static string, and there is only one missing function, the coverage results still show up, so you basically get a valid coverage report anyways. But with a static string, if there are more than one missing function, only one has coverage, and the rest appear uncovered, because multiple functions share the same fallback name. So a solutions would be to generate a new string for each missing function (which requires allocating them somewhere). Definitely fixable, but not as simple as returning a simple static string. If you already implemented a fix like that, I'd love to move that forward. If not, I will start a new patch. Let me know. Thanks! |
I did not
You are welcome to do it.
Indeed, maybe the coverage about the whole static library/rust crate is wrong, but the other ones look correct. |
Thank you! I just wanted to make sure I wasn't duplicating effort. Thanks so much for your time and effort helping to find this root cause. It meant a lot! |
I tried this code:
compiled with
RUSTFLAGS="-Zinstrument-coverage -C opt-level=1"
and Cargo.toml having
I expected to see this happen:
Running
llvm-profdata merge -j=1 -sparse
, then llvm-cov should give me a coverage reportInstead, this happened:
llvm-cov fails with `Failed to load coverage: Malformed instrumentation profile data
When building with
-C opt-level=0
, the buggy behavior does not show up with this reproducerMeta
rustc --version --verbose
:Backtrace
There is no backtrace here as the failure happens with llvm-cov
I used the following clang
Using this patch in clang
I get more information with
./bin/llvm-cov show -name-regex="lol" -instr-profile=dump.profdata -object=/path/to/rustbinary
It shows the pretendedly unnamed function code which can be found here https://github.com/rust-num/num-integer/blob/master/src/roots.rs#L174
cc @richkadel
Original discussion comes from google/oss-fuzz#4697
The text was updated successfully, but these errors were encountered: