MIR InstrumentCoverage - Can spans for `TerminatorKind::Goto` be improved to avoid special cases? #78542

richkadel · 2020-10-29T21:37:09Z

Rust's LLVM InstrProf-based source code coverage implementation instruments Rust code via the MIR pass InstrumentCoverage. Most criteria for identifying coverage regions and counter locations are very general, based on Control Flow Graph (CFG) analysis of the MIR, and a fairly straightforward mapping of MIR Statements and Terminators to their source code regions (Spans).

TerminatorKind::Gotos are an exception, requiring special handling.

This issue is created to highlight some of the unique requirements and issues addressed in the current coverage implementation, in case someone has ideas for improving things, to reduce the reliance on the Goto-specific logic, either by improving InstrumentCoverage if something was overlooked, or improving the Goto representation (such as refining its Span representation, or providing additional context that InstrumentCoverage might leverage).

Current State

One of the first steps in the InstrumentCoverage process is to extract relevant code Spans from the MIR Statements and Terminators. (These Spans are later combined into sets of sequential statements and with contiguous source code regions that can be counted via a single counter; i.e., if any statement in the set was executed, all statements in the same set would also have been executed.)

bcb_to_initial_coverage_spans() iterates through the BasicBlocks of the CoverageGraph (a subset of the MIR, essentially skipping panic/unwind paths), and their Statements and Terminators. Some Statements and Terminators are relevant to Coverage, and others are not. The Statement and Terminator filtering is handled by filtered_statement_span() and filtered_terminator_span(), respectively.

In almost all cases, if not filtered out, the initial coverage Span contributed by either a Statement or a Terminator is the source_info.span (within the function body) of the Statement or Terminator; because, in most cases, the source code span carried forward from the parsed source to its MIR representation is a fairly accurate mapping from intent to execution.

For example, filtered_terminator_span() uses the entire source_info.span for the following TerminatorKinds:

fn filtered_terminator_span(terminator: &'a Terminator<'tcx>, body_span: Span) -> Option<Span> {
    match terminator.kind {
...
        // Retain spans from all other terminators
        TerminatorKind::Resume
        | TerminatorKind::Abort
        | TerminatorKind::Return
        | TerminatorKind::Call { .. }
        | TerminatorKind::Yield { .. }
        | TerminatorKind::GeneratorDrop
        | TerminatorKind::FalseUnwind { .. }
        | TerminatorKind::InlineAsm { .. } => {
            Some(function_source_span(terminator.source_info.span, body_span))
        }

All other TerminatorKinds are filtered out, except for Goto.

Goto terminators play an important role in the control flow, so they cannot be filtered out, but their source_info.span typically includes the Spans of the statements that precede it, making the Span redundant, in most cases.

One example: `Goto`s are often the targets of `SwitchInt` branches, and certain important optimizations to replace some `Counter`s with `Expression`s require a separate `BasicCoverageBlock` for each branch, to support the `Counter`, when needed.

Since a Goto-based CoverageSpan still needs a span to indicate if a region of actual source code was executed or not, the span returned from filtered_terminator_span(), for Gotos, is an empty span, positioned at the Goto span's last byte position:

TerminatorKind::Goto { .. } => {
    Some(function_source_span(terminator.source_info.span.shrink_to_hi(), body_span))
}

This byte position can--most often--be leveraged to contribute to a CoverageSpan for certain execution branches.

For example, an if block without an else shows the block was executed if the condition was true, but there would be no way to indicate coverage (or lack thereof) of the false branch without using the associated Gotos hi() byte position (which is expanded by one character to the left, for a non-empty CoverageSpan.

However, in other cases, a visible CoverageSpan is not wanted, but the Goto
block must still be counted (for example, to contribute its count to an Expression
that reports the execution count for some other block). In these cases, the code region
is set to None.

This decision (whether to include a one-character coverage span for a Goto or to count a Goto block without a code region) is handled in inject_coverage_span_counters(), beginning with the call to is_code_region_redundant(), which encapsulates the decision on how to handle these special cases.

At the time of this writing, the decision criteria is only looking for Goto terminators with spans that end at the last byte position in the file, because these Goto spans--if present--are redundant with the spans from every function's final Return terminator. When they are present, they can cause the function's last line to appear to have been executed twice, when it was only executed once.

The text was updated successfully, but these errors were encountered:

nagisa · 2020-10-29T21:52:01Z

All other TerminatorKinds are filtered out, except for Goto.

I'm having trouble following this. Are you saying that we do not record coverage information on spans attached to those terminators (and for Goto we have special cases to record some information from them) or is it vice versa (which is what the code seems to suggest to me)?

Includes a change to the implicit else span in ast_lowering, so coverage of the implicit else no longer spans the `then` block. Adds coverage for unused closures and async function bodies. Fixes: rust-lang#78542

richkadel · 2020-11-29T16:38:07Z

@nagisa Sorry for the long delay. I missed your original comment. This issue will be closed if/when the changes I just made in PR #79109 merges.

But to answer your question, the TerminatorKinds not included in Coverage had nothing to contribute to coverage. The actual code spans are covered by other TerminatorKinds.

I found a way to remove the TerminatorKind::Goto completely, with less special case code, and still get even better coverage than I had with it. (The special case caused more problems than it solved.)

richkadel mentioned this issue Oct 29, 2020

Working expression optimization, and some improvements to branch-level source coverage #78267

Merged

camelid added the A-mir Area: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html label Oct 29, 2020

richkadel mentioned this issue Nov 17, 2020

Tracking Issue for LLVM InstrProf code coverage (current flag: -Zinstrument-coverage) #79121

Closed

3 tasks

bors closed this as completed in def932c Dec 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MIR InstrumentCoverage - Can spans for `TerminatorKind::Goto` be improved to avoid special cases? #78542

MIR InstrumentCoverage - Can spans for `TerminatorKind::Goto` be improved to avoid special cases? #78542

richkadel commented Oct 29, 2020

nagisa commented Oct 29, 2020

richkadel commented Nov 29, 2020

MIR InstrumentCoverage - Can spans for TerminatorKind::Goto be improved to avoid special cases? #78542

MIR InstrumentCoverage - Can spans for TerminatorKind::Goto be improved to avoid special cases? #78542

Comments

richkadel commented Oct 29, 2020

Current State

nagisa commented Oct 29, 2020

richkadel commented Nov 29, 2020

MIR InstrumentCoverage - Can spans for `TerminatorKind::Goto` be improved to avoid special cases? #78542

MIR InstrumentCoverage - Can spans for `TerminatorKind::Goto` be improved to avoid special cases? #78542