Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel branch coverage ignores Jacoco filtering #12696

Closed
gergelyfabian opened this issue Dec 14, 2020 · 6 comments
Closed

Bazel branch coverage ignores Jacoco filtering #12696

gergelyfabian opened this issue Dec 14, 2020 · 6 comments
Assignees
Labels
coverage P2 We'll consider working on this in future. (Assignee optional) team-Rules-CPP Issues for C++ rules

Comments

@gergelyfabian
Copy link

Description of the problem / feature request:

Bazel counts branch coverage also for code that would be filtered out by Jacoco.
Function and line coverage (when looking at reports generated by genhtml) seem to be respecting Jacoco filtering.

Jacoco is constantly improving filtering (https://www.jacoco.org/jacoco/trunk/doc/changes.html), but these changes (e.g. filtering out generated bytecode for Java and Kotlin) aren't then taken by Bazel branch coverage.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

  1. Check out https://github.com/gergelyfabian/bazel-scala-example/tree/jacoco_coverage_fix.
  2. In bazel repo: bazel build src/java_tools/junitrunner/java/com/google/testing/coverage:JacocoCoverage_jarjar_deploy.jar
  3. Copy bazel's bazel-bin/src/java_tools/junitrunner/java/com/google/testing/coverage/JacocoCoverage_jarjar_deploy.jar to bazel-scala-example's tools/
  4. Run tools/coverage.sh
  5. You'll see in the coverage report branch coverage for lines that are not covered by line coverage (example-lib/src/main/scala/mypackage/Foo.scala):
      23         [ +  + ]:            :       d <- if (foo.a < 10) {
      24                 :            :         List(4)
      25                 :            :       } else {
      26                 :            :         List(14)
      27                 :            :       }
  1. Branch coverage is 5/30 for example-lib/src/main/scala/mypackage/Foo.scala.

Note: These lines are not covered because of a bug in Jacoco 0.8.3 (that Bazel currently uses), it was fixed in Jacoco 0.8.5.

When in contrast I follow the instructions to upgrade Jacoco in #11674 (comment) (to Jacoco 0.8.3 with the backported Scala 2.12 lambda fix) and then run the above steps the line coverage changes (the above lines 23-27 are now covered) and the branch coverage stays the same for example-lib/src/main/scala/mypackage/Foo.scala (5/30).

Compare with plain Jacoco:

  1. Download http://search.maven.org/remotecontent?filepath=org/jacoco/jacoco/0.8.5/jacoco-0.8.5.zip and http://search.maven.org/remotecontent?filepath=org/jacoco/jacoco/0.8.3/jacoco-0.8.3.zip.
  2. Unpack them into /tmp/jacoco_0.8.3 and /tmp/jacoco_0.8.5.
  3. Install Scala 2.12.12 into ~/opt/scala-2.12.12
  4. Copy bazel-scala-example/example-lib/src/main/scala/mypackage/Foo.scala to /tmp/src/mypackage/Foo.scala
  5. cd /tmp
  6. ~/opt/scala-2.12.12/bin/scalac src/mypackage/Foo.scala -d classes
  7. cd /tmp/jacoco_0.8.3
  8. java -javaagent:lib/jacocoagent.jar -cp ~/opt/scala-2.12.12/lib/scala-library.jar:../classes mypackage.Foo
  9. java -jar lib/jacococli.jar report jacoco.exec --classfiles ../classes --sourcefiles ../src --html report
  10. cd /tmp/jacoco_0.8.5
  11. java -javaagent:lib/jacocoagent.jar -cp ~/opt/scala-2.12.12/lib/scala-library.jar:../classes mypackage.Foo
  12. java -jar lib/jacococli.jar report jacoco.exec --classfiles ../classes --sourcefiles ../src --html report
  13. Compare the reports for Jacoco 0.8.3 and 0.8.5. You'll see that branch coverage is changing and that for 0.8.3 lines that have been filtered aren't counted as branches hit or total.

What operating system are you running Bazel on?

Ubuntu 20.04.

What's the output of bazel info release?

release 3.7.1

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

Replace this line with your answer.

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

https://github.com/gergelyfabian/bazel-scala-example
8a9b58a48b86c456a0593427fbe217d6f193b404
aa8039f593df6628ef749f0c7bcb462976d45419

Have you found anything relevant by searching the web?

No.

Any other information, logs, or outputs that you want to share?

genhtml output segment for Foo.scala with Bazel 3.7.1 default Jacoco (0.8.3):

       5                 :          1 : object Foo {
       6                 :          1 :   val message = "hello world"
       7                 :          1 :   val message1 = 1
       8                 :            : 
       9                 :            :   def testLambdas(a: Int) = {
      10                 :          1 :     println(message1)
      11                 :          1 :     val foo = Foo(a, message, Set.empty[String])
      12         [ +  - ]:          1 :     foo match {
      13         [ +  + ]:          1 :       case Foo(a, _, _) if a > 10 =>
      14                 :          1 :         println("a is bigger than 10")
      15                 :            :       case _ =>
      16                 :          1 :         println("a is not bigger than 10")
      17                 :            :     }
      18                 :          1 :     println(foo)
      19                 :            :     for {
      20                 :          1 :       a <- List(foo)
      21                 :            :       b <- List(2)
      22                 :            :       c <- List(3)
      23         [ +  + ]:            :       d <- if (foo.a < 10) {
      24                 :            :         List(4)
      25                 :            :       } else {
      26                 :            :         List(14)
      27                 :            :       }
      28                 :            :     } yield a.a + b + c + d
      29                 :            :   }
      30                 :            : 
      31                 :            :   def main(args: Array[String]) {
      32                 :          1 :     testLambdas(1)
      33                 :          1 :     testLambdas(11)
      34                 :            :   }
      35                 :          1 : }

With Jacoco upgraded to 0.8.3 with Scala 2.12 lambda fix patch:

       5                 :          1 : object Foo {
       6                 :          1 :   val message = "hello world"
       7                 :          1 :   val message1 = 1
       8                 :            : 
       9                 :            :   def testLambdas(a: Int) = {
      10                 :          1 :     println(message1)
      11                 :          1 :     val foo = Foo(a, message, Set.empty[String])
      12         [ +  - ]:          1 :     foo match {
      13         [ +  + ]:          1 :       case Foo(a, _, _) if a > 10 =>
      14                 :          1 :         println("a is bigger than 10")
      15                 :            :       case _ =>
      16                 :          1 :         println("a is not bigger than 10")
      17                 :            :     }
      18                 :          1 :     println(foo)
      19                 :            :     for {
      20                 :          1 :       a <- List(foo)
      21                 :          1 :       b <- List(2)
      22                 :          1 :       c <- List(3)
      23         [ +  + ]:          1 :       d <- if (foo.a < 10) {
      24                 :          1 :         List(4)
      25                 :            :       } else {
      26                 :          1 :         List(14)
      27                 :            :       }
      28                 :          1 :     } yield a.a + b + c + d
      29                 :            :   }
      30                 :            : 
      31                 :            :   def main(args: Array[String]) {
      32                 :          1 :     testLambdas(1)
      33                 :          1 :     testLambdas(11)
      34                 :            :   }
      35                 :          1 : }
@gergelyfabian
Copy link
Author

gergelyfabian commented Dec 14, 2020

From what I have seen from Jacoco's and Bazel's source:

In Jacoco's ClassAnalyzer (https://github.com/jacoco/jacoco/blob/6bcce6972b8939d7925c4b4d3df785d9a7b00007/org.jacoco.core/src/org/jacoco/core/internal/analysis/ClassAnalyzer.java#L94) addMethodCoverage (https://github.com/jacoco/jacoco/blob/6bcce6972b8939d7925c4b4d3df785d9a7b00007/org.jacoco.core/src/org/jacoco/core/internal/analysis/ClassAnalyzer.java#L114) is called, that does the filtering.

Jacoco uses ClassProbesAdapter with ClassAnalyzer in https://github.com/jacoco/jacoco/blob/6bcce6972b8939d7925c4b4d3df785d9a7b00007/org.jacoco.core/src/org/jacoco/core/analysis/Analyzer.java#L102.

Bazel in contrast uses its own ClassProbesMapper (

public class ClassProbesMapper extends ClassProbesVisitor {
) in BranchDetailAnalyzer (
final ClassProbesAdapter adapter = new ClassProbesAdapter(mapper, false);
) with Jacoco's ClassProbesAdapter.

From what I see the reason could be, that Jacoco's implementation that uses filtering is not re-used by Bazel. I don't yet see where it should be added and would be grateful for any guidance (and happy to contribute). I also don't understand how it's possible that filtering works for line coverage and doesn't work for branch coverage.

@gergelyfabian
Copy link
Author

gergelyfabian commented Dec 16, 2020

I think I found the reason.

final IBundleCoverage bundleCoverage = analyzeStructure();
final Map<String, BranchCoverageDetail> branchDetails = analyzeBranch();
createReport(bundleCoverage, branchDetails);

Bazel creates coverage data with Jacoco for lines (and functions), but uses it's own logic for branch coverage (that is in fact older than Jacoco's filtering functionality: 2016 vs. 2017).

That explains why line and function coverage respects Jacoco filtering, while branch coverage not.

Data is then processed here:

private void processSourceFile(PrintWriter writer, String sourceFile) {
writer.printf("SF:%s\n", sourceFile);
ISourceFileCoverage srcCoverage = sourceToFileCoverage.get(sourceFile);
if (srcCoverage != null) {
// List methods, including methods from nested classes, in FN/FNDA pairs
for (IClassCoverage clsCoverage : sourceToClassCoverage.get(sourceFile).values()) {
for (IMethodCoverage mthCoverage : clsCoverage.getMethods()) {
String name = constructFunctionName(mthCoverage, clsCoverage.getName());
writer.printf("FN:%d,%s\n", mthCoverage.getFirstLine(), name);
writer.printf("FNDA:%d,%s\n", mthCoverage.getMethodCounter().getCoveredCount(), name);
}
}
for (IClassCoverage clsCoverage : sourceToClassCoverage.get(sourceFile).values()) {
BranchCoverageDetail detail = branchCoverageDetail.get(clsCoverage.getName());
if (detail != null) {
for (int line : detail.linesWithBranches()) {
int numBranches = detail.getBranches(line);
boolean executed = detail.getExecutedBit(line);
if (executed) {
for (int branchIdx = 0; branchIdx < numBranches; branchIdx++) {
if (detail.getTakenBit(line, branchIdx)) {
writer.printf("BA:%d,%d\n", line, 2); // executed, taken
} else {
writer.printf("BA:%d,%d\n", line, 1); // executed, not taken
}
}
} else {
for (int branchIdx = 0; branchIdx < numBranches; branchIdx++) {
writer.printf("BA:%d,%d\n", line, 0); // not executed
}
}
}
}
}
// List of DA entries matching source lines
int firstLine = srcCoverage.getFirstLine();
int lastLine = srcCoverage.getLastLine();
for (int line = firstLine; line <= lastLine; line++) {
ICounter instructionCounter = srcCoverage.getLine(line).getInstructionCounter();
if (instructionCounter.getTotalCount() != 0) {
writer.printf("DA:%d,%d\n", line, instructionCounter.getCoveredCount());
}
}
}
writer.println("end_of_record");
}

Branch data is processed at: 145-166.

We can use srcCoverage.getLine(line).getBranchCounter() inside for (int line : detail.linesWithBranches()) to check the branch data generated by Jacoco. One could think about replacing Bazel's own branch coverage data parsing, but it's not necessary to fix this issue. It's enough to check Jacoco's branch counter for total count, and if it's 0, then it means that the branch has been filtered out by Jacoco. Then do not emit branch data from Bazel.

For my case this change fixes the issue.

@gergelyfabian
Copy link
Author

gergelyfabian commented Dec 18, 2020

After thinking about it once more, emitting branch coverage data should be done basing on what Jacoco has provided us (not just checking filtering), as it may happen that Jacoco filters out just part of the branches for a single line.

Updated PR #12710 according to this.

@aiuto aiuto added team-Rules-CPP Issues for C++ rules untriaged labels Jan 7, 2021
@c-mita
Copy link
Member

c-mita commented Jan 11, 2021

Jacoco's branch coverage output reports only the total number of branches in a line, and the number then taken. It does not give any details to individual branches in a line.

Since Bazel attempts to merge coverage reports, we need to be able to identify when individual branches so they can be merged successfully to give a meaningful report.
e.g. a line with 5 branches and two coverage reports that show three taken in one and the other two others in a second, should show all branches taken in the merged report. Jacoco's built in Analyzer class does not facilitate this.

Bazel uses its own BranchDetailAnalyzer to run a second analysis pass, extracting just the branch probe details. Assuming deterministic probe placement, this allows us to map execution information to individual branches across coverage reports.
However, because we are using our Analyzer class, we do not get the internal filtering Jacoco does.

The only "quick" solution I can think of is to omit branches on a line if Jacoco's analyzer says there are zero branches in total, otherwise report all branches.
This only works if all branches on a line have been filtered. I don't really want to do this; it doesn't really solve the problem and can lead to "surprising" results.

Since BranchDetailAnalyzer already uses internal Jacoco classes (e.g. ClassProvesAdapter), it might be able to apply the same filters Jacoco does (see Filters:all). Doing this would also require our own IFilterOutput implementation.
These are internal classes and interfaces; going down this road could make updating the Jacoco version far more difficult if these change, making this a much more expensive approach and I don't believe this is worth it.

@c-mita c-mita added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) and removed untriaged labels Jan 11, 2021
@gergelyfabian
Copy link
Author

The only "quick" solution I can think of is to omit branches on a line if Jacoco's analyzer says there are zero branches in total, otherwise report all branches.
This only works if all branches on a line have been filtered. I don't really want to do this; it doesn't really solve the problem and can lead to "surprising" results.

I also considered this solution in #12710 and dropped it for the same reason.

Since BranchDetailAnalyzer already uses internal Jacoco classes (e.g. ClassProvesAdapter), it might be able to apply the same filters Jacoco does (see Filters:all). Doing this would also require our own IFilterOutput implementation.
These are internal classes and interfaces; going down this road could make updating the Jacoco version far more difficult if these change, making this a much more expensive approach and I don't believe this is worth it.

I think that Bazel is already quite much dependent on Jacoco, e.g. updating to Jacoco 0.8.5 (or higher) will need some changes in Bazel itself. So if Bazel is trying to provide an alternate implementation for branch coverage, and already uses internal Jacoco classes like ClassProvesAdapter, then I think it should also provide its own IFilterOutput implementation and do filtering appropriately. I'd prefer this solution than ignoring filtering at all. If Bazel ignores Jacoco filtering, then coverage data will diverge more and more from an original Jacoco report, as Jacoco adds more and more filtering. Kotlin is one good example, but even for Java there is some filtering logic already. My personal motivation is Scala, where I'm using a fork so far to add filtering for Scala code (a lot needed).

@c-mita c-mita added P2 We'll consider working on this in future. (Assignee optional) and removed P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) labels Aug 17, 2021
@c-mita c-mita self-assigned this Aug 17, 2021
@c-mita
Copy link
Member

c-mita commented Aug 23, 2021

I've done some more investigation into this and have made promising progress.

  • Jacoco's filters operate on AbstractInsnNode objects (https://asm.ow2.io/javadoc/org/objectweb/asm/tree/AbstractInsnNode.html). Our branch analysis doesn't deal with these, only interesting itself with Jacoco's internal Instruction objects.
  • We can track these objects in the same way Jacoco's regular analysis does as we walk the ASM tree, map them to Instruction objects, pass them to the Filter set, and record which are to be ignored.
  • When we process the results to generate branch informations from the Instruction list, we skip over any who's corresponding AbstractInsnNode is ignored.

I don't grok what's going on MethodProbesMapper::visitEnd, but I think skipping over Instructions that the filters have determined to be "ignored" is safe.

Note that Filters have three potential output calls to IFilterOutput:

  • IFilterOutput::ignore
  • IFilterOutput::merge
  • IFilterOutput::replaceBranches

It isn't clear to me yet what needs to be done to support more than just ignore. However, this would cover most cases as the others are used by very few filters.

I have a WIP branch here: https://github.com/c-mita/bazel/tree/jacoco_filter

Non-Scala/Kotlin demonstrations of the problem of extraneous branches can be demonstrated in Java with string switches:

public int foo(String myString) {
  switch (myString) {
    case "AA":
      return 1;
    case "BB":
      return 2;
    default:
      return -1;
  }
}

The "expected" number of branches here is 3, however the java compiler will emit more than double this number. Jacoco normally filters these extra branches.

@c-mita c-mita linked a pull request Aug 24, 2021 that will close this issue
bazel-io pushed a commit that referenced this issue Sep 7, 2021
Jacoco applies a set of filters during its coverage analysis to remove
various compiler constructs that could lead to surprising outputs in
coverage results, leaving behind coverage data that closer fits the
code given to the compiler.

A simple java example is the function:

```
public int foo(String input) {
  switch (input) {
    case "A":
      return 1;
    case "B":
      return 2;
    case "C":
      return 3;
    default:
      return -1
  }
}
```

This would generate at least double the number of expected branches as
the compiler generates a second set of branches to operate on hashCode.
Jacoco would normally ignore the first set of branches, reducing the
number to the expected 4.

Because Bazel applies its own branch analysis step, Bazel's branch
coverage output does not benefit from this filtering.

This change adapts the custom analyzer to record the necessary
information during the ASM Tree walk in the same manner as Jacoco's
regular analyzer in order to apply the filters.

The filters have three output operations:

 * ignore - ignore a range of instructions
 * merge - merges the coverage information of two instructions
 * replaceBranches - retarget the output branches of an instruction

The first of these, ignore, is by far the simplest to implement and
covers the vast majority of use cases in Java.

By mapping the AbstractInsnNode objects recorded during the ASM Tree
walk to the Instruction objects used during branch analysis, we can
simply skip over Instruction objects that have had their associated
AbstractInsnNode object ignored by a filter.

The remaining operations, merge and replaceBranches, are left
unimplemeted for now; these require more care to handle, are harder
to test, and affect only a minority of cases, specifically:

 * merge is only used to handle duplication of finally blocks
 * replaceBranches is used by Kotlin and Eclipse compiler filters

Part of #12696

Closes #13896.

PiperOrigin-RevId: 395235432
bazel-io pushed a commit that referenced this issue Sep 14, 2021
Instructions that are indicated to be merged must have their corresponding
coverage expressions (`CovExp`) combined in a pairwise fashion.

e.g. A pair of branch expressions that should be merged:
`BranchExp(a, b)`, `BranchExp(c, d)`

Becomes:

`BranchExp(a | c, b | d)`

This is distinct from the previous merge operation on BranchExp which
simply joined the branches of the two objects (this has been renamed to
concatenate).

Part of #12696.

Closes #13972.

PiperOrigin-RevId: 396573429
c-mita added a commit to c-mita/bazel that referenced this issue Mar 22, 2022
These were added during bazelbuild#12696.

The integration tests in question depend on the pre-built version of
java_tools and so would have started to fail when the tests were updated
for the new functionality added.

Now java_tools has been updated (several times), it is safe to allow
automated running of them once again.
bazel-io pushed a commit that referenced this issue Mar 23, 2022
These were added during #12696.

The integration tests in question depend on the pre-built version of
java_tools and so would have started to fail when the tests were updated
for the new functionality added.

Now java_tools has been updated (several times), it is safe to allow
automated running of them once again.

Closes #15100.

PiperOrigin-RevId: 436702004
gergelyfabian added a commit to gergelyfabian/rules_scala that referenced this issue Jun 1, 2022
Upgraded jacoco package name for Bazel 5.0+.

Upgraded jacoco from 0.8.3 to 0.8.6.

Jacoco upgrade to 0.8.6 was made in Bazel in:
bazelbuild/bazel#11674
bazelbuild/bazel@cb7c1a2

Removed options for workarounds where those have been fixed in the meantime:

Bazel's handling for branch coverage was fixed in:
bazelbuild/bazel#12696
gergelyfabian added a commit to gergelyfabian/rules_scala that referenced this issue Jun 1, 2022
Upgraded jacoco package name for Bazel 5.0+.

Upgraded jacoco from 0.8.3 to 0.8.6.

Jacoco upgrade to 0.8.6 was made in Bazel in:
bazelbuild/bazel#11674
bazelbuild/bazel@cb7c1a2

Removed options for workarounds where those have been fixed in the meantime:

Bazel's handling for branch coverage was fixed in:
bazelbuild/bazel#12696

Instead of changing the existing script for building Jacoco, added a new
one that can be used for Bazel 5.0.
simuons pushed a commit to bazelbuild/rules_scala that referenced this issue Jun 6, 2022
* JacocoRunner script: update for Bazel 5.0+

Upgraded jacoco package name for Bazel 5.0+.

Upgraded jacoco from 0.8.3 to 0.8.6.

Jacoco upgrade to 0.8.6 was made in Bazel in:
bazelbuild/bazel#11674
bazelbuild/bazel@cb7c1a2

Removed options for workarounds where those have been fixed in the meantime:

Bazel's handling for branch coverage was fixed in:
bazelbuild/bazel#12696

Instead of changing the existing script for building Jacoco, added a new
one that can be used for Bazel 5.0.

* build_jacocorunner_bazel_5.0+: update bazel tag to avoid error for Java 17

bazelbuild/bazel#15081 fixed a bug when using
coverage with Scala targets on Java 17.
Use the tag 6.0.0-pre.20220520.1 where this was fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
coverage P2 We'll consider working on this in future. (Assignee optional) team-Rules-CPP Issues for C++ rules
Projects
None yet
3 participants