Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-17479 Relocate junit XML files [2/n] #17098

Merged
merged 6 commits into from
Sep 5, 2024

Conversation

mumrah
Copy link
Contributor

@mumrah mumrah commented Sep 4, 2024

In #17066, we fixed caching for ":jar" and ":test" tasks. However, now in PRs, the test results will be restored as part of the Gradle cache resolution. This means tasks which show as FROM-CACHE and are not run will still have test results in their build directory. To avoid incorrectly reporting these results in the job summary, this patch uses a doLast task handler to relocate JUnit XML files into a new directory.

Another option would be to move the test summary logic from junit.py into a Gradle task that can properly hook into the Gradle lifecycle. However, this approach seems to work and is less complex.

@mumrah mumrah added the build Gradle build or GitHub Actions label Sep 5, 2024
@mumrah mumrah force-pushed the gh-KAFKA-17479-relocate-xml-files branch from 578b7c2 to 841c5ad Compare September 5, 2024 14:37
@mumrah
Copy link
Contributor Author

mumrah commented Sep 5, 2024

I expected this PR to skip tests due to the caching. However, since this patch adds a doLast action to all the :test tasks, the build graph is modified which affects cache keys. I verified this locally be comparing the cache keys for a :createVersionFile task between trunk and this PR (with and without the doLast change).

So, unfortunately, splitting this into two PRs was kind of pointless :)

@mumrah mumrah requested a review from chia7712 September 5, 2024 16:20
Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the build graph is modified which affects cache keys

I open PR: #17103 to test the cache keys, and the cache key is the same but gradle still assume "input changes".

Thu, 05 Sep 2024 15:43:38 GMT > Task :clients:compileTestJava
Thu, 05 Sep 2024 15:43:38 GMT Build cache key for task ':clients:compileTestJava' is 8244edcbc5934a166d9cba86d98c5b5a
Thu, 05 Sep 2024 15:43:38 GMT Task ':clients:compileTestJava' is not up-to-date because:
Thu, 05 Sep 2024 15:43:38 GMT   No history is available.
Thu, 05 Sep 2024 15:43:38 GMT The input changes require a full rebuild for incremental task ':clients:compileTestJava'.

the key 8244edcbc5934a166d9cba86d98c5b5a should be cached by last run of CI (https://ge.apache.org/s/t3wfrt2toiolg/timeline?details=jlwllx4wxhpp4&expanded=WyIxIl0&page=2)

I don't catch the root cause. @mumrah Do you have any idea?

.github/workflows/build.yml Outdated Show resolved Hide resolved
@mumrah
Copy link
Contributor Author

mumrah commented Sep 5, 2024

@chia7712 Looking at https://github.com/apache/kafka/actions/runs/10723681373/job/29737573604?pr=17103

we see a cache miss of :clients:compileTestJava

> Task :clients:compileTestJava
Build cache key for task ':clients:compileTestJava' is 8244edcbc5934a166d9cba86d98c5b5a
Task ':clients:compileTestJava' is not up-to-date because:
  No history is available.
The input changes require a full rebuild for incremental task ':clients:compileTestJava'.
Compilation mode: in-process compilation
Full recompilation is required because no incremental change information is available. This is usually caused by clean builds or changing compiler arguments.
Compiling with toolchain '/usr/lib/jvm/temurin-17-jdk-amd64'.
Compiling with JDK Java compiler API.

if we look at the setup-gradle output, we can find the build cache that was restored

https://github.com/apache/kafka/actions/runs/10723681373/job/29737573604?pr=17103#step:3:80

  Restored Gradle User Home from cache key: gradle-home-v1|Linux-X64|test[188616818c9a3165053ef8704c27b28e]-887b947d8a56cf9d9ef46756657e96f41d669555

the last bit of the cache key is the SHA which produced that cache entry, so 887b947. If your PR is based off trunk HEAD at the moment, that means you probably have these two commits:

which would explain the cache miss.

These kinds of cache misses are expected since the cache will lag behind trunk slightly. If this is causing a lot of cache misses, then I think we could mitigate it by using a floating git tag that points to the latest SHA of the build cache. Then PRs could branch off that tag rather than HEAD.

@mumrah mumrah requested a review from chia7712 September 5, 2024 16:54
Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and thanks for detailed explanation!

@mumrah mumrah merged commit 84aa5d7 into trunk Sep 5, 2024
4 of 7 checks passed
@mumrah mumrah deleted the gh-KAFKA-17479-relocate-xml-files branch September 5, 2024 17:50
mumrah added a commit that referenced this pull request Sep 6, 2024
The ignoreFailures property was removed in #17066 to prevent test failures from being cached. However, this breaks the JUnit report and makes the github workflow less user friendly.

The problem is that we are copying the junit test report files into a new directory (added in #17098) in a Gradle doLast closure. If we don't run with ignoreFailures=true, then this closure will not run and the test failures won't be processed by junit.py.

This patch adds logic to ensure the doLast closure of :test is always run. The user provided -PignoreFailures is still honored for the test tasks so local developer workflows should not be disturbed.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Gradle build or GitHub Actions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants