Run some tests less frequently #9487

MarkEWaite · 2024-07-20T00:16:13Z

Run some tests less frequently

The Launchable report shows that these tests have never failed in the 12 months that they have been monitored by Launchable. Since they have never failed, it seems safe to run them less frequently. This implementation of "less frequently" will run the tests 1 of 8 times on the master branch with the assumption that we average 2 builds per day throughout the week, with 3 configurations in every build. 14 jobs per week and 3 configurations per job should run the tests on the ci.jenkins.io master branch at least 3 times each week.

Tests are only skipped if the CI environment variable is "true". Developer execution of the tests is not changed.

Considering how reliable the tests are, we might consider running them even less, but this is a start.

Testing done

Confirmed that the test frequency approximates 1 in 3 when working on a development branch with the JENKINS_URL environment variable set to https://ci.jenkins.io

Confirmed that the tests are always run in the developer environment (no JENKINS_URL environment variable).

Proposed changelog entries

N/A

Proposed upgrade guidelines

N/A

Submitter checklist

Give feedback

The Jira issue, if it exists, is well-described.
The changelog entries and upgrade guidelines are appropriate for the audience affected by the change (users or developers, depending on the change) and are in the imperative mood (see examples). Fill in the Proposed upgrade guidelines section only if there are breaking changes or changes that may require extra steps from users during upgrade.
There is automated testing or an explanation as to why this change has no tests.
New public classes, fields, and methods are annotated with @Restricted or have @since TODO Javadocs, as appropriate.
New deprecations are annotated with @Deprecated(since = "TODO") or @Deprecated(forRemoval = true, since = "TODO"), if applicable.
New or substantially changed JavaScript is not defined inline and does not call eval to ease future introduction of Content Security Policy (CSP) directives (see documentation).
For dependency updates, there are links to external changelogs and, if possible, full differentials.
For new APIs and extension points, there is a link to at least one consumer.
Options

Desired reviewers

N/A

Before the changes are marked as ready-for-merge:

Maintainer checklist

Give feedback

There are at least two (2) approvals for the pull request and no outstanding requests for change.
Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
Proper changelog labels are set so that the changelog can be generated automatically.
If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
If it would make sense to backport the change to LTS, a Jira issue must exist, be a Bug or Improvement, and be labeled as lts-candidate to be considered (see query).
Options

https://app.launchableinc.com/organizations/jenkins/workspaces/jenkins/insights/unhealthy-tests shows that these tests have never failed in the 12 months that they have been monitored by Launchable. Since they have never failed, it seems safe to run them less frequently. This implementation of "less frequently" will run the tests 1 of 15 times on the master branch with the assumption that we average 2 builds per day throughout the week, with 3 configurations in every build. 14 jobs per week and 3 configurations per job should run the tests on the ci.jenkins.io master branch at least 3 times each week. Tests are only skipped if the JENKINS_URL environment variable contains the string "ci.jenkins.io". Developer execution of the tests is not changed. If a branch other than master is being tested, then the test will run 1 of 3 times. That will usually run each of the tests at least once on each ci.jenkins.io pull request job and on each ci.jenkins.io LTS job, because those jobs test 3 configurations. Considering how reliable the tests are, we might consider running them even less, but this is a start. Launchable summary suggests that this will save us as much as 3000 CPU minutes per week.

Better to limit tests on all CI servers instead of only on certain CI servers.

Constant value tends to skip every test in a run

daniel-beck

Remember when a Surefire(?) bug caused an XSS test to show success instead of failure due to improper handling of parameterized tests? I'd rather not want a repeat.

daniel-beck · 2024-07-20T19:16:27Z

test/src/test/java/hudson/security/csrf/DefaultCrumbIssuerTest.java

@@ -178,6 +181,7 @@ public class DefaultCrumbIssuerTest {
    @Issue("SECURITY-626")
    @WithTimeout(300)
    public void crumbOnlyValidForOneSession() throws Exception {
+        assumeTrue(runTestSometimes(NEVER_FAILING_TEST));


Thanks for the comment and thanks for the pointer to a specific case where a technique like this caused serious problems. It is good for me to know that this is not a workable technique.

Is there any technique you can envision that would be acceptable to reduce the frequency of running the tests? I'm fine if there is not, but am curious if you can see any path that would allow us to run these tests less frequently while still running them sometimes.

Is there any technique you can envision that would be acceptable to reduce the frequency of running the tests?

Skip on master specifically when running on ci.j.io? PRs still run them (even if unrelated), and cert.ci still runs.

Skip on master specifically when running on ci.j.io? PRs still run them (even if unrelated), and cert.ci still runs.

Thanks for the clarification. I'll implement that. That has the benefit that release.ci.jenkins.io will also still run those tests and provide one more safeguard that a regression has not been introduced.

daniel-beck · 2024-07-20T19:16:31Z

test/src/test/java/hudson/util/SecretCompatTest.java

@@ -82,6 +86,7 @@ public void encryptedValueStaysTheSameAfterRoundtrip() throws Exception {
    @Issue("SECURITY-304")
    @LocalData
    public void canReadPreSec304Secrets() throws Exception {
+        assumeTrue(runTestSometimes(NEVER_FAILING_TEST));


I mistakenly thought that the "Please no" applied to all the changes in the pull request, but now that I read more carefully, I see that it applies to 4 tests that are specifically validating security fixes. I'll remove the changes to tests related to security. An undetected security regression is much, much more expensive than running the tests each time.

Thanks again for the review.

Right, I specifically request security-related tests be exempted. Regular functional regressions being caught with 90+% likelihood seems mostly fine if the savings goal here is significant enough.

daniel-beck · 2024-07-20T19:16:34Z

test/src/test/java/hudson/util/SecretCompatTest.java

@@ -55,6 +58,7 @@ public class SecretCompatTest {
    @Test
    @Issue("SECURITY-304")
    public void encryptedValueStaysTheSameAfterRoundtrip() throws Exception {
+        assumeTrue(runTestSometimes(NEVER_FAILING_TEST));


daniel-beck · 2024-07-20T19:16:42Z

test/src/test/java/hudson/model/PasswordParameterDefinitionTest.java

@@ -44,6 +47,7 @@ public class PasswordParameterDefinitionTest {
    @Rule public JenkinsRule j = new JenkinsRule();

    @Test public void defaultValueKeptSecret() throws Exception {
+        assumeTrue(runTestSometimes(NEVER_FAILING_TEST));


daniel-beck · 2024-07-20T19:16:57Z

test/src/test/java/hudson/model/DirectoryBrowserSupportTest.java

@@ -576,6 +579,7 @@ public boolean delete() {
    @Test
    @Issue("SECURITY-904")
    public void symlink_outsideWorkspace_areNotAllowed() throws Exception {
+        assumeTrue(runTestSometimes(NEVER_FAILING_TEST));


github-actions · 2024-07-20T22:46:31Z

Please take a moment and address the merge conflicts of your pull request. Thanks!

Continues to run the tests on pull request branches, on developer environments, and on other CI servers like release.ci.jenkins.io and cert.ci.jenkins.io. Run the test approximately 1 of 8 times on the ci.jenkins.io master branch as well. In the last 4 weeks, we've had 7.5 builds of the master branch completed each week. Running the tests 1 of 8 times should provide about 3 runs per week, since we run 3 configurations (Windows with Java 17, Linux with Java 17, and Linux with Java 21).

timja · 2024-07-21T19:45:06Z

test/src/test/java/jenkins/SkipSomeTests.java

+                if (GIT_BRANCH.equals("master") || GIT_BRANCH.equals("origin/master")) {
+                    /* Run 1 of 8 times on the master branch, typically once a week or more */
+                    /* The master branch builds complete 8 times per week (on average), with each build running 3 configurations */
+                    return (System.currentTimeMillis() % 8) == 0;


how does this assure 1 in 8 times?

Wouldn't it be better to do off of the build number rather than wall clock time which could be completely random and never be mod 8?

how does this assure 1 in 8 times?

It doesn't assure 1 in 8 times, but it does approximate it.

When a change follows another change before the build has completed for the first change, the build for the first change is aborted. By relying on the semi-random nature of the system clock, we're less likely to skip a build because it was aborted by a later merge.

By using the system clock and its expected relatively even distribution of values, when a test is about to be run, roughly 1 time out of 8 times it will be selected to run. I think that is better than relying on the build number because it does not risk skipping a build because the build was aborted due to a later build.

StefanSpieker · 2024-07-23T20:10:27Z

I'm not sure, but it might be a good idea to use JUnit5 and the @DisabledIf() annotation for this, because it would really disable the test and not just let it pass. Here is an example how to use it: junit5 user guide

Another way would be to implement an interface for that:

@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
@DisabledIf("Math.random() >= 0.2")
@interface OnlySomeTimes {
}

and then in your method:

@OnlySomeTimes
public void gamble() {
    // this method will run roughly 20% of the time
}

Baeldung has a nice tutorial for that: https://www.baeldung.com/junit-5-conditional-test-execution#custom-annotations

MarkEWaite · 2024-07-24T01:45:42Z

I'm not sure, but it might be a good idea to use JUnit5

I'm not ready to convert those test files from JUnit 4 to JUnit 5. That seems like more work than I'm ready to do and a much larger impact on the test files than adding an assume condition to stop execution.

MarkEWaite · 2024-07-24T01:48:55Z

Rather than this relatively small cost savings, I'll likely spend my effort on larger cost savings, like switching off the JDK 17 testing on Linux. Closing this pull request. We can always reopen it in the future if the idea is interesting.

MarkEWaite added the skip-changelog Should not be shown in the changelog label Jul 20, 2024

MarkEWaite added 3 commits July 19, 2024 20:22

Use the CI variable instead of JENKINS_URL

1ce8c7e

Better to limit tests on all CI servers instead of only on certain CI servers.

Use all caps for boolean constant

6dfd7b4

Spread test selection more widely

25b9cda

Constant value tends to skip every test in a run

daniel-beck requested changes Jul 20, 2024

View reviewed changes

github-actions bot added the unresolved-merge-conflict There is a merge conflict with the target branch. label Jul 20, 2024

Merge branch 'master' into reduce-test-cost

b61c2b0

github-actions bot removed the unresolved-merge-conflict There is a merge conflict with the target branch. label Jul 20, 2024

MarkEWaite added 3 commits July 21, 2024 07:16

Merge branch 'master' into reduce-test-cost

f930f4f

Never skip a test related to security

95c0266

timja reviewed Jul 21, 2024

View reviewed changes

Merge branch 'master' into reduce-test-cost

f30e327

MarkEWaite closed this Jul 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run some tests less frequently #9487

Run some tests less frequently #9487

MarkEWaite commented Jul 20, 2024 •

edited

Loading

Submitter checklist

Maintainer checklist

daniel-beck left a comment

daniel-beck Jul 20, 2024

MarkEWaite Jul 20, 2024

daniel-beck Jul 21, 2024

MarkEWaite Jul 21, 2024 •

edited

Loading

daniel-beck Jul 20, 2024

MarkEWaite Jul 21, 2024

daniel-beck Jul 21, 2024

daniel-beck Jul 20, 2024

daniel-beck Jul 20, 2024

daniel-beck Jul 20, 2024

github-actions bot commented Jul 20, 2024

timja Jul 21, 2024

MarkEWaite Jul 21, 2024

StefanSpieker commented Jul 23, 2024

MarkEWaite commented Jul 24, 2024

MarkEWaite commented Jul 24, 2024

Run some tests less frequently #9487

Run some tests less frequently #9487

Conversation

MarkEWaite commented Jul 20, 2024 • edited Loading