-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run some tests less frequently #9487
Conversation
https://app.launchableinc.com/organizations/jenkins/workspaces/jenkins/insights/unhealthy-tests shows that these tests have never failed in the 12 months that they have been monitored by Launchable. Since they have never failed, it seems safe to run them less frequently. This implementation of "less frequently" will run the tests 1 of 15 times on the master branch with the assumption that we average 2 builds per day throughout the week, with 3 configurations in every build. 14 jobs per week and 3 configurations per job should run the tests on the ci.jenkins.io master branch at least 3 times each week. Tests are only skipped if the JENKINS_URL environment variable contains the string "ci.jenkins.io". Developer execution of the tests is not changed. If a branch other than master is being tested, then the test will run 1 of 3 times. That will usually run each of the tests at least once on each ci.jenkins.io pull request job and on each ci.jenkins.io LTS job, because those jobs test 3 configurations. Considering how reliable the tests are, we might consider running them even less, but this is a start. Launchable summary suggests that this will save us as much as 3000 CPU minutes per week.
Better to limit tests on all CI servers instead of only on certain CI servers.
Constant value tends to skip every test in a run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remember when a Surefire(?) bug caused an XSS test to show success instead of failure due to improper handling of parameterized tests? I'd rather not want a repeat.
@@ -178,6 +181,7 @@ public class DefaultCrumbIssuerTest { | |||
@Issue("SECURITY-626") | |||
@WithTimeout(300) | |||
public void crumbOnlyValidForOneSession() throws Exception { | |||
assumeTrue(runTestSometimes(NEVER_FAILING_TEST)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment and thanks for the pointer to a specific case where a technique like this caused serious problems. It is good for me to know that this is not a workable technique.
Is there any technique you can envision that would be acceptable to reduce the frequency of running the tests? I'm fine if there is not, but am curious if you can see any path that would allow us to run these tests less frequently while still running them sometimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any technique you can envision that would be acceptable to reduce the frequency of running the tests?
Skip on master specifically when running on ci.j.io? PRs still run them (even if unrelated), and cert.ci still runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skip on master specifically when running on ci.j.io? PRs still run them (even if unrelated), and cert.ci still runs.
Thanks for the clarification. I'll implement that. That has the benefit that release.ci.jenkins.io
will also still run those tests and provide one more safeguard that a regression has not been introduced.
@@ -82,6 +86,7 @@ public void encryptedValueStaysTheSameAfterRoundtrip() throws Exception { | |||
@Issue("SECURITY-304") | |||
@LocalData | |||
public void canReadPreSec304Secrets() throws Exception { | |||
assumeTrue(runTestSometimes(NEVER_FAILING_TEST)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mistakenly thought that the "Please no" applied to all the changes in the pull request, but now that I read more carefully, I see that it applies to 4 tests that are specifically validating security fixes. I'll remove the changes to tests related to security. An undetected security regression is much, much more expensive than running the tests each time.
Thanks again for the review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I specifically request security-related tests be exempted. Regular functional regressions being caught with 90+% likelihood seems mostly fine if the savings goal here is significant enough.
@@ -55,6 +58,7 @@ public class SecretCompatTest { | |||
@Test | |||
@Issue("SECURITY-304") | |||
public void encryptedValueStaysTheSameAfterRoundtrip() throws Exception { | |||
assumeTrue(runTestSometimes(NEVER_FAILING_TEST)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please no.
@@ -44,6 +47,7 @@ public class PasswordParameterDefinitionTest { | |||
@Rule public JenkinsRule j = new JenkinsRule(); | |||
|
|||
@Test public void defaultValueKeptSecret() throws Exception { | |||
assumeTrue(runTestSometimes(NEVER_FAILING_TEST)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please no.
@@ -576,6 +579,7 @@ public boolean delete() { | |||
@Test | |||
@Issue("SECURITY-904") | |||
public void symlink_outsideWorkspace_areNotAllowed() throws Exception { | |||
assumeTrue(runTestSometimes(NEVER_FAILING_TEST)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please no.
Please take a moment and address the merge conflicts of your pull request. Thanks! |
Continues to run the tests on pull request branches, on developer environments, and on other CI servers like release.ci.jenkins.io and cert.ci.jenkins.io. Run the test approximately 1 of 8 times on the ci.jenkins.io master branch as well. In the last 4 weeks, we've had 7.5 builds of the master branch completed each week. Running the tests 1 of 8 times should provide about 3 runs per week, since we run 3 configurations (Windows with Java 17, Linux with Java 17, and Linux with Java 21).
if (GIT_BRANCH.equals("master") || GIT_BRANCH.equals("origin/master")) { | ||
/* Run 1 of 8 times on the master branch, typically once a week or more */ | ||
/* The master branch builds complete 8 times per week (on average), with each build running 3 configurations */ | ||
return (System.currentTimeMillis() % 8) == 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this assure 1 in 8 times?
Wouldn't it be better to do off of the build number rather than wall clock time which could be completely random and never be mod 8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this assure 1 in 8 times?
It doesn't assure 1 in 8 times, but it does approximate it.
When a change follows another change before the build has completed for the first change, the build for the first change is aborted. By relying on the semi-random nature of the system clock, we're less likely to skip a build because it was aborted by a later merge.
By using the system clock and its expected relatively even distribution of values, when a test is about to be run, roughly 1 time out of 8 times it will be selected to run. I think that is better than relying on the build number because it does not risk skipping a build because the build was aborted due to a later build.
I'm not sure, but it might be a good idea to use JUnit5 and the Another way would be to implement an interface for that:
and then in your method:
Baeldung has a nice tutorial for that: https://www.baeldung.com/junit-5-conditional-test-execution#custom-annotations |
I'm not ready to convert those test files from JUnit 4 to JUnit 5. That seems like more work than I'm ready to do and a much larger impact on the test files than adding an |
Rather than this relatively small cost savings, I'll likely spend my effort on larger cost savings, like switching off the JDK 17 testing on Linux. Closing this pull request. We can always reopen it in the future if the idea is interesting. |
Run some tests less frequently
The Launchable report shows that these tests have never failed in the 12 months that they have been monitored by Launchable. Since they have never failed, it seems safe to run them less frequently. This implementation of "less frequently" will run the tests 1 of 8 times on the master branch with the assumption that we average 2 builds per day throughout the week, with 3 configurations in every build. 14 jobs per week and 3 configurations per job should run the tests on the ci.jenkins.io master branch at least 3 times each week.
Tests are only skipped if the
CI
environment variable is "true". Developer execution of the tests is not changed.Considering how reliable the tests are, we might consider running them even less, but this is a start.
Testing done
Confirmed that the test frequency approximates 1 in 3 when working on a development branch with the JENKINS_URL environment variable set to https://ci.jenkins.io
Confirmed that the tests are always run in the developer environment (no JENKINS_URL environment variable).
Proposed changelog entries
N/A
Proposed upgrade guidelines
N/A
Submitter checklist
Desired reviewers
N/A
Before the changes are marked as
ready-for-merge
:Maintainer checklist