Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix intermittent issue on OciMetricsSupportTest #6177

Conversation

klustria
Copy link
Member

@klustria klustria commented Feb 13, 2023

RCA is described here: #6112 (comment)

Changes include the following:

  1. In OciMetricsSupportTest.testEndpoint, extend the amount of validation time to 10 seconds for checking that the metric endpoint has been restored. Intermittently, a race condition exist where the validation happens before the endpoint is restored.
  2. Modify all countdownLatch to be locally defined in the test methods rather than being a static variable, which is causing chain reaction failure to other tests if a previous test fails because they share the same countdownLatch.
  3. Always check that countDownLatch.await() is verified to have completed or otherwise, assert a failure.
  4. Remove the use of fixed port when starting a WebServer.
  5. Reset postingEndPoint to its original value before each test, so @RepeatedTest can be used in the future for debugging purposes.
  6. Apply Helidon Code Style on both OciMetricsSupportTest and OciMetricsCdiExtensionTest. This would include making the tests's class and methods package local rather than public, rearranging variable fields order based on whether they are static, final, etc.
  7. Note that OciMetricsCdiExtensionTest only involves Code Style change and removal of delay method which is never used, so logic in that test class will be the same as before. Only OciMetricsSupportTest contain significant change to resolve the issue reported.
  8. Fail the OciMetricsCdiExtensionTest if enabled OCI Metrics validation times out on countDownLatch.await()

RCA is described here: helidon-io#6112 (comment)

Changes include the following:
1. In OciMetricsSupportTest.testEndpoint, extend the amount of validation time to 10 seconds for checking that the metric endpoint has been restored. Intermittently, a race condition exist where the validation happens before the endpoint is restored.
2. Modify all countdownLatch to be locally defined in the test methods rather than being a static variable, which is causing chain reaction failure to other tests if a previous test fails because they share the same countdownLatch.
3. Always check that countDownLatch.await() is verified to have completed or otherwise, assert a failure.
4. Remove the use of fixed port when starting a WebServer.
5. Reset postingEndPoint to its original value before each test, so @RepeatedTest can be used in the future for debugging purposes.
6. Apply Helidon Code Style on both OciMetricsSupportTest and OciMetricsCdiExtensionTest. This would include making the tests's class and methods package local  rather than public, rearranging variable fields order based on whether they are static, final, etc.
7. Note that OciMetricsCdiExtensionTest only involves Code Style change and removal of delay method which is never used, so logic in that test class will be the same as before. Only OciMetricsSupportTest contain significant change to resolve the issue reported.
8. Fail the OciMetricsCdiExtensionTest if enabled OCI Metrics validation times out on countDownLatch.await()
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Feb 13, 2023
@klustria klustria changed the title 6112 oci metrics support test intermittent failure 3.x Fix intermittent issue on OciMetricsSupportTest Feb 13, 2023
@klustria klustria self-assigned this Feb 13, 2023
@klustria klustria merged commit fd7ada4 into helidon-io:helidon-3.x Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants