Increase timeout for all Pods to be eventually ready in E2E tests #2348

sebgl · 2020-01-06T10:07:47Z

Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady
test reaches its 5min timeout.
On good conditions, it takes more than 3 minutes for some rolling
upgrades to be completely applied (for eg.
TestMutationNodeSetReplacementWithChangeBudget).

Depending on external factors (slow Pod scheduling, slow
PersistentVolume binding, etc.), we can easily reach the fixed 5min
timeout.

I propose we increase the timeout to 15min for this particular check.
This is an arbitrary value (unfortunately), but I think we're OK with
the eventual consistency nature of k8s Pods scheduling.

We could make the test smarter (continue waiting if we see there's some
small progress), but we'd still have to pick up some arbitrary timeout
values anyway, so let's keep things simple.

Fixes #2263.
Relates #2134 (comment).

Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady test reaches its 5min timeout. On good conditions, it takes more than 3 minutes for some rolling upgrades to be completely applied (for eg. TestMutationNodeSetReplacementWithChangeBudget). Depending on external factors (slow Pod scheduling, slow PersistentVolume binding, etc.), we can easily reach the fixed 5min timeout. I propose we increase the timeout to 15min for this particular check. This is an arbitrary value (unfortunately), but I think we're OK with the eventual consistency nature of k8s Pods scheduling. We could make the test smarter (continue waiting if we see there's some small progress), but we'd still have to pick up some arbitrary timeout values anyway, so let's keep things simple.

* Increase timeout for all Pods to be eventually ready (#2348) Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady test reaches its 5min timeout. On good conditions, it takes more than 3 minutes for some rolling upgrades to be completely applied (for eg. TestMutationNodeSetReplacementWithChangeBudget). Depending on external factors (slow Pod scheduling, slow PersistentVolume binding, etc.), we can easily reach the fixed 5min timeout. I propose we increase the timeout to 15min for this particular check. This is an arbitrary value (unfortunately), but I think we're OK with the eventual consistency nature of k8s Pods scheduling. We could make the test smarter (continue waiting if we see there's some small progress), but we'd still have to pick up some arbitrary timeout values anyway, so let's keep things simple. * Use a 15min RollingUpgradeTimeout for keystore checks in E2E tests (#2388) * Use the RollingUpgradeTimeout for keystore checks in E2E tests Since we added a 30sec preStop wait, rolling upgrades take longer than before. We recently updated the rolling upgrade timeout to 15 minutes, but did not do it for the keystore rolling upgrade test which is written differently. * fix comment

Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady test reaches its 5min timeout. On good conditions, it takes more than 3 minutes for some rolling upgrades to be completely applied (for eg. TestMutationNodeSetReplacementWithChangeBudget). Depending on external factors (slow Pod scheduling, slow PersistentVolume binding, etc.), we can easily reach the fixed 5min timeout. I propose we increase the timeout to 15min for this particular check. This is an arbitrary value (unfortunately), but I think we're OK with the eventual consistency nature of k8s Pods scheduling. We could make the test smarter (continue waiting if we see there's some small progress), but we'd still have to pick up some arbitrary timeout values anyway, so let's keep things simple.

sebgl added the >flaky_test label Jan 6, 2020

anyasabo approved these changes Jan 6, 2020

View reviewed changes

david-kow approved these changes Jan 7, 2020

View reviewed changes

sebgl merged commit 40bfb03 into elastic:master Jan 7, 2020

sebgl mentioned this pull request Jan 9, 2020

TestUpdateESSecureSettings is failing #2380

Closed

sebgl mentioned this pull request Jan 9, 2020

Backport e2e timeout bump #2391

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase timeout for all Pods to be eventually ready in E2E tests #2348

Increase timeout for all Pods to be eventually ready in E2E tests #2348

sebgl commented Jan 6, 2020

Increase timeout for all Pods to be eventually ready in E2E tests #2348

Increase timeout for all Pods to be eventually ready in E2E tests #2348

Conversation

sebgl commented Jan 6, 2020