Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase timeout for all Pods to be eventually ready in E2E tests #2348

Merged
merged 1 commit into from
Jan 7, 2020

Conversation

sebgl
Copy link
Contributor

@sebgl sebgl commented Jan 6, 2020

Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady
test reaches its 5min timeout.
On good conditions, it takes more than 3 minutes for some rolling
upgrades to be completely applied (for eg.
TestMutationNodeSetReplacementWithChangeBudget).

Depending on external factors (slow Pod scheduling, slow
PersistentVolume binding, etc.), we can easily reach the fixed 5min
timeout.

I propose we increase the timeout to 15min for this particular check.
This is an arbitrary value (unfortunately), but I think we're OK with
the eventual consistency nature of k8s Pods scheduling.

We could make the test smarter (continue waiting if we see there's some
small progress), but we'd still have to pick up some arbitrary timeout
values anyway, so let's keep things simple.

Fixes #2263.
Relates #2134 (comment).

Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady
test reaches its 5min timeout.
On good conditions, it takes more than 3 minutes for some rolling
upgrades to be completely applied (for eg.
TestMutationNodeSetReplacementWithChangeBudget).

Depending on external factors (slow Pod scheduling, slow
PersistentVolume binding, etc.), we can easily reach the fixed 5min
timeout.

I propose we increase the timeout to 15min for this particular check.
This is an arbitrary value (unfortunately), but I think we're OK with
the eventual consistency nature of k8s Pods scheduling.

We could make the test smarter (continue waiting if we see there's some
small progress), but we'd still have to pick up some arbitrary timeout
values anyway, so let's keep things simple.
@sebgl sebgl merged commit 40bfb03 into elastic:master Jan 7, 2020
sebgl added a commit to sebgl/cloud-on-k8s that referenced this pull request Jan 9, 2020
Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady
test reaches its 5min timeout.
On good conditions, it takes more than 3 minutes for some rolling
upgrades to be completely applied (for eg.
TestMutationNodeSetReplacementWithChangeBudget).

Depending on external factors (slow Pod scheduling, slow
PersistentVolume binding, etc.), we can easily reach the fixed 5min
timeout.

I propose we increase the timeout to 15min for this particular check.
This is an arbitrary value (unfortunately), but I think we're OK with
the eventual consistency nature of k8s Pods scheduling.

We could make the test smarter (continue waiting if we see there's some
small progress), but we'd still have to pick up some arbitrary timeout
values anyway, so let's keep things simple.
@sebgl sebgl mentioned this pull request Jan 9, 2020
sebgl added a commit that referenced this pull request Jan 9, 2020
* Increase timeout for all Pods to be eventually ready (#2348)

Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady
test reaches its 5min timeout.
On good conditions, it takes more than 3 minutes for some rolling
upgrades to be completely applied (for eg.
TestMutationNodeSetReplacementWithChangeBudget).

Depending on external factors (slow Pod scheduling, slow
PersistentVolume binding, etc.), we can easily reach the fixed 5min
timeout.

I propose we increase the timeout to 15min for this particular check.
This is an arbitrary value (unfortunately), but I think we're OK with
the eventual consistency nature of k8s Pods scheduling.

We could make the test smarter (continue waiting if we see there's some
small progress), but we'd still have to pick up some arbitrary timeout
values anyway, so let's keep things simple.

* Use a 15min RollingUpgradeTimeout for keystore checks in E2E tests (#2388)

* Use the RollingUpgradeTimeout for keystore checks in E2E tests

Since we added a 30sec preStop wait, rolling upgrades take longer than
before. We recently updated the rolling upgrade timeout to 15 minutes,
but did not do it for the keystore rolling upgrade test which is written
differently.

* fix comment
mjmbischoff pushed a commit to mjmbischoff/cloud-on-k8s that referenced this pull request Jan 13, 2020
Some of our E2E tests fail because the CheckExpectedPodsEventuallyReady
test reaches its 5min timeout.
On good conditions, it takes more than 3 minutes for some rolling
upgrades to be completely applied (for eg.
TestMutationNodeSetReplacementWithChangeBudget).

Depending on external factors (slow Pod scheduling, slow
PersistentVolume binding, etc.), we can easily reach the fixed 5min
timeout.

I propose we increase the timeout to 15min for this particular check.
This is an arbitrary value (unfortunately), but I think we're OK with
the eventual consistency nature of k8s Pods scheduling.

We could make the test smarter (continue waiting if we see there's some
small progress), but we'd still have to pick up some arbitrary timeout
values anyway, so let's keep things simple.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TestMutationNodeSetReplacementWithChangeBudget is flaky
3 participants