Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test operator behavior after failover #2012

Closed
david-kow opened this issue Oct 17, 2019 · 3 comments · Fixed by #3706
Closed

Test operator behavior after failover #2012

david-kow opened this issue Oct 17, 2019 · 3 comments · Fixed by #3706
Assignees
Labels
>test Related to unit/integration/e2e tests

Comments

@david-kow
Copy link
Contributor

As a stateless service we expect the operator to be resilient to downtime, but we don't currently test for it. We should at least have a sanity test around correctness of what operator is doing when being randomly restarted. Somewhat related to general chaos testing.

@david-kow david-kow added loe:medium >test Related to unit/integration/e2e tests labels Oct 17, 2019
@pebrc pebrc removed the loe:medium label Apr 27, 2020
@anyasabo
Copy link
Contributor

anyasabo commented May 7, 2020

Related: #709

@david-kow
Copy link
Contributor Author

We should create a separate test pipeline where we would randomly delete the operator pod while the entire E2E test suite is running. This way we don't introduce noise into existing pipelines and we have a good coverage of different test scenarios. We should log and preserve timestamps when the operator pod is being removed for potential issues investigation.

@barkbay
Copy link
Contributor

barkbay commented Sep 24, 2020

Fixed by #3706

@barkbay barkbay closed this as completed Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>test Related to unit/integration/e2e tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants