-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests/e2e: Libvirt Env tests are unstable #1831
Comments
This is getting worse and we are hitting it multiple times on each PR now. I've tried running this test locally and in about 8 re-runs it worked every time, so I'm not sure of the cause of the failure. In the short term I think we need to skip it in the CI to stop it blocking PRs. |
The TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly test is failing semi-regularly on the CI, but seems to run okay locally, so skip it until we have a chance to debug. See confidential-containers#1831 Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly and TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment tests are failing semi-regularly on the CI, but seems to run okay locally, so skip it until we have a chance to debug. See confidential-containers#1831 Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly and TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment tests are failing semi-regularly on the CI, but seems to run okay locally, so skip it until we have a chance to debug. See #1831 Signed-off-by: stevenhorsman <steven@uk.ibm.com>
The TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly and TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageAndDeployment tests are failing semi-regularly on the CI, but seems to run okay locally, so skip it until we have a chance to debug. See confidential-containers#1831 Signed-off-by: stevenhorsman <steven@uk.ibm.com>
It is possible that this is related to the image-pull changes as Chengyu is touch the config merge code in kata-containers/kata-containers#9695, so after this, we should try re-testing this. |
TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly
test unstable
Hmm, this is suspicious, now the e2e tests related to env are skipped I've seen:
start failing, so maybe it's related to something before now being cleaned up, or the workdir has the same issue? |
This has failed the last three nightlies, so I will raise a PR to skip this for now |
The TestLibvirtCreatePeerPodAndCheckWorkDirLogs test has failed on a few PRs and the last three nightly test runs, so skip it until we have a chance to debug. See confidential-containers#1831 Signed-off-by: stevenhorsman <steven@uk.ibm.com>
@stevenhorsman yesterday I ran TestLibvirtCreatePeerPodAndCheckWorkDirLogs a couple of times locally with the hope of reproducing the error but it always passed! Then I started working on a golang equivalent of |
Yeah - I have this experience with the other tests too. My hope is that a new version of the kata-agent and image-rs might have addressed some of these, so I will re-test after they've been bumped |
The TestLibvirtCreatePeerPodAndCheckWorkDirLogs test has failed on a few PRs and the last three nightly test runs, so skip it until we have a chance to debug. See #1831 Signed-off-by: stevenhorsman <steven@uk.ibm.com>
In #2183 I've tried re-enabling all the tests and it seems that only
|
Tracking of unstable tests:
|
Based in the test analysis done for 18 days of nighty tests in confidential-containers#1831 (comment) the only containerd test failures we saw were: - `TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly` - three times - `TestLibvirtCreatePeerPodAndCheckWorkDirLogs` - four times - `TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly` - twice Although the chances of failure for each of these tests is < 25%, we want to reduce the re-runs required, so if we skip these we should have more stable CI tests. It should also be noted that most of the failures were seen on the packer built images. This is probably just chance, but might indicate that the peer pod boot speed is related and we should re-evaluate again once we can remove the packer podvm images. Signed-off-by: stevenhorsman <steven@uk.ibm.com>
Based in the test analysis done for 18 days of nighty tests in #1831 (comment) the only containerd test failures we saw were: - `TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithImageOnly` - three times - `TestLibvirtCreatePeerPodAndCheckWorkDirLogs` - four times - `TestLibvirtCreatePeerPodAndCheckEnvVariableLogsWithDeploymentOnly` - twice Although the chances of failure for each of these tests is < 25%, we want to reduce the re-runs required, so if we skip these we should have more stable CI tests. It should also be noted that most of the failures were seen on the packer built images. This is probably just chance, but might indicate that the peer pod boot speed is related and we should re-evaluate again once we can remove the packer podvm images. Signed-off-by: stevenhorsman <steven@uk.ibm.com>
We see occasional (anecdotally <20% of the time) failures on the libvirt nightly CI, which seems to always (so far) pass on re-run and now we've seen in on a PR test, so it's becoming more of an obstacle, so we should investigate it when we get the chance
The text was updated successfully, but these errors were encountered: