Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flaking unit test in TestReconcileMachinePoolMachines #11070

Closed
cahillsf opened this issue Aug 19, 2024 · 11 comments · Fixed by #11124
Closed

flaking unit test in TestReconcileMachinePoolMachines #11070

cahillsf opened this issue Aug 19, 2024 · 11 comments · Fixed by #11124
Assignees
Labels
area/machinepool Issues or PRs related to machinepools help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@cahillsf
Copy link
Member

Which jobs are flaking?

these failures are apparent in periodic-cluster-api-test-mink8s-main and periodic-cluster-api-test-main

Which tests are flaking?

TestReconcileMachinePoolMachines/Reconcile_MachinePool_Machines/Should_create_two_machines_if_two_infra_machines_exist

Since when has it been flaking?

at least since 20214-07-06: https://storage.googleapis.com/k8s-triage/index.html?date=2024-07-20&text=TestReconcileMachinePoolMachines%2FReconcile_MachinePool_Machines%2FShould_create_two_machines_if_two_infra_machines_exist&job=.*cluster-api.*(test%7Ce2e)-(mink8s-)*main&xjob=.*-provider-.*

Testgrid link

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/periodic-cluster-api-test-mink8s-main/1824877164462346240

Reason for failure (if possible)

No response

Anything else we need to know?

No response

Label(s) to be applied

/kind flake
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

@k8s-ci-robot k8s-ci-robot added kind/flake Categorizes issue or PR as related to a flaky test. needs-priority Indicates an issue lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 19, 2024
@cahillsf
Copy link
Member Author

/area machinepool

@k8s-ci-robot k8s-ci-robot added the area/machinepool Issues or PRs related to machinepools label Aug 19, 2024
@sbueringer
Copy link
Member

Yup. I saw a bunch of flakes around MachinePool unit tests as well

/triage accepted

/help

@k8s-ci-robot
Copy link
Contributor

@sbueringer:
This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

  • Why are we solving this issue?
  • To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
  • Does this issue have zero to low barrier of entry?
  • How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

Yup. I saw a bunch of flakes around MachinePool unit tests as well

/triage accepted

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 21, 2024
@sbueringer sbueringer added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Aug 21, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates an issue lacks a `priority/foo` label and requires one. label Aug 21, 2024
@cahillsf
Copy link
Member Author

cahillsf commented Sep 1, 2024

/assign cahillsf

cannot reproduce this issue locally, have opened a draft that seems to use preferred methods in this unit test, see PR for details. hopefully this will improve the stability of this test

@sbueringer
Copy link
Member

Would be great if some folks familiar with Machine Pools / MachinePool Machines can review #11124

(cc @Jont828 @willie-yao)

@sbueringer
Copy link
Member

sbueringer commented Sep 2, 2024

/reopen

I assume we want to keep this issue open for now as we're not sure if the PR will fix all flakes

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Reopened this issue.

In response to this:

/reopen

I assume we want to keep this issue open for now as we're not sure if it will fix all flakes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 2, 2024
@cahillsf
Copy link
Member Author

cahillsf commented Sep 2, 2024

/reopen

I assume we want to keep this issue open for now as we're not sure if the PR will fix all flakes

Yep sounds good, will track the test and revisit


edit: adding k8s-triage link https://storage.googleapis.com/k8s-triage/index.html?text=TestReconcileMachinePoolMachines&job=.*cluster-api-(test%7Ce2e)-(mink8s-)*main

@cahillsf
Copy link
Member Author

revisiting this, test hasn't flaked since 9/1 prior to the attempted fix being merged: https://storage.googleapis.com/k8s-triage/index.html?date=2024-09-15&text=TestReconcileMachinePoolMachines&job=.*cluster-api-(test%7Ce2e)-(mink8s-)*main

if we update the date for today the failures are out of the default lookback window: https://storage.googleapis.com/k8s-triage/index.html?date=2024-09-18&text=TestReconcileMachinePoolMachines&job=.*cluster-api-(test%7Ce2e)-(mink8s-)*main

not sure how long we want to wait before closing out this issue @sbueringer ?

@sbueringer
Copy link
Member

I think we can close the issue, the flake was pretty frequent before, so I think we have enough data to be sure it's fixed.

Thx for fixing this flake!

/close

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Closing this issue.

In response to this:

I think we can close the issue, the flake was pretty frequent before, so I think we have enough data to be sure it's fixed.

Thx for fixing this flake!

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/machinepool Issues or PRs related to machinepools help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/flake Categorizes issue or PR as related to a flaky test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging a pull request may close this issue.

3 participants