Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: race condition for ingress resource status #1931

Merged
merged 1 commit into from
Oct 19, 2021

Conversation

shaneutt
Copy link
Contributor

@shaneutt shaneutt commented Oct 19, 2021

What this PR does / why we need it:

A race condition was found when using a LoadBalancer type service for the Kong proxy which in some deployments could trigger a permanent failure to resolve Ingress resource statuses. This patch ensures that we don't expect the LoadBalancer to be immediately ready when the controller manager first starts, and waits to start the status update goroutine until it's resolved. Informational logging is also provided to help indicate if the controller manager is "stuck" because the LoadBalancer is not being provisioned, instead of the previous behavior where this would just fail silently.

Which issue this PR fixes

Fixes #1925

Special notes for your reviewer:

I could only trigger this issue on GKE because the LoadBalancer provisioning can be quite slow there, this fix has been tested and verified in that environment manually. I'm looking into why our GKE tests missed this problem, but that's more about protecting against future regressions so I don't think that should hold up the patch given the light touch and laser focus of this patch, and the manual testing and verification.

It is my opinion that we might want to consider an alternative status update implementation for future releases, but for this PR I simply stuck with a light touch and a fix for the immediate problem at hand. I previously reported on the potential issues with our current implementation in #1492 and that may be considered a follow up.

PR Readiness Checklist:

@shaneutt shaneutt force-pushed the shaneutt/fix-ingress-status-update-timing-issues branch from 9248656 to 96d7e90 Compare October 19, 2021 16:19
@shaneutt shaneutt self-assigned this Oct 19, 2021
@shaneutt shaneutt added bug Something isn't working priority/high blocked labels Oct 19, 2021
Base automatically changed from shaneutt/fix-status-updates to main October 19, 2021 16:39
@shaneutt shaneutt linked an issue Oct 19, 2021 that may be closed by this pull request
1 task
@shaneutt shaneutt force-pushed the shaneutt/fix-ingress-status-update-timing-issues branch from 96d7e90 to 1211c7c Compare October 19, 2021 16:49
@shaneutt shaneutt temporarily deployed to Configure ci October 19, 2021 16:49 Inactive
@shaneutt shaneutt temporarily deployed to Configure ci October 19, 2021 16:49 Inactive
@shaneutt shaneutt mentioned this pull request Oct 19, 2021
1 task
@shaneutt shaneutt marked this pull request as ready for review October 19, 2021 16:58
@shaneutt shaneutt requested a review from a team as a code owner October 19, 2021 16:58
@shaneutt shaneutt removed the blocked label Oct 19, 2021
@shaneutt shaneutt temporarily deployed to Configure ci October 19, 2021 16:58 Inactive
@shaneutt shaneutt enabled auto-merge (squash) October 19, 2021 17:20
@shaneutt shaneutt merged commit ee7386f into main Oct 19, 2021
@shaneutt shaneutt deleted the shaneutt/fix-ingress-status-update-timing-issues branch October 19, 2021 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ingress Does not Get Assigned Address
2 participants