Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

React on pod events for readiness gates #1214

Merged

Conversation

bpineau
Copy link

@bpineau bpineau commented Apr 6, 2020

We're relying on endpoints events to re-trigger reconciliations during
rollouts, and we're considering pod's containers status (eg. are all pod's
containers ContainerReady?) to either act upon, or swallow those events.

Both (pods changes, ep changes) can be out of sync. For instance: a starting
pod whose containers aren't all ContainerReady might have its addresses
registered to an ep subset's NotReadyAddresses, kicking a reconcile event
which won't propagate to the TargetGroups (+ conditions updates) since the pod
is evaluated as not ready. Further pod changes won't kick an endpoint change
(due to readiness gates the pod's address will stay in NotReadyAddresses
until we do something). As probably seen in #1205 .

In order to react reliably on pods changes, we have to hook in a pod watch.
Doing so is slightly expensive as we have to map pod -> [service ->]
endpoint -> ingress on pods events, though we limit the search to pod's ns
(services can only reference pods from their own ns, and ingress services from
their ns).

We might optimize that by adding a pod indexer (if supported by controller-runtime).

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 6, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @bpineau. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 6, 2020
Copy link

@nirnanaaa nirnanaaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor improvement

internal/ingress/controller/handlers/pod.go Outdated Show resolved Hide resolved
internal/ingress/controller/handlers/pod.go Outdated Show resolved Hide resolved
continue
}

if h.isPodInEndpoint(pod, endpoint) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to only reconcile the ingress if the ContainersReady condition is true to avoid unnecessary reconciliation iterations (and thus many AWS API calls) on frequent pod status updates during pod startup?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devkid the point of reconciliations IMO is that you don't have to rely on state comparison to make a decision.

In our test we didn't had any issues yet with AWS API limits.

Also we shouldn't rely on the fact that right now only containers ready is the important and driving factor for registering pods in the endpoints table. Upstream k8s could change or extend that logic any time and it would break the implementation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @nirnanaaa, that would specialise the watch by embedding expectations on implementations details of an unrelated part in the code base (which might change); eg. setting a trap for futur maintainers.

Copy link
Collaborator

@M00nF1sh M00nF1sh Apr 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nirnanaaa
I think we need to have the ContainerReady check. we have the check to decide whether to register the Pod to ALB anyway. Under current architecture, the reconcile will call AWS APIs to check every ALB settings before reconcile targetGroups, which is inefficient and cause troubles if there are many ingresses.

I think it's better to have below changes:

  1. have a common function defined in backends package like "IsPodSuitableAsIPTarget", which checks containersReady
  2. change https://github.com/kubernetes-sigs/aws-alb-ingress-controller/blob/master/internal/ingress/backend/endpoint.go#L186 to call the function above for decide whether pod in NotReadyAddresses should be registered.
  3. only enqueue Ingress for pod Update event when (IsPodSuitableAsIPTarget(podNew) != IsPodSuitableAsIPTarget(podOld)) && PodInEndpoint(podNew)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@M00nF1sh does that really justify adding all the complex logic to have oldRef stored, the handling if no old ref was found in store etc ? I don't think the comparison would be a huge deal other than duplicating logic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@M00nF1sh pushed a commit doing exactly that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, ready to go :D, just one nit, we need a "break" after queue.Add :D

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added the break (optimize the loop but precludes pods being part of several readiness gate backed ingresses - which I guess was a shady corner case wrt readiness gates semantics anyway?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bpineau The break would only quit the loop that iterates over the backends, i.e. all other ingresses would still be processed. Enqueuing the ingress for reconciliation will reconcile all target groups anyhow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devkid you're right

We're relying on endpoints events to re-trigger reconciliations during
rollouts, and we're considering pod's containers status (eg. are all pod's
containers ContainerReady?) to either act upon, or swallow those events.

Both (pods changes, ep changes) can be out of sync. For instance: a starting
pod whose containers aren't all ContainerReady might have its addresses
registered to an ep subset's NotReadyAddresses, kicking a reconcile event
which won't propagate to the TargetGroups (+ conditions updates) since the pod
is evaluated as not ready. Further pod changes won't kick an endpoint change
(due to readiness gates, the pod's address will stay in NotReadyAddresses
until we do something). As probably seen in  #1205 .

In order to react reliably on pods changes, we have to hook in a pod watch.
Doing so is slightly expensive as we have to map pod -> [service ->]
endpoint -> ingress on pods events, though we limit the search to pod's ns
(services can only reference pods from their own ns, and ingress services from
their ns).
@bpineau bpineau force-pushed the missed-readiness-gates-events branch from e5d0d54 to 2872d85 Compare April 7, 2020 10:53
@nirnanaaa
Copy link

@M00nF1sh or @bigkraig could you please have a look - would be much appreciated.

@bpineau bpineau force-pushed the missed-readiness-gates-events branch from 12f1b2b to a13750a Compare April 9, 2020 20:47
@M00nF1sh
Copy link
Collaborator

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 10, 2020
@bpineau bpineau force-pushed the missed-readiness-gates-events branch from a13750a to 1f8d7be Compare April 10, 2020 06:58
func IsPodSuitableAsIPTarget(pod *corev1.Pod) bool {
for _, condition := range pod.Status.Conditions {
if condition.Type == api.ContainersReady {
if condition.Status == api.ConditionTrue {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be return condition.Status == api.ConditionTrue to quit whenever the ContainersReady condition was found.

Addressing PR #1214 comments: bubbling up all (ingress backing)
pods events to main reconcile loop might trigger an excessive AWS
API calls volume. Let's keep that pod watch noise to the minimum
requiered for readiness gates: catching changes in `ContainerReady`.
@bpineau bpineau force-pushed the missed-readiness-gates-events branch from 1f8d7be to 2cf2c2a Compare April 10, 2020 14:45
@M00nF1sh
Copy link
Collaborator

really awesome work :D, thanks sooo much!!
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 10, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bpineau, M00nF1sh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 10, 2020
@k8s-ci-robot k8s-ci-robot merged commit 25d2b0e into kubernetes-sigs:master Apr 10, 2020
alebedev87 pushed a commit to alebedev87/aws-load-balancer-controller that referenced this pull request Oct 26, 2023
Addressing PR kubernetes-sigs#1214 comments: bubbling up all (ingress backing)
pods events to main reconcile loop might trigger an excessive AWS
API calls volume. Let's keep that pod watch noise to the minimum
requiered for readiness gates: catching changes in `ContainerReady`.
alebedev87 pushed a commit to alebedev87/aws-load-balancer-controller that referenced this pull request Oct 26, 2023
…-gates-events

React on pod events for readiness gates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants