Make Broker availability based on the Endpoints not Deployments. #2766

mattmoor · 2020-03-17T17:31:52Z

🧽 Update or clean up current behavior

Today the fact that the Broker lifecycle stuff propagates Deployment status for filter/ingress leaks implementation details into the API types. An example of where this leakage is bas would be if I were to run a multi-tenant broker (#2760) as something other than a Deployment.

This change corrects one small part of this leaky interface by shifting from Deployment to Endpoints as the canonical resource for establishing readiness.

This is also slightly more correct as it assesses not just whether the Deployment is ready, but whether that information has propagated to the Service / Endpoints.

/hold
/assign @vaikas @lionelvillard @grantr

Today the fact that the Broker lifecycle stuff propagates *Deployment* status for filter/ingress leaks implementation details into the API types. An example of where this leakage is bas would be if I were to run a multi-tenant broker (knative#2760) as something other than a Deployment. This change corrects one small part of this leaky interface by shifting from Deployment to Endpoints as the canonical resource for establishing readiness. This is also slightly more correct as it assesses not just whether the Deployment is ready, but whether that information has propagated to the Service / Endpoints.

knative-metrics-robot · 2020-03-17T17:36:21Z

The following is the coverage report on the affected files.
Say /test pull-knative-eventing-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/broker/broker.go	80.5%	81.2%	0.6
pkg/reconciler/broker/controller.go	87.5%	88.5%	1.0

n3wscott · 2020-03-17T17:40:44Z

/lgtm
/approve

grantr · 2020-03-17T17:55:48Z

Today the fact that the Broker lifecycle stuff propagates Deployment status for filter/ingress leaks implementation details into the API types.

Is this an API purity concern or a user concern? Can you give more details on the user impact of leaking the implementation detail? The Go type hasn't changed here, so I'm not sure which API type you're referring to.

Does the switch to using endpoint availability from deployment availability change the meaning of Broker readiness? I.e. from a user perspective, if the Broker is ready, can I make the same assumptions about its state as before? If not, is the new set of assumptions more accurate or better for the user?

mattmoor · 2020-03-17T18:09:29Z

/retest

(this test seems to be flaky on testgrid)

The Go type hasn't changed here, so I'm not sure which API type you're referring to.

Many levels of API 🐢 . Specifically I'm talking about the fact that in a class-based world the API exposed for folks to manipulate Broker status is perhaps overly fitted to our current implementation. Frankly even the notion of distinct ingress and filters is somewhat leaky of implementation details.

Does the switch to using endpoint availability from deployment availability change the meaning of Broker readiness?

In practice, this should be effectively the same. In theory, this is more correct as I would interpret readiness to mean that if I POST to the .status.address I don't get 500s; in that sense, it's more important that the Endpoints have gotten the ready pods than the Deployment.

Happy to discuss more in whatever medium makes the most sense, this is just something I noticed looking over #2760 and wanted to lend a hand.

vaikas · 2020-03-17T18:15:37Z

As far as the Conditions, the user should only rely on the status.Ready as whether the Broker is good to go or not,

Overall, the Broker Spec does not talk about any of these, nor should it.
https://github.com/knative/eventing/blob/master/docs/spec/broker.md

And from that spec:

Broker objects SHOULD include a Ready condition in their status.

The Broker SHOULD indicate Ready=True when its ingress is available to receive events.

While a Broker is Ready, it SHOULD be a valid Addressable and its status.address.url field SHOULD indicate the address of its ingress.

So this is definitely a more better change.

mattmoor · 2020-03-17T18:24:20Z

Well, what I said about the sub-conditions applies indirectly to how it manifests in Ready too.

@grantr do you have [other] concerns you'd like to discuss?

grantr · 2020-03-17T18:24:51Z

in a class-based world the API exposed for folks to manipulate Broker status

Ok, IIUC you're talking about the Go API (for contributors), not the content of the object's status (for users).

it's more important that the Endpoints have gotten the ready pods than the Deployment.

I'm not super familiar with the relationship between Service and Endpoints. Does a pod ip only make it into Endpoints if it's confirmed ready via liveness/readiness?

Concretely it seems like you're saying that the sequence is normally

Deployment available, then Deployments pods show up in Endpoints

And never

Deployment pods show up in Endpoints, then Deployment available

It's this second one that would be a behavior change, because it moves the readiness earlier than before. If, after this change, the user can be more sure that a Ready Broker is available to receive requests, then 👍.

vaikas · 2020-03-17T18:47:15Z

/lgtm
/approve

knative-prow-robot · 2020-03-17T18:47:26Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mattmoor, n3wscott, vaikas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [n3wscott,vaikas]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

mattmoor · 2020-03-17T20:27:20Z

Does a pod ip only make it into Endpoints if it's confirmed ready via liveness/readiness?

There are two sections notReadyAddresses and addresses. Pods only land in addresses once they have passed their readiness probe (and optional readiness gates, if folks are using that).

Deployment pods show up in Endpoints, then Deployment available

Just to be super nit picky / pedantic the flow of readiness is:

Pods -> ReplicaSet -> Deployment
Pods -> Service/Endpoints

It is possible that an Endpoints is ready before the Deployment has reported ready, but this is fine because the Deployment isn't actually what matters.

It is possible that a Deployment is ready before the Endpoints has ready addresses, which is a potential bug with what is at HEAD today.

If, after this change, the user can be more sure that a Ready Broker is available to receive requests, then 👍.

Yeah that's exactly what I'm saying. In practice this is likely a small change, but in theory a race exists today.

mattmoor · 2020-03-18T14:40:55Z

This has been out for a while, and I believe we've resolved concerns. Going to land to unblock @vaikas's PR.

/unhold

vaikas · 2020-03-18T15:25:00Z

/test pull-knative-eventing-integration-tests

knative-prow-robot assigned grantr and lionelvillard Mar 17, 2020

knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 17, 2020

knative-prow-robot assigned vaikas Mar 17, 2020

googlebot added the cla: yes Indicates the PR's author has signed the CLA. label Mar 17, 2020

knative-prow-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Mar 17, 2020

knative-prow-robot requested review from Harwayne and n3wscott March 17, 2020 17:32

knative-prow-robot assigned n3wscott Mar 17, 2020

knative-prow-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 17, 2020

mattmoor mentioned this pull request Mar 17, 2020

first cut at multi-tenant broker #2760

Merged

knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 18, 2020

knative-prow-robot merged commit 77dd5b2 into knative:master Mar 18, 2020

tzununbekov mentioned this pull request Oct 28, 2021

IMC dispatcher Endpoints status propagation #5854

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make Broker availability based on the Endpoints not Deployments. #2766

Make Broker availability based on the Endpoints not Deployments. #2766

mattmoor commented Mar 17, 2020

knative-metrics-robot commented Mar 17, 2020

n3wscott commented Mar 17, 2020

grantr commented Mar 17, 2020

mattmoor commented Mar 17, 2020

vaikas commented Mar 17, 2020

mattmoor commented Mar 17, 2020

grantr commented Mar 17, 2020

vaikas commented Mar 17, 2020

knative-prow-robot commented Mar 17, 2020

mattmoor commented Mar 17, 2020

mattmoor commented Mar 18, 2020

vaikas commented Mar 18, 2020

Make Broker availability based on the Endpoints not Deployments. #2766

Make Broker availability based on the Endpoints not Deployments. #2766

Conversation

mattmoor commented Mar 17, 2020

knative-metrics-robot commented Mar 17, 2020

n3wscott commented Mar 17, 2020

grantr commented Mar 17, 2020

mattmoor commented Mar 17, 2020

vaikas commented Mar 17, 2020

mattmoor commented Mar 17, 2020

grantr commented Mar 17, 2020

vaikas commented Mar 17, 2020

knative-prow-robot commented Mar 17, 2020

mattmoor commented Mar 17, 2020

mattmoor commented Mar 18, 2020

vaikas commented Mar 18, 2020