-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update (rewrite) overview, motivation, and data plane #24
Conversation
1e38d41
to
2d25faf
Compare
2d25faf
to
8ef8f61
Compare
specs/eventing/data-plane.md
Outdated
If a recipient chooses to reply to a sender with a `200` response code and a | ||
reply event in the absence of a `Prefer: reply` header, the sender SHOULD treat | ||
the event as accepted, and MAY log an error about the unexpected payload. The | ||
sender MUST NOT process the reply event if it did not advertise the `Prefer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this section brings the Prefer: reply
to a mandatory and I thought before it was more of a "you can provide this, but the sink is free to ignore it, but if you say "Prefer: " and get a reply, you are free to drop the event without warning.
This language makes the header mandatory, and it is not used anywhere yet, so it would be an API change for eventing at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this attempts to make the expected behavior for e.g. a receiving function explicit.
I'm willing to change this, but the current formulation implemented in eventing makes it difficult for a function to know if the reply might be dropped, which seems to go against the goal of avoiding message loss.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I didn't change this yet, because I'd like to discuss whether it makes more sense to change the contract or the implementation.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, here's some more background on this.
knative/eventing#4560
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the background, and the concern that adding MUST
language around this header would make previous versions of Knative incompatible with the 1.0 spec.
knative/eventing#4560 (comment)
I'm wondering if we have a requirement that versions of Knative prior to the 1.0 spec MUST
be compliant? If so, what is our boundary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this functionality is not written in the reference implementation, I would be inclined to not make it a MUST
.
The original PR was using SHOULD
and MAY
, no MUST NOT
nor MUST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original PR did not include any MUST clauses.
Unfortunately, I have a conflict with the eventing meeting today (interview); is there a way you could see this becoming a "MUST", or do you think it always needs to be a MAY/SHOULD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it as always a MAY/SHOULD for 1.0
Relates to knative/eventing#4595 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
specs/eventing/data-plane.md
Outdated
If a recipient chooses to reply to a sender with a `200` response code and a | ||
reply event in the absence of a `Prefer: reply` header, the sender SHOULD treat | ||
the event as accepted, and MAY log an error about the unexpected payload. The | ||
sender MUST NOT process the reply event if it did not advertise the `Prefer: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, here's some more background on this.
knative/eventing#4560
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! I think I've address most of the issues, and I re-ordered the overview to put the interfaces first.
Just wanted to say THANK YOU for doing the hard to write this up @evankanderson |
/assign @duglin |
specs/eventing/data-plane.md
Outdated
The current version of this document does not describe protocol negotiation or | ||
the ability to upgrade an HTTP 1.1 event delivery into a more efficient protocol | ||
such as GRPC, AMQP, or the like. It is expected that a future compatible version | ||
of this specification might describe a protocol negotiation mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should remove this sentence, It doesn't make sense to me to describe the spec future inside the spec itself. Maybe we can have such claims somewhere else, but not in the spec.
Maybe just keep:
The current version of this document does not describe protocol negotiation or the ability to upgrade an HTTP 1.1 event delivery into another protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this so that readers of the specification would be aware when building compatible implementations. Since we don't have a separate version number, it's possible that an implementation built against this specification might be run against a future specification which would offer extra fields. The (older) implementation should not reject the event because of additional fields that it doesn't understand.
(For example, if the Upgrade
header is used, the older implementation should just ignore the protocol upgrade offer.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean, but what you're talking about is a concern of the future spec, not of the existing one: when we're going to add such mechanisms, we need to make sure they'll be backwards compatible, in the sense that old implementations should still work properly.
On the other hand, this sentence sounds wrong, because we're saying in the spec what the future of the spec might look like, which is just going to confuse the reader that needs to implement the spec today. And also, what if we're never going to keep this promise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important for implementers to realize that there may be future-compatible extensions. How about the following language:
The current version of this document does not describe protocol negotiation or any delivery mechanism other than HTTP 1.1. Future versions may define protocol negotiation to optimize delivery; compliant implementations SHOULD aim to interoperate by ignoring unrecognized negotiation options (such as HTTP
Upgrade
headers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording you proposed is definitely better, but it still doesn't sound right to me writing such "future proof" sentence inside the spec. I prefer just keeping the first sentence of this new wording, but it's ok if you want to have all the text block.
For me from Future versions onward should go in a doc like a primer: #24 (comment)
Senders MAY probe the recipient with an | ||
[HTTP OPTIONS request](https://tools.ietf.org/html/rfc7231#section-4.3.7); if | ||
implemented, the recipient MUST indicate support for the POST verb using the | ||
[`Allow` header](https://tools.ietf.org/html/rfc7231#section-7.4.1). Senders | ||
which receive an error when probing with HTTP OPTIONS SHOULD proceed using the | ||
HTTP POST mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually implemented by somebody? At least in our ref implementation, it's not implemented at all. Can you remove it now and maybe we add it later?
The reason I ask is that there is another CE spec, called Webhook, which I believe overlaps with this sentence knative/eventing#3092
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I found it in one of the existing eventing docs. This seems to be compatible with https://github.com/cloudevents/spec/blob/master/http-webhook.md.
The CE WebHook specification provides the following for webhook authorization:
Any system that allows registration of and delivery of notifications to arbitrary HTTP endpoints ...
- Sender sends an HTTP OPTIONS request with
Webhook-Request-Origin
(which should match theOrigin
header sent on each event delivery) and optionallyWebhook-Request-Callback
andWebhook-Request-Rate
.- Responder replies with
Allow: POST
andWebHook-Allowed-Origin
and preferablyWebHook-Allowed-Rate
(required ifWebHook-Request-Rate
is set). If the responder needs to do some manual setup andWebhook-Request-Callback
is provided, the responder can respond without headers and then use the callback URL later to grant access.
Since this is "sender MAY" and "if implemented, recipient MUST respond with Allow: POST
, I think that lines up. The "receive an error when probing ... SHOULD" allows for the webhook authorization behavior, but doesn't require it to be implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this is implemented, neither supported in any way right now... So I prefer to remove it now and add it later, when we'll have discussed and implemented it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. If the spec is now rewritten in a way mandating certain implementation work, that's not there, is IMO not flying well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting; it appears that the current Brokers are not compliant with https://datatracker.ietf.org/doc/html/rfc7231#section-7.4.1
The actual set of allowed methods is defined by the origin server at
the time of each request. An origin server MUST generate an Allow
field in a 405 (Method Not Allowed) response and MAY do so in any
other response. An empty Allow field value indicates that the
resource allows no methods, which might occur in a 405 response if
the resource has been temporarily disabled by configuration.
curl -vv -X OPTIONS -H "Host: broker-ingress.knative-eventing.svc.cluster.local" http://127.0.0.1:8090/default/example-broker
* Trying 127.0.0.1:8090...
* Connected to 127.0.0.1 (127.0.0.1) port 8090 (#0)
> OPTIONS /default/example-broker HTTP/1.1
> Host: broker-ingress.knative-eventing.svc.cluster.local
> User-Agent: curl/7.71.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 405 Method Not Allowed
< Date: Thu, 20 May 2021 18:29:47 GMT
< Content-Length: 0
<
In any case, the current Broker implementation matches this clause: my sender sent an HTTP OPTIONS, and the recipient did not support HTTP OPTIONS, so I proceed with POST delivery.
specs/eventing/data-plane.md
Outdated
be retried? What about `405` (Method Not Allowed), 413 (Payload Too Large), 414 | ||
(URI Too Long), 426 (Upgrade Required), 431 (Header Fields Too Large), 451 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this: https://github.com/knative/eventing/blob/main/pkg/kncloudevents/message_sender.go#L192-L202
as a good section on 4xx
errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working off of the following:
https://github.com/knative/specs/blob/main/specs/eventing/data-plane.md#http-support
Which specified 200/202, 404, and 400 (unparseable CloudEvent). This is somewhat contradicted by:
https://github.com/knative/specs/blob/main/specs/eventing/data-plane.md#error-handling
If Sink is not returning HTTP success header (200 or 202) then the event may be sent again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to default to 4xx = no retry, should any of following codes be retried?
- 401 (Unauthorized)
- 403 (Forbidden)
- 404 (Not Found)
- 405 (Method Not Allowed)
- 408 (Request Timeout)
- 412 (Precondition Failed)
- 413 (Payload Too Large)
- 414 (URI Too Long)
- 421 (Misdirected Request)
- 426 (Upgrade Required)
- 431 (Request Header Fields Too Large)
Currently, the spec specifies:
If an HTTP request's URL does not correspond to an existing endpoint, then the Sink MUST respond with
404 Not Found
.
I interpreted this to mean that 404s were retriable, but the current spec says that every error "may be sent again" (non-spec language), which doesn't really line up with the code implementation.
I've adjusted this to only retry the three codes mentioned and all other 400s are non-retriable, though the 404 case seems strange to me, and I'd love to get more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure on all the "Too Long" things,
Also not sure on 426
, as that indicates the current protocol is refused. I doubt we do upgrade...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews!
I've updated things again; one semantic change:
- 4xx responses default to not-retriable except for 404, 409, and 429.
specs/eventing/data-plane.md
Outdated
The current version of this document does not describe protocol negotiation or | ||
the ability to upgrade an HTTP 1.1 event delivery into a more efficient protocol | ||
such as GRPC, AMQP, or the like. It is expected that a future compatible version | ||
of this specification might describe a protocol negotiation mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this so that readers of the specification would be aware when building compatible implementations. Since we don't have a separate version number, it's possible that an implementation built against this specification might be run against a future specification which would offer extra fields. The (older) implementation should not reject the event because of additional fields that it doesn't understand.
(For example, if the Upgrade
header is used, the older implementation should just ignore the protocol upgrade offer.)
Senders MAY probe the recipient with an | ||
[HTTP OPTIONS request](https://tools.ietf.org/html/rfc7231#section-4.3.7); if | ||
implemented, the recipient MUST indicate support for the POST verb using the | ||
[`Allow` header](https://tools.ietf.org/html/rfc7231#section-7.4.1). Senders | ||
which receive an error when probing with HTTP OPTIONS SHOULD proceed using the | ||
HTTP POST mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, I found it in one of the existing eventing docs. This seems to be compatible with https://github.com/cloudevents/spec/blob/master/http-webhook.md.
The CE WebHook specification provides the following for webhook authorization:
Any system that allows registration of and delivery of notifications to arbitrary HTTP endpoints ...
- Sender sends an HTTP OPTIONS request with
Webhook-Request-Origin
(which should match theOrigin
header sent on each event delivery) and optionallyWebhook-Request-Callback
andWebhook-Request-Rate
.- Responder replies with
Allow: POST
andWebHook-Allowed-Origin
and preferablyWebHook-Allowed-Rate
(required ifWebHook-Request-Rate
is set). If the responder needs to do some manual setup andWebhook-Request-Callback
is provided, the responder can respond without headers and then use the callback URL later to grant access.
Since this is "sender MAY" and "if implemented, recipient MUST respond with Allow: POST
, I think that lines up. The "receive an error when probing ... SHOULD" allows for the webhook authorization behavior, but doesn't require it to be implemented.
b7d30e5
to
07fcbde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorporated feedback.
My intent is not to add new requirements except the Prefer: Reply
header, which I'm willing to send a PR for next week. I think event recipients knowing whether or not a reply event is going to the bit-bucket is important enough to implement for the next release.
It would also be nice for Brokers and Channels to be RFC 7231 (HTTP/1.1) compliant with respect to the Allow
header, but I think the spec language doesn't require that except indirectly.
specs/eventing/data-plane.md
Outdated
The current version of this document does not describe protocol negotiation or | ||
the ability to upgrade an HTTP 1.1 event delivery into a more efficient protocol | ||
such as GRPC, AMQP, or the like. It is expected that a future compatible version | ||
of this specification might describe a protocol negotiation mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important for implementers to realize that there may be future-compatible extensions. How about the following language:
The current version of this document does not describe protocol negotiation or any delivery mechanism other than HTTP 1.1. Future versions may define protocol negotiation to optimize delivery; compliant implementations SHOULD aim to interoperate by ignoring unrecognized negotiation options (such as HTTP
Upgrade
headers).
Senders MAY probe the recipient with an | ||
[HTTP OPTIONS request](https://tools.ietf.org/html/rfc7231#section-4.3.7); if | ||
implemented, the recipient MUST indicate support for the POST verb using the | ||
[`Allow` header](https://tools.ietf.org/html/rfc7231#section-7.4.1). Senders | ||
which receive an error when probing with HTTP OPTIONS SHOULD proceed using the | ||
HTTP POST mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting; it appears that the current Brokers are not compliant with https://datatracker.ietf.org/doc/html/rfc7231#section-7.4.1
The actual set of allowed methods is defined by the origin server at
the time of each request. An origin server MUST generate an Allow
field in a 405 (Method Not Allowed) response and MAY do so in any
other response. An empty Allow field value indicates that the
resource allows no methods, which might occur in a 405 response if
the resource has been temporarily disabled by configuration.
curl -vv -X OPTIONS -H "Host: broker-ingress.knative-eventing.svc.cluster.local" http://127.0.0.1:8090/default/example-broker
* Trying 127.0.0.1:8090...
* Connected to 127.0.0.1 (127.0.0.1) port 8090 (#0)
> OPTIONS /default/example-broker HTTP/1.1
> Host: broker-ingress.knative-eventing.svc.cluster.local
> User-Agent: curl/7.71.1
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 405 Method Not Allowed
< Date: Thu, 20 May 2021 18:29:47 GMT
< Content-Length: 0
<
In any case, the current Broker implementation matches this clause: my sender sent an HTTP OPTIONS, and the recipient did not support HTTP OPTIONS, so I proceed with POST delivery.
specs/eventing/data-plane.md
Outdated
in return to the sender. Event recipients MUST use the | ||
[`source` and `id` attributes](https://github.com/cloudevents/spec/blob/v1.0.1/spec.md#required-attributes) | ||
to determine duplicate events (see [observability](#observability) for an | ||
example case where other event attributes may vary from one delivery attempt to | ||
another). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What Ville said -- CloudEvents requires that each source
ensure that id
is globally-for-all-time unique for events emitted by that source.
If a destination system such as a Knative Service can't handle the events idempotently, then it may need something like a lookaside cache (e.g. Redis) where it can track the events it has already processed. This isn't built into Knative serving, and is a state store that's outside of Knative. That's hardly unique to Knative; AWS Lambda has similar requirements and solutions, 2, 3.
I'm happy to rewrite this section to make the two points clearer -- would the following help?
As specified in the CloudEvents specification, event recipients MUST use the
source
andid
attributes to determine duplicate events if needed. This specification does not describe state requirements for clients which need to detect duplicate events. (See observability for anexample case where other event attributes may vary from one delivery attempt toanother)
specs/eventing/data-plane.md
Outdated
[`traceparent` and `tracestate` distributed tracing attributes](https://github.com/cloudevents/spec/blob/v1.0/extensions/distributed-tracing.md) | ||
may be modified in this way for each delivery attempt of the same event. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I pointed out that was a mess a while ago; I can also link directly to the w3c in case a future version of the CE spec removes the tracecontext content.
specs/eventing/data-plane.md
Outdated
in return to the sender. Event recipients MUST use the | ||
[`source` and `id` attributes](https://github.com/cloudevents/spec/blob/v1.0.1/spec.md#required-attributes) | ||
to determine duplicate events (see [observability](#observability) for an | ||
example case where other event attributes may vary from one delivery attempt to | ||
another). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also added a note that idempotency was a reasonable way to deal with duplicate events.
specs/eventing/data-plane.md
Outdated
| other `2xx` | (Unspecified) | No\* | No\* | Yes\* | | ||
| other `3xx` | (Unspecified) | No\* | No\* | Yes\* | | ||
| `400` | Unparsable event | No | No | Yes | | ||
| `404` | Endpoint does not exist | Yes | No | Yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a Yes in the Retry column means that the sender MUST retry or that it MAY retry if it wants. I'm not sure I would want to mandate a Retry on a 404.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or on any error TBH
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a SHOULD follow the sender's defined retry behavior, so this is a "MAY".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L134 specifies this as a "SHOULD":
Where possible, event senders SHOULD re-attempt delivery of events where the HTTP request failed or returned a retryable status code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still uncomfortable with a SHOULD for a 404 and 5xx errors. I can buy an argument that 429 might be retried, but the others (and probably 409) don't give me any reason to believe that a subsequent replay of the event would result in any different result. Maybe 404 if you believe the sink is still coming up I guess, but all of the others are just asking to be DDOSs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about a MAY if the sender has reason to believe that a retry might yield different results?
[observability](#observability) for an example case where other event attributes | ||
may vary from one delivery attempt to another). | ||
|
||
Where possible, event senders SHOULD re-attempt delivery of events where the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think SHOULD may be too strong here. In serving we specifically stopped the networking layer from retrying on 5xx errors because it ended up spamming the ksvc when the ksvc was correctly returning a 503. Even with a MAY I'd still like to see some text that talks about how the sender needs to be selective about it to avoid DDOSing the sink.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the mention of congestion control in the next sentence sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good thing to mention, but it still pushes people towards resending... just slower. I keep having flashbacks to wasting my time debugging a ksvc trying to figure out why I was seeing 3 times the errors in its logs... and it was because of the network retries. I think it's sufficient to talk about how it's ok to retry if the sender thinks a new result will happen, but not just because it failed with a 5xx
specs/eventing/data-plane.md
Outdated
event senders implement some form of congestion control (such as exponential | ||
backoff) when managing retry timing. This specification does not document any | ||
specific congestion control algorithm or parameters. | ||
[Brokers](./overview.md#broker) and [Channels](./overview.md#channel) MUST |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the previous comment, MUST may be too strong here. I don't want my broker/channel to spam my ksvc if the error returned by the ksvc is an error and it's not going to change no matter how many times it is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we need to lean on congestion control a bit here to avoid overwhelming the destination, possibly including discarding some messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added:
Congestion control MAY cause event delivery to fail or retry attempts to be skipped
(Delivery failure would trigger dead-letter behavior, see #25 for details of that.)
specs/eventing/data-plane.md
Outdated
|
||
In some applications, an event receiver might emit an event in reaction to a | ||
received event. Components MAY to choose to support this pattern by accepting an | ||
encoded CloudEvent in the HTTP response. The sender SHOULD NOT assume that a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the SHOULD NOT here is very interesting when paired with the "might emit an event in reaction to a received event". The sentence above sure implies a relationship to me. So, if we're explicitly saying that it could be a totally unrelated event how can components like a Sequence provide any guarantees that it'll work properly? Can you explain why this SHOULD NOT is needed? Seems like we're opening the door for people to ask "ok, if they can be unrelated then how can I indicate that it IS related? Or how do I send a related response later?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I copied this from here:
https://github.com/knative/specs/blob/main/specs/eventing/data-plane.md#http-support
If Sink is Callable it MAY respond with 200 OK and a single event in the HTTP response. A returned event is not required to be related to the received event. The Callable should return a successful response if the event was processed successfully. If there is no event to send back then Callable Sink MUST respond with 2xx HTTP and with empty body.
TBH, I'm not sure why that's here, and I'd be happy to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@n3wscott @slinkydeveloper @vaikas do you know why this is here? W/o a clear reason for why it could be a random event, I'd prefer to remove this and then leave the implication be that it's related. That leave it open for us to "clarify" it later by adding text to say that it can be random or mandate that it be related once we get more feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 with removing it. I don't know why that sentence is here.
specs/eventing/data-plane.md
Outdated
reply event in the absence of a `Prefer: reply` header, the sender SHOULD treat | ||
the event as accepted, and MAY log an error about the unexpected payload. The | ||
sender MUST NOT process the reply event if it did not advertise the | ||
`Prefer: reply` capability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we can mandate what a sender does w.r.t. the reply at all since we can't verify it. Or put another way, we can only control the interactions between the sender and receiver, not the sender's actions after that interaction.
Likewise, it feels wrong to push people to treat it as an error to get a response w/o the Prefer header. There are going to be lots of sinks that have no idea about this header and to encourage the sender to treat them all as errors doesn't feel right. Especially when they can't actually do anything about it. Granted it's a MAY but it feels like we're encouraging it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a header that the sender controls. I've attempted to clarify. This is saying "if a sender does not set Prefer: reply
in the request and receives a response, it should not process the response". I'm not sure how a sender would reach that situation by accident (except that our existing Channels / Broker didn't implement the extension we described, but I'm going to PR that shortly).
This also provides a sink the ability to determine whether they are on the subscriber
or reply
leg of a Subscription; subscriber
should advertise Prefer: reply
, while reply
doesn't have a place to handle the reply and shouldn't send that header.
In terms of determining the MUST for Brokers and Channels, we can actually determine what it does with the response message, because processing the event means re-enqueing it in the Broker or sending it to the reply
Addressable. We can have the subscriber
target refuse to send an event (or have it log an error or what-not) if the delivered event doesn't have the expected Prefer: reply
header.
1. Services can be connected to create new applications | ||
- without modifying producer or consumer. | ||
1. Services can be connected to create new applications: | ||
- without modifying producer or consumer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's use common terms... introducing "consume" in this doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you're suggesting here; in the data plane doc, receiver is being used specifically to refer to the "origin server" side of HTTP, whereas this is intended to speak about the logical connections and flow of events. (All of motivation should use the produce/consume terminology, I think.)
|
||
1. Services can be connected to create new applications | ||
- without modifying producer or consumer. | ||
1. Services can be connected to create new applications: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what this one is trying to say. Do you mean we can add new sinks into the eventing infra/mesh w/o impacting existing producers and consumers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, this is saying that new event sources and sinks can be connected to the topology without needing to change any code in the producers or consumers.
Compare this with explicitly configuring exchanges in AMQP or topics in Kafka, where consuming from two exchanges or topics would required updating the code to create two subscriptions or consumer groups.
- [Resource type overview](overview.md) | ||
- [Interface contracts](interfaces.md) | ||
- [Object model specification](spec.md) | ||
Knative Eventing also defines patterns to simplify the construction and usage of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have an example of what you had in mind with this last sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Destination/Addressable and the high availability of Channels and Brokers are the two benefits to the event producer side. For the event consumer side, I think the reply pattern and retry policies along with CloudEvents-over-HTTP are the main contributors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm willing to cut this sentence if it doesn't feel like it belongs, though.
specs/eventing/overview.md
Outdated
|
||
### Addressable | ||
|
||
**Addressable** resources expose a resource address (HTTP URL) in their `status` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/address/address
/ to make it clear that's the name of the property. And we should show an example to make it even more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specs/eventing/overview.md
Outdated
|
||
### Destination | ||
|
||
**Destination** is an interface (resource fragment) which is used consistently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you call it an "interface", but Addressable is a "resource"... it might be better to use consistent terms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a discussion with @slinkydeveloper about this:
I think Addressable is a contract, because you can access an object (via the K8s apiserver), and process it based on unpacking the partial schema.
You can't do this for a Destination, because Destination is not rooted at the base of the object. Instead, you'll find Destinations in several different places in spec, but they all work the same. I wanted to distinguish between duck typing and "common fragments like ObjectReference", because common fragments don't have the same ability to work across variant Kinds.
Destination is a different type of thing than Addressable (which is a duck type). This is more akin to a K8s ObjectReference -- a common type used in many places, but which can't be operated on abstractly in the same way as a duck type.
specs/eventing/overview.md
Outdated
simple URL or relative to an **Addressable** object reference; it also supports | ||
a Kubernetes Service object reference (as a special case). An absolute URL in a | ||
Destination may be used to reference cluster-external resources such as a | ||
virtual machine or SaaS service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we show exactly what the resource definitions look like and make them normative? This applies to all resources define here. Having just textual descriptions leaves too much to the imagination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#addressable-resolution, https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#addressable-v1, and https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#duckv1destination should bring you joy, then.
|
||
- Topology-based event routing ([`messaging.knative.dev`](#messaging)) | ||
|
||
- Content-based event routing ([`eventing.knative.dev`](#eventing)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Knowing the history here I understand why you wrote this, but w/o more context (or explaining these concepts and how they related to one another) this is hard for a newbie to follow or understand why you're mentioning this. These two terms are never used again. I'd suggest we either remove this section and just talk about the individual resources below (including how they related to each other), or go into a lot more detail about the meta-model so people understand it better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at #25 (which includes this PR), you'll see that these are referenced there: https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#content-based-routing. I broke the PR up so people could get started and didn't need to digest all 1500 lines at once.
The Knative Eventing API provides primitives for two common event-processing | ||
patterns: | ||
|
||
- Topology-based event routing ([`messaging.knative.dev`](#messaging)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these *.knative.dev
namespaces significant from a spec perspective?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because the Kubernetes apiserver will serve these from:
/apis/messaging.knative.dev/v1/namespaces/{namespace}/channels/{name}
for Channels, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok - then I think we need to add some RFC keywords in here to mandate that these be used to ensure interop.
specs/eventing/overview.md
Outdated
may reference additional implementation-specific configuration options via a | ||
reference to an external object (either a kubernetes built-in like ConfigMap or | ||
a custom object); the format of the external objects is intentionally not | ||
standardized. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing normative in here so what are people meant to do with this section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to be something like the CE primer, i.e. to provide background on the intent. Compare with https://github.com/knative/specs/blob/main/specs/serving/overview.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duglin I'm going to resolve comments where I made a change that mostly lines up with your suggestions, and ask you to ack or request further changes on the remaining comments that aren't resolved.
Thanks for the review! It definitely helped make this part clearer and stronger.
specs/eventing/data-plane.md
Outdated
|
||
A **Sink** MUST be able to handle duplicate events. | ||
- Knative Eventing assumes a sender-driven (push) event delivery system. That | ||
is, each event processor is actively responsible for an event until it is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, it actually means:
Each sender is responsible for an event until it receives an acknowledgement from the receiver. Each receiver MUST NOT send an acknowledgement for an event until it is ready to take responsibility for the event.
That's a little longer than "each event processor is actively responsible for an event", but maybe it's clearer?
specs/eventing/data-plane.md
Outdated
A **Sink** MUST be able to handle duplicate events. | ||
- Knative Eventing assumes a sender-driven (push) event delivery system. That | ||
is, each event processor is actively responsible for an event until it is | ||
handled (or affirmatively delivered to all following recipients). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my previous comment. It's like "hot potato" -- you're responsible for not dropping the potato until you hand it to someone else.
specs/eventing/data-plane.md
Outdated
[version 0.3](https://github.com/cloudevents/spec/blob/v0.3/http-transport-binding.md) | ||
with restrictions and extensions specified below. | ||
- Knative Eventing aims to make writing | ||
[event sources](./overview.md#event-source) and event-processing software |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are a few different roles:
- Event Sender - the party initiating the HTTP POST in this document
- Event Receiver - the party receiving the HTTP POST in this document
- Event Processor - a component which may be an "event sender", "event receiver", or both
- Source - an event processor which is only ever an "event sender"
- Sink - an event processor which is only ever an "event receiver"
Would a glossary of these terms near the top help?
I also notice that this says "make writing event sources and ... easier to write". I took out the first "writing"
- [Resource type overview](overview.md) | ||
- [Interface contracts](interfaces.md) | ||
- [Object model specification](spec.md) | ||
Knative Eventing also defines patterns to simplify the construction and usage of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm willing to cut this sentence if it doesn't feel like it belongs, though.
|
||
- Topology-based event routing ([`messaging.knative.dev`](#messaging)) | ||
|
||
- Content-based event routing ([`eventing.knative.dev`](#eventing)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you look at #25 (which includes this PR), you'll see that these are referenced there: https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#content-based-routing. I broke the PR up so people could get started and didn't need to digest all 1500 lines at once.
specs/eventing/overview.md
Outdated
|
||
### Addressable | ||
|
||
**Addressable** resources expose a resource address (HTTP URL) in their `status` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specs/eventing/overview.md
Outdated
|
||
### Destination | ||
|
||
**Destination** is an interface (resource fragment) which is used consistently |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a discussion with @slinkydeveloper about this:
I think Addressable is a contract, because you can access an object (via the K8s apiserver), and process it based on unpacking the partial schema.
You can't do this for a Destination, because Destination is not rooted at the base of the object. Instead, you'll find Destinations in several different places in spec, but they all work the same. I wanted to distinguish between duck typing and "common fragments like ObjectReference", because common fragments don't have the same ability to work across variant Kinds.
Destination is a different type of thing than Addressable (which is a duck type). This is more akin to a K8s ObjectReference -- a common type used in many places, but which can't be operated on abstractly in the same way as a duck type.
specs/eventing/overview.md
Outdated
simple URL or relative to an **Addressable** object reference; it also supports | ||
a Kubernetes Service object reference (as a special case). An absolute URL in a | ||
Destination may be used to reference cluster-external resources such as a | ||
virtual machine or SaaS service. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#addressable-resolution, https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#addressable-v1, and https://github.com/knative/specs/blob/99bb2e0cb0e33f36dfd3a4475ebf08668fbca2f3/specs/eventing/control-plane.md#duckv1destination should bring you joy, then.
| `400` | Unparsable event | No | No | Yes | | ||
| `404` | Endpoint does not exist | Yes | No | Yes | | ||
| `409` | Conflict / Processing in progress | Yes | No | Yes | | ||
| `429` | Too Many Requests / Overloaded | Yes | No | Yes | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec should explicitly reference https://datatracker.ietf.org/doc/html/rfc7231#section-7.1.3 as stated here: knative-extensions/eventing-kafka#666 (comment)
The CloudEvent Webhook spec says:
... The sender MUST observe the value of the Retry-After header and refrain from sending further requests until the indicated time.
With a proper reference to 7231 we should be good, right?
/close We're going to merge #25 and close this. |
@evankanderson: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Signed-off-by: Pierangelo Di Pilato <pierangelodipilato@gmail.com>
I'm afraid these ended up being mostly-complete rewrites.
I plan to merge
broker.md
,channel.md
,spec.md
, andinterfaces.md
into acontrol-plane.md
document, which will address both lifecycle and resource interactions as well as event routing semantics.My current plan is probably to remove
helper.md
andsources.md
, as the former seems like implementation-specific guidance, and the latter feels like it might be better in a 1.1 document along with the Discovery API for sources. I'm happy to be convinced otherwise, but conformance testing for sources feels somewhat uncharted.