A36: xDS-Enabled Servers #214

ejona86 · 2020-11-20T21:00:15Z

See "open issues" at the end of the doc. At the time of creation, Go and C are much more tentative than Java.

ejona86 · 2020-11-20T21:15:58Z

easwars · 2020-11-20T22:33:32Z

A36-xds-for-servers.md

+
+The xDS-returned Listener must have an [`address`][Listener.address] that
+matches the listening address provided. The Listener's `address` would be a TCP
+`SocketAddress` with matching `address` and `port_value`. The XdsClient must NAK


s/NAK/NACK maybe?

Yes. Two spellings for the same thing.

easwars · 2020-11-20T22:38:15Z

A36-xds-for-servers.md

+`example/resource?udpa.resource.listening_address=[::]:80`. Note that the `[]`
+is _not_ percent encoded, as this string is not a URI.
+
+The xDS-returned Listener must have an [`address`][Listener.address] that


In Go, the piece of code which processes the response from the management server and decides whether to ack or nack, does not know about the set of resources being watched (that happens at a higher layer). This means that when we get a Listener resource, we do not have a way to make sure that the address and port_value in the SocketAddress match the one specified in the listening_address field. What we can do is to make sure that it matches with the name field (because eventually the name in the response has to match the one in the request).

I wanted to bring this us up to see if there will be any inconsistency in the handling of this across the languages.

There is no listening_address field. It is just encoded in the resource name in the request. So when you match the name in the response to the name in the request you are indirectly verifying the value of the listening_address. In that case it should not be difficult to match that part with the [address][Listener.address] value either.

Repeating my earlier comment from reviewable:

Hmm, good question. Maybe the XdsClient can verify that the Listener resource has its address field set to the same address as in the resource name. Although that might be difficult if we use the template approach I suggest above instead of hard-coding "xds.resource.listener_address=" in our implementation.

We have now switched to the template approach. Given that, I don't think we can make any assumptions about the name field.

I would like to do this validation, but I don't see a reasonable way to do it without some fairly ugly layering violations. Unless one of you can think of another way to do this, I think we may need to live without this check. :(

I can think of the following layered approach for validating this:

the xDSClient validates the returned Listener by matching the resource name against the requested resource name (trivial common check for all resources?)

the actual watcher that requested this Listener resource then validates that the returned Listener has the address (Listener.address) matching the expected value. This might be too late to NACK to the control plane but should also be rare (a bug in the control plane).

easwars · 2020-11-20T22:43:14Z

A36-xds-for-servers.md

+
+### Go
+
+Create an `xds.Server` struct that would internally contain a `grpc.Server`. It


s/xds.Server/xds.GRPCServer

https://github.com/grpc/grpc-go/blob/8736dcd05b735913226c0c8bf787db48a13a7129/xds/server.go#L86

easwars · 2020-11-20T22:47:31Z

A36-xds-for-servers.md

+Create an `xds.Server` struct that would internally contain a `grpc.Server`. It
+would inject its own `StreamServerInterceptor` and `UnaryServerInterceptor`s
+into the `grpc.Server`. The interceptors would be controlled by an
+`atomic.Value` that would update with the configuration. When


Could we skip this line about using an atomic to update the interceptors? It seems very implementation specific and could change anytime. Currently, we do have an interceptor which does nothing, and is not changed based on any configuration received. When we start supporting changing of the interceptors based on received configuration, we could use an atomic or a mutex.

I've added in the option for using a mutex. Calling out how the interceptor will roughly work is important to show the design actually works.

easwars · 2020-11-20T22:50:55Z

A36-xds-for-servers.md

+TODO: It looks like there is a lifetime issue for the XdsClient, as RPCs can
+continue after the listener is closed and Serve() returns.
+
+Because service registration is done on a method in the generated code, and


This is no longer true. We have added a ServiceRegistrar interface to the grpc package which contains a single method RegisterService, and this is implemented by both grpc.Server and xds.GRPCServer. The protoc-gen-go has also been updated to emit code where the registration function accepts this interface instead of a concrete type. So, users should be able to directly pass xds.GRPCServer to these registration functions, and the underlying grpc.Server is no longer exported.

https://github.com/grpc/grpc-go/blob/8736dcd05b735913226c0c8bf787db48a13a7129/server.go#L558

How will the code snippet below change for xDS? Was a new method introduced to the generated code?

s := xds.NewGRPCServer() pb.RegisterGreeterServer(s, &server{})

easwars · 2020-11-20T22:58:15Z

A36-xds-for-servers.md

+```
+
+
+To allow passing configuration, the `xds.Server` will be responsible for


This section needs updating as well. The current plan can be summarized as follows:

An API to create a server-side XdsCredentials will be exposed. This will be similar to the NewClientCredentials, and would be called NewServerCredentials.

The xds.GRPCServer will look through the ServerOptions list passed to it and will use a type assertion to determine whether or not the user has specified XdsCredentials.

If the user has specified XdsCredentials, any information required by the credentials implementation will be made available by exposing a function on a wrapped net.Conn passed to the ServerHandshake method.

The xds.GRPCServer will be responsible for wrapping the raw net.Conn corresponding to the incoming connection before invoking the ServerHandshake method.

sanjaypujare · 2020-11-21T23:54:37Z

A36-xds-for-servers.md

+## Abstract
+
+Bring xDS-based configuration to Servers. xDS has many features and this gRFC
+will not provide support for any of them in particular. However, it will be


s/it will be provide/it will provide/

sanjaypujare · 2020-11-22T00:15:24Z

A36-xds-for-servers.md

+configure the API, gRPC should log the errors at a default-visible log level. If
+the xds bootstrap is missing or invalid, implementations would ideally fail
+server startup, but a permanent hang is also acceptable. "Initial configuration"
+does not have to include server credentials.


Not clear what this means. Does it mean the initial config received from the control plane may not include TLS configuration but a later one may? Or by "server credentials" you mean the credentials needed to talk to the control plane? In this context "server" is ambiguous because it could mean the xDS control plane or the gRPC xDS-server under question.

markdroth

Thanks for writing this up, Eric!

Reviewed 1 of 1 files at r1.
Reviewable status: all files reviewed, 20 unresolved discussions (waiting on @easwars and @ejona86)

A36-xds-for-servers.md, line 57 at r1 (raw file):

of the existing API.

Credential configuration will be managed separately by providing a

This paragraph seems like it might belong in the security gRFC instead of this one. Or maybe leave it here but add something like "see gRFC A29 for details"?

A36-xds-for-servers.md, line 65 at r1 (raw file):

When starting the XdsServer, it will start communication with xDS to receive
initial configuration. `bind()` and `listen()` can happen immediately, but

I think it's worth pointing out that calling listen() before the xDS resource is obtained is actually worse for clients, because even without calling accept(), it will still cause timeout delays before pick_first can fail over to the next address in the list. So I think implementations should prefer to defer calling listen() until after the xDS resource is obtained. That's what we're planning to do in C-core.

Also, the language here seems to imply that bind() should be done immediately even if listen() is delayed. Is there actually any reason for that? Based on what @yashykt and I are looking at in C-core, it seems like the only case where it would matter that bind() happens before we get the xDS resource is a server using an ephemeral port. And that seems like such a niche case that I think I'd be okay saying that we just don't support it for the xDS-enabled server API. (To be clear, it does seem better to call bind() earlier if possible, so that we can support ephemeral ports, but I think it's a nice-to-have, not a hard requirement. In C-core, it looks like that may be hard to de-couple, so at least initially we will probably start by deferring both bind() and listen() until after we have the xDS resource. We can revisit that later if someone actually wants to use ephemeral ports with xDS.)

A36-xds-for-servers.md, line 80 at r1 (raw file):

```json
{
  "grpc_server_resource_name_id": "example/resource",

Given our recent discussion about what we're going to want to do for new-style resource names, how about doing something like this:

  // A template for the name of the Listener resource to subscribe to
  // for a gRPC server.
  // The token "%s", if present in this string, will be replaced with
  // the IP and port on which the server is listening.
  "server_listener_resource_name_template": "grpc/server?xds.resource.listening_address=%s",

This seems more flexible for deployments, and it has the nice property that we'd avoid hard-coding the "xds.resource.listening_address" part.

A36-xds-for-servers.md, line 90 at r1 (raw file):

`envoy.config.listener.v3.Listener` resource. XdsClient will concatenate the
value of `grpc_server_resource_name_id` from the bootstrap,
`?udpa.resource.listening_address=`, and the listening address to construct the

s/udpa/xds/

A36-xds-for-servers.md, line 93 at r1 (raw file):

resource name. For example, with an address of `[::]:80` and the example
bootstrap field above, the resource name would be
`example/resource?udpa.resource.listening_address=[::]:80`. Note that the `[]`

s/udpa/xds/

A36-xds-for-servers.md, line 96 at r1 (raw file):

Previously, sanjaypujare wrote…

There is no listening_address field. It is just encoded in the resource name in the request. So when you match the name in the response to the name in the request you are indirectly verifying the value of the listening_address. In that case it should not be difficult to match that part with the [address][Listener.address] value either.

Hmm, good question. Maybe the XdsClient can verify that the Listener resource has its address field set to the same address as in the resource name. Although that might be difficult if we use the template approach I suggest above instead of hard-coding "xds.resource.listener_address=" in our implementation.

A36-xds-for-servers.md, line 120 at r1 (raw file):

    EXTERNAL = 2;
  }
  google.protobuf.UInt32Value destination_port = 8; // Always fail match

Do we fail to match even if the destination port actually matches the port we're listening on?

A36-xds-for-servers.md, line 125 at r1 (raw file):

  repeated core.v3.CidrRange source_prefix_ranges = 6;
  repeated uint32 source_ports = 7;
  repeated string server_names = 11; // May always fail match if TLS unsupported

The comment here is a little confusing. What's the intended behavior here? (It's not actually entirely clear to me what this field actually does in Envoy.)

A36-xds-for-servers.md, line 126 at r1 (raw file):

  repeated uint32 source_ports = 7;
  repeated string server_names = 11; // May always fail match if TLS unsupported
  string transport_protocol = 9; // Always fail match

Why would this always fail? It looks like the supported values are "raw_buffer" and "tls", to select based on whether the connection is encrypted or not.

A36-xds-for-servers.md, line 127 at r1 (raw file):

  repeated string server_names = 11; // May always fail match if TLS unsupported
  string transport_protocol = 9; // Always fail match
  repeated string application_protocols = 10; // Always fail match

Always fail even if this says "h2"?

A36-xds-for-servers.md, line 131 at r1 (raw file):

All fields are "supported," however, we know that some features are unsupported.

I don't quite understand the meaning of this paragraph.

Should we reject a config that specifies filters, or do we just ignore the filters for now? If we just ignore them for now, then what happens when we later add support for filters? Will we be in a situation where a config that did previously work will then be NACKed?

In particular, I'm wondering about the HTTP connection manager (HCM) filter, which should always be the last filter in the chain. Should we verify that that's there, since we know that we're going to need some functionality from the HCM config later?

A36-xds-for-servers.md, line 141 at r1 (raw file):

no longer in use; they are a resource and must not be leaked.

This overall "wrapping" server API has many language-specific ramifications. We

I wonder if we should actually split the individual language sections out into their own gRFCs with L numbers? We're ultimately going to need a separate section for every wrapped language.

A36-xds-for-servers.md, line 284 at r1 (raw file):

impact is hoped to be low as servers will generally receive their initial
configuration quickly (on the order of a second) and most clients will likely be
using a variant of round-robin. We'd also encourge the xDS control plane to

Why do we encourage the xDS server to avoid handing out addresses of not-yet-started servers? I don't see the rationale for that.

yashykt

Reviewable status: 0 of 1 files reviewed, 21 unresolved discussions (waiting on @easwars, @ejona86, and @markdroth)

A36-xds-for-servers.md, line 101 at r1 (raw file):

the resource if the address does not match.

Although `Filter`s will not be observed initially, the `FilterChain` contains

Is this paragraph meant to be a placeholder for future changes? i.e. do we expect the types that we use to change in the future?

A36-xds-for-servers.md, line 103 at r1 (raw file):

Although `Filter`s will not be observed initially, the `FilterChain` contains
data that may be used. When looking for a FilterChain, the standard matching
logic must be used. Each `filter_chain_match` of the repeated

FYI currently, implementation-wise C Core and Go are accepting only a single filter_chain entry, so there is no matching logic in place at the moment. We aren't looking at default_filter_chain either. The only check that we are expecting to perform from the single entry's filter_chain_match is to check whether the application_protocol mentions "h2".

markdroth

Reviewed 1 of 1 files at r2.
Reviewable status: all files reviewed, 21 unresolved discussions (waiting on @easwars and @ejona86)

A36-xds-for-servers.md, line 80 at r1 (raw file):

Previously, markdroth (Mark D. Roth) wrote…

Given our recent discussion about what we're going to want to do for new-style resource names, how about doing something like this:
  // A template for the name of the Listener resource to subscribe to
  // for a gRPC server.
  // The token "%s", if present in this string, will be replaced with
  // the IP and port on which the server is listening.
  "server_listener_resource_name_template": "grpc/server?xds.resource.listening_address=%s",
This seems more flexible for deployments, and it has the nice property that we'd avoid hard-coding the "xds.resource.listening_address" part.

Although actually, if we do use the template here, then the parameter name should probably not start with "xds.", since that namespace is reserved for parameters added by the client implementation, not those added by the user.

ejona86 · 2020-12-22T01:06:23Z

Responses to @markdroth

So I think implementations should prefer to defer calling listen() until after the xDS resource is obtained.

The very next sentence stated "listen() should be delayed if possible". The rationale section describes the issues listening causes.

Given our other discission, I think this may change to "either don't call listen(), or call listen() and do accept()+close()".

Also, the language here seems to imply that bind() should be done immediately even if listen() is delayed.

It sayed "can". Changed to "may." Java does not bind before receiving the xDS Listener, so that definitely was not the intention. And yes, ephemeral ports won't have a port chosen at that point.

Given our recent discussion about what we're going to want to do for new-style resource names, how about doing something like this: (server_listener_resource_name_template snippet)

Yes, we should change it. Although I'll note that this will break experimental Java users today. But I think we can just add a new experimental flag to the bootstrap and leave the current flag's behavior unchanged. When we remove the grpc_server_resource_name_id flag that's when those clients will break, but they should have migrated to a new experimental flag or the default-on behavior once it is available.

Although actually, if we do use the template here, then the parameter name should probably not start with "xds.", since that namespace is reserved for parameters added by the client implementation, not those added by the user.

It isn't a new-style name so it was already wild-west. But, yeah, good point. I guess with this templating thing we don't need a well-defined name and can just let every server define their own, since now it isn't a protocol between the client and server as it was when this was implicit. CC @sanjaypujare

envoy.config.listener.v3.Listener resource. XdsClient will concatenate the
value of grpc_server_resource_name_id from the bootstrap,
?udpa.resource.listening_address=, and the listening address to construct the

s/udpa/xds/

Looks like Java, and thus probably TD, already used 'udpa'. That can be changed, but "we had all agreed" already to the old name. We can either leave it as-is for the old-style name and improve things for new-style names or change it now even though "it was already decided and agreed on and implemented." I understand how this happened though; this is due to a more clear future direction since we fleshed out a federation design. CC @sanjaypujare

google.protobuf.UInt32Value destination_port = 8; // Always fail match

Do we fail to match even if the destination port actually matches the port we're listening on?

Yes. The Listener's address is what is important for the local port. destination_port is only applicable for use_original_dst. It's documentation: "Optional destination port to consider when use_original_dst is set on the listener in determining a filter chain match."

repeated string server_names = 11; // May always fail match if TLS unsupported

The comment here is a little confusing. What's the intended behavior here? (It's not actually entirely clear to me what this field actually does in Envoy.)

Changed to "Always fail match for non-TLS connections". I think I had intended to let this to have "always fail" logic before XdsServerCredentias is implemented, but I think that actually doesn't help much. This uses SNI to determine what host the client thinks it is connecting to.

string transport_protocol = 9; // Always fail match

Why would this always fail? It looks like the supported values are "raw_buffer" and "tls", to select based on whether the connection is encrypted or not.

It changes based on envoy.filters.listener.tls_inspector; aka connection sniffing. I seriously considered only matching the value "raw_buffer", but the entire thing makes me uneasy as it is really easy to accidentally use wrong.

repeated string application_protocols = 10; // Always fail match

Always fail even if this says "h2"?

Yep. This is more tls_inspector goo; it is only set by that filter (at least the docs say so).

All fields are "supported," however, we know that some features are unsupported.

I don't quite understand the meaning of this paragraph.

Should we reject a config that specifies filters, or do we just ignore the filters for now?

Filters will be ignored initially, just like on client-side. That was at defined at "Although Filters will not be observed initially..."

But this discussion of "fields" only applies to FilterChainMatch.

You, Doug, and me had discussed checking filters and decided to just mirror client-side. Any issues that arise will be no worse than we will already suffer.

impact is hoped to be low as servers will generally receive their initial
configuration quickly (on the order of a second) and most clients will likely be
using a variant of round-robin. We'd also encourge the xDS control plane to

Why do we encourage the xDS server to avoid handing out addresses of not-yet-started servers? I don't see the rationale for that.

In the context of EDS, it is best not to send clients to unhealthy/not ready backends.

ejona86 · 2020-12-22T01:06:46Z

Responses to @yashykt

Although Filters will not be observed initially, the FilterChain contains

Is this paragraph meant to be a placeholder for future changes? i.e. do we expect the types that we use to change in the future?

At some point in the future we expect to support filters. But for this gRFC it will not change; adding filter support will be some later effort.

FYI currently, implementation-wise C Core and Go are accepting only a single filter_chain entry, so there is no matching logic in place at the moment. We aren't looking at default_filter_chain either. The only check that we are expecting to perform from the single entry's filter_chain_match is to check whether the application_protocol mentions "h2".

That's fine since that code is not enabled by default. But every part of that behavior would need to be changed/enhanced before GA.

ejona86 · 2020-12-11T20:53:45Z

A36-xds-for-servers.md

+
+The xDS-returned Listener must have an [`address`][Listener.address] that
+matches the listening address provided. The Listener's `address` would be a TCP
+`SocketAddress` with matching `address` and `port_value`. The XdsClient must NAK


Yes. Two spellings for the same thing.

ejona86 · 2020-12-11T20:55:15Z

A36-xds-for-servers.md

+Create an `xds.Server` struct that would internally contain a `grpc.Server`. It
+would inject its own `StreamServerInterceptor` and `UnaryServerInterceptor`s
+into the `grpc.Server`. The interceptors would be controlled by an
+`atomic.Value` that would update with the configuration. When


I've added in the option for using a mutex. Calling out how the interceptor will roughly work is important to show the design actually works.

ejona86 · 2020-12-11T21:00:15Z

A36-xds-for-servers.md

+TODO: It looks like there is a lifetime issue for the XdsClient, as RPCs can
+continue after the listener is closed and Serve() returns.
+
+Because service registration is done on a method in the generated code, and


How will the code snippet below change for xDS? Was a new method introduced to the generated code?

s := xds.NewGRPCServer() pb.RegisterGreeterServer(s, &server{})

A36-xds-for-servers.md

markdroth

This looks really good!

Just a couple of comments remaining. One of them is mostly cosmetic (being a bit clearer in terms of how we describe the constraints around bind() and listen()). The other is actually a content change, unless we can think of a better way to do the address check.

Please let me know if you have any questions. Thanks!

A36-xds-for-servers.md

markdroth · 2021-01-05T20:29:06Z

A36-xds-for-servers.md

+`example/resource?udpa.resource.listening_address=[::]:80`. Note that the `[]`
+is _not_ percent encoded, as this string is not a URI.
+
+The xDS-returned Listener must have an [`address`][Listener.address] that


Repeating my earlier comment from reviewable:

Hmm, good question. Maybe the XdsClient can verify that the Listener resource has its address field set to the same address as in the resource name. Although that might be difficult if we use the template approach I suggest above instead of hard-coding "xds.resource.listener_address=" in our implementation.

We have now switched to the template approach. Given that, I don't think we can make any assumptions about the name field.

I would like to do this validation, but I don't see a reasonable way to do it without some fairly ugly layering violations. Unless one of you can think of another way to do this, I think we may need to live without this check. :(

A36-xds-for-servers.md

dfawley · 2021-01-06T17:48:39Z

A36-xds-for-servers.md

+stall until it succeeds. Each language will provide an XdsServer-specific API to
+inform the application of initial communication failures. If the user does not


Does Java/C have (or plan to have) an API to expose this? I don't think we designed anything like this for Go (yet?). @easwars @yashykt

See XdsIniitListener for Java.

C does not have one yet, but it is in the works.

A36-xds-for-servers.md

Waiting for xds config is not just a one-shot affair at server startup, since xds can delete resources after started. That caused a pretty strong rework to the behavior, including ditching listen()-but-don't-accept(). We still need to validate that accept()+close() won't cause poor behavior for clients such as Envoy, but the options available are limited.

sanjaypujare · 2021-02-06T00:05:39Z

A36-xds-for-servers.md

+is restored. However, XdsServer must accept configuration changes provided by
+the xDS server, including resource deletions. If that causes the XdsServer to
+lack essential configuration (e.g., the Listener was deleted) the server would
+need to enter "not serving" mode. "Not serving" requires existing connections to


How about the "Not serving" to "Serving" transition? Specifically:

this transition will automatically start processing RPCs without any intervention by the application, correct? i.e. it is as though start() was called on the XdsServer object (but without the application having to call it)

this will require recreating the internal/delegate Server object since the delegate does not support restarting after shutting down. Is that correct?

It will be good to add some clarification for this.

this transition will automatically start processing RPCs without any intervention by the application, correct? i.e. it is as though start() was called on the XdsServer object (but without the application having to call it)

Yes. The application had already called start() as is expecting the server to be running. It was the library that "called stop()" so it is its responsibility to resume and restore the expected application state when able.

this will require recreating the internal/delegate Server object since the delegate does not support restarting after shutting down. Is that correct?

Correct. This will vary with the language-specific API and the approach taken to become "not serving." But assuming XdsServer calls server.stop() on the wrapped server (the internal one), then it'd need to create a new one to replace it when calling start().

I didn't go into more detail here because that is language and implementation-specific. I did mention in the Java section that we may need to rebuild the server, and that was to allow for things here. But it isn't required to re-create the server; I left open the option for you to use a ProtocolNegotiator to immediately close() connections. Such an approach wouldn't have needed to re-create the server.

A36-xds-for-servers.md

markdroth

This looks really good! I think the only things we need are the changes we've discussed about L4 filters, HTTP filters, and the RouteConfiguration.

Thanks again for driving this!

A36-xds-for-servers.md

easwars · 2021-02-19T00:33:42Z

A36-xds-for-servers.md

+message FilterChainMatch {
+  enum ConnectionSourceType {
+    ANY = 0;
+    SAME_IP_OR_LOOPBACK = 1;


For the SAME_IP_OR_LOOPBACK, one can easily verify if the source IP of the incoming connection is from a loopback address. But do we also have to support the case where the host supports multiple addresses and we need to make sure that the source IP of the incoming connection is one of them? Also, does anything change based on whether the server is bound to an actual local address or any address?

No, SAME_IP_OR_LOOPBACK means the source IP and dest IP are the same (or localhost); it doesn't support cases where a machine has multiple IPs and the IP selection is different between source and dest.

Nothing changes by the wildcard. It should be straight-forward to implement.

A36-xds-for-servers.md

sanjaypujare · 2021-02-22T01:17:42Z

A36-xds-for-servers.md

+automatically (e.g., retry every minute).
+
+Communication failures do not impact the XdsServer's xDS configuration; the
+XdsServer should continue using the most recent configuration until connectivity


Is that true also for the very first time config is needed during start() (in Java) ? What's the expected behavior when the listener resource is missing at startup time? Should the server's start() fail with the server going into permanent failure state? Or should it go into "Not serving" state and loop until the listener resource is available?

This is basically saying "if an I/O error occurs, continue doing what you were doing before." And that applies to startup, because at startup we are "not serving." No, the server's start() should not fail. That is covered by the "XdsServer's start must not fail due to transient xDS issues, like missing xDS configuration from the xDS server" language below. We absolutely must avoid a permanent failure state triggered by transient I/O failures. We'd stay "not serving" until something improves.

A36-xds-for-servers.md

ejona86 · 2021-02-25T01:25:59Z

I pushed an important filter handling addition; I've mentioned to some of you that was coming. It is yet something else to take care of, but I hope isn't that bad because some client-side work that's already been happening should take care of the harder part.

I made a smattering of fixes for nits. I'm aware some comments still need to be addressed/replied.

ejona86 · 2021-02-25T01:27:36Z

If you see my "Make explicit that xds v3 is optional" commit message. Rest assured, it should have said v2. It happens that v3 isn't optional yet :-)

ejona86

I think everything except the question of validating use_original_dst is resolved. use_original_dst will be a trivial change if we choose to make it, so I think this is very close and we should be looking at the final-ish round of reviews.

ejona86 · 2021-02-25T20:53:07Z

A36-xds-for-servers.md

+automatically (e.g., retry every minute).
+
+Communication failures do not impact the XdsServer's xDS configuration; the
+XdsServer should continue using the most recent configuration until connectivity


This is basically saying "if an I/O error occurs, continue doing what you were doing before." And that applies to startup, because at startup we are "not serving." No, the server's start() should not fail. That is covered by the "XdsServer's start must not fail due to transient xDS issues, like missing xDS configuration from the xDS server" language below. We absolutely must avoid a permanent failure state triggered by transient I/O failures. We'd stay "not serving" until something improves.

ejona86 · 2021-02-25T21:02:45Z

A36-xds-for-servers.md

+is restored. However, XdsServer must accept configuration changes provided by
+the xDS server, including resource deletions. If that causes the XdsServer to
+lack essential configuration (e.g., the Listener was deleted) the server would
+need to enter "not serving" mode. "Not serving" requires existing connections to


this transition will automatically start processing RPCs without any intervention by the application, correct? i.e. it is as though start() was called on the XdsServer object (but without the application having to call it)

Yes. The application had already called start() as is expecting the server to be running. It was the library that "called stop()" so it is its responsibility to resume and restore the expected application state when able.

this will require recreating the internal/delegate Server object since the delegate does not support restarting after shutting down. Is that correct?

Correct. This will vary with the language-specific API and the approach taken to become "not serving." But assuming XdsServer calls server.stop() on the wrapped server (the internal one), then it'd need to create a new one to replace it when calling start().

I didn't go into more detail here because that is language and implementation-specific. I did mention in the Java section that we may need to rebuild the server, and that was to allow for things here. But it isn't required to re-create the server; I left open the option for you to use a ProtocolNegotiator to immediately close() connections. Such an approach wouldn't have needed to re-create the server.

A36-xds-for-servers.md

markdroth

This looks fantastic! Thanks for nailing down all of these details!

A36-xds-for-servers.md

A36: xDS-Enabled Servers

b825f33

ejona86 force-pushed the xds-for-servers branch from 183c8f9 to b825f33 Compare November 20, 2020 21:29

easwars reviewed Nov 20, 2020

View reviewed changes

sanjaypujare reviewed Nov 21, 2020

View reviewed changes

sanjaypujare reviewed Nov 22, 2020

View reviewed changes

markdroth requested changes Dec 11, 2020

View reviewed changes

Mostly editorial changes and small tweaks

6eee828

yashykt reviewed Dec 11, 2020

View reviewed changes

markdroth requested changes Dec 11, 2020

View reviewed changes

Language tweaks

56ddd35

ejona86 commented Dec 22, 2020

View reviewed changes

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

sanjaypujare reviewed Dec 22, 2020

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

ejona86 added 2 commits December 22, 2020 17:29

Rework how listener_filters are handled and the impact to Matchers

dabf9aa

Swap to server_listener_resource_name_template

9b883c5

markdroth reviewed Jan 5, 2021

View reviewed changes

dfawley reviewed Jan 6, 2021

View reviewed changes

sanjaypujare reviewed Jan 7, 2021

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

srini100 mentioned this pull request Jan 14, 2021

xds: allow usage without specific import of xds package grpc/grpc-go#4123

Closed

ejona86 added 2 commits January 15, 2021 12:29

Explicitly define how to shut down existing connections

6fc8390

sanjaypujare reviewed Feb 6, 2021

View reviewed changes

sanjaypujare reviewed Feb 8, 2021

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

sanjaypujare mentioned this pull request Feb 8, 2021

xds: Implement XdsServingStatusListener as per the new xDS server gRFC grpc/grpc-java#7876

Merged

ejona86 commented Feb 11, 2021

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

markdroth reviewed Feb 11, 2021

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

A36-xds-for-servers.md Show resolved Hide resolved

A36-xds-for-servers.md Show resolved Hide resolved

A36-xds-for-servers.md Show resolved Hide resolved

easwars reviewed Feb 18, 2021

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

easwars reviewed Feb 19, 2021

View reviewed changes

sanjaypujare mentioned this pull request Feb 19, 2021

xds: implement the new filterChainMatch algorithm grpc/grpc-java#7910

Merged

sanjaypujare reviewed Feb 22, 2021

View reviewed changes

easwars reviewed Feb 24, 2021

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

easwars reviewed Feb 24, 2021

View reviewed changes

A36-xds-for-servers.md Show resolved Hide resolved

ejona86 added 5 commits February 24, 2021 16:58

Require filter handling and validation

6eff58a

Make explicit that xds v3 is optional

9b301cc

s/addition/additional/;s/second time/more than once/

35286d5

If no chain matches, close connection

5c4d52d

s/is missing or unknown/does not exist or has not yet been received/

c8bc62f

ejona86 commented Feb 25, 2021

View reviewed changes

Validate use_original_dst != true

d0098ea

ejona86 requested a review from markdroth February 25, 2021 22:37

markdroth approved these changes Feb 26, 2021

View reviewed changes

A36-xds-for-servers.md Outdated Show resolved Hide resolved

ejona86 added 2 commits February 26, 2021 13:03

Mark as ready for implementation; sync last updated date

2ef6bc2

Replace broken link, since this is being merged before A29

e60b6df

ejona86 merged commit 71f4dea into grpc:master Feb 26, 2021

ejona86 deleted the xds-for-servers branch February 26, 2021 21:06

ejona86 mentioned this pull request May 4, 2021

Add xds-enabled server creation details for Core and C++ #234

Merged


		### Go

		Create an `xds.Server` struct that would internally contain a `grpc.Server`. It

		```


		To allow passing configuration, the `xds.Server` will be responsible for

		stall until it succeeds. Each language will provide an XdsServer-specific API to
		inform the application of initial communication failures. If the user does not

A36: xDS-Enabled Servers #214

A36: xDS-Enabled Servers #214

Conversation

ejona86 commented Nov 20, 2020

ejona86 commented Nov 20, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanjaypujare Jan 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markdroth left a comment

Choose a reason for hiding this comment

yashykt left a comment

Choose a reason for hiding this comment

markdroth left a comment

Choose a reason for hiding this comment

ejona86 commented Dec 22, 2020 • edited Loading

ejona86 commented Dec 22, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markdroth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markdroth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sanjaypujare Feb 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ejona86 commented Feb 25, 2021 • edited Loading

ejona86 commented Feb 25, 2021

ejona86 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

markdroth left a comment

Choose a reason for hiding this comment

sanjaypujare Jan 7, 2021 •

edited

Loading

ejona86 commented Dec 22, 2020 •

edited

Loading

sanjaypujare Feb 22, 2021 •

edited

Loading

ejona86 commented Feb 25, 2021 •

edited

Loading