From fc5070cc41e53cc65d4036ec52ed7e0ae5032f7a Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Wed, 21 Jul 2021 09:57:39 -0700 Subject: [PATCH 01/14] xRFC TP2: Dynamically Generated Cacheable xDS Resources Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 1064 +++++++++++++++++ 1 file changed, 1064 insertions(+) create mode 100644 proposals/TP2-dynamically-generated-cacheable-xds-resources.md diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md new file mode 100644 index 00000000..54dede49 --- /dev/null +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -0,0 +1,1064 @@ +TP2: Dynamically Generated Cacheable xDS Resources +---- +* Author(s): markdroth, htuch +* Approver: htuch +* Implemented in: +* Last updated: 2021-08-03 + +## Abstract + +This xRFC proposes a new mechanism to allow xDS servers to +dynamically generate the contents of xDS resources for individual +clients while at the same time preserving cacheability. Unlike the +context parameter mechanism that is part of the new xDS naming scheme (see +[xRFC TP1](https://github.com/cncf/xds/pull/6)), the mechanism described in +this proposal is visible only to the transport protocol layer, not to the +data model layer. This means that if a resource has a parameter that +affects its contents, that parameter is not part of the resource's name, +which means that any other resources that refer to the resource do not +need to encode the parameter. Therefore, use of these parameters is +not viral, thus making the mechanism much easier to use. + +## Background + +There are many use-cases where a control plane may need to dynamically +generate the contents of xDS resources to tailor the resources for +individual clients. Here are some examples: + +- **xDS minor/patch version negotiation.** In this case, each client + supports a given minor and patch version, and the server may choose to + use newer API features when talking to clients that support newer versions + of the API. (See https://github.com/envoyproxy/envoy/issues/8416 for + details.) +- **Sharding Cluster resources for scalability.** In this use-case, there + are a really large number of clusters, too many for any one client to + handle. The clusters are divided into shards, and each client is given + a dynamically changing assignment of which shards to load. To support + this, there needs to be a different variant of the `ClusterCollection` + resource for each combination of shards that may be assigned to a given + client. The shard assignments are generally determined dynamically on + the client but may change at any time. +- **Sharding endpoints for scalability.** This is similar to the previous + case, except that there is a single cluster with a large number of + endpoints. The goal is that the xDS server will send different subsets + of endpoints to different clients, thus avoiding unwanted connections + when there are large numbers of both servers and clients. (At Google, + this is referred to as "subsetting", but it's a different feature than + the one that Envoy uses that term for.) In this case, it is desirable + for the xDS server to determine the subset of endpoints to assign to + each client. +- **Selecting which cluster to send a client to based on an ACL.** In this + use-case, there are two different network paths that can be used to + access the endpoints: one goes directly to the endpoints, with + client-side load balancing, and the other goes via a reverse proxy. + The path that goes directly to the endpoints is faster but is + access-restricted. The xDS server needs to check an ACL to determine + whether a given client is authorized to directly access the endpoints. + If the client is authorized, it will be sent a `RouteConfiguration` + pointing to the cluster for those endpoints; otherwise, it will be sent + a different variant of the `RouteConfiguration` that points to a cluster + containing the reverse proxy endpoint. +- **Dynamic route selection.** Every client sends a set of dynamic + selection parameters (today, conveyed as node metadata). The server + has a list of routes to configure, but individual routes in the list + may be included or excluded based on the client's dynamic selection + parameters. Thus, the server needs to generate a slightly different + version of the `RouteConfiguration` for clients based on the parameters + they send. (See + https://cloud.google.com/traffic-director/docs/configure-advanced-traffic-management#config-filtering-metadata + for an example.) + +The new xDS naming scheme described in [xRFC +TP1](https://github.com/cncf/xds/pull/6) provides a mechanism called +context parameters, which is intended to move all parameters that affect +resource contents into the resource name, thus adding cacheability to the +xDS ecosystem. However, this approach means that these parameters +become part of the resource graph on an individual client, which causes +a number of problems: +- Dynamic context parameters are viral, spreading from a given resource + to all earlier resources in the resource graph. For example, if + multiple variants of an EDS resource are needed, there need to be two + different instances of the resource with different names, + distinguished by a context parameter. But because the contents of the + CDS resource include the name of the corresponding EDS resource name, + that means that we also need two different versions of the CDS + resource, also distinguished by the same context parameter. And then + we need two different versions of the RDS resource, since that needs + to refer to the CDS resource. And then two different versions of the + LDS resource, which refers to the RDS resource. This causes a + combinatorial explosion in the number of resources needed, and it adds + complexity to xDS servers, which need to construct the right variants + of every resource and make sure that they refer to each other using + the right names. +- In the new xDS naming scheme, context parameters are exact-match-only. + This means that if a control plane wants to provide the same resource + both with and without a given parameter, it needs to publish two + versions of the resource, each with a different name, even though the + contents are the same, which can also cause unnecessarily poor cache + performance. For example, in the "dynamic route selection" use-case, + let's say that every client uses two different dynamic selection + parameters, `env` (which can have one of the values `prod`, `canary`, or + `test`) and `version` (which can have one of the values `v1`, `v2`, or + `v3`). Now let's say that there is a `RouteConfiguration` with one route + that should be selected via the parameter `env=prod` and another route that + should be selected via the parameter `version=v1`. This means that there + are only four variants of the `RouteConfiguration` resource (`{env!=prod, + version!=v1}`, `{env=prod, version!=v1}`, `{env!=prod, version=v1}`, and + `{env=prod, version=v1}`). However, the exact-match semantics means + that there will have to be nine different versions of this resource, + one for each combination of values of the two parameters. + +### Related Proposals: +* [xRFC TP1: new xDS naming scheme](https://github.com/cncf/xds/pull/6) + +## Proposal + +This document proposes an alternative approach. We start with the +observation that resource names are used in two places: + +- The **transport protocol** layer, which needs to identify the right + resource contents to send for a given resource name, often obtaining + those resource contents from a cache. +- The **resource graph** used on an individual client, where there are a + set of data model resources that refer to each other by name. For + example, a `RouteConfiguration` refers to individual `Cluster` resources + by name. + +The use-cases for dynamic resource selection share one important property +that we can take advantage of. When multiple variants of a given resource +exist, any given client will only ever use one of those variants at a +given time. That means that the parameters that affect which variant +of the resource is used are required by the transport protocol, but +they are not required by the client's data model. (For example, in the +"sharding endpoints for scalability" use-case, different clients may see +different variants of the EDS resource, but once a given client has the +right variant, it will be unique on that client, which means that the +CDS resource does not need to refer to different EDS resource names on +different client.) + +It should be noted that caching xDS proxies, unlike "leaf" clients, will +need to track multiple variants of each resource, since a given caching +proxy may be serving clients that need different variants of a given +resource. However, since caching xDS proxies deal with resources only +at the transport protocol layer, the resource graph layer is +essentially irrelevant in that case. + +### Dynamic Parameters + +With the above property in mind, this document proposes the following +data structures: +- **Dynamic parameters**, which are a set of key/value pairs that are part + of the cache key for an xDS resource (in addition to the resource name + itself). This provides a mechanism to represent multiple variants of a + given resource in a cacheable way. These parameters are used to identify + the specified resource in the transport protocol, but they are not part of + the resource name and therefore do not appear as part of the resource graph. +- **Dynamic parameter constraints**, which are a set of criteria that + can be used to determine whether a set of dynamic parameters matches + the constraints. When a client subscribes to a resource, it may + specify a set of dynamic parameter constraints, which will be used to + select which variant of the resource will be returned by the server. + In response to a given subscription request from the client containing + a set of dynamic parameter constraints, the server will send a + resource whose dynamic parameters match the dynamic parameter + constraints in the request. The client will use the dynamic + parameters on the resource to determine which of its subscriptions the + resource is associated with. + +Dynamic parameters, unlike context parameters, will not be +exact-match-only. Dynamic parameter constraints will be able to represent +various types of flexible matching, such as range-based matching (which +will be used for the "xDS minor/patch version negotiation" use-case). +This flexible matching semantic means that there are some cases where +ambiguity can occur; we define a set of best practices below to prevent +these cases from occurring in practice. + +#### Matching Ambiguity + +Flexible matching means that there may be ambiguities when determining +which resources match which subscriptions. This section defines the matching +behavior and a set of best practices for deployments to follow to avoid this +kind of ambiguity. + +To illustrate where this comes up in practice, it is useful to consider +what happens in transition scenarios, where a deployment initially +groups its clients on a single key but then wants to add a second key. +The second key needs to be added both in the constraints on the server +side and in the clients' configurations, but those two changes cannot +occur atomically. + +For example, let's say that the clients are currently categorized by the +parameter `env`, whose value is either `prod` or `test`. The resource +variants on the server will therefore have the following sets of dymamic +parameters: +- `{env=prod}` +- `{env=test}` + +Clients will send one of the following two sets of dynamic parameter +constraints, depending on whether they are `prod` or `test` clients: + +```textproto +// For {env=prod} +{key_constraints:[ + {key:"env" value:{ + constraints:[{value:"prod"}] + }} +]} + +// For {env=test} +{key_constraints:[ + {key:"env" value:{ + constraints:[{value:"test"}] + }} +]} +``` + +Now the deployment wants to add an additional key called `version`, +whose value will be either `v1` or `v2`, so that it can further subdivide +its clients' configs. + +If the new key is added on the clients first, then the clients will +start subscribing with dynamic parameters constraints like the following: + +```textproto +// For {env=prod, version=v1} +{key_constraints:[ + {key:"env" value:{ + constraints:[{value:"prod"}] + }}, + {key:"version" value:{ + constraints:[{value:"v1"}] + }} +]} +``` + +The server or cache has to match that set of constraints against the +existing sets of dynamic parameters, which do not specify the `version` +key at all. + +Conversely, if the new key is added on the server side first, then the +server will have resource variants with parameters like this: +- `{env=prod, version=v1}` +- `{env=prod, version=v2}` +- `{env=test, version=v1}` +- `{env=test, version=v2}` + +But at this point, the clients are continuing to subscribe without +constraints on this new key. So the server or cache needs to figure out +(e.g.) which of the first two sets of constraints to use for constraints +that require `env` to be `prod` but do not specify `version`. + +We address this transition scenario by allowing the set of constraints +for a given key to match any resource variant that does not specify that +key at all. This allows constraints for new keys to be added on clients +before the corresponding keys are added on the resources on the server, but +it does introduce some additional ambiguity into the matching. For example, +let's say that the server has the following two variants of a resource: +- `{env=prod}` +- `{env=prod, version=v1}` + +Now consider what happens if a client subscribes with the following +constraints: + +```textproto +{key_constraints:[ + {key:"env" value:{ + constraints:[{value:"prod"}] + }}, + {key:"version" value:{ + constraints:[{value:"v1"}] + }} +]} +``` + +These constraints can match either of the above variants of the resource. +This situation can be avoided by establishing a best practice that all +variants of a given resource must have the same set of keys. + +There is still a possible ambiguity that can occur if a server adds +multiple variants of a new key that clients are not yet sending. +For example, let's say that the server has the following two variants +of a resource: +- `{env=prod, version=v1}` +- `{env=prod, version=v2}` + +Consider what happens if a client subscribes with the following constraints: + +```textproto +{key_constraints:[ + {key:"env" value:{ + constraints:[{value:"prod"}] + }} +]} +``` + +These constraints can match either variant of the above resource. +This can be avoided by establishing a best practice of not adding multiple +variants of a new parameter until clients are sending the new parameter. +However, if this does happen, the cache implementation is free to pick +one of the variants at random. + +So, the expected order of changes for this kind of transition would be: +1. Change clients to start sending a constraint for `version=v1`. +2. Add the dynamic parameter `version=v1` to all existing resources. +3. Create new variants of each resource with `version=v2`. +4. Change the desired set of clients to send a constraint for + `version=v2` instead of `version=v1`. + +##### Alternatives Considered + +We could avoid much of the matching ambiguity described above by saying that +a set of constraints must specify all keys present on the resource in order +to match. However, this would mean that if the client starts subscribing +with a constraint for the new key before the corresponding key is added on +the resources on the server, then it will fail to match the existing resources. +In other words, the process would be: + +1. Add a variant of all resources on the server side with `version=v1` + (in addition to all existing dynamic parameters). +2. Change clients to start sending constraints with the new key. +3. When all clients are updated, remove the resource variants that do + *not* have the new key. + +This will effectively require adding new keys on the server side first, +which seems like a large burden on users. It also seems fairly tricky +for most users to get the exactly correct set of dynamic parameters on +each resource variant, and if they fail to do it right, they will break +their existing configuration. + +We also considered having the client add the new constraint but mark it +as optional using an `is_optional` field. That way, it would match +resources both before and after the new key is added on the server. +However, the `is_optional` field would introduce another type of ambiguity +in matching. Specifically, let's say that the server has the following +two variants of the resource: +- `{env=prod}` +- `{env=prod, version=v1}` + +Now a client subscribes with the following set of constraints: + +```textproto +{key_constraints:[ + {key:"env" value:{ + constraints:[{value:"prod"}] + }}, + {key:"version" value:{ + constraints:[{value:"v1"}] + is_optional: true + }} +]} +``` + +These constraints can match either of the above variants of the resource. +For authoritative servers, this could be addressed by establishing +a best practice of not having two variants of a resource that differ +only by keys that the client will send as optional. However, this +requires coordination between client and server, and requires +machinery on the client to determine when to set the `is_optional` bit. + +Ultimately, although this approach is more semantically precise, it is +also considered too rigid and difficult for users to work with. + +#### Matching Behavior and Best Practices + +We advise deployments to avoid ambiguity through the following best practices: +- Whenever there are multiple variants of a resource, all variants must + list the same set of keys. This allows the server to ignore constraints + on keys sent by the client that do not affect the choice of variant + without causing ambiguity in cache misses. +- Servers should not create multiple variants of a parameter that is not yet + being sent by clients. If they do, clients that do not send that parameter + will get one of the variants at random. +- There must be a variant of the resource for every value of a key that is + going to be present. For example, if clients will send constraints on the + `env` key requiring the value to be one of `prod`, `test`, or `qa`, then + you must have each of those three variants of the resource. + - Note that servers that can make use of the mechanism described under + [Server-Specified Constraints](#server-specified-constraints) below + may be able to optimize this in some cases. See the "Dynamic Route + Selection" example below for details. +- For cases where a constraint may match multiple values (e.g., a + range constraint), the largest possible matching value is preferred. + This means that caches (both on clients and on caching xDS proxies) + must attempt to fetch a larger value even if they already have a smaller + matching value already present in the cache. For example, let's say + that a cache contains a variant of a resource with the parameter + `{shard=3}` and a client subscribes with the following constraints: + ```textproto + {key_constraints:[ + {key:"shard" value:{ + constraints:[ + {integer_range_list:[ + {range:{min_value:0 max_value:5}} + ]} + ] + }} + ]} + ``` + In this case, the cache must attempt to fetch a resource from the + authoritative server with that constraint before falling back to + using the one it already has cached, because the preferred value is + `{shard=5}`, not `{shard=3}`. + - Note: This is not an issue for glob collections, because in that case + all matching variants of the resource will be used. + +#### API Changes + +The API changes necessary to implement this proposal are in +https://github.com/envoyproxy/envoy/pull/17192. + +Dynamic parameter constraints will be represented as follows: + +```proto +// A set of dynamic parameter constraints used to select the variant of +// a given resource desired by a client. Clients send a set of +// constraints with each subscription request, and servers respond by +// sending a resource with a matching set of dynamic parameters. +message DynamicParameterConstraints { + // Constraints for a given key. + message KeyConstraints { + message Constraint { + // A list of one or more integer ranges. + // A value is considered to match if it falls in any of the ranges. + message IntegerRangeList { + // At least one of *min_value* or *max_value* must be set. + message Range { + // If specified, value may not be less than this. + uint64 min_value = 1; + + // If specified, value may not be greater than this. + uint64 max_value = 2; + } + + repeated Range range = 1; + } + + oneof constraint { + // The key must have this specific value. + string value = 1; + + // The key's value must be integers and within one of the ranges in this list. + IntegerRangeList integer_range_list = 2; + } + } + + // A list of one or more constraints on the value of the key. + // All constraints must be met. + repeated Constraint constraints = 2; + } + + // One entry per key. + // Note that if a key has a constraint here, it will place restrictions + // on the key's value if the key is present on a variant of the resource. + // However, if a key has a constraint here but is not present on the + // resource, it will match, regardless of what the constraint says. + map key_constraints = 1; +} +``` + +The following message will be added to represent a subscription to a +resource by name with associated dynamic parameter constraints: + +```proto +// A specification of a resource used when subscribing or unsubscribing. +message ResourceLocator { + // The resource name to subscribe to. + string name = 1; + + // A set of constraints used to match against the dynamic parameters on the resource. This + // allows clients to select between multiple variants of the same resource. + DynamicParameterConstraints dynamic_parameter_constraints = 2; +} +``` + +The following new field will be added to `DiscoveryRequest`, to allow clients +to specify constraints when subscribing to a resource: + +```proto + // Alternative to resource_names field that allows specifying cache + // keys along with each resource name. If this is populated in the + // first request for a resource type on a stream, resource_names is ignored + // for all subsequent requests for that resource type on that stream. + // Clients that populate this field must be able to handle responses + // from the server where resources are wrapped in a Resource message. + repeated ResourceLocator resource_locators = 7; +``` + +Similarly, the following fields will be added to `DeltaDiscoveryRequest`: + +```proto + // Alternative to resource_names_subscribe field that allows specifying cache + // keys along with each resource name. If this is populated in the + // first request for a resource type on a stream, resource_names_subscribe + // and resource_names_unsubscribe are ignored for all subsequent requests + // for that resource type on that stream. + repeated ResourceLocator resource_locators_subscribe = 8; + + // Alternative to resource_names_unsubscribe field that allows specifying cache + // keys along with each resource name. If resource_locators_subscribe is + // populated in the first request for a resource type on a stream, + // this field is used instead of resource_named_unsubscribe for all + // subsequent requests for that resource type on that stream. + repeated ResourceLocator resource_locators_unsubscribe = 9; +``` + +The following field will be added to the `Resource` message, to allow the +server to return the dynamic parameters associated with each resource: + +```proto + // Dynamic parameters associated with this resource. To be used by client-side caches + // (including xDS proxies) when matching subscribed resource locators. + map dynamic_parameters = 8; +``` + +### Server-Specified Constraints + +In the "sharding endpoints" and "selecting cluster based on ACL" use-cases, +the constraints need to be dynamically determined by the xDS server, not by +the client. To support this, we introduce a new xDS resource type called +`DynamicParametersConstraintsMap`, which looks like this: + +```proto +package envoy.config.dynamic_parameters.v3; + +message DynamicParameterConstraintsMap { + // Key is resource type name (e.g., "envoy.config.cluster.v3.Cluster"). + map resource_type_constraints = + 1; +} +``` + +This resource allows the management server to provide the client with a +set of dynamic parameter constraints to be used for each resource type. + +The client will obtain this resource from the server either via ADS or +via a new xDS API called Dynamic Parameter Discovery Service (DPDS): + +```proto +package envoy.service.dynamic_parameters.v3; + +service DynamicParametersDiscoveryService { + option (envoy.annotations.resource).type = "envoy.config.dynamic_parameters.v3.DynamicParameters"; + + rpc StreamDynamicParameterConstraints(stream discovery.v3.DiscoveryRequest) + returns (stream discovery.v3.DiscoveryResponse) { + } + + rpc DeltaDynamicParameterConstraints(stream discovery.v3.DeltaDiscoveryRequest) + returns (stream discovery.v3.DeltaDiscoveryResponse) { + } + + rpc FetchDynamicParameterConstraints(discovery.v3.DiscoveryRequest) + returns (discovery.v3.DiscoveryResponse) { + option (google.api.http).post = "/v3/discovery:dynamic_parameters"; + option (google.api.http).body = "*"; + } +} +``` + +Use of this resource type is optional and will be configured locally on +the client (e.g., in the bootstrap file). The client's configuration +will tell it the name of the DPDS resource to subscribe to and what server +to obtain it from. When the new xdstp: naming scheme is in use, the client +should be able to configure a different DPDS resource to use for each +authority. + +If configured, the client will subscribe to the DPDS resource before +subscribing to any other type of resource. It will then use the +constraints from the DPDS resource when subscribing to all other types +of resources. + +If the client cannot obtain the configured DPDS resource, it will ignore +the failure and request the remaining resources with no additional +constraints. Control planes can decide how to handle that request; if +they want the client to fail without the DPDS resource, they can simply +not return any resource for the name without the expected constraints. + +FIXME: is the above still right? Control plane can't really do that based on the matching rules defined above. + +Just like any other xDS resource, a DPDS resource can be updated by the +control plane at any time. When that happens, the constraints to be +used for a given resource type change, which will cause the client to +unsubscribe from all resources of that type using the old constraints and +then resubscribe to all resources using the new constraints. Note that +this is an eventually consistent model, but the appropriate use of ADS +or distributed coordination can provide stronger consistency. + +#### DPDS Example + +For example, let's say that the client is configured such that it will +use the DPDS resource +`xdstp://xds.example.com/envoy.config.context_params.v3.DynamicContextParameters/my_context_params` +for authority `xds.example.com`. The client is asked to subscribe to the LDS +resource +`xdstp://xds.example.com/envoy.config.listener.v3.Listener/my_listener`. +The client notices that the it has a DPDS resource configured for the +authority of this resource (xds.example.com), so it will first subscribe +to the DPDS resource. Let's say that it gets back the following +response: + +```textproto +{resource_type_constraints:[ + {key:"envoy.config.listener.v3.Listener" value:{ + key_constraints:[ + {key:"listener_type" value:{ + constraints:[{value:"direct"}] + }} + ] + }} +]} +``` + +The client would then use the following constraints when subscribing to +the LDS resource: + +```textproto +{key_constraints:[ + {key:"listener_type" value:{ + constraints:[{value:"direct"}] + }} +]} +``` + +Now let's say that the client later gets an update of the DPDS resource +with the following contents: + +```textproto +{resource_type_constraints:[ + {key:"envoy.config.listener.v3.Listener" value:{ + key_constraints:[ + {key:"listener_type" value:{ + constraints:[{value:"via_proxy"}] + }} + ] + }} +]} +``` + +This changes the constrains to be used for LDS resources. +The client would then send a new request that unsubscribes from the +LDS resource with the old constrains and subscribes to the same resource +with the new constraints. + +#### Non-Cacheability of DPDS + +Because this mechanism is intended to be used in cases where the server +needs to determine the constraints to be used by the client based on +information not included in the resource locator (e.g., node information, +client IP, or client credentials), the DPDS resource itself is always +non-cacheable. Servers must always set the [`Resource.do_not_cache` +field](https://github.com/envoyproxy/envoy/blob/371099f4f52f94e60f558561e29ce8852e1091da/api/envoy/service/discovery/v3/discovery.proto#L245) +when sending this resource. Clients that use a caching xDS proxy for +most of their resources will need to obtain this resource directly +from the authoritative server; when using the new xdstp: naming scheme, +this can be done by using a different authority in the DPDS resource name. + +#### Possible DPDS Implementation + +One possible way for clients to implement this is by adding a transparent +layer between the transport protocol layer and the data model layer. + +Let's say that the transport protocol is handled by an XdsClient object +that handles interaction with the xDS server(s) and takes care of all of +the client-side caching. The XdsClient object has an API that allows +data model code to register a watcher for a particular resource name, +and the XdsClient will invoke a method on the watcher whenever the +resource is updated. + +The DPDS functionality can be added as a transparent "wrapper" of the +XdsClient object: +- When a watch is started on the wrapper object for (e.g.) a Listener + resource, if use of DPDS is not configured or the resource name is an + old-style resource name, the watch will just be passed down to the real + XdsClient without modification. But if DPDS is in use, then the wrapper + will use the real XdsClient to start a watch for the DPDS resource. +- When the DPDS resource is returned, the wrapper will use it to determine + what constraints to use when subscribing to the Listener resource, at + which point it will start a watch for the Listener resource on the real + XdsClient using those constraints. Any updates returned by the Listener + watcher on the real XdsClient will be passed through to the watcher given + to the wrapper by the data model code. +- Whenever the DPDS resource gets updated, if the constraints for LDS + resources have changed, the wrapper will stop the watch for the Listener + resource on the real XdsClient that was using the old constraints and + start a new watch using the new constraints. This change will be + transparent to the data model code that started the watch on the + XdsClient wrapper. + +#### Envoy-Specific Details + +(This section applies only to Envoy, not to other xDS clients like gRPC.) + +In Envoy, RTDS is used before DPDS, so the DPDS resource cannot be used to +specify constraints for RTDS resources. + +The constraints from the DPDS resource will be used to choose the CDS +resource, which means that DPDS resources will be fetched before CDS +resources. This means that the DPDS resource itself cannot be fetched +from a cluster obtained via CDS; it must either use a static cluster or +a Google gRPC ApiConfigSource. + +### Migrating From Node Metadata + +Today, the equivalent of dynamic parameter constraints is node metadata, +which can be used by servers to determine the set of resources to send +for LDS and CDS wildcard subscriptions or to determine the contents of +other resources (e.g., to select individual routes to be included in an +RDS resource). For transition purposes, this mechanism could continue +to be supported in one of two ways: +1. Direct translation of node metadata to exact-match constraints. For + example, if the node metadata contains the entry `env=prod`, this + would be translated to a constraint `{key_constraints:[{key:"env" + value:{constraints:[{value:"prod"}]}}]}`. +2. Use the mechanism described under [Server-Specified + Constraints](#server-specified-constraints) above to convert from node + metadata to dynamic parameter constraints. (Note that this mechanism + requires direct access to the authoritative server, because the + `DynamicParameterConstraintsMap` resource is not cacheable.) + +Any given xDS client may support either or both of these mechanisms. + +### Examples + +This section shows how the mechanism described in this proposal can be +used to address each of the use-cases identified in the "Background" +section above. + +#### xDS Minor/Patch Version Negotiation + +The client will send the following dynamic parameter constraints, which +indicate the range of versions that it supports: + +```textproto +{key_constraints:[ + {key:"xds.version.minor" value:{ + constraints:[ + {integer_range_list:[ + {range:{min_value:0 max_value:5}} + ]} + ] + }} +]} +``` + +Let's say that a server has a resource that wants to use a new feature +introduced in version 3.0.5 for clients that support that version. It will +provide two versions of that resource: +- For clients at version 3.0.5 or higher, a resource with keys + `{"xds.version.patch"=5}`. +- For clients at version 3.0.4 or lower, there will need to be at least one + variant of the resource for every possible version range that any client + may request, all with the exact same content. For example: + - `{"xds.version.patch"=4}` + - `{"xds.version.patch"=3}` + - `{"xds.version.patch"=2}` + - `{"xds.version.patch"=1}` + - `{"xds.version.patch"=0}` + +#### Sharding Clusters + +In this use-case, the client will have a set of shard ranges determined +by some client-side code, resulting in a pair of dynamic parameter +constraints, one for SRDS and another for CDS. For example, let's +say that a client should use shard ranges [4-6], [11-15], and [46-90]. +The dynamic parameter constraints for SRDS and CDS would be: + +```textproto +// For SRDS (single resource). +{key_constraints:[ + {key:"shards" value:{ + constraints:[ + {value:"[4-6],[11-15],[46-90]"} + ] + }} +]} + +// For CDS (glob collection). +{key_constraints:[ + {key:"shards" value:{ + constraints:[ + {integer_range_list:[ + {range:{min_value:4 max_value:6}}, + {range:{min_value:11 max_value:15}}, + {range:{min_value:46 max_value:90}} + ]} + ] + }} +]} +``` + +The resulting SRDS resource will tell the client what RDS resources +to fetch. The server can either generate different resource names for +each variant of the RDS resource, or it can choose to apply the same +constraints to RDS as it uses for SRDS. + +#### Sharding Endpoints + +The client will initially subscribe to the `DynamicParameterConstraintsMap` +resource to get the dynamic parameter constraints to use for each resource +type. The server will send back the following DPDS resource: + +```textproto +{resource_type_constraints:[ + {key:"envoy.config.cluster.v3.ClusterLoadAssignment" value:{ + key_constraints:[ + {key:"subset_id" value:{ + constraints:[{value:"123"}] + }} + ] + }} +]} +``` + +This tells the client to use the following dynamic parameter constraints +when subscribing to EDS resources: + +```textproto +{key_constraints:[ + {key:"subset_id" value:{ + constraints:[{value:"123"}] + }} +]} +``` + +The server can provide a different variant of the EDS resources for each +client, each with different dynamic parameter constraints (e.g., one client +would be told to use `shard_id` 123, while another client might be told to +use `shard_id` 456). + +#### Selecting Cluster Based on ACL + +The client will initially subscribe to the `DynamicParameterConstraintsMap` +resource to get the dynamic parameters to use for each resource type. The +server will send back the following DPDS resource: + +```textproto +{resource_type_constraints:[ + {key: "envoy.config.cluster.v3.RouteConfiguration" value:{ + key_constraints:[ + {key:"use_proxy" value:{ + constraints:[{value:"true"}] // or "false", depending on the client + }} + ] + }} +]} +``` + +This tells the client to send a constraint setting the `use_proxy` +parameter to either true or false when subscribing to the RDS resource. + +#### Dynamic Route Selection + +Let's say that every client uses two different dynamic selection +parameters, `env` (which can have one of the values `prod`, `canary`, or +`test`) and `version` (which can have one of the values `v1`, `v2`, or `v3`). +Now let's say that there is a RouteConfiguration with one route that should +be selected via the parameter `env=prod` and another route that should be +selected via the parameter `version=v1`. Normally, the server will need to +actually provide the cross-product of these parameter values, so there +will be 9 different variants of the resource, even though there are only +4 unique contents for the resource: + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Dynamic Parameters on ResourceResource Contents
+
    +
  • {env=canary,version=v2} +
  • {env=test,version=v2} +
  • {env=canary,version=v3} +
  • {env=test,version=v3} +
+
+
    +
  • does not include the route for env=prod +
  • does not include the route for version=v1 +
+
+
    +
  • {env=prod,version=v2} +
  • {env=prod,version=v3} +
+
+
    +
  • does include the route for env=prod +
  • does not include the route for version=v1 +
+
+
    +
  • {env=canary,version=v1} +
  • {env=test,version=v1} +
+
+
    +
  • does not include the route for env=prod +
  • does include the route for version=v1 +
+
+
    +
  • {env=prod,version=v1} +
+
+
    +
  • does include the route for env=prod +
  • does include the route for version=v1 +
+
+ +Note that a server that does not need to operate with caching xDS proxies +could optimize this by using the mechanism described in [Server-Specified +Constraints](#server-specified-constraints) above. Specifically, it could use +the DPDS resource to set constraints to minimize the number of variants: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Node MetadataDynamic Parameter Constraints from DPDSDynamic Parameters on Resource
+
    +
  • {env=canary,version=v2} +
  • {env=test,version=v2} +
  • {env=canary,version=v3} +
  • {env=test,version=v3} +
+
+
+{key_constraints:[
+  {key:"env" value{
+    constraints:[{value:"prod"}]
+    invert:true
+  }},
+  {key:"version" value:{
+    constraints:[{value:"v1"}]
+    invert:true
+  }}
+]}
+
+
+ {env=NOT_prod,version=NOT_v1} +
+
    +
  • {env=prod,version=v2} +
  • {env=prod,version=v3} +
+
+
+{key_constraints:[
+  {key:"env" value{
+    constraints:[{value:"prod"}]
+  }},
+  {key:"version" value:{
+    constraints:[{value:"v1"}]
+    invert:true
+  }}
+]}
+
+
+ {env=prod,version=NOT_v1} +
+
    +
  • {env=canary,version=v1} +
  • {env=test,version=v1} +
+
+
+{key_constraints:[
+  {key:"env" value{
+    constraints:[{value:"prod"}]
+    invert:true
+  }},
+  {key:"version" value:{
+    constraints:[{value:"v1"}]
+  }}
+]}
+
+
+ {env=NOT_prod,version=v1} +
+
    +
  • {env=prod,version=v1} +
+
+
+{key_constraints:[
+  {key:"env" value{
+    constraints:[{value:"prod"}]
+  }},
+  {key:"version" value:{
+    constraints:[{value:"v1"}]
+  }}
+]}
+
+
+ {env=prod,version=v1} +
+ +## Rationale + +We considered extending the context parameter mechanism from [xRFC +TP1](https://github.com/cncf/xds/pull/6) to support flexible matching +semantics, rather that its current exact-match semantics. However, that +approach had some down-sides: +- It would not have solved the virality problem described in the "Background" + section above. +- It would have made the new xDS naming scheme a prerequisite for using + the dynamic resource selection mechanism. (The mechanism described in + this doc is completely independent of the new xDS naming scheme; it can + be used with the legacy xDS naming scheme as well.) + +## Implementation + +TBD (Will probably be implemented in gRPC before Envoy) + +## Open issues (if applicable) + +N/A From a2396cef5814949da94250c93cdafd9d0ce4a5ca Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Thu, 5 Aug 2021 09:01:04 -0700 Subject: [PATCH 02/14] fix links to TP1 now that it's been merged Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 23 +++++++++---------- 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 54dede49..e1c55060 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -11,7 +11,7 @@ This xRFC proposes a new mechanism to allow xDS servers to dynamically generate the contents of xDS resources for individual clients while at the same time preserving cacheability. Unlike the context parameter mechanism that is part of the new xDS naming scheme (see -[xRFC TP1](https://github.com/cncf/xds/pull/6)), the mechanism described in +[xRFC TP1](TP1-xds-transport-next.md)), the mechanism described in this proposal is visible only to the transport protocol layer, not to the data model layer. This means that if a resource has a parameter that affects its contents, that parameter is not part of the resource's name, @@ -68,13 +68,12 @@ individual clients. Here are some examples: https://cloud.google.com/traffic-director/docs/configure-advanced-traffic-management#config-filtering-metadata for an example.) -The new xDS naming scheme described in [xRFC -TP1](https://github.com/cncf/xds/pull/6) provides a mechanism called -context parameters, which is intended to move all parameters that affect -resource contents into the resource name, thus adding cacheability to the -xDS ecosystem. However, this approach means that these parameters -become part of the resource graph on an individual client, which causes -a number of problems: +The new xDS naming scheme described in [xRFC TP1](TP1-xds-transport-next.md) +provides a mechanism called context parameters, which is intended to move all +parameters that affect resource contents into the resource name, thus adding +cacheability to the xDS ecosystem. However, this approach means that these +parameters become part of the resource graph on an individual client, which +causes a number of problems: - Dynamic context parameters are viral, spreading from a given resource to all earlier resources in the resource graph. For example, if multiple variants of an EDS resource are needed, there need to be two @@ -109,7 +108,7 @@ a number of problems: one for each combination of values of the two parameters. ### Related Proposals: -* [xRFC TP1: new xDS naming scheme](https://github.com/cncf/xds/pull/6) +* [xRFC TP1: new xDS naming scheme](TP1-xds-transport-next.md) ## Proposal @@ -1045,9 +1044,9 @@ the DPDS resource to set constraints to minimize the number of variants: ## Rationale We considered extending the context parameter mechanism from [xRFC -TP1](https://github.com/cncf/xds/pull/6) to support flexible matching -semantics, rather that its current exact-match semantics. However, that -approach had some down-sides: +TP1](TP1-xds-transport-next.md) to support flexible matching semantics, +rather that its current exact-match semantics. However, that approach had +some down-sides: - It would not have solved the virality problem described in the "Background" section above. - It would have made the new xDS naming scheme a prerequisite for using From 956e1b21707b747a984dcbad4284003fa0a83117 Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Thu, 5 Aug 2021 09:03:57 -0700 Subject: [PATCH 03/14] fix description of behavior when DPDS resource is not present Signed-off-by: Mark D. Roth --- ...dynamically-generated-cacheable-xds-resources.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index e1c55060..48549e2b 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -569,11 +569,14 @@ of resources. If the client cannot obtain the configured DPDS resource, it will ignore the failure and request the remaining resources with no additional -constraints. Control planes can decide how to handle that request; if -they want the client to fail without the DPDS resource, they can simply -not return any resource for the name without the expected constraints. - -FIXME: is the above still right? Control plane can't really do that based on the matching rules defined above. +constraints. This will likely result in the client sending a request that +does not include constraints for one of the parameters that is used to +distinguish different variants of the resource, and as mentioned in +the [Matching Behavior and Best +Practices](#matching-behavior-and-best-practices) section above, the +control plane is free to return any variant of the resource in that +case. However, note that the authoritative server cannot control what +choice is made by caching xDS proxies. Just like any other xDS resource, a DPDS resource can be updated by the control plane at any time. When that happens, the constraints to be From d366cd69722fe9d5296fa20220ad64637be0e62a Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Wed, 11 Aug 2021 08:56:06 -0700 Subject: [PATCH 04/14] remove server-specified constraints Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 391 ++---------------- 1 file changed, 24 insertions(+), 367 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 48549e2b..b9e85622 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -3,7 +3,7 @@ TP2: Dynamically Generated Cacheable xDS Resources * Author(s): markdroth, htuch * Approver: htuch * Implemented in: -* Last updated: 2021-08-03 +* Last updated: 2021-08-11 ## Abstract @@ -372,10 +372,6 @@ We advise deployments to avoid ambiguity through the following best practices: going to be present. For example, if clients will send constraints on the `env` key requiring the value to be one of `prod`, `test`, or `qa`, then you must have each of those three variants of the resource. - - Note that servers that can make use of the mechanism described under - [Server-Specified Constraints](#server-specified-constraints) below - may be able to optimize this in some cases. See the "Dynamic Route - Selection" example below for details. - For cases where a constraint may match multiple values (e.g., a range constraint), the largest possible matching value is preferred. This means that caches (both on clients and on caching xDS proxies) @@ -510,213 +506,17 @@ server to return the dynamic parameters associated with each resource: map dynamic_parameters = 8; ``` -### Server-Specified Constraints - -In the "sharding endpoints" and "selecting cluster based on ACL" use-cases, -the constraints need to be dynamically determined by the xDS server, not by -the client. To support this, we introduce a new xDS resource type called -`DynamicParametersConstraintsMap`, which looks like this: - -```proto -package envoy.config.dynamic_parameters.v3; - -message DynamicParameterConstraintsMap { - // Key is resource type name (e.g., "envoy.config.cluster.v3.Cluster"). - map resource_type_constraints = - 1; -} -``` - -This resource allows the management server to provide the client with a -set of dynamic parameter constraints to be used for each resource type. - -The client will obtain this resource from the server either via ADS or -via a new xDS API called Dynamic Parameter Discovery Service (DPDS): - -```proto -package envoy.service.dynamic_parameters.v3; - -service DynamicParametersDiscoveryService { - option (envoy.annotations.resource).type = "envoy.config.dynamic_parameters.v3.DynamicParameters"; - - rpc StreamDynamicParameterConstraints(stream discovery.v3.DiscoveryRequest) - returns (stream discovery.v3.DiscoveryResponse) { - } - - rpc DeltaDynamicParameterConstraints(stream discovery.v3.DeltaDiscoveryRequest) - returns (stream discovery.v3.DeltaDiscoveryResponse) { - } - - rpc FetchDynamicParameterConstraints(discovery.v3.DiscoveryRequest) - returns (discovery.v3.DiscoveryResponse) { - option (google.api.http).post = "/v3/discovery:dynamic_parameters"; - option (google.api.http).body = "*"; - } -} -``` - -Use of this resource type is optional and will be configured locally on -the client (e.g., in the bootstrap file). The client's configuration -will tell it the name of the DPDS resource to subscribe to and what server -to obtain it from. When the new xdstp: naming scheme is in use, the client -should be able to configure a different DPDS resource to use for each -authority. - -If configured, the client will subscribe to the DPDS resource before -subscribing to any other type of resource. It will then use the -constraints from the DPDS resource when subscribing to all other types -of resources. - -If the client cannot obtain the configured DPDS resource, it will ignore -the failure and request the remaining resources with no additional -constraints. This will likely result in the client sending a request that -does not include constraints for one of the parameters that is used to -distinguish different variants of the resource, and as mentioned in -the [Matching Behavior and Best -Practices](#matching-behavior-and-best-practices) section above, the -control plane is free to return any variant of the resource in that -case. However, note that the authoritative server cannot control what -choice is made by caching xDS proxies. - -Just like any other xDS resource, a DPDS resource can be updated by the -control plane at any time. When that happens, the constraints to be -used for a given resource type change, which will cause the client to -unsubscribe from all resources of that type using the old constraints and -then resubscribe to all resources using the new constraints. Note that -this is an eventually consistent model, but the appropriate use of ADS -or distributed coordination can provide stronger consistency. - -#### DPDS Example - -For example, let's say that the client is configured such that it will -use the DPDS resource -`xdstp://xds.example.com/envoy.config.context_params.v3.DynamicContextParameters/my_context_params` -for authority `xds.example.com`. The client is asked to subscribe to the LDS -resource -`xdstp://xds.example.com/envoy.config.listener.v3.Listener/my_listener`. -The client notices that the it has a DPDS resource configured for the -authority of this resource (xds.example.com), so it will first subscribe -to the DPDS resource. Let's say that it gets back the following -response: - -```textproto -{resource_type_constraints:[ - {key:"envoy.config.listener.v3.Listener" value:{ - key_constraints:[ - {key:"listener_type" value:{ - constraints:[{value:"direct"}] - }} - ] - }} -]} -``` - -The client would then use the following constraints when subscribing to -the LDS resource: - -```textproto -{key_constraints:[ - {key:"listener_type" value:{ - constraints:[{value:"direct"}] - }} -]} -``` - -Now let's say that the client later gets an update of the DPDS resource -with the following contents: - -```textproto -{resource_type_constraints:[ - {key:"envoy.config.listener.v3.Listener" value:{ - key_constraints:[ - {key:"listener_type" value:{ - constraints:[{value:"via_proxy"}] - }} - ] - }} -]} -``` - -This changes the constrains to be used for LDS resources. -The client would then send a new request that unsubscribes from the -LDS resource with the old constrains and subscribes to the same resource -with the new constraints. - -#### Non-Cacheability of DPDS - -Because this mechanism is intended to be used in cases where the server -needs to determine the constraints to be used by the client based on -information not included in the resource locator (e.g., node information, -client IP, or client credentials), the DPDS resource itself is always -non-cacheable. Servers must always set the [`Resource.do_not_cache` -field](https://github.com/envoyproxy/envoy/blob/371099f4f52f94e60f558561e29ce8852e1091da/api/envoy/service/discovery/v3/discovery.proto#L245) -when sending this resource. Clients that use a caching xDS proxy for -most of their resources will need to obtain this resource directly -from the authoritative server; when using the new xdstp: naming scheme, -this can be done by using a different authority in the DPDS resource name. - -#### Possible DPDS Implementation - -One possible way for clients to implement this is by adding a transparent -layer between the transport protocol layer and the data model layer. - -Let's say that the transport protocol is handled by an XdsClient object -that handles interaction with the xDS server(s) and takes care of all of -the client-side caching. The XdsClient object has an API that allows -data model code to register a watcher for a particular resource name, -and the XdsClient will invoke a method on the watcher whenever the -resource is updated. - -The DPDS functionality can be added as a transparent "wrapper" of the -XdsClient object: -- When a watch is started on the wrapper object for (e.g.) a Listener - resource, if use of DPDS is not configured or the resource name is an - old-style resource name, the watch will just be passed down to the real - XdsClient without modification. But if DPDS is in use, then the wrapper - will use the real XdsClient to start a watch for the DPDS resource. -- When the DPDS resource is returned, the wrapper will use it to determine - what constraints to use when subscribing to the Listener resource, at - which point it will start a watch for the Listener resource on the real - XdsClient using those constraints. Any updates returned by the Listener - watcher on the real XdsClient will be passed through to the watcher given - to the wrapper by the data model code. -- Whenever the DPDS resource gets updated, if the constraints for LDS - resources have changed, the wrapper will stop the watch for the Listener - resource on the real XdsClient that was using the old constraints and - start a new watch using the new constraints. This change will be - transparent to the data model code that started the watch on the - XdsClient wrapper. - -#### Envoy-Specific Details - -(This section applies only to Envoy, not to other xDS clients like gRPC.) - -In Envoy, RTDS is used before DPDS, so the DPDS resource cannot be used to -specify constraints for RTDS resources. - -The constraints from the DPDS resource will be used to choose the CDS -resource, which means that DPDS resources will be fetched before CDS -resources. This means that the DPDS resource itself cannot be fetched -from a cluster obtained via CDS; it must either use a static cluster or -a Google gRPC ApiConfigSource. - ### Migrating From Node Metadata Today, the equivalent of dynamic parameter constraints is node metadata, which can be used by servers to determine the set of resources to send for LDS and CDS wildcard subscriptions or to determine the contents of other resources (e.g., to select individual routes to be included in an -RDS resource). For transition purposes, this mechanism could continue -to be supported in one of two ways: -1. Direct translation of node metadata to exact-match constraints. For - example, if the node metadata contains the entry `env=prod`, this - would be translated to a constraint `{key_constraints:[{key:"env" - value:{constraints:[{value:"prod"}]}}]}`. -2. Use the mechanism described under [Server-Specified - Constraints](#server-specified-constraints) above to convert from node - metadata to dynamic parameter constraints. (Note that this mechanism - requires direct access to the authoritative server, because the - `DynamicParameterConstraintsMap` resource is not cacheable.) +RDS resource). For transition purposes, this mechanism can continue +to be supported by the client performing direct translation of node +metadata to exact-match constraints. For example, if the node metadata +contains the entry `env=prod`, this would be translated to a constraint +`{key_constraints:[{key:"env" value:{constraints:[{value:"prod"}]}}]}`. Any given xDS client may support either or both of these mechanisms. @@ -796,58 +596,29 @@ constraints to RDS as it uses for SRDS. #### Sharding Endpoints -The client will initially subscribe to the `DynamicParameterConstraintsMap` -resource to get the dynamic parameter constraints to use for each resource -type. The server will send back the following DPDS resource: - -```textproto -{resource_type_constraints:[ - {key:"envoy.config.cluster.v3.ClusterLoadAssignment" value:{ - key_constraints:[ - {key:"subset_id" value:{ - constraints:[{value:"123"}] - }} - ] - }} -]} -``` +For this use-case, dynamic parameters cannot be used, because the server +would need to determine which dynamic parameter constraints a given +client would use, and that decision is inherently non-cacheable. As a +result, we will not address this use-case via dynamic parameters; +instead, the server will supply +[non-cacheable](https://github.com/envoyproxy/envoy/blob/5e80b8255d267dbd7b128244605e93f9541ccaa5/api/envoy/service/discovery/v3/discovery.proto#L245) +EDS resources. -This tells the client to use the following dynamic parameter constraints -when subscribing to EDS resources: - -```textproto -{key_constraints:[ - {key:"subset_id" value:{ - constraints:[{value:"123"}] - }} -]} -``` - -The server can provide a different variant of the EDS resources for each -client, each with different dynamic parameter constraints (e.g., one client -would be told to use `shard_id` 123, while another client might be told to -use `shard_id` 456). +In the future, we may consider alternative designs that will better +address this use-case. #### Selecting Cluster Based on ACL -The client will initially subscribe to the `DynamicParameterConstraintsMap` -resource to get the dynamic parameters to use for each resource type. The -server will send back the following DPDS resource: +For this use-case, dynamic parameters cannot be used, because the server +would need to determine which dynamic parameter constraints a given +client would use, and that decision is inherently non-cacheable. As a +result, we will not address this use-case via dynamic parameters; +instead, the server will supply +[non-cacheable](https://github.com/envoyproxy/envoy/blob/5e80b8255d267dbd7b128244605e93f9541ccaa5/api/envoy/service/discovery/v3/discovery.proto#L245) +RDS resources. -```textproto -{resource_type_constraints:[ - {key: "envoy.config.cluster.v3.RouteConfiguration" value:{ - key_constraints:[ - {key:"use_proxy" value:{ - constraints:[{value:"true"}] // or "false", depending on the client - }} - ] - }} -]} -``` - -This tells the client to send a constraint setting the `use_proxy` -parameter to either true or false when subscribing to the RDS resource. +In the future, we may consider alternative designs that will better +address this use-case. #### Dynamic Route Selection @@ -930,120 +701,6 @@ will be 9 different variants of the resource, even though there are only -Note that a server that does not need to operate with caching xDS proxies -could optimize this by using the mechanism described in [Server-Specified -Constraints](#server-specified-constraints) above. Specifically, it could use -the DPDS resource to set constraints to minimize the number of variants: - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Node MetadataDynamic Parameter Constraints from DPDSDynamic Parameters on Resource
-
    -
  • {env=canary,version=v2} -
  • {env=test,version=v2} -
  • {env=canary,version=v3} -
  • {env=test,version=v3} -
-
-
-{key_constraints:[
-  {key:"env" value{
-    constraints:[{value:"prod"}]
-    invert:true
-  }},
-  {key:"version" value:{
-    constraints:[{value:"v1"}]
-    invert:true
-  }}
-]}
-
-
- {env=NOT_prod,version=NOT_v1} -
-
    -
  • {env=prod,version=v2} -
  • {env=prod,version=v3} -
-
-
-{key_constraints:[
-  {key:"env" value{
-    constraints:[{value:"prod"}]
-  }},
-  {key:"version" value:{
-    constraints:[{value:"v1"}]
-    invert:true
-  }}
-]}
-
-
- {env=prod,version=NOT_v1} -
-
    -
  • {env=canary,version=v1} -
  • {env=test,version=v1} -
-
-
-{key_constraints:[
-  {key:"env" value{
-    constraints:[{value:"prod"}]
-    invert:true
-  }},
-  {key:"version" value:{
-    constraints:[{value:"v1"}]
-  }}
-]}
-
-
- {env=NOT_prod,version=v1} -
-
    -
  • {env=prod,version=v1} -
-
-
-{key_constraints:[
-  {key:"env" value{
-    constraints:[{value:"prod"}]
-  }},
-  {key:"version" value:{
-    constraints:[{value:"v1"}]
-  }}
-]}
-
-
- {env=prod,version=v1} -
- ## Rationale We considered extending the context parameter mechanism from [xRFC From 6ba6067cfd8c7c3c43bfb09a18b186dc0157b3ae Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Wed, 11 Aug 2021 09:14:20 -0700 Subject: [PATCH 05/14] address @htuch comments Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 45 +++++++++---------- 1 file changed, 22 insertions(+), 23 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index b9e85622..077aafdc 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -42,11 +42,9 @@ individual clients. Here are some examples: case, except that there is a single cluster with a large number of endpoints. The goal is that the xDS server will send different subsets of endpoints to different clients, thus avoiding unwanted connections - when there are large numbers of both servers and clients. (At Google, - this is referred to as "subsetting", but it's a different feature than - the one that Envoy uses that term for.) In this case, it is desirable - for the xDS server to determine the subset of endpoints to assign to - each client. + when there are large numbers of both servers and clients. In this case, + it is desirable for the xDS server to determine the subset of endpoints + to assign to each client (or possibly equivalent class of clients). - **Selecting which cluster to send a client to based on an ACL.** In this use-case, there are two different network paths that can be used to access the endpoints: one goes directly to the endpoints, with @@ -133,7 +131,7 @@ they are not required by the client's data model. (For example, in the different variants of the EDS resource, but once a given client has the right variant, it will be unique on that client, which means that the CDS resource does not need to refer to different EDS resource names on -different client.) +a different client.) It should be noted that caching xDS proxies, unlike "leaf" clients, will need to track multiple variants of each resource, since a given caching @@ -172,20 +170,6 @@ This flexible matching semantic means that there are some cases where ambiguity can occur; we define a set of best practices below to prevent these cases from occurring in practice. -#### Matching Ambiguity - -Flexible matching means that there may be ambiguities when determining -which resources match which subscriptions. This section defines the matching -behavior and a set of best practices for deployments to follow to avoid this -kind of ambiguity. - -To illustrate where this comes up in practice, it is useful to consider -what happens in transition scenarios, where a deployment initially -groups its clients on a single key but then wants to add a second key. -The second key needs to be added both in the constraints on the server -side and in the clients' configurations, but those two changes cannot -occur atomically. - For example, let's say that the clients are currently categorized by the parameter `env`, whose value is either `prod` or `test`. The resource variants on the server will therefore have the following sets of dymamic @@ -212,9 +196,24 @@ constraints, depending on whether they are `prod` or `test` clients: ]} ``` -Now the deployment wants to add an additional key called `version`, -whose value will be either `v1` or `v2`, so that it can further subdivide -its clients' configs. +#### Matching Ambiguity + +Flexible matching means that there may be ambiguities when determining +which resources match which subscriptions. This section defines the matching +behavior and a set of best practices for deployments to follow to avoid this +kind of ambiguity. + +To illustrate where this comes up in practice, it is useful to consider +what happens in transition scenarios, where a deployment initially +groups its clients on a single key but then wants to add a second key. +The second key needs to be added both in the constraints on the server +side and in the clients' configurations, but those two changes cannot +occur atomically. + +Consider the above example where the clients are already divided into +`env=prod` and `env=test`. Let's say that now the deployment wants to add +an additional key called `version`, whose value will be either `v1` or `v2`, +so that it can further subdivide its clients' configs. If the new key is added on the clients first, then the clients will start subscribing with dynamic parameters constraints like the following: From 8dabdddd089be9af2d53e3161520c5d9d078bf5a Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Wed, 11 Aug 2021 13:38:43 -0700 Subject: [PATCH 06/14] remove node context params from xRFC TP1, since that should use dynamic params instead Signed-off-by: Mark D. Roth --- proposals/TP1-xds-transport-next.md | 23 ++++------------------- 1 file changed, 4 insertions(+), 19 deletions(-) diff --git a/proposals/TP1-xds-transport-next.md b/proposals/TP1-xds-transport-next.md index 0cacb2f8..b2bcec64 100644 --- a/proposals/TP1-xds-transport-next.md +++ b/proposals/TP1-xds-transport-next.md @@ -276,21 +276,6 @@ Context parameters in URNs presented by the client to the server will be composed from the following sources. Using an example of a URL `xdstp://some-authority/some.type/foo?bar=baz`: -* Static `Node`-derived context parameters. These are prefixed with - `xds.node.`. The set of `Node`-derived context parameters is specified in the - bootstrap on a per-resource type basis. The key reflects the `Node` proto3 - field structure, e.g. `xds.node.locality.sub_zone=some_sub_zone`, - `xds.node.metadata.bar="a"`, `xds.node.user_agent_version=1.2.3`. - * Generally, values in the `Node` are converted from their proto3 value - to JSON following the [canonical - transformation](https://developers.google.com/protocol-buffers/docs/proto3#json). - * Both `xds.node.metadata.X` and - `xds.node.user_agent_build_version.metadata.X` permit directly referencing - a top-level metadata field `X` in a context parameter key. This does not - apply to nested fields in the metadata, e.g. `xds.node.metadata.bar`. - * `xds.node.user_agent_build_version.version` yields a string value composed - of `major.minor.patch` values, e.g. `"1.2.3"`. - * Context parameters from the URL, in the above example `bar=baz`. These must not be in the `xds.*` namespace. @@ -302,7 +287,7 @@ composed from the following sources. Using an example of a URL `xds.resource.` prefixed. An example computed URN following the above example is -`xdstp://some-authority/some.type/foo?xds.node.metadata.foo=bar&xds.shard_id=1234&bar=baz&xds.client_feature.lb.least_loaded=false&xds.resource.vip=96.54.3.1`. +`xdstp://some-authority/some.type/foo?bar=baz&xds.resource.vip=96.54.3.1`. This proposal reserves all prefixes beginning with a non-alphanumeric character for context parameter values in future URI context parameter enhancements. @@ -686,7 +671,7 @@ Client LDS `DeltaDiscoveryRequest` sent to xDS relay proxy (note the use of clie ``` resource_names_subscribe: -- xdstp://some-authority/envoy.config.listeners.v3.Listener/my-listeners/*?xds.node_type=ingress&xds.client_features.envoy.config.no_bind_to_port=true +- xdstp://some-authority/envoy.config.listeners.v3.Listener/my-listeners/*?node_type=ingress ``` xDS management server `DeltaDiscoveryResponse` sent to the client: @@ -695,12 +680,12 @@ xDS management server `DeltaDiscoveryResponse` sent to the client: ``` resources: - version: 1 - name: xdstp://some-authority/envoy.config.listeners.v3.Listener/my-listeners/foo?xds.node_type=ingress&xds.client_features.envoy.config.no_bind_to_port=true + name: xdstp://some-authority/envoy.config.listeners.v3.Listener/my-listeners/foo?node_type=ingress resource: "@type": type.googleapis.com/envoy.config.listeners.v3.Listener … foo's Listener payload … - version: 42 - name: xdstp://some-authority/envoy.config.listeners.v3.Listener/my-listeners/bar?xds.node_type=ingress&xds.client_features.envoy.config.no_bind_to_port=true + name: xdstp://some-authority/envoy.config.listeners.v3.Listener/my-listeners/bar?node_type=ingress resource: "@type": type.googleapis.com/xds.core.v3.ResourceLocator From 9308fe716026d166668bca846251561ae594df52 Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Thu, 14 Oct 2021 23:03:58 +0000 Subject: [PATCH 07/14] change design to send params from client and put constraints on resources Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 791 ++++++++---------- 1 file changed, 371 insertions(+), 420 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 077aafdc..3021e9d8 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -3,7 +3,7 @@ TP2: Dynamically Generated Cacheable xDS Resources * Author(s): markdroth, htuch * Approver: htuch * Implemented in: -* Last updated: 2021-08-11 +* Last updated: 2021-10-14 ## Abstract @@ -21,50 +21,16 @@ not viral, thus making the mechanism much easier to use. ## Background -There are many use-cases where a control plane may need to dynamically -generate the contents of xDS resources to tailor the resources for -individual clients. Here are some examples: - -- **xDS minor/patch version negotiation.** In this case, each client - supports a given minor and patch version, and the server may choose to - use newer API features when talking to clients that support newer versions - of the API. (See https://github.com/envoyproxy/envoy/issues/8416 for - details.) -- **Sharding Cluster resources for scalability.** In this use-case, there - are a really large number of clusters, too many for any one client to - handle. The clusters are divided into shards, and each client is given - a dynamically changing assignment of which shards to load. To support - this, there needs to be a different variant of the `ClusterCollection` - resource for each combination of shards that may be assigned to a given - client. The shard assignments are generally determined dynamically on - the client but may change at any time. -- **Sharding endpoints for scalability.** This is similar to the previous - case, except that there is a single cluster with a large number of - endpoints. The goal is that the xDS server will send different subsets - of endpoints to different clients, thus avoiding unwanted connections - when there are large numbers of both servers and clients. In this case, - it is desirable for the xDS server to determine the subset of endpoints - to assign to each client (or possibly equivalent class of clients). -- **Selecting which cluster to send a client to based on an ACL.** In this - use-case, there are two different network paths that can be used to - access the endpoints: one goes directly to the endpoints, with - client-side load balancing, and the other goes via a reverse proxy. - The path that goes directly to the endpoints is faster but is - access-restricted. The xDS server needs to check an ACL to determine - whether a given client is authorized to directly access the endpoints. - If the client is authorized, it will be sent a `RouteConfiguration` - pointing to the cluster for those endpoints; otherwise, it will be sent - a different variant of the `RouteConfiguration` that points to a cluster - containing the reverse proxy endpoint. -- **Dynamic route selection.** Every client sends a set of dynamic - selection parameters (today, conveyed as node metadata). The server - has a list of routes to configure, but individual routes in the list - may be included or excluded based on the client's dynamic selection - parameters. Thus, the server needs to generate a slightly different - version of the `RouteConfiguration` for clients based on the parameters - they send. (See - https://cloud.google.com/traffic-director/docs/configure-advanced-traffic-management#config-filtering-metadata - for an example.) +There are many use-cases where a control plane may need to +dynamically generate the contents of xDS resources to tailor the +resources for individual clients. One common case is where the +server has a list of routes to configure, but individual routes in +the list may be included or excluded based on the client's dynamic +selection parameters (today, conveyed as node metadata). Thus, +the server needs to generate a slightly different version of the +`RouteConfiguration` for clients based on the parameters they send. (See +https://cloud.google.com/traffic-director/docs/configure-advanced-traffic-management#config-filtering-metadata +for an example.) The new xDS naming scheme described in [xRFC TP1](TP1-xds-transport-next.md) provides a mechanism called context parameters, which is intended to move all @@ -144,64 +110,127 @@ essentially irrelevant in that case. With the above property in mind, this document proposes the following data structures: -- **Dynamic parameters**, which are a set of key/value pairs that are part - of the cache key for an xDS resource (in addition to the resource name - itself). This provides a mechanism to represent multiple variants of a - given resource in a cacheable way. These parameters are used to identify - the specified resource in the transport protocol, but they are not part of - the resource name and therefore do not appear as part of the resource graph. +- **Dynamic parameters**, which are a set of key/value pairs sent by the + client when subscribing to a resource. - **Dynamic parameter constraints**, which are a set of criteria that can be used to determine whether a set of dynamic parameters matches - the constraints. When a client subscribes to a resource, it may - specify a set of dynamic parameter constraints, which will be used to - select which variant of the resource will be returned by the server. - In response to a given subscription request from the client containing - a set of dynamic parameter constraints, the server will send a - resource whose dynamic parameters match the dynamic parameter - constraints in the request. The client will use the dynamic - parameters on the resource to determine which of its subscriptions the - resource is associated with. + the constraints. These constraints are part of the cache key for an + xDS resource (in addition to the resource name itself) on xDS servers, + xDS clients, and xDS caching proxies. This provides a mechanism to + represent multiple variants of a given resource in a cacheable way. -Dynamic parameters, unlike context parameters, will not be -exact-match-only. Dynamic parameter constraints will be able to represent -various types of flexible matching, such as range-based matching (which -will be used for the "xDS minor/patch version negotiation" use-case). -This flexible matching semantic means that there are some cases where -ambiguity can occur; we define a set of best practices below to prevent -these cases from occurring in practice. +Both of these data structures are used in the xDS transport protocol, +but they are not part of the resource name and therefore do not appear as +part of the resource graph. + +When a client subscribes to a resource, it specifies a set of dynamic +parameters. In response, the server will send a resource whose dynamic +parameter constraints match the dynamic parameters in the subscription +request. The client will use the dynamic parameter constraints on the +returned resource to determine which of its subscriptions the resource is +associated with. + +#### Constraints Representation + +Dynamic parameter constraints will be represented in protobuf form as follows: + +```proto +message DynamicParameterConstraints { + // A list of constraints that may be combined with AND or OR semantics. + message ConstraintList { + // A constraint for a given key. + message Constraint { + message Exists {} + // The key to match against. + string key = 1; + // How to match. + oneof constraint_type { + // Matches this exact value. + string value = 2; + // Key is present (matches any value except for the key being absent). + Exists exists = 3; + } + // If set to true, the match is inverted -- i.e., the key must NOT + // match the specified value. + bool invert = 4; + } + + enum MatchType { + // Default value. + MATCH_TYPE_UNSPECIFIED = 0; + // Logical AND of constraints. + MATCH_TYPE_AND = 1; + // Logical OR of constraints. + MATCH_TYPE_OR = 2; + } + + // A list of key/value constraints. + repeated Constraint constraints = 1; + + // How to match the constraints. + MatchType match_type = 2; + } + + // A list of constraint lists. All constraint lists must match (i.e., + // logical AND semantics). + repeated ConstraintList constraints = 1; +} +``` + +#### Matching Behavior + +Note that both xDS servers and clients need to evaluate matching between +a set of dynamic parameters and a set of constraints. The server does +this when deciding which variant of a given resource to return for a +given subscription request. When the client receives the resource from +the server, it needs to do the same matching to determine which of its +subscriptions that resource is associated with. Therefore, the matching +behavior becomes an inherent part of the xDS transport protocol. + +(In effect, the resource cache in an xDS client is basically the same +logic as that on an xDS server; the only difference is that in the case +of a client, the resources in the cache come from an xDS stream instead +of from an authoritative database. Similarly, a caching xDS proxy is +simply an xDS client where the subscriptions come from an incoming xDS +stream.) For example, let's say that the clients are currently categorized by the -parameter `env`, whose value is either `prod` or `test`. The resource -variants on the server will therefore have the following sets of dymamic -parameters: +parameter `env`, whose value is either `prod` or `test`. So any given +client will send one of the following sets of dynamic parameters: - `{env=prod}` - `{env=test}` -Clients will send one of the following two sets of dynamic parameter -constraints, depending on whether they are `prod` or `test` clients: +The resource variants on the server will have the following sets of dynamic +parameter constraints: ```textproto // For {env=prod} -{key_constraints:[ - {key:"env" value:{ - constraints:[{value:"prod"}] - }} +{constraints:[ + { + constraints:[{key:"env" value:"prod"}] + match_type: MATCH_TYPE_AND + } ]} // For {env=test} -{key_constraints:[ - {key:"env" value:{ - constraints:[{value:"test"}] - }} +{constraints:[ + { + constraints:[{key:"env" value:"test"}] + match_type: MATCH_TYPE_AND + } ]} ``` #### Matching Ambiguity -Flexible matching means that there may be ambiguities when determining -which resources match which subscriptions. This section defines the matching -behavior and a set of best practices for deployments to follow to avoid this -kind of ambiguity. +Dynamic parameters, unlike context parameters, will not be +exact-match-only. Dynamic parameter constraints will be able to represent +certain simple types of flexible matching, such as matching an exact +value or the existance of a key, and simple AND and OR combinations +of constraints. This flexible matching semantic means that there may be +ambiguities when determining which resources match which subscriptions. +This section defines the matching behavior and a set of best practices for +deployments to follow to avoid this kind of ambiguity. To illustrate where this comes up in practice, it is useful to consider what happens in transition scenarios, where a deployment initially @@ -215,147 +244,197 @@ Consider the above example where the clients are already divided into an additional key called `version`, whose value will be either `v1` or `v2`, so that it can further subdivide its clients' configs. -If the new key is added on the clients first, then the clients will -start subscribing with dynamic parameters constraints like the following: +If the new key is added on the server side first, then the server will +have resource variants with constraints like this: ```textproto // For {env=prod, version=v1} -{key_constraints:[ - {key:"env" value:{ - constraints:[{value:"prod"}] - }}, - {key:"version" value:{ - constraints:[{value:"v1"}] - }} +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" value:"v1"} + ] + match_type: MATCH_TYPE_AND + } +]} + +// For {env=prod, version=v2} +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" value:"v2"} + ] + match_type: MATCH_TYPE_AND + } ]} ``` -The server or cache has to match that set of constraints against the -existing sets of dynamic parameters, which do not specify the `version` -key at all. +But at this point, the clients are continuing to subscribe without +specifying this new key. So the server or cache would not have any way +to know which of the above variants to use for a subscription specifying +`{env=prod}` but not specifying `version`. -Conversely, if the new key is added on the server side first, then the -server will have resource variants with parameters like this: +Conversely, if the new key is added on the clients first, then the clients +will start subscribing with dynamic parameters like the following: - `{env=prod, version=v1}` - `{env=prod, version=v2}` - `{env=test, version=v1}` - `{env=test, version=v2}` -But at this point, the clients are continuing to subscribe without -constraints on this new key. So the server or cache needs to figure out -(e.g.) which of the first two sets of constraints to use for constraints -that require `env` to be `prod` but do not specify `version`. - -We address this transition scenario by allowing the set of constraints -for a given key to match any resource variant that does not specify that -key at all. This allows constraints for new keys to be added on clients -before the corresponding keys are added on the resources on the server, but -it does introduce some additional ambiguity into the matching. For example, -let's say that the server has the following two variants of a resource: -- `{env=prod}` -- `{env=prod, version=v1}` - -Now consider what happens if a client subscribes with the following -constraints: +The server or cache has to match those sets of dynamic parameters against +the existing sets of dynamic parameter constraints, which do not specify the +`version` key at all. + +We address this transition scenario by allowing a set of constraints +to match a set of dynamic parameters that includes a key that is not +specified by the constraints. This allows new keys to be added on +clients before the corresponding constraints are added on the resources, +which we expect to be the common case. (In general, we expect clients +to send a lot of keys that may not actually be used by the server, since +deployments often divide their clients into categories before they have +a need to differentiate the configs for those categories.) + +As mentioned above, this approach does introduce the possibility of +matching ambiguity in certain cases, where there may be more than one +variant of a resource that matches the dynamic parameters specified by +the client. If an xDS transport protocol implementation does encounter +multiple possible matching variants of a resource, its behavior is +undefined. In the following sections, we evaluate the cases where that +can occur and specify how each one will be addressed. + +##### Adding a New Key on the Server First + +As stated above, we are optimizing for the case where new keys are added +on clients first, since that is expected to be the common scenario. +However, there may be cases where it is not feasible to have all clients +start sending a new key before the server needs to start making use of +that key. + +For example, let's consider the same case as above, where the clients +are initially sending only the `env` key, and the server now wants to +introduce the `version` key. However, let's say that this is in an +environment where the xDS server is controlled by one team and the clients +are controlled by various other teams, so it's not feasible to force all +clients to start sending the new `version` key all at once. But there +is one particular client team that is eager to start using the new +`version` key to differentiate the configs of their clients, and they +don't want to wait for all of the other client teams to start sending +the new key. + +Consider what happens if the server simply adds a variant of the +resource with the new key: ```textproto -{key_constraints:[ - {key:"env" value:{ - constraints:[{value:"prod"}] - }}, - {key:"version" value:{ - constraints:[{value:"v1"}] - }} +// Existing variant for older clients that are not yet sending the +// version key. +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"} + ] + match_type: MATCH_TYPE_AND + } +]} + +// New variant intended for clients sending the version key. +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" value:"v1"} + ] + match_type: MATCH_TYPE_AND + } ]} ``` -These constraints can match either of the above variants of the resource. -This situation can be avoided by establishing a best practice that all -variants of a given resource must have the same set of keys. +This will work fine for older clients that are not yet sending the +`version` key, because their dynamic parameters will not match the new +variant's constraints. However, newer clients that are sending dynamic +parameters `{env=prod, version=v1}` will run into ambiguity: those +parameters can match either of the above variants of the resource. -There is still a possible ambiguity that can occur if a server adds -multiple variants of a new key that clients are not yet sending. -For example, let's say that the server has the following two variants -of a resource: -- `{env=prod, version=v1}` -- `{env=prod, version=v2}` +This situation will be avoided by requiring that **all variants of a +given resource must specify constraints for the same set of keys**. -Consider what happens if a client subscribes with the following constraints: +However, in order to make this work for the case where the server starts +sending the constraint on the new key before all clients are sending it, +we provide the `exists` matcher, which will allow the server to specify +a default explicitly for clients that are not yet sending a new key. +In this example, the server would actually have the following two +variants: ```textproto -{key_constraints:[ - {key:"env" value:{ - constraints:[{value:"prod"}] - }} +// Existing variant for older clients that are not yet sending the +// version key. +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" exists:{} invert:true} + ] + match_type: MATCH_TYPE_AND + } ]} -``` - -These constraints can match either variant of the above resource. -This can be avoided by establishing a best practice of not adding multiple -variants of a new parameter until clients are sending the new parameter. -However, if this does happen, the cache implementation is free to pick -one of the variants at random. -So, the expected order of changes for this kind of transition would be: -1. Change clients to start sending a constraint for `version=v1`. -2. Add the dynamic parameter `version=v1` to all existing resources. -3. Create new variants of each resource with `version=v2`. -4. Change the desired set of clients to send a constraint for - `version=v2` instead of `version=v1`. - -##### Alternatives Considered - -We could avoid much of the matching ambiguity described above by saying that -a set of constraints must specify all keys present on the resource in order -to match. However, this would mean that if the client starts subscribing -with a constraint for the new key before the corresponding key is added on -the resources on the server, then it will fail to match the existing resources. -In other words, the process would be: - -1. Add a variant of all resources on the server side with `version=v1` - (in addition to all existing dynamic parameters). -2. Change clients to start sending constraints with the new key. -3. When all clients are updated, remove the resource variants that do - *not* have the new key. +// New variant for clients sending the version key. +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" value:"v1"} + ] + match_type: MATCH_TYPE_AND + } +]} +``` -This will effectively require adding new keys on the server side first, -which seems like a large burden on users. It also seems fairly tricky -for most users to get the exactly correct set of dynamic parameters on -each resource variant, and if they fail to do it right, they will break -their existing configuration. +This allows maintaining the requirement that all variants of a given +resource have constraints on the same set of keys, while also allowing +the server to explicitly provide a result for older clients that do not +yet send the new key. -We also considered having the client add the new constraint but mark it -as optional using an `is_optional` field. That way, it would match -resources both before and after the new key is added on the server. -However, the `is_optional` field would introduce another type of ambiguity -in matching. Specifically, let's say that the server has the following -two variants of the resource: -- `{env=prod}` -- `{env=prod, version=v1}` +##### Variants With Overlapping Constraint Values -Now a client subscribes with the following set of constraints: +There is also a possible ambiguity that can occur if a server provides +multiple variants of a resource whose constraints for a given key +overlap in terms of the values they can match. For example, let's say +that a server has the following two variants of a resource: ```textproto -{key_constraints:[ - {key:"env" value:{ - constraints:[{value:"prod"}] - }}, - {key:"version" value:{ - constraints:[{value:"v1"}] - is_optional: true - }} +// Matches {env=prod} or {env=test}. +{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"env" value:"test"} + ] + match_type: MATCH_TYPE_OR + } +]} + +// Matches {env=qa} or {env=test}. +{constraints:[ + { + constraints:[ + {key:"env" value:"qa"}, + {key:"env" value:"test"} + ] + match_type: MATCH_TYPE_OR + } ]} ``` -These constraints can match either of the above variants of the resource. -For authoritative servers, this could be addressed by establishing -a best practice of not having two variants of a resource that differ -only by keys that the client will send as optional. However, this -requires coordination between client and server, and requires -machinery on the client to determine when to set the `is_optional` bit. +Now consider what happens if a client subscribes with dynamic parameters +`{env=test}`. Those dynamic parameters can match either of the above +variants of the resource. -Ultimately, although this approach is more semantically precise, it is -also considered too rigid and difficult for users to work with. +This situation will be avoided by requiring that **all variants of a given +resource must specify non-overlapping constraints for the same set of keys**. +Control planes must not accept a set of resources that violates this +requirement. #### Matching Behavior and Best Practices @@ -363,95 +442,25 @@ We advise deployments to avoid ambiguity through the following best practices: - Whenever there are multiple variants of a resource, all variants must list the same set of keys. This allows the server to ignore constraints on keys sent by the client that do not affect the choice of variant - without causing ambiguity in cache misses. -- Servers should not create multiple variants of a parameter that is not yet - being sent by clients. If they do, clients that do not send that parameter - will get one of the variants at random. + without causing ambiguity in cache misses. Servers may use the + `exists` mechanism to provide backward compatibility for clients that + are not yet sending a newly added key. +- The constraints on each variant of a given resource must be mutually + exclusive. For example, if one variant of a resource matches a given key + with values "foo" or "bar", and another variant matches that same key + with values "bar" or "baz", that would cause ambiguity, because both + variants would match the value "bar". - There must be a variant of the resource for every value of a key that is going to be present. For example, if clients will send constraints on the `env` key requiring the value to be one of `prod`, `test`, or `qa`, then - you must have each of those three variants of the resource. -- For cases where a constraint may match multiple values (e.g., a - range constraint), the largest possible matching value is preferred. - This means that caches (both on clients and on caching xDS proxies) - must attempt to fetch a larger value even if they already have a smaller - matching value already present in the cache. For example, let's say - that a cache contains a variant of a resource with the parameter - `{shard=3}` and a client subscribes with the following constraints: - ```textproto - {key_constraints:[ - {key:"shard" value:{ - constraints:[ - {integer_range_list:[ - {range:{min_value:0 max_value:5}} - ]} - ] - }} - ]} - ``` - In this case, the cache must attempt to fetch a resource from the - authoritative server with that constraint before falling back to - using the one it already has cached, because the preferred value is - `{shard=5}`, not `{shard=3}`. - - Note: This is not an issue for glob collections, because in that case - all matching variants of the resource will be used. - -#### API Changes - -The API changes necessary to implement this proposal are in -https://github.com/envoyproxy/envoy/pull/17192. - -Dynamic parameter constraints will be represented as follows: - -```proto -// A set of dynamic parameter constraints used to select the variant of -// a given resource desired by a client. Clients send a set of -// constraints with each subscription request, and servers respond by -// sending a resource with a matching set of dynamic parameters. -message DynamicParameterConstraints { - // Constraints for a given key. - message KeyConstraints { - message Constraint { - // A list of one or more integer ranges. - // A value is considered to match if it falls in any of the ranges. - message IntegerRangeList { - // At least one of *min_value* or *max_value* must be set. - message Range { - // If specified, value may not be less than this. - uint64 min_value = 1; - - // If specified, value may not be greater than this. - uint64 max_value = 2; - } - - repeated Range range = 1; - } - - oneof constraint { - // The key must have this specific value. - string value = 1; + you must have each of those three variants of the resource. (Failure + to do this will result in the server acting as if the requested + resource does not exist.) - // The key's value must be integers and within one of the ranges in this list. - IntegerRangeList integer_range_list = 2; - } - } - - // A list of one or more constraints on the value of the key. - // All constraints must be met. - repeated Constraint constraints = 2; - } - - // One entry per key. - // Note that if a key has a constraint here, it will place restrictions - // on the key's value if the key is present on a variant of the resource. - // However, if a key has a constraint here but is not present on the - // resource, it will match, regardless of what the constraint says. - map key_constraints = 1; -} -``` +#### Transport Protocol Changes The following message will be added to represent a subscription to a -resource by name with associated dynamic parameter constraints: +resource by name with associated dynamic parameters: ```proto // A specification of a resource used when subscribing or unsubscribing. @@ -459,22 +468,21 @@ message ResourceLocator { // The resource name to subscribe to. string name = 1; - // A set of constraints used to match against the dynamic parameters on the resource. This - // allows clients to select between multiple variants of the same resource. - DynamicParameterConstraints dynamic_parameter_constraints = 2; + // A set of dynamic parameters used to match against the dynamic parameter + // constraints on the resource. This allows clients to select between + // multiple variants of the same resource. + map dynamic_parameters = 2; } ``` The following new field will be added to `DiscoveryRequest`, to allow clients -to specify constraints when subscribing to a resource: +to specify dynamic parameters when subscribing to a resource: ```proto // Alternative to resource_names field that allows specifying cache - // keys along with each resource name. If this is populated in the - // first request for a resource type on a stream, resource_names is ignored - // for all subsequent requests for that resource type on that stream. - // Clients that populate this field must be able to handle responses - // from the server where resources are wrapped in a Resource message. + // keys along with each resource name. Clients that populate this field + // must be able to handle responses from the server where resources are + // wrapped in a Resource message. repeated ResourceLocator resource_locators = 7; ``` @@ -482,17 +490,11 @@ Similarly, the following fields will be added to `DeltaDiscoveryRequest`: ```proto // Alternative to resource_names_subscribe field that allows specifying cache - // keys along with each resource name. If this is populated in the - // first request for a resource type on a stream, resource_names_subscribe - // and resource_names_unsubscribe are ignored for all subsequent requests - // for that resource type on that stream. + // keys along with each resource name. repeated ResourceLocator resource_locators_subscribe = 8; // Alternative to resource_names_unsubscribe field that allows specifying cache - // keys along with each resource name. If resource_locators_subscribe is - // populated in the first request for a resource type on a stream, - // this field is used instead of resource_named_unsubscribe for all - // subsequent requests for that resource type on that stream. + // keys along with each resource name. repeated ResourceLocator resource_locators_unsubscribe = 9; ``` @@ -500,9 +502,10 @@ The following field will be added to the `Resource` message, to allow the server to return the dynamic parameters associated with each resource: ```proto - // Dynamic parameters associated with this resource. To be used by client-side caches - // (including xDS proxies) when matching subscribed resource locators. - map dynamic_parameters = 8; + // Dynamic parameter constraints associated with this resource. To be used + // by client-side caches (including xDS proxies) when matching subscribed + // resource locators. + DynamicParameterConstraints dynamic_parameter_constraints = 8; ``` ### Migrating From Node Metadata @@ -519,132 +522,42 @@ contains the entry `env=prod`, this would be translated to a constraint Any given xDS client may support either or both of these mechanisms. -### Examples +### Example This section shows how the mechanism described in this proposal can be -used to address each of the use-cases identified in the "Background" +used to address each the use-case described in the "Background" section above. -#### xDS Minor/Patch Version Negotiation - -The client will send the following dynamic parameter constraints, which -indicate the range of versions that it supports: - -```textproto -{key_constraints:[ - {key:"xds.version.minor" value:{ - constraints:[ - {integer_range_list:[ - {range:{min_value:0 max_value:5}} - ]} - ] - }} -]} -``` - -Let's say that a server has a resource that wants to use a new feature -introduced in version 3.0.5 for clients that support that version. It will -provide two versions of that resource: -- For clients at version 3.0.5 or higher, a resource with keys - `{"xds.version.patch"=5}`. -- For clients at version 3.0.4 or lower, there will need to be at least one - variant of the resource for every possible version range that any client - may request, all with the exact same content. For example: - - `{"xds.version.patch"=4}` - - `{"xds.version.patch"=3}` - - `{"xds.version.patch"=2}` - - `{"xds.version.patch"=1}` - - `{"xds.version.patch"=0}` - -#### Sharding Clusters - -In this use-case, the client will have a set of shard ranges determined -by some client-side code, resulting in a pair of dynamic parameter -constraints, one for SRDS and another for CDS. For example, let's -say that a client should use shard ranges [4-6], [11-15], and [46-90]. -The dynamic parameter constraints for SRDS and CDS would be: - -```textproto -// For SRDS (single resource). -{key_constraints:[ - {key:"shards" value:{ - constraints:[ - {value:"[4-6],[11-15],[46-90]"} - ] - }} -]} - -// For CDS (glob collection). -{key_constraints:[ - {key:"shards" value:{ - constraints:[ - {integer_range_list:[ - {range:{min_value:4 max_value:6}}, - {range:{min_value:11 max_value:15}}, - {range:{min_value:46 max_value:90}} - ]} - ] - }} -]} -``` - -The resulting SRDS resource will tell the client what RDS resources -to fetch. The server can either generate different resource names for -each variant of the RDS resource, or it can choose to apply the same -constraints to RDS as it uses for SRDS. - -#### Sharding Endpoints - -For this use-case, dynamic parameters cannot be used, because the server -would need to determine which dynamic parameter constraints a given -client would use, and that decision is inherently non-cacheable. As a -result, we will not address this use-case via dynamic parameters; -instead, the server will supply -[non-cacheable](https://github.com/envoyproxy/envoy/blob/5e80b8255d267dbd7b128244605e93f9541ccaa5/api/envoy/service/discovery/v3/discovery.proto#L245) -EDS resources. - -In the future, we may consider alternative designs that will better -address this use-case. - -#### Selecting Cluster Based on ACL - -For this use-case, dynamic parameters cannot be used, because the server -would need to determine which dynamic parameter constraints a given -client would use, and that decision is inherently non-cacheable. As a -result, we will not address this use-case via dynamic parameters; -instead, the server will supply -[non-cacheable](https://github.com/envoyproxy/envoy/blob/5e80b8255d267dbd7b128244605e93f9541ccaa5/api/envoy/service/discovery/v3/discovery.proto#L245) -RDS resources. - -In the future, we may consider alternative designs that will better -address this use-case. - -#### Dynamic Route Selection - Let's say that every client uses two different dynamic selection -parameters, `env` (which can have one of the values `prod`, `canary`, or -`test`) and `version` (which can have one of the values `v1`, `v2`, or `v3`). -Now let's say that there is a RouteConfiguration with one route that should -be selected via the parameter `env=prod` and another route that should be -selected via the parameter `version=v1`. Normally, the server will need to -actually provide the cross-product of these parameter values, so there -will be 9 different variants of the resource, even though there are only -4 unique contents for the resource: +parameters, `env` (which can have one of the values `prod`, `canary`, +or `test`) and `version` (which can have one of the values `v1`, `v2`, +or `v3`). Now let's say that there is a `RouteConfiguration` with one +route that should be selected via the parameter `env=prod` and another +route that should be selected via the parameter `version=v1`. Without +this design, the server would need to actually provide the cross-product +of these parameter values, so there will be 9 different variants of the +resource, even though there are only 4 unique contents for the resource. +However, this design instead allows the server to provide only the 4 +unique variants of the resource, with constraints allowing each client +to get the appropriate one: - + @@ -850,13 +792,10 @@ to get the appropriate one: @@ -870,14 +809,11 @@ to get the appropriate one:
Dynamic Parameters on ResourceDynamic Parameter Constraints on Resource Resource Contents
-
    -
  • {env=canary,version=v2} -
  • {env=test,version=v2} -
  • {env=canary,version=v3} -
  • {env=test,version=v3} -
+{constraints:[ + { + constraints:[ + {key:"env" value:"prod" invert:true}, + {key:"version" value:"v1" invert:true} + ] + match_type: MATCH_TYPE_AND + } +]}
    @@ -656,10 +569,15 @@ will be 9 different variants of the resource, even though there are only
-
    -
  • {env=prod,version=v2} -
  • {env=prod,version=v3} -
+{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" value:"v1" invert:true} + ] + match_type: MATCH_TYPE_AND + } +]}
    @@ -671,10 +589,15 @@ will be 9 different variants of the resource, even though there are only
-
    -
  • {env=canary,version=v1} -
  • {env=test,version=v1} -
+{constraints:[ + { + constraints:[ + {key:"env" value:"prod" invert:true}, + {key:"version" value:"v1"} + ] + match_type: MATCH_TYPE_AND + } +]}
    @@ -686,9 +609,15 @@ will be 9 different variants of the resource, even though there are only
-
    -
  • {env=prod,version=v1} -
+{constraints:[ + { + constraints:[ + {key:"env" value:"prod"}, + {key:"version" value:"v1"} + ] + match_type: MATCH_TYPE_AND + } +]}
    @@ -713,6 +642,28 @@ some down-sides: this doc is completely independent of the new xDS naming scheme; it can be used with the legacy xDS naming scheme as well.) +We could avoid much of the matching ambiguity described above by saying that +a set of constraints must specify all keys present in the subscription +request in order to match. However, this would mean that if the client +starts subscribing with a new key before the corresponding constraint is +added on the resources on the server, then it will fail to match the +existing resources. In other words, the process would be: + +1. Add a variant of all resources on the server side with a constraint + for `version=v1` (in addition to all existing constraints). +2. Change clients to start sending the new key. +3. When all clients are updated, remove the resource variants that do + *not* have the new key. + +This will effectively require adding new keys on the server side first, +which seems like a large burden on users. It also seems fairly tricky +for most users to get the exactly correct set of dynamic parameters on +each resource variant, and if they fail to do it right, they will break +their existing configuration. + +Ultimately, although this approach is more semantically precise, it is +also considered too rigid and difficult for users to work with. + ## Implementation TBD (Will probably be implemented in gRPC before Envoy) From f3ef06b77b584de2aacb7398c3450057fc20c568 Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Thu, 14 Oct 2021 23:09:27 +0000 Subject: [PATCH 08/14] add note about limitation on extending matching Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 3021e9d8..04797177 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -631,6 +631,23 @@ to get the appropriate one: ## Rationale +This section documents limitations and design alternatives that we +considered. + +### Limitation on Enhancing Matching in the Future + +One limitation of this design is that, because all xDS transport protocol +implementations (clients, servers, and caching proxies) need to implement +this matching behavior, it will be very difficult to add new matching +behavior in the future. Doing so will probably require some sort of +client capability. + +Because of this, reviewers of this design are encouraged to carefully +scrutinize the proposed matching semantics to ensure that they meet our +expected needs. + +### Using Context Parameters + We considered extending the context parameter mechanism from [xRFC TP1](TP1-xds-transport-next.md) to support flexible matching semantics, rather that its current exact-match semantics. However, that approach had @@ -642,6 +659,8 @@ some down-sides: this doc is completely independent of the new xDS naming scheme; it can be used with the legacy xDS naming scheme as well.) +### Stricter Matching to Avoid Ambiguity + We could avoid much of the matching ambiguity described above by saying that a set of constraints must specify all keys present in the subscription request in order to match. However, this would mean that if the client From cec4e7952e6cb6d3c3bd138d487eadc53ad2b407 Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Wed, 20 Oct 2021 17:00:50 +0000 Subject: [PATCH 09/14] address review feedback from ejona86 Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 223 ++++++++++++------ 1 file changed, 145 insertions(+), 78 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 04797177..ea9c96dd 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -130,6 +130,14 @@ request. The client will use the dynamic parameter constraints on the returned resource to determine which of its subscriptions the resource is associated with. +Dynamic parameters, unlike context parameters, will not be +exact-match-only. Dynamic parameter constraints will be able to represent +certain simple types of flexible matching, such as matching an exact +value or the existance of a key, and simple AND and OR combinations +of constraints. This flexible matching semantic means that there may be +ambiguities when determining which resources match which subscriptions, +which are discussed below. + #### Constraints Representation Dynamic parameter constraints will be represented in protobuf form as follows: @@ -177,16 +185,28 @@ message DynamicParameterConstraints { } ``` -#### Matching Behavior +#### Where Matching Is Performed -Note that both xDS servers and clients need to evaluate matching between -a set of dynamic parameters and a set of constraints. The server does -this when deciding which variant of a given resource to return for a -given subscription request. When the client receives the resource from +Both xDS servers and clients need to evaluate matching between a set +of dynamic parameters and a set of constraints. The server does this +when deciding which variant of a given resource to return for a given +subscription request. When the client receives the resource from the server, it needs to do the same matching to determine which of its subscriptions that resource is associated with. Therefore, the matching behavior becomes an inherent part of the xDS transport protocol. +Note that because leaf clients should only ever receive a single variant +of a given resource, implementations may be tempted to not bother with +this matching on the client side. However, that is not true for caching +xDS proxies; a proxy may have multiple clients that request different +variants of the same resource. Because we do not want to get into a +situation where xDS servers typically do not populate dynamic parameter +constraints in their responses and then need changes to work with +caching proxies, this design requires that leaf clients validate that the +constraints in the response match the requested dynamic parameters. +This ensures that the wire protocol used by leaf clients and caching xDS +proxies remains the same. + (In effect, the resource cache in an xDS client is basically the same logic as that on an xDS server; the only difference is that in the case of a client, the resources in the cache come from an xDS stream instead @@ -194,14 +214,16 @@ of from an authoritative database. Similarly, a caching xDS proxy is simply an xDS client where the subscriptions come from an incoming xDS stream.) -For example, let's say that the clients are currently categorized by the -parameter `env`, whose value is either `prod` or `test`. So any given -client will send one of the following sets of dynamic parameters: +#### Example: Basic Dynamic Parameters Usage + +Let's say that the clients are currently categorized by the parameter +`env`, whose value is either `prod` or `test`. So any given client will +send one of the following sets of dynamic parameters: - `{env=prod}` - `{env=test}` -The resource variants on the server will have the following sets of dynamic -parameter constraints: +Now let's say that the server has two variants of a given resource, and +the variants have the following dynamic parameter constraints: ```textproto // For {env=prod} @@ -221,31 +243,66 @@ parameter constraints: ]} ``` -#### Matching Ambiguity - -Dynamic parameters, unlike context parameters, will not be -exact-match-only. Dynamic parameter constraints will be able to represent -certain simple types of flexible matching, such as matching an exact -value or the existance of a key, and simple AND and OR combinations -of constraints. This flexible matching semantic means that there may be -ambiguities when determining which resources match which subscriptions. -This section defines the matching behavior and a set of best practices for -deployments to follow to avoid this kind of ambiguity. - -To illustrate where this comes up in practice, it is useful to consider -what happens in transition scenarios, where a deployment initially +When a client subscribes to this resource with dynamic parameters +`{env=prod}`, the server will return the first variant; when a client +subscribes to this resource with dynamic parameters `{env=test}`, the +server will return the second variant. When the client receives the +returned resource, it will verify that the dynamic parameters it sent +match the constraints of the returned resource. + +#### Unconstrained Parameters + +Note that clients may send dynamic parameters that are not specified in +the constraints on the resulting resource. If a set of constraints does +not specify any constraint for a given parameter sent by the client, that +parameter does not prevent the constraints from matching. This allows +clients to add new parameters before a server begins using them. +(In general, we expect clients to send a lot of keys that may not +actually be used by the server, since deployments often divide their +clients into categories before they have a need to differentiate the +configs for those categories.) + +Continuing the example above, if the server wanted to sent the same +contents for a given resource to both `{env=prod}` and `{env=test}` clients, +it would have only a single variant of that resource, and that variant would +not have any constraints. The server would therefore send that variant to +all clients, and the clients would consider it a match for the constraints +that they subscribed with. + +#### Example: Transition Scenarios + +Consider what happens in transition scenarios, where a deployment initially groups its clients on a single key but then wants to add a second key. The second key needs to be added both in the constraints on the server side and in the clients' configurations, but those two changes cannot occur atomically. -Consider the above example where the clients are already divided into +Let's start with the above example where the clients are already divided into `env=prod` and `env=test`. Let's say that now the deployment wants to add an additional key called `version`, whose value will be either `v1` or `v2`, so that it can further subdivide its clients' configs. -If the new key is added on the server side first, then the server will -have resource variants with constraints like this: +The first step is to add the new key on the clients first, so that any +given client will send one of the following sets of dynamic parameters: +- `{env=prod, version=v1}` +- `{env=prod, version=v2}` +- `{env=test, version=v1}` +- `{env=test, version=v2}` + +At this point, the server still does not have a variant of any resource +that has constraints for the `version` key; it has only variants that +differentiate between `env=prod` and `env=test`. But the addition of +the new key on the clients will not affect which resource variant is +sent to each client, because it does not affect the matching. Clients +sending `{env=prod, version=v1}` or `{env=prod, version=v2}` will both get +the resource variant for `env=prod`, and clients sending +`{env=test, version=v1}` or `{env=test, version=v2}` will both get the +resource variant for `env=test`. + +Once the clients have all been updated to send the new key, then the +server can be updated to have different resource variants based on the +`version` key. For example, it may replace the single resource variant +for `env=prod` with the following two variants: ```textproto // For {env=prod, version=v1} @@ -271,60 +328,49 @@ have resource variants with constraints like this: ]} ``` -But at this point, the clients are continuing to subscribe without -specifying this new key. So the server or cache would not have any way -to know which of the above variants to use for a subscription specifying -`{env=prod}` but not specifying `version`. - -Conversely, if the new key is added on the clients first, then the clients -will start subscribing with dynamic parameters like the following: -- `{env=prod, version=v1}` -- `{env=prod, version=v2}` -- `{env=test, version=v1}` -- `{env=test, version=v2}` - -The server or cache has to match those sets of dynamic parameters against -the existing sets of dynamic parameter constraints, which do not specify the -`version` key at all. +Once that change happens on the server, the clients will start getting +the correct variant of the resource based on their `version` key. -We address this transition scenario by allowing a set of constraints -to match a set of dynamic parameters that includes a key that is not -specified by the constraints. This allows new keys to be added on -clients before the corresponding constraints are added on the resources, -which we expect to be the common case. (In general, we expect clients -to send a lot of keys that may not actually be used by the server, since -deployments often divide their clients into categories before they have -a need to differentiate the configs for those categories.) +#### Matching Ambiguity -As mentioned above, this approach does introduce the possibility of +As mentioned above, this design does introduce the possibility of matching ambiguity in certain cases, where there may be more than one variant of a resource that matches the dynamic parameters specified by -the client. If an xDS transport protocol implementation does encounter -multiple possible matching variants of a resource, its behavior is -undefined. In the following sections, we evaluate the cases where that -can occur and specify how each one will be addressed. +the client. + +If an xDS transport protocol implementation does encounter multiple +possible matching variants of a resource, its behavior is undefined. +In the following sections, we evaluate the cases where that can occur +and specify how each one will be addressed. ##### Adding a New Key on the Server First +Consider what would happen in the above transition scenario if we changed +the server to have multiple variants of a resource differentiated by +the new `version` key before all of the clients were upgraded to use +that key. For clients sending `{env=prod}`, there would be two possible +matching variants of the resource, one for `version=v1` and another for +`version=v2`, and there would be no way to determine which variant to +use for that client. + As stated above, we are optimizing for the case where new keys are added on clients first, since that is expected to be the common scenario. However, there may be cases where it is not feasible to have all clients start sending a new key before the server needs to start making use of that key. -For example, let's consider the same case as above, where the clients -are initially sending only the `env` key, and the server now wants to -introduce the `version` key. However, let's say that this is in an -environment where the xDS server is controlled by one team and the clients -are controlled by various other teams, so it's not feasible to force all -clients to start sending the new `version` key all at once. But there -is one particular client team that is eager to start using the new -`version` key to differentiate the configs of their clients, and they -don't want to wait for all of the other client teams to start sending -the new key. +For example, let's say that this transition scenario is occurring in +an environment where the xDS server is controlled by one team and the +clients are controlled by various other teams, so it's not feasible to +force all clients to start sending the new `version` key all at once. +But there is one particular client team that is eager to start using +the new `version` key to differentiate the configs of their clients, +and they don't want to wait for all of the other client teams to start +sending the new key. Consider what happens if the server simply adds a variant of the -resource with the new key: +resource with the new key, while leaving the original resource variant +in place: ```textproto // Existing variant for older clients that are not yet sending the @@ -356,12 +402,13 @@ variant's constraints. However, newer clients that are sending dynamic parameters `{env=prod, version=v1}` will run into ambiguity: those parameters can match either of the above variants of the resource. -This situation will be avoided by requiring that **all variants of a -given resource must specify constraints for the same set of keys**. +This situation will be avoided via a best practice that all authoritative +xDS servers should have **all variants of a given resource specify +constraints for the same set of keys**. -However, in order to make this work for the case where the server starts -sending the constraint on the new key before all clients are sending it, -we provide the `exists` matcher, which will allow the server to specify +In order to make this work for the case where the server starts sending +the constraint on the new key before all clients are sending it, we +provide the `exists` matcher, which will allow the server to specify a default explicitly for clients that are not yet sending a new key. In this example, the server would actually have the following two variants: @@ -431,10 +478,10 @@ Now consider what happens if a client subscribes with dynamic parameters `{env=test}`. Those dynamic parameters can match either of the above variants of the resource. -This situation will be avoided by requiring that **all variants of a given -resource must specify non-overlapping constraints for the same set of keys**. -Control planes must not accept a set of resources that violates this -requirement. +This situation will be avoided via a best practice that all authoritative +xDS servers should have **all variants of a given resource specify +non-overlapping constraints for the same set of keys**. Control planes +must not accept a set of resources that violates this requirement. #### Matching Behavior and Best Practices @@ -508,7 +555,29 @@ server to return the dynamic parameters associated with each resource: DynamicParameterConstraints dynamic_parameter_constraints = 8; ``` -### Migrating From Node Metadata +### Client Configuration + +Client configuration is outside of the scope of this design. However, +this section lists some considerations for client implementors to take +into account. + +#### Configuring Dynamic Parameters + +Each leaf client should have a way of configuring the dynamic parameters +that it sends. + +For old-style resource names (those not using the new `xdstp` URI +scheme from [xRFC TP1](TP1-xds-transport-next.md)), clients should +send the same set of dynamic parameters for all resource subscriptions. +The client's configuration should allow setting these default dynamic +parameters globally. + +For new-style resource names, clients should send the same set of +dynamic parameters for all resource subscriptions in a given authority. +The client's configuration should allow setting the dymamic parameters to +use for each authority. + +#### Migrating From Node Metadata Today, the equivalent of dynamic parameter constraints is node metadata, which can be used by servers to determine the set of resources to send @@ -516,9 +585,7 @@ for LDS and CDS wildcard subscriptions or to determine the contents of other resources (e.g., to select individual routes to be included in an RDS resource). For transition purposes, this mechanism can continue to be supported by the client performing direct translation of node -metadata to exact-match constraints. For example, if the node metadata -contains the entry `env=prod`, this would be translated to a constraint -`{key_constraints:[{key:"env" value:{constraints:[{value:"prod"}]}}]}`. +metadata to dynamic parameters. Any given xDS client may support either or both of these mechanisms. From cb05b8cfd73b02814de782bf16a882f76f5035ca Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Thu, 21 Oct 2021 23:12:34 +0000 Subject: [PATCH 10/14] remove lingering reference to sharded endpoint use-case Signed-off-by: Mark D. Roth --- ...mically-generated-cacheable-xds-resources.md | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index ea9c96dd..557a23a1 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -87,17 +87,12 @@ observation that resource names are used in two places: example, a `RouteConfiguration` refers to individual `Cluster` resources by name. -The use-cases for dynamic resource selection share one important property -that we can take advantage of. When multiple variants of a given resource -exist, any given client will only ever use one of those variants at a -given time. That means that the parameters that affect which variant -of the resource is used are required by the transport protocol, but -they are not required by the client's data model. (For example, in the -"sharding endpoints for scalability" use-case, different clients may see -different variants of the EDS resource, but once a given client has the -right variant, it will be unique on that client, which means that the -CDS resource does not need to refer to different EDS resource names on -a different client.) +The use-cases that we're aware of for dynamic resource selection have +an important property that we can take advantage of. When multiple +variants of a given resource exist, any given client will only ever use +one of those variants at a given time. That means that the parameters +that affect which variant of the resource is used are required by the +transport protocol, but they are not required by the client's data model. It should be noted that caching xDS proxies, unlike "leaf" clients, will need to track multiple variants of each resource, since a given caching From 4856c1993d240ee6d483284ca609a448c29ea736 Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Tue, 7 Dec 2021 17:49:47 +0000 Subject: [PATCH 11/14] review comments Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 260 +++++++++++++++--- 1 file changed, 226 insertions(+), 34 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 557a23a1..618c8c7c 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -121,7 +121,8 @@ part of the resource graph. When a client subscribes to a resource, it specifies a set of dynamic parameters. In response, the server will send a resource whose dynamic parameter constraints match the dynamic parameters in the subscription -request. The client will use the dynamic parameter constraints on the +request. A client that subscribes to multiple variants of a resource (such +as a caching xDS proxy) will use the dynamic parameter constraints on the returned resource to determine which of its subscriptions the resource is associated with. @@ -180,34 +181,139 @@ message DynamicParameterConstraints { } ``` -#### Where Matching Is Performed - -Both xDS servers and clients need to evaluate matching between a set -of dynamic parameters and a set of constraints. The server does this -when deciding which variant of a given resource to return for a given -subscription request. When the client receives the resource from -the server, it needs to do the same matching to determine which of its -subscriptions that resource is associated with. Therefore, the matching -behavior becomes an inherent part of the xDS transport protocol. - -Note that because leaf clients should only ever receive a single variant -of a given resource, implementations may be tempted to not bother with -this matching on the client side. However, that is not true for caching -xDS proxies; a proxy may have multiple clients that request different -variants of the same resource. Because we do not want to get into a -situation where xDS servers typically do not populate dynamic parameter -constraints in their responses and then need changes to work with -caching proxies, this design requires that leaf clients validate that the -constraints in the response match the requested dynamic parameters. -This ensures that the wire protocol used by leaf clients and caching xDS -proxies remains the same. - -(In effect, the resource cache in an xDS client is basically the same -logic as that on an xDS server; the only difference is that in the case -of a client, the resources in the cache come from an xDS stream instead -of from an authoritative database. Similarly, a caching xDS proxy is -simply an xDS client where the subscriptions come from an incoming xDS -stream.) +#### Background: xDS Client and Server Architecture + +Before discussing where dynamic parameter matching is performed, it is +useful to provide some additional background on xDS client and server +architecture, independent of this design. + +The xDS transport protocol is fundamentally a mechanism that matches up +subscriptions provided by a client with resources provided by a server. +The client controls what it is subscribing to at any given time, +and the server must send the resources from its database that match the +currently active subscriptions. + +An xDS server may be thought of as containing a database of resources, +in which each resource has an associated list of clients that are currently +subscribed to that resource. Whenever a client subscribes to a resource, +the server will send the current version of that resource to the client, +and it will add the client to the list of clients currently subscribed to +that resource. Whenever the server receives a new version of that resource +in its database, it will send the update to all clients that are currently +subscribed to that resource. Whenever a client unsubscribes from a +resource, it is removed from the list of clients subscribed to that +resource, so that the server knows not to send it subsequent updates for +that resource. + +This same paradigm of matching up subscriptions with resources actually +applies to the xDS client as well. Because the xDS transport protocol +does not require a server to resend a resource unless its contents have +changed, clients need to cache the most recently seen value locally in +case they need it again. In general, the best way to structure an xDS +transport protocol client is as an API where the caller can start or +stop subscribing to a given resource at any time, and the xDS client will +handle the wire-level communication and cache the resources returned by +the server. The cache in the xDS client functions very similarly to the +database in an xDS server: each cache entry contains the current value +of the resource received from the xDS server and a list of subscribers to +that resource. When the xDS client sees the first subscription start for +a given resource, it will create the cache entry for that resource, add +the subscriber to the list of subscribers for that resource, and request +that resource from the xDS server. When it receives the resource from +the server, it will store the resource in the cache entry and deliver +it to all subscribers. When the xDS client sees a second subscription +start for the same resource, it will add the new subscriber to the list +of subscribers for that resource and immediately deliver the cached value +of the resource to the new subscriber. Whenever the server sends an +updated version of the resource, the xDS client will deliver the update +to all subscribers. When all subscriptions are stopped, the xDS client +will unsubscribe from the resource on the wire, so that the xDS server +knows to stop sending updates for that resource to the client. + +In effect, the logic in an xDS client is essentially the same as that in an +xDS server, with only two differences. First, subscriptions come from local +API callers instead of downstream RPC clients. And second, the database does +not contain the authoritative source of the resource contents but rather cached +values obtained from the server, and the database entries are removed when +the last subscription for a given resource is stopped. + +The logic in a caching xDS proxy is also essentially the same as that in an xDS +server, with only one difference. Just like an xDS client, the database +does not contain the authoritative source of the resource contents but +rather cached values obtained from the server. However, like an xDS +server, subscriptions do come from downstream RPC clients rather than local +API callers. + +The following table summarizes this structure: + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    xDS Node TypeSource of SubscriptionsSource of Resource Contents
    xDS Serverdownstream xDS clientsauthoritative data
    xDS Clientlocal API callerscached data from upstream xDS server
    xDS Caching Proxydownstream xDS clientscached data from upstream xDS server
    + +#### Where Dynamic Parameter Matching is Performed + +Because of the architecture described above, evaluation of matching between +a set of dynamic parameters and a set of constraints may need to be +performed by both xDS servers and xDS clients. + +xDS servers that support multiple variants of a resource perform this +matching when deciding which variant of a given resource to return for a +given subscription request. xDS servers that support multiple variants of +a resource MUST send the dynamic parameter constraints associated with a +resource variant to the client along with that variant. Any server +implementation that fails to do so is in violation of this specification. + +xDS caching proxies that support multiple variants of a resource also +perform this matching when deciding which variant of a given resource to +return for a given subscription request. Caching proxies MUST store the +dynamic parameter constraints obtained from the upstream server along with +each resource variant, which they will use when deciding which variant of a +given resource to return for a given subscription request from a downstream +xDS client. Caching proxies MUST send those dynamic parameter constraints to +the downstream client when sending that variant of the resource. + +Note this design assumes that a given leaf client will use a fixed set of +dynamic parameters, typically configured in a local bootstrap file, for all +subscriptions over its lifetime. Given that, it is not strictly necessary +for a leaf client to perform this matching, since it should only ever +receive a single variant of a given resource, which should always match the +dynamic parameters it subscribed with. However, clients MAY perform this +matching, which may be useful in cases where the same cache implementation +is used on both a leaf client and a caching proxy. + +It is important to note that the dynamic parameter matching behavior becomes +an inherent part of the xDS transport protocol. xDS servers that interact +only with leaf clients may be tempted not to send dynamic parameter +constraints to the client along with the chosen resource variant, and +leaf clients may accept that. However, as soon as that server wants to +start interacting with a caching proxy or a client that does verify the +constraints, it will run into problems. xDS server implementors are +strongly encouraged not to omit the dynamic parameter constraints in their +responses. #### Example: Basic Dynamic Parameters Usage @@ -257,7 +363,7 @@ actually be used by the server, since deployments often divide their clients into categories before they have a need to differentiate the configs for those categories.) -Continuing the example above, if the server wanted to sent the same +Continuing the example above, if the server wanted to send the same contents for a given resource to both `{env=prod}` and `{env=test}` clients, it would have only a single variant of that resource, and that variant would not have any constraints. The server would therefore send that variant to @@ -277,8 +383,8 @@ Let's start with the above example where the clients are already divided into an additional key called `version`, whose value will be either `v1` or `v2`, so that it can further subdivide its clients' configs. -The first step is to add the new key on the clients first, so that any -given client will send one of the following sets of dynamic parameters: +The first step is to add the new key on the clients, so that any given client +will send one of the following sets of dynamic parameters: - `{env=prod, version=v1}` - `{env=prod, version=v2}` - `{env=test, version=v1}` @@ -326,6 +432,14 @@ for `env=prod` with the following two variants: Once that change happens on the server, the clients will start getting the correct variant of the resource based on their `version` key. +Note that in order to avoid causing matching ambiguity, the server must +handle this kind of change by sending the deletion of the original resource +variant and the creation of the replacement resource variants in a +single xDS response. This will allow the client to atomically apply the +change to its database. For any given subscriber, the client should +present the change as if there was only one variant of the resource and +that variant had just been updated. + #### Matching Ambiguity As mentioned above, this design does introduce the possibility of @@ -584,11 +698,89 @@ metadata to dynamic parameters. Any given xDS client may support either or both of these mechanisms. +### Considerations for Implementations + +This specification does not prescribe implementation details for xDS +clients or servers. However, for illustration purposes, this section +describes how a naive implementation might be structured. + +The database of an xDS server or cache of an xDS client can be thought +of as a map, keyed by resource type and resource name. Prior to this +specification, the value of the map would have been the current value of the +resource and a list of subscribers that need to be updated when the +resource changes. In C++ syntax, the data structure might look like this: + +```c++ +// Represents a subscriber (either a downstream xDS client or a local API caller). +class Subscriber { + public: + // ... +}; + +struct DatabaseEntry { + // Current contents of resource. + // Whenever this changes, the change will be sent to all subscribers. + std::optional resource_contents; + + // Current list of subscribers. + // Entries are added and removed as subscriptions are started and stopped. + std::set subscribers; +}; + +using Database = + std::map; +``` + +This design does not change the key structure of the map, but it does +change the structure of the value of the map. In particular, instead of +storing a single value for the resource contents, it will need to store +multiple values, keyed by the associated dynamic parameter constraints. +And for each subscriber, it will need to store the dynamic parameters that +the subscriber specified. In a naive implementation (not optimized at all), +the modified data structure may look like this: + +```c++ +// Represents a subscriber (either a downstream xDS client or a local API caller). +class Subscriber { + public: + // ... + + // Returns the dynamic parameters specified for the subscription. + DynamicParameters dynamic_parameters() const; +}; + +struct DatabaseEntry { + // Resource contents for each variant of the resource, keyed by + // dynamic parameter constraints. + // Whenever a given variant of the resource changes, the change will be + // sent to all subscribers whose dynamic parameters match the constraints + // of the resource variant that changed. + std::map> resource_contents; + + // Current list of subscribers. + // Entries are added and removed as subscriptions are started and stopped. + std::set subscribers; +}; +``` + +When a variant of a resource is updated, the variant is stored in the map +based on its dynamic parameter constraints. The implementation will then +iterate through the list of subscribers, sending the updated resource +variant and its dynamic parameter constraints to each subscriber whose +dynamic parameters match those constraints. + +A more optimized implementation may instead choose to store a separate list +of subscribers for each resource variant, thus avoiding the need to perform +matching for every subscriber upon every update of a resource variant. +However, this would require moving subscribers from one variant to another +whenever the dynamic parameters change on the resource variants. + ### Example This section shows how the mechanism described in this proposal can be -used to address each the use-case described in the "Background" -section above. +used to address the use-case described in the "Background" section above. Let's say that every client uses two different dynamic selection parameters, `env` (which can have one of the values `prod`, `canary`, From 1f8409932a9ede103d0ff57a4030de40f60c6fcd Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Fri, 28 Jan 2022 19:38:01 +0000 Subject: [PATCH 12/14] add a way for the server to remove a specific variant of a resource Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 34 ++++++++++++++++--- 1 file changed, 30 insertions(+), 4 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 618c8c7c..6ac3feb3 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -654,14 +654,40 @@ Similarly, the following fields will be added to `DeltaDiscoveryRequest`: repeated ResourceLocator resource_locators_unsubscribe = 9; ``` +The following message will be added to represent the name of a specific +variant of a resource: + +```proto +// Specifies a concrete resource name. +message ResourceName { + // The name of the resource. + string name = 1; + + // Dynamic parameter constraints associated with this resource. To be used by + // client-side caches (including xDS proxies) when matching subscribed + // resource locators. + DynamicParameterConstraints dynamic_parameter_constraints = 2; +} +``` + The following field will be added to the `Resource` message, to allow the server to return the dynamic parameters associated with each resource: ```proto - // Dynamic parameter constraints associated with this resource. To be used - // by client-side caches (including xDS proxies) when matching subscribed - // resource locators. - DynamicParameterConstraints dynamic_parameter_constraints = 8; + // Alternative to the *name* field, to be used when the server supports + // multiple variants of the named resource that are differentiated by + // dynamic parameter constraints. + // Only one of *name* or *resource_name* may be set. + ResourceName resource_name = 8; +``` + +And finally, the following field will be added to `DeltaDiscoveryResponse`: + +```proto + // Alternative to removed_resources that allows specifying which variant of + // a resource is being removed. This variant must be used for any resource + // for which dynamic parameter constraints were sent to the client. + repeated ResourceName removed_resource_names = 8; ``` ### Client Configuration From 8b00b7e6b9747ec22254edb3425a7ea15d6c718a Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Wed, 9 Feb 2022 16:41:21 +0000 Subject: [PATCH 13/14] update wording Signed-off-by: Mark D. Roth --- ...2-dynamically-generated-cacheable-xds-resources.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 6ac3feb3..9e3b4212 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -3,7 +3,7 @@ TP2: Dynamically Generated Cacheable xDS Resources * Author(s): markdroth, htuch * Approver: htuch * Implemented in: -* Last updated: 2021-10-14 +* Last updated: 2022-02-09 ## Abstract @@ -109,10 +109,11 @@ data structures: client when subscribing to a resource. - **Dynamic parameter constraints**, which are a set of criteria that can be used to determine whether a set of dynamic parameters matches - the constraints. These constraints are part of the cache key for an - xDS resource (in addition to the resource name itself) on xDS servers, - xDS clients, and xDS caching proxies. This provides a mechanism to - represent multiple variants of a given resource in a cacheable way. + the constraints. These constraints are considered part of the unique + identifier for an xDS resource (along with the resource name itself) + on xDS servers, xDS clients, and xDS caching proxies. This provides a + mechanism to represent multiple variants of a given resource in a + cacheable way. Both of these data structures are used in the xDS transport protocol, but they are not part of the resource name and therefore do not appear as From fc060e4babceba0cc80513f456dbb371a8c1bceb Mon Sep 17 00:00:00 2001 From: "Mark D. Roth" Date: Thu, 21 Apr 2022 00:02:34 +0000 Subject: [PATCH 14/14] allow arbitrary nesting of AND, OR, and NOT expressions Signed-off-by: Mark D. Roth --- ...cally-generated-cacheable-xds-resources.md | 228 +++++++----------- 1 file changed, 89 insertions(+), 139 deletions(-) diff --git a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md index 9e3b4212..7b9b79b1 100644 --- a/proposals/TP2-dynamically-generated-cacheable-xds-resources.md +++ b/proposals/TP2-dynamically-generated-cacheable-xds-resources.md @@ -141,44 +141,37 @@ Dynamic parameter constraints will be represented in protobuf form as follows: ```proto message DynamicParameterConstraints { - // A list of constraints that may be combined with AND or OR semantics. - message ConstraintList { - // A constraint for a given key. - message Constraint { - message Exists {} - // The key to match against. - string key = 1; - // How to match. - oneof constraint_type { - // Matches this exact value. - string value = 2; - // Key is present (matches any value except for the key being absent). - Exists exists = 3; - } - // If set to true, the match is inverted -- i.e., the key must NOT - // match the specified value. - bool invert = 4; + // A single constraint for a given key. + message SingleConstraint { + message Exists {} + // The key to match against. + string key = 1; + // How to match. + oneof constraint_type { + // Matches this exact value. + string value = 2; + // Key is present (matches any value except for the key being absent). + Exists exists = 3; } + } - enum MatchType { - // Default value. - MATCH_TYPE_UNSPECIFIED = 0; - // Logical AND of constraints. - MATCH_TYPE_AND = 1; - // Logical OR of constraints. - MATCH_TYPE_OR = 2; - } + message ConstraintList { + repeated DynamicParameterConstraints constraints = 1; + } - // A list of key/value constraints. - repeated Constraint constraints = 1; + oneof type { + // A single constraint to evaluate. + SingleConstraint constraint = 1; - // How to match the constraints. - MatchType match_type = 2; - } + // A list of constraints to be ORed together. + ConstraintList or_constraints = 2; - // A list of constraint lists. All constraint lists must match (i.e., - // logical AND semantics). - repeated ConstraintList constraints = 1; + // A list of constraints to be ANDed together. + ConstraintList and_constraints = 3; + + // The inverse (NOT) of a set of constraints. + DynamicParameterConstraints not_constraints = 4; + } } ``` @@ -329,20 +322,10 @@ the variants have the following dynamic parameter constraints: ```textproto // For {env=prod} -{constraints:[ - { - constraints:[{key:"env" value:"prod"}] - match_type: MATCH_TYPE_AND - } -]} +{constraint:{key:"env" value:"prod"}} // For {env=test} -{constraints:[ - { - constraints:[{key:"env" value:"test"}] - match_type: MATCH_TYPE_AND - } -]} +{constraint:{key:"env" value:"test"}} ``` When a client subscribes to this resource with dynamic parameters @@ -408,25 +391,15 @@ for `env=prod` with the following two variants: ```textproto // For {env=prod, version=v1} -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" value:"v1"} - ] - match_type: MATCH_TYPE_AND - } +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {constraint:{key:"version" value:"v1"}} ]} // For {env=prod, version=v2} -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" value:"v2"} - ] - match_type: MATCH_TYPE_AND - } +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {constraint:{key:"version" value:"v2"}} ]} ``` @@ -485,24 +458,12 @@ in place: ```textproto // Existing variant for older clients that are not yet sending the // version key. -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"} - ] - match_type: MATCH_TYPE_AND - } -]} +{constraint:{key:"env" value:"prod"}} // New variant intended for clients sending the version key. -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" value:"v1"} - ] - match_type: MATCH_TYPE_AND - } +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {constraint:{key:"version" value:"v1"}} ]} ``` @@ -526,25 +487,17 @@ variants: ```textproto // Existing variant for older clients that are not yet sending the // version key. -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" exists:{} invert:true} - ] - match_type: MATCH_TYPE_AND +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {not_constraint: + {constraint:{key:"version" exists:{}}} } ]} // New variant for clients sending the version key. -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" value:"v1"} - ] - match_type: MATCH_TYPE_AND - } +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {constraint:{key:"version" value:"v1"}} ]} ``` @@ -562,25 +515,15 @@ that a server has the following two variants of a resource: ```textproto // Matches {env=prod} or {env=test}. -{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"env" value:"test"} - ] - match_type: MATCH_TYPE_OR - } +{or_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {constraint:{key:"env" value:"test"}} ]} // Matches {env=qa} or {env=test}. -{constraints:[ - { - constraints:[ - {key:"env" value:"qa"}, - {key:"env" value:"test"} - ] - match_type: MATCH_TYPE_OR - } +{or_constraints:[ + {constraint:{key:"env" value:"qa"}}, + {constraint:{key:"env" value:"test"}} ]} ``` @@ -830,13 +773,12 @@ to get the appropriate one:
-{constraints:[ - { - constraints:[ - {key:"env" value:"prod" invert:true}, - {key:"version" value:"v1" invert:true} - ] - match_type: MATCH_TYPE_AND +{and_constraints:[ + {not_constraints: + {constraint:{key:"env" value:"prod"}} + }, + {not_constraints: + {constraint:{key:"version" value:"v1"}} } ]}
-{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" value:"v1" invert:true} - ] - match_type: MATCH_TYPE_AND +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {not_constraints: + {constraint:{key:"version" value:"v1"} } ]}
-{constraints:[ - { - constraints:[ - {key:"env" value:"prod" invert:true}, - {key:"version" value:"v1"} - ] - match_type: MATCH_TYPE_AND - } +{and_constraints:[ + {not_constraints: + {constraint:{key:"env" value:"prod"}} + }, + {constraint:{key:"version" value:"v1"}} ]} @@ -890,14 +826,9 @@ to get the appropriate one:
-{constraints:[ - { - constraints:[ - {key:"env" value:"prod"}, - {key:"version" value:"v1"} - ] - match_type: MATCH_TYPE_AND - } +{and_constraints:[ + {constraint:{key:"env" value:"prod"}}, + {constraint:{key:"version" value:"v1"}} ]} @@ -921,12 +852,31 @@ One limitation of this design is that, because all xDS transport protocol implementations (clients, servers, and caching proxies) need to implement this matching behavior, it will be very difficult to add new matching behavior in the future. Doing so will probably require some sort of -client capability. +client capability. This will make it feasible to expand this mechanism +in an environment where all of the caching xDS proxies are under centralized +control, but it will be quite difficult to deploy those changes in +environments that depend on distributed third-party caching xDS proxies. Because of this, reviewers of this design are encouraged to carefully scrutinize the proposed matching semantics to ensure that they meet our expected needs. +### Complexity of Constraint Expressions + +Although the `DynamicParameterConstraints` proto allows specifying +arbitrarily nested combinations of AND, OR, and NOT expressions, control +planes do not need to actually support that full arbitrary power. It is +possible to limit the sets of supported constraints to (e.g.) a +simple flat list of AND or OR expressions, which would make it easier +for a control plane to optimize its implementation. + +Simimarly, caching xDS proxies may be able to provide an optimized +implementation if all of the constraints that they see are limited to +some subset of the full flexibility allowed by the protocol. However, +any general-purpose caching proxy implementation will likely need to +support a less optimized implementation that does support the full +flexibility allowed by the protocol. + ### Using Context Parameters We considered extending the context parameter mechanism from [xRFC