Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: named filter config / filter config discovery service #7867

Closed
mattklein123 opened this issue Aug 8, 2019 · 46 comments · Fixed by #11826
Closed

api: named filter config / filter config discovery service #7867

mattklein123 opened this issue Aug 8, 2019 · 46 comments · Fixed by #11826
Labels
api/v3 Major version release @ end of Q3 2019 no stalebot Disables stalebot from closing an issue
Milestone

Comments

@mattklein123
Copy link
Member

I would like the ability to have individual filter configurations be named and fetchable from a discrete management server. This would allow federation of filter configuration from multiple sources, for example fault testing. Thought needs to be put into how this plays with FCDS for listener filters, draining, whether filters can swap config atomically via a new optional interface, etc.

If we do this right we should not need the TapDS IMO.

cc @envoyproxy/api-shepherds @htuch @yuval-k

@mattklein123 mattklein123 added api/v3 Major version release @ end of Q3 2019 no stalebot Disables stalebot from closing an issue labels Aug 8, 2019
@lambdai
Copy link
Contributor

lambdai commented Aug 8, 2019

individual configurations is the filter chain, correct?

@htuch
Copy link
Member

htuch commented Sep 23, 2019

@mattklein123 do you think this is needed for 1.12.0 or can we do this as a v3 add-on once shipped?

@mattklein123
Copy link
Member Author

@mattklein123 do you think this is needed for 1.12.0 or can we do this as a v3 add-on once shipped?

We can do it as an add on, but can we do it without any structural changes? I suppose we probably can by just having an optional config source on a filter config? I'm mostly just hoping to avoid deprecation. We might want to look at it briefly and at least put the existing static config in a oneof?

@htuch
Copy link
Member

htuch commented Sep 24, 2019

Sure, I think this aspect of the issue is manageable in the v3 release cycle.

@rshriram
Copy link
Member

adding another use case that I came across

Certain filters like RBAC etc. have the access control rules [something that changes dynamically] embedded inside the filter as static configuration. When access control rules change, we end up sending an update to the listeners with the updated filter config. This causes all existing connections to drain. A more efficient thing to do would be to reload the rbac filter config alone, and force the rbac filter instances to run some event handler [e.g., onConfigUpdate], wherein they can reevaluate existing connections to check if they still pass the rbac rules.

@mattklein123
Copy link
Member Author

@rshriram do you really need an event? I think this is possible but can't RBAC just run on new connections? If you want an event can we track as a separate issue?

@rshriram
Copy link
Member

So, think about HTTP connections.. in this case, new rbac would have to become effective on new requests on the same connection right? Otherwise, people will end up forcing listener drains for the newer rules to take effect. I suggested the event thing as just one idea but as long as there is an ability to apply the newer config on existing connections, I am good.

This would apply to faults as well wont it? like you could pick one h2 stream, introduce some faults and then remove those faults, to see how the client withstands these intermittent failures. Wont this be helpful in mobile use cases?

@mattklein123
Copy link
Member Author

This would apply to faults as well wont it? like you could pick one h2 stream, introduce some faults and then remove those faults, to see how the client withstands these intermittent failures. Wont this be helpful in mobile use cases?

We explicitly want this for fault testing. :)

Basically, at the network level, any new connection will get the new config. At the HTTP level, any new stream will get the new configuration. I think this is sufficient?

@lambdai
Copy link
Contributor

lambdai commented Oct 2, 2019

It's fairly easy for the new connection to get the new config.

The goal of new http stream get new configuration can be achieve by above if we drain listener/filter. But it's not smooth to achieve it without using new connection. I will start to think about it when the first step is done.

@mattklein123
Copy link
Member Author

The goal of new http stream get new configuration can be achieve by above if we drain listener/filter. But it's not smooth to achieve it without using new connection. I will start to think about it when the first step is done.

I think this is actually not that hard and we can do this without any draining. Essentially we just need to have a layer of indirection between the filter chain and the actual filter config, such that the filter config itself can be snapped by the connection when it is created. I'm very happy to work with whoever wants to implement this on the design. Note also that we need this at Lyft so if no one gets to it first @wgallagher is likely to implement.

@lambdai
Copy link
Contributor

lambdai commented Oct 7, 2019

Essentially we just need to have a layer of indirection between the filter chain and the actual filter config
Are you saying the filter chain from LDS doesn't mutate? Just do some extra calculation when applying to newly created http stream? This seems easier. We already support similar calculation when applying network filter at ActiveTcpListener::newConnection

What is the missing part?
Should the similar delayed calculation apply to http_conn_manager?
Support async applier by giving it a provider and provider is supposed to fetch the latest change?

@mattklein123
Copy link
Member Author

Are you saying the filter chain from LDS doesn't mutate? Just do some extra calculation when applying to newly created http stream?

Correct that's what I'm thinking.

What is the missing part? Should the similar delayed calculation apply to http_conn_manager? Support async applier by giving it a provider and provider is supposed to fetch the latest change?

I haven't thought it through fully (we will need to iterate on a design) but I think we move the the filter configuration into a oneof that either is defined inline statically or comes from a config source (cc @htuch because we want to move the static filter config into a oneof for v3 to future proof it). Once we have a config source, I think we can have the owner of the filter chain starting fetching for dynamic updates with warming, etc. and then when a new connection (for L4) or a new stream (for L7) is created, we can snap the correct config for the filter.

@htuch
Copy link
Member

htuch commented Oct 8, 2019

Yep, moving to a ConfigSource is aligned with what we want in UDPA. Let's promote to a oneof for v3.

@htuch htuch modified the milestones: 1.12.0, 1.13.0 Oct 16, 2019
@yxue
Copy link
Contributor

yxue commented Nov 2, 2019

@PiotrSikora @jplevyak I am wondering if we can use the filter config discovery service for the WASM. WASM can fetch the code from some management server instead of using inline code.

@kyessenov
Copy link
Contributor

Envoy position is that config is trusted and user supplied. Shipping code through config server may compromise the security of the data plane. There is also a high-risk of head-of-line blocking for large (>10MB Wasm modules) in ADS.

@htuch
Copy link
Member

htuch commented Nov 17, 2019

@kyessenov I think we don't want to be too prescriptive here; folks can use some of these features with caveats around not pushing large updates inline. It's no different shipping 10MB+ WASM or 10k+ filter chains, each with a 1KB certificate inline.

@kyessenov
Copy link
Contributor

If we do end-up using this for Wasm, it would be great to be able to add filters, not just update existing filter configs. Maybe using some sort of stubs if Wasm filter is not yet needed (e.g. empty filter config should be allowed).

@kyessenov
Copy link
Contributor

Just copying thoughts from the linked issue:

  1. Can we please make filter config payload be a list of filters? It's more flexible, doesn't restrict to a single filter type, and allows empty lists.
  2. Consider changing how filters are ordered. Maybe it's better to have a weighted map of filters that are sorted before application? That way we can insert the filter config at any place.

@htuch
Copy link
Member

htuch commented Dec 5, 2019

Why a list when the set of resources return by xDS is already a list? I worry that a list of lists is a bit confusing to reconcile.

@kyessenov
Copy link
Contributor

List is used to 1) enforce ordering of filters in a "patch", 2) allow flexible number of filters to be returned. Think of Wasm filters. We can either add 10 dynamic filter configs with some unset, or have one filter config group with 0..10 Wasm filters in it.

@htuch
Copy link
Member

htuch commented Dec 5, 2019

I think this is a conflation of concerns. We have one concern that is managing which filters are in the filter chain and their relative order, and another that is how individual filters are configured. Ideally we can provide config independently for each of these, and potentially federate the responsibilities to distinct parties.

I would like to see a filter config discover service that just supplies config for a named filter. The existing Filter Chain Discovery Service is providing the list of filters and relative order. If that needs to be further decomposed, I think a distinct design proposal is needed.

@mandarjog
Copy link
Contributor

+1 for fail-open and fail-closed, we have been discussing this in the WASM context. For example if a telemetry WASM filter fails to load you would still want to declare success and move on (and increment a metric for bad configs).

Using a URL fetcher to load WASM code has the same issue. The listener will go in "warming" state, with no way to notify the control plane.
So the ask here specifically is to be able send back a NACK after a listener enters warming state, or after config is accepted from FCDS. Is this possible ?

@mattklein123
Copy link
Member Author

So the ask here specifically is to be able send back a NACK after a listener enters warming state, or after config is accepted from FCDS. Is this possible ?

This is not possible today and will require a bunch of thinking to figure out the right way to do it, but it's certainly possible and something we should do. In the near term, the hacky way of dealing with this is to just delete the listener with a stat, and then have it re-added in the next go around. Though, this won't work with incremental.

@mandarjog
Copy link
Contributor

Ok, since ETA on async NACK is unclear I think it is better to use “local file” load when using WASM so that bad wasm module or “module not found” can be rejected synchronously with a NACK. Wdyt @mattklein123 ?
I think this is generally true for any other interaction of this type.

@mattklein123
Copy link
Member Author

Wdyt @mattklein123 ? I think this is generally true for any other interaction of this type.

I think it really depends on the deployment. IMO fail open with a default config will work in many cases, especially since with eventually consistency the config should eventually get loaded later when things are fixed.

@mattklein123 mattklein123 modified the milestones: 1.13.0, 1.14.0 Dec 26, 2019
@mattklein123
Copy link
Member Author

I'm moving this out to 1.14.0, but we still need to move the static filter config into a oneof for v3. cc @lizan @htuch to track the automation around this.

@htuch
Copy link
Member

htuch commented Dec 27, 2019

@mattklein123 looking at filter config, today, we have:

  // Filter specific configuration which depends on the filter being
  // instantiated. See the supported filters for further documentation.
  oneof config_type {
    google.protobuf.Struct config = 2 [deprecated = true];

    google.protobuf.Any typed_config = 4;
  }

Do we need a new oneof, or could the dynamic ConfigSource just live under config_type?

@mattklein123
Copy link
Member Author

Do we need a new oneof, or could the dynamic ConfigSource just live under config_type?

Oops yeah sorry I forget we already have a oneof here from the config -> typed_config conversion. Yeah this oneof should be fine. Sorry for the runaround.

@mattklein123 mattklein123 modified the milestones: 1.14.0, 1.15.0 Mar 18, 2020
@mattklein123 mattklein123 modified the milestones: 1.15.0, 1.16.0 Jun 24, 2020
htuch pushed a commit that referenced this issue Jun 25, 2020
Define filter config discovery. Add FDS for HTTP filters (HTTP extensions is where the pain is felt the most). Modelled after RDS with a twist of config override for re-use.

Risk Level: low (not implemented)
Testing:
Docs Changes:
Release Notes:

Issue: #7867

Signed-off-by: Kuat Yessenov <kuat@google.com>
songhu pushed a commit to songhu/envoy that referenced this issue Jun 25, 2020
Define filter config discovery. Add FDS for HTTP filters (HTTP extensions is where the pain is felt the most). Modelled after RDS with a twist of config override for re-use.

Risk Level: low (not implemented)
Testing:
Docs Changes:
Release Notes:

Issue: envoyproxy#7867

Signed-off-by: Kuat Yessenov <kuat@google.com>
yashwant121 pushed a commit to yashwant121/envoy that referenced this issue Jul 24, 2020
Define filter config discovery. Add FDS for HTTP filters (HTTP extensions is where the pain is felt the most). Modelled after RDS with a twist of config override for re-use.

Risk Level: low (not implemented)
Testing:
Docs Changes:
Release Notes:

Issue: envoyproxy#7867

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: yashwant121 <yadavyashwant36@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/v3 Major version release @ end of Q3 2019 no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants