Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown fields in the configuration and new features policy #4053

Closed
PiotrSikora opened this issue Aug 4, 2018 · 17 comments · Fixed by #4096
Closed

Unknown fields in the configuration and new features policy #4053

PiotrSikora opened this issue Aug 4, 2018 · 17 comments · Fixed by #4096
Assignees
Labels
Milestone

Comments

@PiotrSikora
Copy link
Contributor

PiotrSikora commented Aug 4, 2018

Starting with #3778, Envoy ignores unknown fields in the configuration. This is enabled by default and without the ability to turn it off, which already lead to invalid configuration sneaking into tests (see: #4051).

Presumably, the reasoning behind that change was for proxies to be able to accept configuration produced by newer control planes, effectively ignoring new features they don't know about yet.

Unfortunately, this leads to Envoy silently accepting invalid (e.g. simple typo means that the setting is ignored and feature is not configured) and/or not working (e.g. new feature that's required for the deployment is silently ignored by older proxies, leading to unexpected behavior) configurations.

Having said that, we cannot rely on proxies and control planes always being in sync, and the drift can happen the other way around as well, i.e. having proxies with new features, without the ability to configure or even disable (!) them by older control planes, so we should have a solution for both scenarios.

Ultimately, having new features disabled by default and requiring explicit opt-in, might be the safest and easiest option to solve that. Though, it pushes the responsibility for configuring sane defaults onto control planes.

As for the unknown fields in the configuration, I think that control planes could send a list of fields that are safe to be ignored by the proxies.

Thoughts?

cc @mattklein123 @alyssawilk @htuch @ggreenway @kyessenov @rshriram

@mattklein123 mattklein123 added this to the 1.8.0 milestone Aug 4, 2018
@mattklein123
Copy link
Member

@kyessenov can you refresh our memory on the rationale behind the original change? FWIW, I saw that change go by and I thought about pushing back, but didn't. I probably should have.

IMO this creates a lot of confusion for configuration, and it's a problem we had a lot of in v1 prior to a strict JSON schema. People think they are setting options, but they have a typo, and then the option is ignored and they get the default. This leads to a lot of support queries.

My general feeling is that we should make the change to ignore unknown fields configurable on the CLI, and default to off. Then operators that want this functionality can turn it on via CLI? I think that is the simplest option.

@htuch
Copy link
Member

htuch commented Aug 5, 2018

I think the motivation was having consistency between proto behavior and JSON; unknown fields in proto are not automatically rejected (https://developers.google.com/protocol-buffers/docs/proto#updating).

I generally agree with the sentiment that we should make this more robust for users, but consistent behavior between proto and JSON/YAML config where possible is another consideration to throw in there.

@mattklein123
Copy link
Member

I generally agree with the sentiment that we should make this more robust for users, but consistent behavior between proto and JSON/YAML config where possible is another consideration to throw in there.

As I recall I though the issue here was more with how we handle struct/any encoding? @kyessenov? In any case, though I understand the difference between JSON/YAML vs. proto, IMO this creates a lot of extra confusion and I think we should make this configurable if people want both behaviors.

@ramaraochavali
Copy link
Contributor

FWIW, I ran in to this issue today where envoy silently accepted a typo in config and it took some time to figure that out.

@kyessenov
Copy link
Contributor

Sorry, I missed notification for this issue. I tend to agree with @mattklein123 that a CLI option for strict semantics is a good idea since it enforces validation. My position is that we should apply it consistently between protos and proto structs. I think it's reasonable to disallow unknown fields in all protos (bootstrap, and proto Any fields), as long as we match the behavior between filter configs and the rest of the config. As an author of filter config, it is unexpected that envoy rejects the config when a new field is added to the proto, since if we were to use a standard proto representation, this would not happen. It's not immediately obvious that there is this difference between structs and Any.

We can be more fine-grained, and customize unknown field behavior by the type of config if necessary.

@mattklein123
Copy link
Member

@kyessenov SGTM. Are you willing to work on this? I would suggest we make the default what it was before with a CLI option to opt-out of strict semantics?

@kyessenov
Copy link
Contributor

kyessenov commented Aug 7, 2018 via email

@htuch
Copy link
Member

htuch commented Aug 7, 2018

+1 on the general approach; as long as we have consistency between protobuf and JSON/YAML, I'm all good.

@lizan
Copy link
Member

lizan commented Aug 8, 2018

In addition to CLI option, perhaps also make it a flag in runtime configuration, so it can be changed without restarting?

@htuch
Copy link
Member

htuch commented Aug 9, 2018

@lizan what's the motivation for runtime support? It seems this is something you want to set once for the lifetime of an Envoy instance. For existing deployments, they will need to pickup the new binary and do a restart to be able to take advantage of it.

@lizan
Copy link
Member

lizan commented Aug 9, 2018

The motivation for runtime support is to let envoy accepts unknown fields only during control-plane upgrade. Control planes doesn't always track each Envoy instances version and may emit unknown fields to some of Envoy instances. After the Envoy instances pick up new binary, it can be set back to strict.

@kyessenov
Copy link
Contributor

@lizan if we can selectively enable runtime flag to allow unknown fields, then we can surely teach the control plane to do the same on the server side. I think this is mostly an issue for control planes that do not track versions of the proxies.

@lizan
Copy link
Member

lizan commented Aug 9, 2018

@kyessenov I'm not saying you would selectively enable this based on each instances. Flow like this:

  1. Initially the runtime flag could be set to strict when you deploy Envoy instances
  2. You make all of Envoy instances lenient when you deploy control plane.
  3. After deployed control plane, pick up new Envoy binary and deploy with strict.

A runtime flag for 2 without restarting, thoughts?

@lizan
Copy link
Member

lizan commented Aug 10, 2018

btw, recalled more of this context from protocolbuffers/protobuf#1704, this is more like proto-text vs protobuf binary. JSON is in the middle, could be both handwritten (which shouldn't be parsed leniently as it may contain typos), or machine produced (which could be parsed leniently for backward compatibility, typos are not there).

A couple more things I can think of:

  • Raise warnings when JSON/YAML parsing encountered an unknown field.
    Unfortunately protobuf doesn't have an interface to do this easiliy :(

  • Differentiate protobuf binary (machine produced) and JSON/YAML strictly.
    This is an unfortunate side effect of using Struct as an extensible config format, when parsing extensions everything is a JSON/YAML, thus we need a flag to control that. I don't want bikeshedding more on Any vs Struct, but it add a point to do oneof, to differentiate the two aspect.

@kyessenov
Copy link
Contributor

kyessenov commented Aug 10, 2018

I think the new flag seems to fit fine with the paradigm:

  • if the config is machine generated, then run with --allow-unknown-fields which applies lenient parsing;
  • if the config is hand-written (including xDS payloads), then don't set the flag; that helps novices and test harnesses;
  • a hybrid of human/machine config is not something we want to encourage, I think.

@lizan
Copy link
Member

lizan commented Aug 10, 2018

a hybrid of human/machine config is not something we want to encourage, I think.

But certainly we do have a use ;) https://istio.io/docs/reference/config/istio.networking.v1alpha3/#EnvoyFilter

htuch pushed a commit that referenced this issue Aug 12, 2018
 Add a flag allow-unknown-fields that controls whether proto conversion ignores unknown fields. Proto structs are uniformly used for extension configuration. A smaller fix than #4094.

The validation is applied in the following places:

Any conversion (e.g. xDS payloads)
Struct conversion (e.g. filter configs)
bootstrap proto binary
Risk Level: Medium. Allow unknown fields is set to false by default, meaning unknown fields will be rejected by envoy. A missing check for xDS type checking was discovered.

Testing: Unit tests were updated to have it strict by default. Some wrong filter configs were discovered.

Docs Changes: None

Release Notes:
Add a flag allow-unknown-fields to control validation of protobuf configurations for unknown fields.

Fixes #4053

Signed-off-by: Kuat Yessenov <kuat@google.com>
@htuch
Copy link
Member

htuch commented Mar 27, 2019

I think we might need to revisit this in light of #6271. If we want stable APIs, in the absence of version negotiation (#5405), we're going to need to allow control planes to set fields unknown to earlier Envoys.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants