Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need care with hashing protobufs #6252

Closed
kyessenov opened this issue Mar 11, 2019 · 8 comments
Closed

Need care with hashing protobufs #6252

kyessenov opened this issue Mar 11, 2019 · 8 comments
Labels
Milestone

Comments

@kyessenov
Copy link
Contributor

kyessenov commented Mar 11, 2019

It appears envoy hashes protobuf config structures to determine whether a config has changed. This is a problem since configs are placed into typed_configs as Any, and protobuf specifically does not guarantee deterministic serialization. The main non-determinism comes from map ordering.

If the control plane repeatedly serializes and pushes same configs without change, then envoy might incorrectly assume a change, and drain/reload listeners causing a significant CPU spike and drained connections.

For golang, the solution is to force the control plane to serialize maps in a stable manner. C++ has a global SetDefaultSerializationDeterministic. I assume the same applies to python and java, but that needs to be checked with protobuf libraries for those languages.

@kyessenov
Copy link
Contributor Author

/cc @snowp

htuch pushed a commit that referenced this issue Mar 11, 2019
Using proto.MarshalAny results in unstable output due to non-deterministic map ordering. This in turn causes Envoy's diff to reload a config since the hash of the structure changes.
Enable stable marshaler for gogoproto to avoid this problem. See #6252
Risk Level: low
Testing: n/a

Signed-off-by: Kuat Yessenov <kuat@google.com>
@lizan
Copy link
Member

lizan commented Mar 11, 2019

FYI protocolbuffers/protobuf#5731

@mattklein123
Copy link
Member

@lizan @htuch I think we need a holistic approach to solving this problem as this keeps coming up in various forms. Any thoughts here? Do we need someone to take a look at this and think about a general solution?

@mandarjog
Copy link
Contributor

mandarjog commented Mar 12, 2019

Map ordering is the prominent reason for non determinism. However there are many other reasons why binary encoding of two messages is not the same even though messages are the same.for example You can encode fields out of order, varInt encodings allow for zero extensions.

I think for an api to be clear we could have a “sha” field as a peer of any. That way the onus is on the caller to supply it. At present it is a hidden contract of the api.

@htuch
Copy link
Member

htuch commented Mar 12, 2019

@mandarjog arguably that just punts the problem to the management server. It's in a position to compute the SHA for sure, but it's going to be a lot of tedious work if it can't rely on the proto library for this. With resource level versioning in incremental xDS (CC @fredlas), this might be improved, since the management server now provides a per-resource "version" at each update, that can be used to avoid diffing.

@JimmyCYJ
Copy link
Member

JimmyCYJ commented Mar 13, 2019

FYI, when Istio Polot generates SDS config, we need to marshal FileBasedMetadataConfig and place it in a field(TypedConfig) in GrpcService_GoogleGrpc_CallCredentials. FileBasedMetadataConfig does not support deterministic config, and forcing to marshal it deterministically will get empty string. So a workaround I use is to marshal it once, store the marshaled string in a map. Other generation of SDS config just gets a copy of the marshaled string, instead of marshaling FileBasedMetadataConfig.

@lizan
Copy link
Member

lizan commented Sep 25, 2019

Short term fixed by #8231. Long term discussion: #8301.

@incfly
Copy link
Contributor

incfly commented Oct 2, 2019

/cc @incfly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants