-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
downstream circuit breakers #373
Comments
Can you elaborate phase 2 please? Circuit breakers as such makes sense when calling dependencies. At downstream level, its more of handshaking pattern (assuming the other end is Envoy). If not, it boils down to throttling/load shedding does it? |
This is primarily for Envoys that are acting as edge proxies. It's a defense in depth measure to help with too many incoming connections. The 2 phase approach allows it to be graceful in phase 1 before becoming "shed load as fast as possible" in phase 2. |
@dnoe assigning to you. |
We are facing the 'too many open files' problem as an edge proxy, which is the same problem as this issue. We have built an gRPC service mesh with envoy in our system, and we use a front envoy(as edge) routing client side gRPC requests. I wonder how do you solve this scenario in your solution? Using some else h2-proxy, etc, nginx? However, I prefer to use a global unified solution. |
Ah, a third phase of this would be when over [a configured file descriptor limit?] to simply stop accepting new incoming connections. It's drastic, but better than crashing when under DoS attack, and will likely be implemented as part of this issue. |
@anticpp I will say that in general, incoming connections can be generally controlled pretty well by having appropriate upstream circuit breakers, downstream idle timeouts, and a high enough FD limit for the process. With that said, I would love to make progress on this issue, but it hasn't gotten resourced yet. |
I believe our team plans to pick it up in Q2
|
Here's a draft design doc for an "Overload Manager" component for Envoy that would help address this: https://docs.google.com/document/d/1NXHAYibd6N7B4PVrukhACqzNK7WUZHWDSil9maeG1c8/edit?usp=sharing Comments welcome! |
LGTM; left some minor comments on the doc but the overall design looks very solid. |
+1 this looks great. |
to be used by the overload manager (issue envoyproxy#373) Signed-off-by: Elisha Ziskind <eziskind@google.com>
Add an extensible resource monitor framework for monitoring resource "pressures" (usage/limit). This will be used by the overload manager to implement downstream circuit breaking (issue #373 - see design doc linked from there). Risk Level: low (not yet used in envoy main) Signed-off-by: Elisha Ziskind <eziskind@google.com>
@eziskind I add stats/docs open items to the issue description for tracking. |
Initialize on startup and add documentation (issue #373) Risk Level: low Testing: unit tests Docs Changes: add docs for overload manager Signed-off-by: Elisha Ziskind <eziskind@google.com>
Remaining item is to implement overload actions in the connection manager. |
Lookups in this cache can be used as an alternative to registering a callback for overload action state changes. Useful for objects (like the http connection manager) with dynamic lifetimes that are created after envoy initialization - currently callback registration must be done during initialization and there isn't support for unregistration. Those restrictions could be relaxed if we need to in the future but for now this keeps things simpler. For issue #373. Risk Level: low Testing: unit tests Signed-off-by: Elisha Ziskind <eziskind@google.com>
Add an overload action in the http connection manager to immediately close new streams in case of envoy overload (issue #373). Signed-off-by: Elisha Ziskind <eziskind@google.com>
@eziskind should we close this issue as complete and track further work in new issues? I think the main feature is done? |
@mattklein123 I'm working on overload action to stop accepting new network connections, mentioned in this comment. After that I think we can call this done. |
Per previous comment, calling this done. We can track further overload actions in new issues. |
…roxy#373) * Changed the type of init_service_configs and added rollout_id * Renamed to service_config_rollout. Added getter and setter of current_rollout_id_
Add foreign function support to Wasm.
…-filter zh-translation:docs/root/configuration/other_protocols/thrift_filters…
Adds 2 new jobs that build the Swift and Objective-C demo apps on CI. Note that the `Envoy.framework` artifact has to be zipped during upload and unzipped during download since the upload/download GitHub actions seem to use the _contents_ of `Envoy.framework` instead of the whole folder (and thus incorrectly end up with the contents of `Envoy.framework` in `dist/` after downloading instead of an `Envoy.framework`). These will be used to validate that the demos continue to work properly. Signed-off-by: Michael Rebello <me@michaelrebello.com> Signed-off-by: JP Simard <jp@jpsim.com>
Adds 2 new jobs that build the Swift and Objective-C demo apps on CI. Note that the `Envoy.framework` artifact has to be zipped during upload and unzipped during download since the upload/download GitHub actions seem to use the _contents_ of `Envoy.framework` instead of the whole folder (and thus incorrectly end up with the contents of `Envoy.framework` in `dist/` after downloading instead of an `Envoy.framework`). These will be used to validate that the demos continue to work properly. Signed-off-by: Michael Rebello <me@michaelrebello.com> Signed-off-by: JP Simard <jp@jpsim.com>
2 different levels:
Remaining items:
The text was updated successfully, but these errors were encountered: