Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resource-based circuit breaking #3332

Open
danielhochman opened this issue May 9, 2018 · 4 comments
Open

Resource-based circuit breaking #3332

danielhochman opened this issue May 9, 2018 · 4 comments
Assignees
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@danielhochman
Copy link
Contributor

Description

Envoy should have the ability to circuit break on system resources like CPU.

Circuit breakers at ingress are used to protect our hosts from resource exhaustion. To determine circuit breaker thresholds, we run a "redline" test, which increasingly ramps traffic on a single host until it degrades. We note rq_active, then set the threshold less some buffer.

Over time this ends up being a poor approximation for the real bottleneck for most of our services, CPU:

  • If any of the service's dependencies slow down, a single host can handle additional concurrency without exhausting its local resources.
  • If a service (the local service or a downstream) ships code that changes the overall load profile and/or the overall request mix, the circuit breaker may no longer protect the service or circuit break early.

Working out the platform-dependent implementation and the algorithm will be the fun part. I'd like to get a first impression from other users before getting into that.

@danielhochman danielhochman added the enhancement Feature requests. Not bugs or questions. label May 9, 2018
@mattklein123
Copy link
Member

IMO this is best done as a dedicated filter, as the circuit breaking is a bit different from what we currently do and I think it's pretty self contained. I think I would make this a general resource based ingress circuit breaking filter that could be eventually extended to memory and other things. As long as we have the right platform abstractions for getting the information we need I think this sounds like a very useful feature to add.

@mattklein123 mattklein123 added this to the 1.7.0 milestone May 11, 2018
@mattklein123 mattklein123 modified the milestones: 1.7.0, 1.8.0 May 28, 2018
@alyssawilk
Copy link
Contributor

I think we may also want to tie this in to the centralized system for #373. I can imagine hitting some threshold (event loop time?) at which we simply stop accepting new requests so we can make forward progress on existing ones.

@stale
Copy link

stale bot commented Jul 11, 2018

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 11, 2018
@danielhochman danielhochman added the help wanted Needs help! label Jul 11, 2018
@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jul 11, 2018
@mattklein123 mattklein123 modified the milestones: 1.8.0, 1.9.0 Sep 21, 2018
@mattklein123 mattklein123 removed this from the 1.9.0 milestone Oct 5, 2018
@eightnoteight
Copy link

came across this issue and wondered about the implementation part in the context of containers and running envoy as a sidecar in an ecs task or in k8s pod,

one way to do this would be to share process namespace, that way the envoy can track cpu of the other container using /proc/{main-container-pid}/root/sys/fs/cgroup/cpuacct/cpuacct.usage_all

but this solutions adds too many requirements on the end user like having to share process namespace, run envoy as root user, allow envoy to access entire disk of the main container.

is there any other easy/secure way to track resources of system in a container based system?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

No branches or pull requests

4 participants