-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for N levels of failover endpoints #1929
Comments
So I did quick PoC to see what it would take to add this feature, here are the changes: Envoy changes: master...surki:load_balancer_standby and corresponding config file:
Not sure if above changes are inline with general Envoy architecture/good practice, few questions:
PS: We are also planning to add MySQL filter for exposing interesting stats (just like other filters like DynamoDB etc does). I see that there is an experimental Lua support, probably I may try out that for MySQL protocol parsing/stats extraction. |
Some quick questions:
|
ID: Need a way to deterministically pick up the host from first to last. Since the DNS resolving happens async for all hosts concurrently, the values populated in LoadBalancerBase::host_set_ will have random order. Any other way to achieve the order? (I need to choose hosts in the exact order specified in the json config file) max_connections per host basis: I thought it will be more flexible since different hosts can have different througput/capacity etc. We can move max_connections out to cluster level if you think that is something we will not need host level (personally I do not need per host level, though I could see usecase for others). |
I think there is a general question here as to whether this should be a new load balancing policy, or actually built directly into the tcp_proxy code. For example, one could imagine adding the concept of a fallback cluster to tcp_proxy routing, and then having rules on when the fallback is actually used. I think @rshriram and @ggreenway had some thoughts on this. In some ways it fits a bit better in this way vs. a discrete balancing policy. If we go with the LB built-in approach, I understand the need for ID. It seems unfortunate to have to add this to the Re: max connections per host, unless there is a specific need I would probably add this as a property of the cluster where we already have I would like to get some additional thoughts on whether we should do that is a new LB type vs. building into tcp_proxy. I think I am leanings towards doing what you have done here but would like some other opinions. |
I wish we could somehow use the max connections settings from the circuit breaker. This use case seems a bit niche, to create a special load balancer (if there are other scenarios where this bin packing style LB is needed, I would love to hear them). Even in this use case, what would you do if you have multiple slaves? It seems that the only requirement here is to use the master after slaves are filled up. And among the slaves, it doesn’t matter how connections are assigned. Is this assumption correct? If so, we could imagine creating some form of fallback cluster option that contains MySQL master and the slaves are in the main cluster. The fallback option seems like it would get more mileage across multiple use cases. You can set the circuit breaker in the main cluster to be the total number of connections acceptable in the slaves. And failing that, the fallback cluster will be picked. For the MySQL filter, i suggest doing it in c++, for performance reasons and the fact that lua filters for tcp are not here yet. You could take a look at the mongo filter for reference. Another idea to satisfy the particular MySQL use case you have (assuming typical MySQL installations have same needs): create the MySQL filter very much like redis filter. You can have first class support for master and slave cluster, in the filter. You will have control over when to fallback to the master. |
On 10/26/2017 04:53 AM, Matt Klein wrote:
I think there is a general question here as to whether this should be
a new load balancing policy, or actually built directly into the
tcp_proxy code. For example, one could imagine adding the concept of a
fallback cluster to tcp_proxy routing, and then having rules on when
the fallback is actually used. I think @rshriram
<https://github.com/rshriram> and @ggreenway
<https://github.com/ggreenway> had some thoughts on this. In some ways
it fits a bit better in this way vs. a discrete balancing policy.
If we go with the LB built-in approach, I understand the need for ID.
It seems unfortunate to have to add this to the |Address| structure,
but I'm not sure how else to deal with this without doing a breaking
change which we can't do at this point. @htuch
<https://github.com/htuch> any thoughts here?
Yes, ID is the only remaining part (we can move out or drop
max_connections, please see below). Or only if I could find another way
to deterministically get the host sorted.
Re: max connections per host, unless there is a specific need I would
probably add this as a property of the cluster where we already have
|max_requests_per_connection|.
That should be fine. Or probably we can drop off max_connections
altogether, it can be just a true standby load balancing (i.e., load
balance only on active/passive health check fail. I was just trying to
get equivalent HAProxy config for no good reason.
… I would like to get some additional thoughts on whether we should do
that is a new LB type vs. building into tcp_proxy. I think I am
leanings towards doing what you have done here but would like some
other opinions.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAC9hPQYyRBmieMQaN77nf-jAQRmNM_gks5sv8LrgaJpZM4QEV8j>.
|
On 10/26/2017 08:20 AM, Shriram Rajagopalan wrote:
I wish we could somehow use the max connections settings from the
circuit breaker. This use case seems a bit niche, to create a special
load balancer (if there are other scenarios where this bin packing
style LB is needed, I would love to hear them).
Thinking about bit more on this, probably we don't need max_connections
(please see the above reply). All I wanted is: "if m1 fails, switch over
to m2. If m2 fails switch over to m3" etc.
Even in this use case, what would you do if you have multiple slaves?
It seems that the only requirement here is to use the master after
slaves are filled up. And among the slaves, it doesn’t matter how
connections are assigned. Is this assumption correct? If so, we could
imagine creating some form of fallback cluster option that contains
MySQL master and the slaves are in the main cluster. The fallback
option seems like it would get more mileage across multiple use cases.
You can set the circuit breaker in the main cluster to be the total
number of connections acceptable in the slaves. And failing that, the
fallback cluster will be picked.
There can be more than 2 machines. I am okay either way: LB or some kind
of cluster fallback/tcp proxy routing. LB seemed simple enough to
implement (and it could mimic current HAProxy setup), so I could get to
other interesting bit (MySQL filter)
For the MySQL filter, i suggest doing it in c++, for performance
reasons and the fact that lua filters for tcp are not here yet. You
could take a look at the mongo filter for reference.
Agree, C++ could be the right choice. I was not sure how easy it is to
maintain an out of tree Envoy filter, so thought Lua could simplify
things. Taking a look at other filters, the filter API is very
nice/simple and shouldn't be too difficult to maintain.
Another idea to satisfy the particular MySQL use case you have
(assuming typical MySQL installations have same needs): create the
MySQL filter very much like redis filter. You can have first class
support for master and slave cluster, in the filter. You will have
control over when to fallback to the master.
Interesting. Unfortunately, for us, the app makes this decision
currently (I didn't show full config above, so we have multiple
clusters, with different combinations, like ClusterA: "m1, m2, m3",
ClusterB: "m2, m1, m3" etc).
But that is something could be moved to a filter at some point down the
road.
… —
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAC9hFeXmSgOe0ZWVC2BTGQ20g7aRW5Oks5sv_OGgaJpZM4QEV8j>.
|
If you don't need max connections lets just drop that. Also, I just looked at the code and the hosts actually should remain ordered for static/strict_dns clusters. For strict_dns this is truly only if DNS for each "host" returns a single IP, which I'm guessing is your use case: https://github.com/envoyproxy/envoy/blob/master/source/common/upstream/upstream_impl.cc#L483 Assuming the above is OK for you, I think this is pretty simple. Basically just write a new LB which assumes hosts are in order, and just always sends to first, if failed, sends to next, etc. Would that work?
If you are going to do a MySQL filter and are willing to go through code review process, etc. please just contribute it! I think there are many people that would love a MySQL filter. |
The more general solution is priority groups (at least that's what f5 called it). Basically, you have multiple groups/pools of machines, and the groups are priority ordered. So you try to pick a host from the first group; if that fails, try from the next group, etc. To represent that in envoy, I think it makes more sense to put it in the cluster/lb side than in the tcp_proxy. This could also be applicable to http easily enough. I see two ways to doing it that could make sense:
|
Agreed with @ggreenway that would be a nice general solution. I like (2). It's somewhat similar/related to what @zuercher did with the subset LB. |
You mean (1)? Using metadata is 1. You could have reserved metadata
keywords that define priority.
…On Thu, Oct 26, 2017 at 6:24 PM Matt Klein ***@***.***> wrote:
Agreed with @ggreenway <https://github.com/ggreenway> that would be a
nice general solution. I like (2). It's somewhat similar/related to what
@zuercher <https://github.com/zuercher> did with the subset LB.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AH0qdxlSiVgX9Z9FDx_CSFLlgnympwTDks5swQaXgaJpZM4QEV8j>
.
|
Yes I guess we could use metadata for (1), though this does not work with DNS which I think is going to be quite common for this. Personally I kind of like (2) which is a wrapper cluster and a wrapper LB which operate on sub-clusters. I think there are several ways to implement this that are fine with me. |
FYI having a long convo with @alyssawilk about this offline. I think we are potentially converging on doing (1) above (adding priority group per endpoint), but will update this ticket after some more discussion. |
Really long update over here: tl;dr I'll rewrite 1980 to address this use case as well :-) |
Sorry was on vacation for last week+. @alyssawilk @mattklein123 I see that we are going to do priority grouping. Is this being worked on currently? If this is being implemented currently, do let me know if I could help in anyway (testing etc). If it is not being implemented/planned yet, I could help with the implementation as well (though it looks like it is being clubbed with larger re-factoring, so not sure how much of a help I can be of ...) |
I was planning on picking it up next week but as I'm traveling I'm
realistically more likely to get something out for review the week
following. If you think you could do some thing faster feel free to pick
it up - just let me know!
…On Thu, Nov 9, 2017 at 9:20 AM, Suresh Kumar ***@***.***> wrote:
Sorry was on vacation for last week+.
@alyssawilk <https://github.com/alyssawilk> @mattklein123
<https://github.com/mattklein123> I see that we are going to do priority
grouping. Is this being worked on currently? If this is being implemented
currently, do let me know if I could help in anyway (testing etc). If it is
not being implemented/planned yet, I could help with the implementation as
well (though it looks like it is being clubbed with larger re-factoring, so
not sure how much of a help I can be of ...)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1929 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ARYFvUdERCT5arV_8u4nV5yxfg3TiHXbks5s0wo6gaJpZM4QEV8j>
.
|
I will wait for your changes, but do let me know if you want me test/try out your changes earlier. |
Reworking the organization of endpoints from HostSets-with-localities to a tree with PrioritySets at the top and HostSets as nodes. This allows the failover characteristics requested in #1929 as well as satisfying the defunct #1354 This will be followed up by a PR (or set of PRs) converting the load balancers over to use the new priority logic. API Changes: envoyproxy/data-plane-api#259 Risk Level: High It's a pretty big refactor and there were some subtle gotchas (like stats initially updating before the backends) Testing: Current tests passing. health checker unit tests verify all hosts are health checked across priority groups outlier detection unit tests do the same for outlier detection. upstream_impl_test.cc has unit testing basic priority set operations load stats integration test extended for priorities. Verifies stats reporting insofar as there's now way to direct load yet, but the localities show up as expected. Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
Ok, I'm like 70% of the way done with the naive implementation which while sufficient to close this issue off, is not yet sufficient for my self-respect :-P My plan A would be to mimic the load balancing code I'm most familiar with. It basically has a "overprovision factor" of (default 1.4) and starts spilling traffic out of a priority level when fudge_factor * host-set-health-ratio < 1. I guess we'd have to adapt this to priority levels to spill iff the sum of the health of the higher priority levels was less than 1. Assuming you had 2 priority levels: 100% healthy backends. All traffic goes to P=0 I'd have to dig up details on the ratio drop-off when P=1 starts going unhealthy as well but I figure rather than doing it in code I'd poll for improvements / enhancements with enough time to dicker while I wrap up the naive implementation for ring hash etc.
|
@alyssawilk ^ SGTM. The only complexity is that I think it would probably break the way the zone aware balancing algorithm works in terms of spillover calculations between zones (or make it substantially more complicated). I'm honestly not sure this matters. If failover is occurring we could potentially even disable zone aware balancing. I would keep it in mind though. |
Changing the LoadBalancerBase to use priorities, following the simplistic strategy docced up in the eds proto, where we route to the highest priority level with non-zero healthy endpoints. This results in simple priority-based load balancing for RoundRobinLoadBalancer, LeastRequestLoadBalancer, and RandomLoadBalancer. Risk Level: Medium It's a big change but then again no one is using it :-) Testing: Existing tests parameterized to target P=0 and P=1 to verify equivalence. Additional unit tests and one integration test explicitly testing primary/failover/... inteeractions Docs Changes: envoyproxy/data-plane-api#326 Release Notes: Added support for priorities for several types of load balancer (RoundRobinLoadBalancer, LeastRequestLoadBalancer, RandomLoadBalancer) Continued progress on #1929
@surki The behavior that you asked for should work now if you want to play around. 2 more follow-up PRs but they're for gentle failover in the case there's more than one endpoint per priority set. |
Changes Envoy load balancing across priority levels from a hard failover to trickling data based on the health percentage of each priority level. Risk Level: Medium Testing: Added thorough unit testing for the lb failover code, as well as fixing up ring hash failover testing and adding ring-hash specific tests. Docs Changes: envoyproxy/data-plane-api#359 Release Notes: n/a: falls under existing note [Optional Fixes #Issue] Fixes #1929
I would like to use Envoy to load balance long lived TCP connections under the following setup:
Say, I have 2 machines m1 and m2. I would like all traffic to go to m1. The traffic should go to m2 only if m1 fails (either active/passive healthcheck) Or if it exceeds certain number of connections defined per host.
This is similar to the "first" load balancer in HAProxy: https://cbonte.github.io/haproxy-dconv/1.7/configuration.html#balance
The text was updated successfully, but these errors were encountered: