-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: prefetch for upstreams #14143
http: prefetch for upstreams #14143
Changes from 2 commits
377a655
a3f873e
03d25da
0de2a56
00d0506
7f58cb6
800fb46
280b655
8bc15af
963803b
a130c58
47b4b7e
f4619e3
84882f2
568ac1c
5f3ef8a
228f441
7f391a9
575636e
ff6aba8
dbb54a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -589,7 +589,6 @@ message Cluster { | |
google.protobuf.Duration max_interval = 2 [(validate.rules).duration = {gt {nanos: 1000000}}]; | ||
} | ||
|
||
// [#not-implemented-hide:] | ||
message PrefetchPolicy { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And then one question for @mattklein123 , I assuming we do end up with at minimum lmk if you think it's worth having wrapper messages for per-upstream and per-cluster. |
||
// Indicates how many streams (rounded up) can be anticipated per-upstream for each | ||
// incoming stream. This is useful for high-QPS or latency-sensitive services. Prefetching | ||
|
@@ -998,7 +997,6 @@ message Cluster { | |
// Configuration to track optional cluster stats. | ||
TrackClusterStats track_cluster_stats = 49; | ||
|
||
// [#not-implemented-hide:] | ||
// Prefetch configuration for this cluster. | ||
PrefetchPolicy prefetch_policy = 50; | ||
|
||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -855,25 +855,31 @@ ThreadLocalCluster* ClusterManagerImpl::get(absl::string_view cluster) { | |
|
||
void ClusterManagerImpl::maybePrefetch( | ||
ThreadLocalClusterManagerImpl::ClusterEntryPtr& cluster_entry, | ||
const ClusterConnectivityState& state, | ||
std::function<ConnectionPool::Instance*()> pick_prefetch_pool) { | ||
// TODO(alyssawilk) As currently implemented, this will always just prefetch | ||
// one connection ahead of actually needed connections. | ||
// | ||
// Instead we want to track the following metrics across the entire connection | ||
// pool and use the same algorithm we do for per-upstream prefetch: | ||
// ((pending_streams_ + num_active_streams_) * global_prefetch_ratio > | ||
// (connecting_stream_capacity_ + num_active_streams_))) | ||
// and allow multiple prefetches per pick. | ||
// Also cap prefetches such that | ||
// num_unused_prefetch < num hosts | ||
// since if we have more prefetches than hosts, we should consider kicking into | ||
// per-upstream prefetch. | ||
// | ||
// Once we do this, this should loop capped number of times while shouldPrefetch is true. | ||
if (cluster_entry->cluster_info_->peekaheadRatio() > 1.0) { | ||
auto peekahead_ratio = cluster_entry->cluster_info_->peekaheadRatio(); | ||
if (peekahead_ratio <= 1.0) { | ||
return; | ||
} | ||
|
||
// 3 here is arbitrary. Just as in ConnPoolImplBase::tryCreateNewConnections | ||
// we want to limit the work which can be done on any given prefetch attempt. | ||
for (int i = 0; i < 3; ++i) { | ||
if ((state.pending_streams_ + 1 + state.active_streams_) * peekahead_ratio <= | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why the +1 in the expression above? I see a similar +1 in ConnPoolImplBase::shouldCreateNewConnection when doing global prefetch but not when doing per-upstream prefetch It seems that this is mimcing the global prefetch logic in shouldCreateNewConnection Worth a comment that references shouldCreateNewConnection? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added a comment - lmk if that doesn't clarify. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW I've read this a few times and I'm still struggling a bit on what we are comparing. When I see <= my eye wants to not return but we do return. Perhaps invert? Up to you. |
||
(state.connecting_stream_capacity_ + state.active_streams_)) { | ||
return; | ||
} | ||
ConnectionPool::Instance* prefetch_pool = pick_prefetch_pool(); | ||
if (prefetch_pool) { | ||
prefetch_pool->maybePrefetch(cluster_entry->cluster_info_->peekaheadRatio()); | ||
if (!prefetch_pool->maybePrefetch(cluster_entry->cluster_info_->peekaheadRatio())) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: if (!prefetch_pool->maybePrefetch(peekahead_ratio) { |
||
// Given that the next prefetch pick may be entirely different, we could | ||
// opt to try again even if the first prefetch fails. Err on the side of | ||
// caution and wait for the next attempt. | ||
return; | ||
} | ||
} else { | ||
// If unable to find a prefetch pool, exit early. | ||
return; | ||
} | ||
} | ||
} | ||
|
@@ -898,9 +904,10 @@ ClusterManagerImpl::httpConnPoolForCluster(const std::string& cluster, ResourceP | |
// performed here in anticipation of the new stream. | ||
// TODO(alyssawilk) refactor to have one function call and return a pair, so this invariant is | ||
// code-enforced. | ||
maybePrefetch(entry->second, [&entry, &priority, &protocol, &context]() { | ||
return entry->second->connPool(priority, protocol, context, true); | ||
}); | ||
maybePrefetch(entry->second, cluster_manager.cluster_manager_state_, | ||
[&entry, &priority, &protocol, &context]() { | ||
return entry->second->connPool(priority, protocol, context, true); | ||
}); | ||
|
||
return ret; | ||
} | ||
|
@@ -924,9 +931,10 @@ ClusterManagerImpl::tcpConnPoolForCluster(const std::string& cluster, ResourcePr | |
// TODO(alyssawilk) refactor to have one function call and return a pair, so this invariant is | ||
// code-enforced. | ||
// Now see if another host should be prefetched. | ||
maybePrefetch(entry->second, [&entry, &priority, &context]() { | ||
return entry->second->tcpConnPool(priority, context, true); | ||
}); | ||
maybePrefetch(entry->second, cluster_manager.cluster_manager_state_, | ||
[&entry, &priority, &context]() { | ||
return entry->second->tcpConnPool(priority, context, true); | ||
}); | ||
|
||
return ret; | ||
} | ||
|
@@ -1405,8 +1413,10 @@ ClusterManagerImpl::ThreadLocalClusterManagerImpl::ClusterEntry::connPool( | |
LoadBalancerContext* context, bool peek) { | ||
HostConstSharedPtr host = (peek ? lb_->peekAnotherHost(context) : lb_->chooseHost(context)); | ||
if (!host) { | ||
ENVOY_LOG(debug, "no healthy host for HTTP connection pool"); | ||
cluster_info_->stats().upstream_cx_none_healthy_.inc(); | ||
if (!peek) { | ||
ENVOY_LOG(debug, "no healthy host for HTTP connection pool"); | ||
cluster_info_->stats().upstream_cx_none_healthy_.inc(); | ||
} | ||
return nullptr; | ||
} | ||
|
||
|
@@ -1466,8 +1476,10 @@ ClusterManagerImpl::ThreadLocalClusterManagerImpl::ClusterEntry::tcpConnPool( | |
ResourcePriority priority, LoadBalancerContext* context, bool peek) { | ||
HostConstSharedPtr host = (peek ? lb_->peekAnotherHost(context) : lb_->chooseHost(context)); | ||
if (!host) { | ||
ENVOY_LOG(debug, "no healthy host for TCP connection pool"); | ||
cluster_info_->stats().upstream_cx_none_healthy_.inc(); | ||
if (!peek) { | ||
ENVOY_LOG(debug, "no healthy host for TCP connection pool"); | ||
cluster_info_->stats().upstream_cx_none_healthy_.inc(); | ||
} | ||
return nullptr; | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -108,16 +108,16 @@ LoadBalancerBase::LoadBalancerBase( | |
priority_set_(priority_set) { | ||
for (auto& host_set : priority_set_.hostSetsPerPriority()) { | ||
recalculatePerPriorityState(host_set->priority(), priority_set_, per_priority_load_, | ||
per_priority_health_, per_priority_degraded_); | ||
per_priority_health_, per_priority_degraded_, total_healthy_hosts_); | ||
} | ||
// Recalculate panic mode for all levels. | ||
recalculatePerPriorityPanic(); | ||
|
||
priority_set_.addPriorityUpdateCb( | ||
[this](uint32_t priority, const HostVector&, const HostVector&) -> void { | ||
recalculatePerPriorityState(priority, priority_set_, per_priority_load_, | ||
per_priority_health_, per_priority_degraded_); | ||
}); | ||
priority_set_.addPriorityUpdateCb([this](uint32_t priority, const HostVector&, | ||
const HostVector&) -> void { | ||
recalculatePerPriorityState(priority, priority_set_, per_priority_load_, per_priority_health_, | ||
per_priority_degraded_, total_healthy_hosts_); | ||
}); | ||
|
||
priority_set_.addPriorityUpdateCb( | ||
[this](uint32_t priority, const HostVector&, const HostVector&) -> void { | ||
|
@@ -146,11 +146,13 @@ void LoadBalancerBase::recalculatePerPriorityState(uint32_t priority, | |
const PrioritySet& priority_set, | ||
HealthyAndDegradedLoad& per_priority_load, | ||
HealthyAvailability& per_priority_health, | ||
DegradedAvailability& per_priority_degraded) { | ||
DegradedAvailability& per_priority_degraded, | ||
uint32_t& total_healthy_hosts) { | ||
per_priority_load.healthy_priority_load_.get().resize(priority_set.hostSetsPerPriority().size()); | ||
per_priority_load.degraded_priority_load_.get().resize(priority_set.hostSetsPerPriority().size()); | ||
per_priority_health.get().resize(priority_set.hostSetsPerPriority().size()); | ||
per_priority_degraded.get().resize(priority_set.hostSetsPerPriority().size()); | ||
total_healthy_hosts = 0; | ||
|
||
// Determine the health of the newly modified priority level. | ||
// Health ranges from 0-100, and is the ratio of healthy/degraded hosts to total hosts, modified | ||
|
@@ -232,6 +234,10 @@ void LoadBalancerBase::recalculatePerPriorityState(uint32_t priority, | |
per_priority_load.healthy_priority_load_.get().end(), 0) + | ||
std::accumulate(per_priority_load.degraded_priority_load_.get().begin(), | ||
per_priority_load.degraded_priority_load_.get().end(), 0)); | ||
|
||
for (auto& host_set : priority_set.hostSetsPerPriority()) { | ||
total_healthy_hosts += host_set->healthyHosts().size(); | ||
} | ||
} | ||
|
||
// Method iterates through priority levels and turns on/off panic mode. | ||
|
@@ -774,6 +780,10 @@ void EdfLoadBalancerBase::refresh(uint32_t priority) { | |
} | ||
|
||
HostConstSharedPtr EdfLoadBalancerBase::peekAnotherHost(LoadBalancerContext* context) { | ||
if (stashed_random_.size() + 1 > total_healthy_hosts_) { | ||
antoniovicente marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return nullptr; | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is 1 the right max ratio of prefetched connections to healthy hosts? I imagine that when the number of endpoints is small it would be beneficial to set this ratio > 1.0, specially if host weights are not all equal. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This one isn't a ratio thing - we currently cap #prefetches to the number of healthy hosts. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are there plans to relax that restriction? It seems to get in the way of getting to a fixed number of connections in cases where the number of healthy upstreams is less than the desired number of connections. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. as needed. Right now I was aiming at the server side, where for low QPS you'd genreally prefetch a few, and for high qps you'd want per upstream but as we move towards mobile if we only have a single upstream per endpoint we'd want to have more than 1 prefetch. |
||
|
||
const absl::optional<HostsSource> hosts_source = hostSourceToUse(context, random(true)); | ||
if (!hosts_source) { | ||
return nullptr; | ||
|
@@ -859,6 +869,9 @@ HostConstSharedPtr LeastRequestLoadBalancer::unweightedHostPick(const HostVector | |
} | ||
|
||
HostConstSharedPtr RandomLoadBalancer::peekAnotherHost(LoadBalancerContext* context) { | ||
if (stashed_random_.size() + 1 > total_healthy_hosts_) { | ||
antoniovicente marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return nullptr; | ||
} | ||
return peekOrChoose(context, true); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I know for mobile we're going to want to have fixed prefetch numbers (prefetch 6/8 connections), so I was thinking to make it a oneof (ratio, fixed_prefetch)
But I think the fixed_prefetch will end up being the moral equivalent of min_connections_picked, at which point we should leave this as-is and allow extending it. SG @antoniovicente