Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FULL_SCAN selection mode to least request LB #31507

Merged
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ licenses(["notice"]) # Apache 2

api_proto_package(
deps = [
"//envoy/annotations:pkg",
"//envoy/config/core/v3:pkg",
"//envoy/extensions/load_balancing_policies/common/v3:pkg",
"@com_github_cncf_xds//udpa/annotations:pkg",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import "envoy/extensions/load_balancing_policies/common/v3/common.proto";

import "google/protobuf/wrappers.proto";

import "envoy/annotations/deprecation.proto";
import "udpa/annotations/status.proto";
import "validate/validate.proto";

Expand All @@ -22,10 +23,34 @@ option (udpa.annotations.file_status).package_version_status = ACTIVE;
// This configuration allows the built-in LEAST_REQUEST LB policy to be configured via the LB policy
// extension point. See the :ref:`load balancing architecture overview
// <arch_overview_load_balancing_types>` for more information.
// [#next-free-field: 6]
// [#next-free-field: 7]
message LeastRequest {
// Available methods for selecting the host set from which to return the host with the
// fewest active requests.
enum SelectionMethod {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some 1-liners explaining what it means to set these? I don't think it's explained anywhere what a FULL_SCAN method means.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some guidance/recommendation on when full scan should be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barroca : Can you comment on your use case for full scan mode? I know it was different than mine. I'd like to mention both if applicable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have any use case for it :) I pickup the work since it looked simple enough for a learning opportunity to contribute to the project.

The use case I was trying to work on is when the number of hosts is very small and there is a probability that the selection of random hosts will lead to choosing the same one, or not choosing the one with least requests and leaving the load unbalanced.

But this is mostly true when we are dealing with small number of requests, because when you consider high throughput the probability of repeating random selection decreases and we balance out requests between hosts (according the paper used for the original algorithm).

The second implementation, instead of doing a full scan from the same index from the host list was to randomise the start index so we reduce even more the chance of choosing the same index when the number of requests are equal.

I guess the main usecase of a full scan is to guarantee that the request is sent to the host with least requests (which can be beneficial for some use cases like evenly splitting work on a map reduce).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added a line that explains N_CHOICES is best for most scenarios, and also explained the niche scenarios in which FULL_SCAN may be preferable. Let me know what you think, @tonya11en and @ramaraochavali. (I think this is the last outstanding review comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you

// Return host with fewest requests from a set of ``choice_count`` randomly selected hosts.
// Best selection method for most scenarios.
N_CHOICES = 0;

// Return host with fewest requests from all hosts.
// Useful in some niche use cases involving low request rates and one of:
// (example 1) low request limits on workloads, or (example 2) few hosts.
//
// Example 1: Consider a workload type that can only accept one connection at a time.
// If such workloads are deployed across many hosts, only a small percentage of those
// workloads have zero connections at any given time, and the rate of new connections is low,
// the ``FULL_SCAN`` method is more likely to select a suitable host than ``N_CHOICES``.
//
// Example 2: Consider a workload type that is only deployed on 2 hosts. With default settings,
// the ``N_CHOICES`` method will return the host with more active requests 25% of the time.
// If the request rate is sufficiently low, the behavior of always selecting the host with least
// requests as of the last metrics refresh may be preferable.
FULL_SCAN = 1;
}

// The number of random healthy hosts from which the host with the fewest active requests will
// be chosen. Defaults to 2 so that we perform two-choice selection if the field is not set.
// Only applies to the ``N_CHOICES`` selection method.
google.protobuf.UInt32Value choice_count = 1 [(validate.rules).uint32 = {gte: 2}];

// The following formula is used to calculate the dynamic weights when hosts have different load
Expand Down Expand Up @@ -61,8 +86,12 @@ message LeastRequest {
common.v3.LocalityLbConfig locality_lb_config = 4;

// [#not-implemented-hide:]
// Configuration for performing full scan on the list of hosts.
// If this configuration is set, when selecting the host a full scan on the list hosts will be
// used to select the one with least requests instead of using random choices.
google.protobuf.BoolValue enable_full_scan = 5;
// Unused. Replaced by the `selection_method` enum for better extensibility.
google.protobuf.BoolValue enable_full_scan = 5
[deprecated = true, (envoy.annotations.deprecated_at_minor_version) = "3.0"];

// Method for selecting the host set from which to return the host with the fewest active requests.
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
//
// Defaults to ``N_CHOICES``.
SelectionMethod selection_method = 6 [(validate.rules).enum = {defined_only: true}];
}
7 changes: 7 additions & 0 deletions changelogs/current.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,12 @@ new_features:
change: |
added support for :ref:`%UPSTREAM_CONNECTION_ID% <config_access_log_format_upstream_connection_id>` for the upstream connection
identifier.
- area: upstream
change: |
Added :ref:`selection_method <envoy_v3_api_msg_extensions.load_balancing_policies.least_request.v3.LeastRequest>`
option to the least request load balancer. If set to ``FULL_SCAN``,
Envoy will select the host with the fewest active requests from the entire host set rather than
:ref:`choice_count <envoy_v3_api_msg_extensions.load_balancing_policies.least_request.v3.LeastRequest>`
random choices.

deprecated:
63 changes: 62 additions & 1 deletion source/common/upstream/load_balancer_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1299,19 +1299,80 @@ HostConstSharedPtr LeastRequestLoadBalancer::unweightedHostPick(const HostVector
const HostsSource&) {
HostSharedPtr candidate_host = nullptr;

switch (selection_method_) {
case envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest::FULL_SCAN:
candidate_host = unweightedHostPickFullScan(hosts_to_use);
break;
case envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest::N_CHOICES:
candidate_host = unweightedHostPickNChoices(hosts_to_use);
break;
default:
IS_ENVOY_BUG("unknown selection method specified for least request load balancer");
}

return candidate_host;
}

HostSharedPtr LeastRequestLoadBalancer::unweightedHostPickFullScan(const HostVector& hosts_to_use) {
HostSharedPtr candidate_host = nullptr;

size_t num_hosts_known_tied_for_least = 0;

const size_t num_hosts = hosts_to_use.size();

for (size_t i = 0; i < num_hosts; ++i) {
const HostSharedPtr& sampled_host = hosts_to_use[i];

if (candidate_host == nullptr) {
// Make a first choice to start the comparisons.
num_hosts_known_tied_for_least = 1;
candidate_host = sampled_host;
continue;
}

const auto candidate_active_rq = candidate_host->stats().rq_active_.value();
const auto sampled_active_rq = sampled_host->stats().rq_active_.value();

if (sampled_active_rq < candidate_active_rq) {
// Reset the count of known tied hosts.
num_hosts_known_tied_for_least = 1;
candidate_host = sampled_host;
} else if (sampled_active_rq == candidate_active_rq) {
++num_hosts_known_tied_for_least;

// Use reservoir sampling to select 1 unique sample from the total number of hosts N
// that will tie for least requests after processing the full hosts array.
//
// Upon each new tie encountered, replace candidate_host with sampled_host
// with probability (1 / num_hosts_known_tied_for_least percent).
// The end result is that each tied host has an equal 1 / N chance of being the
// candidate_host returned by this function.
const size_t random_tied_host_index = random_.random() % num_hosts_known_tied_for_least;
if (random_tied_host_index == 0) {
candidate_host = sampled_host;
}
}
}

return candidate_host;
}

HostSharedPtr LeastRequestLoadBalancer::unweightedHostPickNChoices(const HostVector& hosts_to_use) {
HostSharedPtr candidate_host = nullptr;

for (uint32_t choice_idx = 0; choice_idx < choice_count_; ++choice_idx) {
const int rand_idx = random_.random() % hosts_to_use.size();
const HostSharedPtr& sampled_host = hosts_to_use[rand_idx];

if (candidate_host == nullptr) {

// Make a first choice to start the comparisons.
candidate_host = sampled_host;
continue;
}

const auto candidate_active_rq = candidate_host->stats().rq_active_.value();
const auto sampled_active_rq = sampled_host->stats().rq_active_.value();

if (sampled_active_rq < candidate_active_rq) {
candidate_host = sampled_host;
}
Expand Down
7 changes: 6 additions & 1 deletion source/common/upstream/load_balancer_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -710,7 +710,8 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
least_request_config.has_active_request_bias()
? absl::optional<Runtime::Double>(
{least_request_config.active_request_bias(), runtime})
: absl::nullopt) {
: absl::nullopt),
selection_method_(least_request_config.selection_method()) {
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
initialize();
}

Expand All @@ -737,6 +738,8 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
const HostsSource& source) override;
HostConstSharedPtr unweightedHostPick(const HostVector& hosts_to_use,
const HostsSource& source) override;
HostSharedPtr unweightedHostPickFullScan(const HostVector& hosts_to_use);
HostSharedPtr unweightedHostPickNChoices(const HostVector& hosts_to_use);

const uint32_t choice_count_;

Expand All @@ -746,6 +749,8 @@ class LeastRequestLoadBalancer : public EdfLoadBalancerBase {
double active_request_bias_{};

const absl::optional<Runtime::Double> active_request_bias_runtime_;
const envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest::SelectionMethod
selection_method_{};
};

/**
Expand Down
1 change: 1 addition & 0 deletions test/common/upstream/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@ envoy_cc_test(
srcs = ["load_balancer_impl_test.cc"],
deps = [
":utility_lib",
"//source/common/common:random_generator_lib",
"//source/common/network:utility_lib",
"//source/common/upstream:load_balancer_lib",
"//source/common/upstream:upstream_includes",
Expand Down
91 changes: 91 additions & 0 deletions test/common/upstream/load_balancer_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include "envoy/config/core/v3/base.pb.h"
#include "envoy/config/core/v3/health_check.pb.h"

#include "source/common/common/random_generator.h"
#include "source/common/network/utility.h"
#include "source/common/upstream/load_balancer_impl.h"
#include "source/common/upstream/upstream_impl.h"
Expand Down Expand Up @@ -2880,6 +2881,96 @@ TEST_P(LeastRequestLoadBalancerTest, PNC) {
EXPECT_EQ(hostSet().healthy_hosts_[3], lb_5.chooseHost(nullptr));
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
}

TEST_P(LeastRequestLoadBalancerTest, DefaultSelectionMethod) {
envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest lr_lb_config;
EXPECT_EQ(lr_lb_config.selection_method(),
envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest::N_CHOICES);
}

TEST_P(LeastRequestLoadBalancerTest, FullScanOneHostWithLeastRequests) {
hostSet().healthy_hosts_ = {makeTestHost(info_, "tcp://127.0.0.1:80", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:81", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:82", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:83", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:84", simTime())};
hostSet().hosts_ = hostSet().healthy_hosts_;
hostSet().runCallbacks({}, {}); // Trigger callbacks. The added/removed lists are not relevant.

hostSet().healthy_hosts_[0]->stats().rq_active_.set(4);
hostSet().healthy_hosts_[1]->stats().rq_active_.set(3);
hostSet().healthy_hosts_[2]->stats().rq_active_.set(2);
hostSet().healthy_hosts_[3]->stats().rq_active_.set(1);
hostSet().healthy_hosts_[4]->stats().rq_active_.set(5);

envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest lr_lb_config;

// Enable FULL_SCAN on hosts.
lr_lb_config.set_selection_method(
envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest::FULL_SCAN);

LeastRequestLoadBalancer lb{priority_set_, nullptr, stats_, runtime_,
random_, 1, lr_lb_config, simTime()};

// With FULL_SCAN we will always choose the host with least number of active requests.
EXPECT_EQ(hostSet().healthy_hosts_[3], lb.chooseHost(nullptr));
}

TEST_P(LeastRequestLoadBalancerTest, FullScanMultipleHostsWithLeastRequests) {
hostSet().healthy_hosts_ = {makeTestHost(info_, "tcp://127.0.0.1:80", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:81", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:82", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:83", simTime()),
makeTestHost(info_, "tcp://127.0.0.1:84", simTime())};
hostSet().hosts_ = hostSet().healthy_hosts_;
hostSet().runCallbacks({}, {}); // Trigger callbacks. The added/removed lists are not relevant.

hostSet().healthy_hosts_[0]->stats().rq_active_.set(3);
hostSet().healthy_hosts_[1]->stats().rq_active_.set(3);
hostSet().healthy_hosts_[2]->stats().rq_active_.set(1);
hostSet().healthy_hosts_[3]->stats().rq_active_.set(1);
hostSet().healthy_hosts_[4]->stats().rq_active_.set(1);

envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest lr_lb_config;

// Enable FULL_SCAN on hosts.
lr_lb_config.set_selection_method(
envoy::extensions::load_balancing_policies::least_request::v3::LeastRequest::FULL_SCAN);

auto random = Random::RandomGeneratorImpl();

LeastRequestLoadBalancer lb{priority_set_, nullptr, stats_, runtime_,
random, 1, lr_lb_config, simTime()};

// Make 1 million selections. Then, check that the selection probability is
// approximately equal among the 3 hosts tied for least requests.
// Accept a +/-0.5% deviation from the expected selection probability (33.3..%).
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
size_t num_selections = 1000000;
size_t expected_approx_selections_per_tied_host = num_selections / 3;
size_t abs_error = 5000;

size_t host_2_counts = 0;
size_t host_3_counts = 0;
size_t host_4_counts = 0;

for (size_t i = 0; i < num_selections; ++i) {
auto selected_host = lb.chooseHost(nullptr);

if (selected_host == hostSet().healthy_hosts_[2]) {
++host_2_counts;
} else if (selected_host == hostSet().healthy_hosts_[3]) {
++host_3_counts;
} else if (selected_host == hostSet().healthy_hosts_[4]) {
++host_4_counts;
} else {
FAIL() << "Must only select hosts with least requests";
}
}

EXPECT_NEAR(expected_approx_selections_per_tied_host, host_2_counts, abs_error);
EXPECT_NEAR(expected_approx_selections_per_tied_host, host_3_counts, abs_error);
EXPECT_NEAR(expected_approx_selections_per_tied_host, host_4_counts, abs_error);
}

TEST_P(LeastRequestLoadBalancerTest, WeightImbalance) {
hostSet().healthy_hosts_ = {makeTestHost(info_, "tcp://127.0.0.1:80", simTime(), 1),
makeTestHost(info_, "tcp://127.0.0.1:81", simTime(), 2)};
Expand Down
1 change: 1 addition & 0 deletions tools/spelling/spelling_dictionary.txt
Original file line number Diff line number Diff line change
Expand Up @@ -747,6 +747,7 @@ exe
execlp
exprfor
expectable
extensibility
extrahelp
faceplant
facto
Expand Down
Loading