Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subset lb: allow ring hash/maglev LB to work with subsets #8030

Merged
20 changes: 14 additions & 6 deletions docs/root/intro/arch_overview/upstream/load_balancing/subsets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,20 @@ therefore, contain a definition that has the same keys as a given route in order
balancing to occur.

This feature can only be enabled using the V2 configuration API. Furthermore, host metadata is only
supported when using the EDS discovery type for clusters. Host metadata for subset load balancing
must be placed under the filter name ``"envoy.lb"``. Similarly, route metadata match criteria use
the ``"envoy.lb"`` filter name. Host metadata may be hierarchical (e.g., the value for a top-level
key may be a structured value or list), but the subset load balancer only compares top-level keys
and values. Therefore when using structured values, a route's match criteria will only match if an
identical structured value appears in the host's metadata.
supported when hosts are defined using
:ref:`ClusterLoadAssignments <envoy_api_msg_ClusterLoadAssignment>`. ClusterLoadAssignments are
available via EDS or the Cluster :ref:`load_assignment <envoy_api_field_Cluster.load_assignment>`
field. Host metadata for subset load balancing must be placed under the filter name ``"envoy.lb"``.
Similarly, route metadata match criteria use ``"envoy.lb"`` filter name. Host metadata may be
hierarchical (e.g., the value for a top-level key may be a structured value or list), but the
subset load balancer only compares top-level keys and values. Therefore when using structured
values, a route's match criteria will only match if an identical structured value appears in the
host's metadata.

Finally, note that subset load balancing is not available for the
:ref:`ORIGINAL_DST_LB <envoy_api_enum_value_Cluster.LbPolicy.ORIGINAL_DST_LB>` or
:ref:`CLUSTER_PROVIDED <envoy_api_enum_value_Cluster.LbPolicy.CLUSTER_PROVIDED>` load balancer
policies.

Examples
^^^^^^^^
Expand Down
22 changes: 13 additions & 9 deletions source/common/upstream/cluster_manager_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -654,17 +654,21 @@ void ClusterManagerImpl::loadCluster(const envoy::api::v2::Cluster& cluster,
const auto cluster_entry_it = cluster_map.find(cluster_reference.info()->name());

// If an LB is thread aware, create it here. The LB is not initialized until cluster pre-init
// finishes.
// finishes. For RingHash/Maglev don't create the LB here if subset balancing is enabled.
zuercher marked this conversation as resolved.
Show resolved Hide resolved
if (cluster_reference.info()->lbType() == LoadBalancerType::RingHash) {
cluster_entry_it->second->thread_aware_lb_ = std::make_unique<RingHashLoadBalancer>(
cluster_reference.prioritySet(), cluster_reference.info()->stats(),
cluster_reference.info()->statsScope(), runtime_, random_,
cluster_reference.info()->lbRingHashConfig(), cluster_reference.info()->lbConfig());
if (!cluster_reference.info()->lbSubsetInfo().isEnabled()) {
cluster_entry_it->second->thread_aware_lb_ = std::make_unique<RingHashLoadBalancer>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also inclined to think if we're changing the thread_aware logic we should probably have an integration test for it, as I think the unit tests don't really exercise the therad-aware code. Or am I misremembering?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean an integration test for subset load balancer + hash-ring or maglev? I can look at adding one.

There are integration tests exercising the hash-ring lb, but not maglev (unless I'm just not seeing it).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, basically my point is this change affects what we do in what thread, in a way that wouldn't be really exercised by unit tests, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a very basic integration test.

cluster_reference.prioritySet(), cluster_reference.info()->stats(),
cluster_reference.info()->statsScope(), runtime_, random_,
cluster_reference.info()->lbRingHashConfig(), cluster_reference.info()->lbConfig());
}
} else if (cluster_reference.info()->lbType() == LoadBalancerType::Maglev) {
cluster_entry_it->second->thread_aware_lb_ = std::make_unique<MaglevLoadBalancer>(
cluster_reference.prioritySet(), cluster_reference.info()->stats(),
cluster_reference.info()->statsScope(), runtime_, random_,
cluster_reference.info()->lbConfig());
if (!cluster_reference.info()->lbSubsetInfo().isEnabled()) {
cluster_entry_it->second->thread_aware_lb_ = std::make_unique<MaglevLoadBalancer>(
cluster_reference.prioritySet(), cluster_reference.info()->stats(),
cluster_reference.info()->statsScope(), runtime_, random_,
cluster_reference.info()->lbConfig());
}
} else if (cluster_reference.info()->lbType() == LoadBalancerType::ClusterProvided) {
cluster_entry_it->second->thread_aware_lb_ = std::move(new_cluster_pair.second);
}
Expand Down
12 changes: 12 additions & 0 deletions source/common/upstream/upstream_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -643,12 +643,24 @@ ClusterInfoImpl::ClusterInfoImpl(
envoy::api::v2::Cluster_LbPolicy_Name(config.lb_policy()),
envoy::api::v2::Cluster_DiscoveryType_Name(config.type())));
}
if (config.has_lb_subset_config()) {
throw EnvoyException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test? (I think the one below is covered but not this one)
Also out of curiosity, what was the prior behavior if someone configured this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out this isn't reachable because of the check in original_dst_cluster.cc.

From the code/comments it seemed the plan was for the original dst cluster to become a custom cluster type to be used in conjunction with cluster_provided lb policy. When that happens I think this case become redundant and will get removed.

That said, I could remove the subset lb check in original_dst_cluster.cc in favor of this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd either suggest a TODO for you to pick up the change which clean this up, or a refactor now, rather than have redundant code in here indefinitely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remove the check in the original dst LB.

fmt::format("cluster: LB policy {} cannot be combined with lb_subset_config",
envoy::api::v2::Cluster_LbPolicy_Name(config.lb_policy())));
}

lb_type_ = LoadBalancerType::ClusterProvided;
break;
case envoy::api::v2::Cluster::MAGLEV:
lb_type_ = LoadBalancerType::Maglev;
break;
case envoy::api::v2::Cluster::CLUSTER_PROVIDED:
if (config.has_lb_subset_config()) {
throw EnvoyException(
fmt::format("cluster: LB policy {} cannot be combined with lb_subset_config",
envoy::api::v2::Cluster_LbPolicy_Name(config.lb_policy())));
}

lb_type_ = LoadBalancerType::ClusterProvided;
break;
default:
Expand Down
5 changes: 5 additions & 0 deletions test/common/upstream/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ envoy_cc_test(
envoy_cc_test(
name = "cluster_manager_impl_test",
srcs = ["cluster_manager_impl_test.cc"],
external_deps = [
"abseil_optional",
],
deps = [
":utility_lib",
"//include/envoy/stats:stats_interface",
Expand All @@ -43,6 +46,7 @@ envoy_cc_test(
"//source/common/stats:stats_lib",
"//source/common/upstream:cluster_factory_lib",
"//source/common/upstream:cluster_manager_lib",
"//source/common/upstream:subset_lb_lib",
"//source/extensions/transport_sockets/raw_buffer:config",
"//source/extensions/transport_sockets/tls:context_lib",
"//test/mocks/access_log:access_log_mocks",
Expand All @@ -61,6 +65,7 @@ envoy_cc_test(
"//test/test_common:simulated_time_system_lib",
"//test/test_common:threadsafe_singleton_injector_lib",
"//test/test_common:utility_lib",
"@envoy_api//envoy/api/v2:cds_cc",
],
)

Expand Down
116 changes: 91 additions & 25 deletions test/common/upstream/cluster_manager_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#include <string>

#include "envoy/admin/v2alpha/config_dump.pb.h"
#include "envoy/api/v2/cds.pb.h"
#include "envoy/api/v2/core/base.pb.h"
#include "envoy/network/listen_socket.h"
#include "envoy/upstream/upstream.h"
Expand All @@ -17,6 +18,7 @@
#include "common/singleton/manager_impl.h"
#include "common/upstream/cluster_factory_impl.h"
#include "common/upstream/cluster_manager_impl.h"
#include "common/upstream/subset_lb.h"

#include "extensions/transport_sockets/tls/context_manager_impl.h"

Expand All @@ -38,6 +40,7 @@
#include "test/test_common/threadsafe_singleton_injector.h"
#include "test/test_common/utility.h"

#include "absl/strings/str_replace.h"
#include "gmock/gmock.h"
#include "gtest/gtest.h"

Expand Down Expand Up @@ -577,14 +580,45 @@ TEST_F(ClusterManagerImplTest, OriginalDstLbRestriction2) {
"'ORIGINAL_DST_LB' is allowed only with cluster type 'ORIGINAL_DST'");
}

TEST_F(ClusterManagerImplTest, SubsetLoadBalancerInitialization) {
const std::string yaml = R"EOF(
class ClusterManagerSubsetInitializationTest
: public ClusterManagerImplTest,
public testing::WithParamInterface<envoy::api::v2::Cluster_LbPolicy> {
public:
ClusterManagerSubsetInitializationTest() = default;

static std::vector<envoy::api::v2::Cluster_LbPolicy> lbPolicies() {
int first = static_cast<int>(envoy::api::v2::Cluster_LbPolicy_LbPolicy_MIN);
int last = static_cast<int>(envoy::api::v2::Cluster_LbPolicy_LbPolicy_MAX);
ASSERT(first < last);

std::vector<envoy::api::v2::Cluster_LbPolicy> policies;
for (int i = first; i <= last; i++) {
if (envoy::api::v2::Cluster_LbPolicy_IsValid(i)) {
auto policy = static_cast<envoy::api::v2::Cluster_LbPolicy>(i);
policies.push_back(policy);
}
}
return policies;
}

static std::string paramName(const testing::TestParamInfo<ParamType>& info) {
zuercher marked this conversation as resolved.
Show resolved Hide resolved
const std::string& name = envoy::api::v2::Cluster_LbPolicy_Name(info.param);
return absl::StrReplaceAll(name, {{"_", ""}});
}
};

TEST_P(ClusterManagerSubsetInitializationTest, SubsetLoadBalancerInitialization) {
const std::string yamlPattern = R"EOF(
static_resources:
clusters:
- name: cluster_1
connect_timeout: 0.250s
type: static
lb_policy: round_robin
lb_policy: "{}"
lb_subset_config:
fallback_policy: ANY_ENDPOINT
subset_selectors:
- keys: [ "x" ]
load_assignment:
endpoints:
- lb_endpoints:
Expand All @@ -600,39 +634,71 @@ TEST_F(ClusterManagerImplTest, SubsetLoadBalancerInitialization) {
port_value: 8001
)EOF";

envoy::config::bootstrap::v2::Bootstrap bootstrap = parseBootstrapFromV2Yaml(yaml);
envoy::api::v2::Cluster::LbSubsetConfig* subset_config =
bootstrap.mutable_static_resources()->mutable_clusters(0)->mutable_lb_subset_config();
subset_config->set_fallback_policy(envoy::api::v2::Cluster::LbSubsetConfig::ANY_ENDPOINT);
subset_config->add_subset_selectors()->add_keys("x");
const std::string& policy_name = envoy::api::v2::Cluster_LbPolicy_Name(GetParam());
const std::string yaml = fmt::format(yamlPattern, policy_name);

create(bootstrap);
checkStats(1 /*added*/, 0 /*modified*/, 0 /*removed*/, 1 /*active*/, 0 /*warming*/);
if (GetParam() == envoy::api::v2::Cluster_LbPolicy_ORIGINAL_DST_LB ||
GetParam() == envoy::api::v2::Cluster_LbPolicy_CLUSTER_PROVIDED) {
EXPECT_THROW(create(parseBootstrapFromV2Yaml(yaml)), EnvoyException);
} else {
create(parseBootstrapFromV2Yaml(yaml));
checkStats(1 /*added*/, 0 /*modified*/, 0 /*removed*/, 1 /*active*/, 0 /*warming*/);

factory_.tls_.shutdownThread();
Upstream::ThreadLocalCluster* tlc = cluster_manager_->get("cluster_1");
EXPECT_NE(nullptr, tlc);

if (tlc) {
Upstream::LoadBalancer& lb = tlc->loadBalancer();
EXPECT_NE(nullptr, dynamic_cast<Upstream::SubsetLoadBalancer*>(&lb));
}

factory_.tls_.shutdownThread();
}
}

TEST_F(ClusterManagerImplTest, SubsetLoadBalancerRestriction) {
INSTANTIATE_TEST_SUITE_P(ClusterManagerSubsetInitializationTest,
ClusterManagerSubsetInitializationTest,
testing::ValuesIn(ClusterManagerSubsetInitializationTest::lbPolicies()),
ClusterManagerSubsetInitializationTest::paramName);

TEST_F(ClusterManagerImplTest, SubsetLoadBalancerOriginalDstRestriction) {
const std::string yaml = R"EOF(
static_resources:
clusters:
- name: cluster_1
connect_timeout: 0.250s
type: original_dst
lb_policy: original_dst_lb
lb_subset_config:
fallback_policy: ANY_ENDPOINT
subset_selectors:
- keys: [ "x" ]
)EOF";

envoy::config::bootstrap::v2::Bootstrap bootstrap = parseBootstrapFromV2Yaml(yaml);
envoy::api::v2::Cluster::LbSubsetConfig* subset_config =
bootstrap.mutable_static_resources()->mutable_clusters(0)->mutable_lb_subset_config();
subset_config->set_fallback_policy(envoy::api::v2::Cluster::LbSubsetConfig::ANY_ENDPOINT);
subset_config->add_subset_selectors()->add_keys("x");

EXPECT_THROW_WITH_MESSAGE(
create(bootstrap), EnvoyException,
create(parseBootstrapFromV2Yaml(yaml)), EnvoyException,
"cluster: cluster type 'original_dst' may not be used with lb_subset_config");
}

TEST_F(ClusterManagerImplTest, SubsetLoadBalancerClusterProvidedLbRestriction) {
const std::string yaml = R"EOF(
static_resources:
clusters:
- name: cluster_1
connect_timeout: 0.250s
type: static
lb_policy: cluster_provided
lb_subset_config:
fallback_policy: ANY_ENDPOINT
subset_selectors:
- keys: [ "x" ]
)EOF";

EXPECT_THROW_WITH_MESSAGE(
create(parseBootstrapFromV2Yaml(yaml)), EnvoyException,
"cluster: LB policy CLUSTER_PROVIDED cannot be combined with lb_subset_config");
}

TEST_F(ClusterManagerImplTest, SubsetLoadBalancerLocalityAware) {
const std::string yaml = R"EOF(
static_resources:
Expand All @@ -641,6 +707,11 @@ TEST_F(ClusterManagerImplTest, SubsetLoadBalancerLocalityAware) {
connect_timeout: 0.250s
type: STATIC
lb_policy: ROUND_ROBIN
lb_subset_config:
fallback_policy: ANY_ENDPOINT
subset_selectors:
- keys: [ "x" ]
locality_weight_aware: true
load_assignment:
endpoints:
- lb_endpoints:
Expand All @@ -656,12 +727,7 @@ TEST_F(ClusterManagerImplTest, SubsetLoadBalancerLocalityAware) {
port_value: 8001
)EOF";

envoy::config::bootstrap::v2::Bootstrap bootstrap = parseBootstrapFromV2Yaml(yaml);
envoy::api::v2::Cluster::LbSubsetConfig* subset_config =
bootstrap.mutable_static_resources()->mutable_clusters(0)->mutable_lb_subset_config();
subset_config->set_locality_weight_aware(true);

EXPECT_THROW_WITH_MESSAGE(create(bootstrap), EnvoyException,
EXPECT_THROW_WITH_MESSAGE(create(parseBootstrapFromV2Yaml(yaml)), EnvoyException,
"Locality weight aware subset LB requires that a "
"locality_weighted_lb_config be set in cluster_1");
}
Expand Down