Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: change default initial fetch timeout to 15s #7571

Merged
3 changes: 1 addition & 2 deletions api/envoy/api/v2/core/config_source.proto
Original file line number Diff line number Diff line change
Expand Up @@ -112,13 +112,12 @@ message ConfigSource {
AggregatedConfigSource ads = 3;
}

// Optional initialization timeout.
// When this timeout is specified, Envoy will wait no longer than the specified time for first
// config response on this xDS subscription during the :ref:`initialization process
// <arch_overview_initialization>`. After reaching the timeout, Envoy will move to the next
// initialization phase, even if the first config is not delivered yet. The timer is activated
// when the xDS API subscription starts, and is disarmed on first config update or on error. 0
// means no timeout - Envoy will wait indefinitely for the first xDS config (unless another
// timeout applies). Default 0.
// timeout applies). The default is 15s.
google.protobuf.Duration initial_fetch_timeout = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a constraint to make this non-zero if it is set?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about it, but decided against it because it may allow people get back old behaviour it they need. I was thinking of updating the doc with that info. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc already exists
// 0 means no timeout - Envoy will wait indefinitely for the first xDS config (unless another
// timeout applies)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sounds good.

}
18 changes: 11 additions & 7 deletions docs/root/intro/arch_overview/operations/init.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,24 @@ accepting new connections.
* During startup, the :ref:`cluster manager <arch_overview_cluster_manager>` goes through a
multi-phase initialization where it first initializes static/DNS clusters, then predefined
:ref:`EDS <arch_overview_dynamic_config_eds>` clusters. Then it initializes
:ref:`CDS <arch_overview_dynamic_config_cds>` if applicable, waits for one response (or failure),
:ref:`CDS <arch_overview_dynamic_config_cds>` if applicable, waits for one response (or failure)
for a :ref:`bounded period of time <envoy_api_field_core.ConfigSource.initial_fetch_timeout>`,
and does the same primary/secondary initialization of CDS provided clusters.
* If clusters use :ref:`active health checking <arch_overview_health_checking>`, Envoy also does a
single active health check round.
* Once cluster manager initialization is done, :ref:`RDS <arch_overview_dynamic_config_rds>` and
:ref:`LDS <arch_overview_dynamic_config_lds>` initialize (if applicable). The server
doesn't start accepting connections until there has been at least one response (or failure) for
LDS/RDS requests.
* If LDS itself returns a listener that needs an RDS response, Envoy further waits until an RDS
:ref:`LDS <arch_overview_dynamic_config_lds>` initialize (if applicable). The server waits
for a :ref:`bounded period of time <envoy_api_field_core.ConfigSource.initial_fetch_timeout>`
for at least one response (or failure) for LDS/RDS requests. After which, it starts accepting connections.
* If LDS itself returns a listener that needs an RDS response, Envoy further waits for
a :ref:`bounded period of time <envoy_api_field_core.ConfigSource.initial_fetch_timeout>` until an RDS
response (or failure) is received. Note that this process takes place on every future listener
addition via LDS and is known as :ref:`listener warming <config_listeners_lds>`.
* After all of the previous steps have taken place, the listeners start accepting new connections.
This flow ensures that during hot restart the new process is fully capable of accepting and
processing new connections before the draining of the old process begins.

All mentioned "waiting for one response" periods can be limited by setting corresponding
:ref:`initial_fetch_timeout <envoy_api_field_core.ConfigSource.initial_fetch_timeout>`.
A key design principle of initialization is that an Envoy is always guaranteed to initialize within
:ref:`initial_fetch_timeout <envoy_api_field_core.ConfigSource.initial_fetch_timeout>`,
with a best effort made to obtain the complete set of xDS configuration within that subject to the
management server availability.
1 change: 1 addition & 0 deletions docs/root/intro/version_history.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ Version history

1.12.0 (pending)
================
* config: changed the default value of :ref:`initial_fetch_timeout <envoy_api_field_core.ConfigSource.initial_fetch_timeout>` from 0s to 15s. This is a change in behaviour in the sense that Envoy will move to the next initialization phase, even if the first config is not delivered in 15s. Refer to :ref:`initialization process <arch_overview_initialization>` for more details.
* config: async data access for local and remote data source.
* http: added the ability to reject HTTP/1.1 requests with invalid HTTP header values, using the runtime feature `envoy.reloadable_features.strict_header_validation`.

Expand Down
2 changes: 1 addition & 1 deletion source/common/config/utility.cc
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ std::chrono::milliseconds Utility::apiConfigSourceRequestTimeout(
std::chrono::milliseconds
Utility::configSourceInitialFetchTimeout(const envoy::api::v2::core::ConfigSource& config_source) {
return std::chrono::milliseconds(
PROTOBUF_GET_MS_OR_DEFAULT(config_source, initial_fetch_timeout, 0));
PROTOBUF_GET_MS_OR_DEFAULT(config_source, initial_fetch_timeout, 15000));
}

void Utility::translateRdsConfig(
Expand Down
6 changes: 3 additions & 3 deletions test/common/config/subscription_factory_impl_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ TEST_F(SubscriptionFactoryTest, HttpSubscriptionCustomRequestTimeout) {
EXPECT_CALL(cm_, clusters()).WillOnce(Return(cluster_map));
EXPECT_CALL(cluster, info()).Times(2);
EXPECT_CALL(*cluster.info_, addedViaApi());
EXPECT_CALL(dispatcher_, createTimer_(_));
EXPECT_CALL(dispatcher_, createTimer_(_)).Times(2);
EXPECT_CALL(cm_, httpAsyncClientForCluster("static_cluster"));
EXPECT_CALL(
cm_.async_client_,
Expand All @@ -246,7 +246,7 @@ TEST_F(SubscriptionFactoryTest, HttpSubscription) {
EXPECT_CALL(cm_, clusters()).WillOnce(Return(cluster_map));
EXPECT_CALL(cluster, info()).Times(2);
EXPECT_CALL(*cluster.info_, addedViaApi());
EXPECT_CALL(dispatcher_, createTimer_(_));
EXPECT_CALL(dispatcher_, createTimer_(_)).Times(2);
EXPECT_CALL(cm_, httpAsyncClientForCluster("static_cluster"));
EXPECT_CALL(cm_.async_client_, send_(_, _, _))
.WillOnce(Invoke([this](Http::MessagePtr& request, Http::AsyncClient::Callbacks&,
Expand Down Expand Up @@ -301,7 +301,7 @@ TEST_F(SubscriptionFactoryTest, GrpcSubscription) {
return async_client_factory;
}));
EXPECT_CALL(random_, random());
EXPECT_CALL(dispatcher_, createTimer_(_));
EXPECT_CALL(dispatcher_, createTimer_(_)).Times(2);
EXPECT_CALL(callbacks_, onConfigUpdateFailed(_));
subscriptionFromConfigSource(config)->start({"static_cluster"});
}
Expand Down
2 changes: 1 addition & 1 deletion test/common/config/utility_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ TEST(UtilityTest, ApiConfigSourceRequestTimeout) {

TEST(UtilityTest, ConfigSourceDefaultInitFetchTimeout) {
envoy::api::v2::core::ConfigSource config_source;
EXPECT_EQ(0, Utility::configSourceInitialFetchTimeout(config_source).count());
EXPECT_EQ(15000, Utility::configSourceInitialFetchTimeout(config_source).count());
}

TEST(UtilityTest, ConfigSourceInitFetchTimeout) {
Expand Down