Skip to content

Commit

Permalink
add detailed error explanations page (#3734)
Browse files Browse the repository at this point in the history
I was speaking to several customers that found our connection poisoning
terminology confusing. To address this, I've changed the messaging and
I've also created a new section in the docs that explains some errors in
greater detail. To direct users to these explanations, I've added a link
to the `tracing::info!` call.

I propose that we consider doing the same any time we identify error
messaging that
- must be kept succinct to avoid logging too much info
- must be more understandable to our users.

----

_By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice._
  • Loading branch information
Velfi authored Jul 3, 2024
1 parent dc1ffb8 commit fe1b341
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 4 deletions.
36 changes: 36 additions & 0 deletions design/src/client/detailed_error_explanations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Detailed Error Explanations

This page collects detailed explanations for some errors. If you encounter an
error and are interested in learning more about what it means and why it occurs,
check here.

**If you can't find the explanation on this page, please file an issue asking
for it to be added.**

## "Connection encountered an issue and should not be re-used. Marking it for closure"

The SDK clients each maintain their own connection pool (_except when they share
an `HttpClient`_). By the convention of some services, when a request fails due
to a [transient error](#transient-errors), that connection should not be re-used
for a retry. Instead, it should be dropped and a new connection created instead.
This prevents clients from repeatedly sending requests over a failed connection.

This feature is referred to as "connection poisoning" internally.

## Transient Errors

When requests to a service time out, or when a service responds with a 500, 502,
503, or 504 error, it's considered a 'transient error'. Transient errors are
often resolved by making another request.

When retrying transient errors, the SDKs may avoid re-using connections to
overloaded or otherwise unavailable service endpoints, choosing instead to
establish a new connection. This behavior is referred to internally as
"connection poisoning" and is configurable.

To configure this behavior, set the [reconnect_mode][reconnect-mode] in an SDK
client config's [RetryConfig].

[file an issue]: https://github.com/smithy-lang/smithy-rs/issues/new?assignees=&labels=&projects=&template=blank_issue.md
[RetryConfig]: https://docs.rs/aws-types/latest/aws_types/sdk_config/struct.RetryConfig.html
[reconnect-mode]: https://docs.rs/aws-types/latest/aws_types/sdk_config/struct.RetryConfig.html#method.with_reconnect_mode
5 changes: 4 additions & 1 deletion rust-runtime/aws-smithy-runtime-api/src/client/connection.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,10 @@ pub struct ConnectionMetadata {
impl ConnectionMetadata {
/// Poison this connection, ensuring that it won't be reused.
pub fn poison(&self) {
tracing::info!("smithy connection was poisoned");
tracing::info!(
see_for_more_info = "https://smithy-lang.github.io/smithy-rs/design/client/detailed_error_explanations.html",
"Connection encountered an issue and should not be re-used. Marking it for closure"
);
(self.poison_fn)()
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,14 +81,14 @@ impl Intercept for ConnectionPoisoningInterceptor {
reconnect_mode == ReconnectMode::ReconnectOnTransientError;

if error_is_transient && connection_poisoning_is_enabled {
debug!("received a transient error, poisoning the connection...");
debug!("received a transient error, marking the connection for closure...");

if let Some(captured_connection) = captured_connection.and_then(|conn| conn.get()) {
captured_connection.poison();
debug!("the connection was poisoned")
debug!("the connection was marked for closure")
} else {
error!(
"unable to poison the connection because no connection was found! The underlying HTTP connector never set a connection."
"unable to mark the connection for closure because no connection was found! The underlying HTTP connector never set a connection."
);
}
}
Expand Down

0 comments on commit fe1b341

Please sign in to comment.