Improved connection / stream results when wai applications throw exceptions. #740
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After submitting your PR:
Overview
Our services operate behind Google load balancer instances, which assume that opened connections stay open for up to 10 minutes. In the event of uncaught exceptions being handled by Warp, the current behavior for HTTP/1 is to close the connection after sending the error response. The HTTP/2 code ignores the user-provided exception response handler from settings and closes the stream with an HTTP/2 stream
INTERNAL_ERROR
signal.This caused the issue for us on HTTP/1 connections that subsequent requests would attempt to reuse the closed connection and fail (since the connection was gone). For
GET
requests, that wasn't a huge deal, since the load balancer would replay them (yay idempotency 🎉). For non-idempotent requests, we'd just lose the data. In HTTP/2, the issue was just the inconvenience of not getting proper 500s back.Arguably according to the HTTP specs, Warp's current behavior is acceptable, and probably the easiest way to recover from any odd error states, but this PR attempts to make connection reuse a bit more robust so we can keep our architecture the same.
HTTP/1 Implementation
We need to distinguish between several failure modes:
Application
prior to returning a response.Settings
, can generate a well-formed response)Socket
s breaking while sending responseIOException
s from files referenced inResponseFile
. (for example: what happens if a while gets modified/deleted while it's being streamed as a result?)ResponseBuilder
orResponseStream
.The general heuristic here is that once we start to flush the response and an exception is thrown any time during the flushing process, consider the connection to be broken. In order to accomplish this, we wrap exceptions thrown while responding with
ExceptionInsideResponseBody
. If an uncaught exception has this wrapper, then we can close the connection, otherwise we can use the provided exception handler fromSettings
and keep the connection alive since we're not wedged in an irrecoverable state.HTTP/2 Implementation
HTTP/2 doesn't reuse streams, so if a stream breaks, new requests on the same connection are fine. However, we use similar logic to differentiate between exceptions thrown in the flushing process with the same
ExceptionInsideResponseBody
.Before:
After: