Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http2: Remove RELEASE_ASSERTs in sendPendingFrames() error handling #13546

Merged
merged 5 commits into from
Oct 16, 2020

Conversation

yanavlasov
Copy link
Contributor

Commit Message:
Remove RELEASE_ASSERTs in sendPendingFrames() error handling

Additional Description:
This PR removes RELEASE_ASSERT on the status of the sendPendingFrames() call and replaces it with the error handling that schedules callback to disconnect connection. The change makes sendPendingFrames() for server codecs to always return the status of the protocol constraint check regardless of the dispatching/non-dispatching context.

During tests I have discovered that returning an error from the onSend and onDataSourceSend callbacks in non-dispatching context does not work well because it aborts the processing of the nghttp2_session_send() and as a result any nghttp2 callbacks that may have been invoked on pending frames will not be called. This is a problem for frames like RST_STREAM and GOAWAY because they invoke callbacks in codecs and http connection manager.

As a result I have unwound the code that returns errors from the onSend and onDataSourceSend callbacks when outbound frame queue limits had been exceeded and instead let the nghttp2_session_send() complete normally and check for protocol constraint violations after it has completed.

Risk Level: Medium
Testing: Unit Tests
Docs Changes: N/A
Release Notes: N/A
Fixes #12280

Signed-off-by: Yan Avlasov yavlasov@google.com

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wohoo! I'm all about reduced release asserts :-)

auto status = parent_.addOutboundFrameFragment(output, framehd, FRAME_HEADER_SIZE);
if (!status.ok()) {
parent_.addOutboundFrameFragment(output, framehd, FRAME_HEADER_SIZE);
if (!parent_.protocol_constraints_.checkOutboundFrameLimits().ok()) {
ENVOY_CONN_LOG(debug, "error sending data frame: Too many frames in the outbound queue",
parent_.connection_);
setDetails(Http2ResponseCodeDetails::get().outbound_frame_flood);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an error return for this case, or should this be void and the call site always return 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error will be handled in the sendPendingFrames(). I've made this function void.

// See comment in the `encodeHeadersBase()` method about this RELEASE_ASSERT.
RELEASE_ASSERT(status.ok(), "sendPendingFrames() failure in non dispatching context");
parent_.checkProtocolConstraintViolation();
parent_.sendPendingFramesAndHandleError();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question before I review farther, does it make sense to continue processing if an error handler has been scheduled? Most of these are the last function called.

Also, where we sendPendingFramesAndHandleError, are we confident we have tests for error handling at each call site? Would it make sense to always
if (sendPendingFramesAndHandleError()) {
return;
}
so we can coverage-verify the error handling is tested? Or given these replace release asserts are we asserting they can't happen and mainly hoping they don't cause issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think in some cases additional processing, like scheduling timers is not needed. I have changed these paths.
I have added in last few PRs 16 new integration test cases for all invocations of the sendPendingFrames(). To the best of my analysis all call sites in all possible contexts are covered. It does not mean that I'm right of course. So I've added the if clause as you have suggested and will check the coverage output.

Signed-off-by: Yan Avlasov <yavlasov@google.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work!!! LGTM. Code is much easier to understand now. Will defer to @alyssawilk and others for any final comments.

Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

Copy link
Contributor

@antoniovicente antoniovicente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice change. Go go improvements to H2 codec safety.

bool ConnectionImpl::sendPendingFramesAndHandleError() {
if (!sendPendingFrames().ok()) {
scheduleProtocolConstraintViolationCallback();
return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning true on error is a bit unusual, but you are documenting the return value so it seems fine.

@yanavlasov
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Signed-off-by: Yan Avlasov <yavlasov@google.com>
@mattklein123
Copy link
Member

I think CI issues should be fixed. Let's try one more main merge?

/wait

Signed-off-by: Yan Avlasov <yavlasov@google.com>
@mattklein123
Copy link
Member

Merge main once #13598 merges. Thanks!

/wait

Signed-off-by: Yan Avlasov <yavlasov@google.com>
@mattklein123 mattklein123 merged commit 11289b9 into envoyproxy:master Oct 16, 2020
mpuncel added a commit to mpuncel/envoy that referenced this pull request Oct 17, 2020
* master: (22 commits)
  delay health checks until transport socket secrets are ready. (envoyproxy#13516)
  test, oauth2: Make sure config test runs field validation (envoyproxy#13496)
  [http] swap codec implementations to default new (envoyproxy#13579)
  wasm: update proxy-wasm-cpp-host (envoyproxy#13606)
  postgres: do not copy and linearize received data when it is not going to be used (envoyproxy#13393)
  configs: Update configs v2 -> v3 (envoyproxy#13562)
  http2: Remove RELEASE_ASSERTs in sendPendingFrames() error handling (envoyproxy#13546)
  dependencies: track untracked implied dependencies, wrapup dashboard. (envoyproxy#13571)
  listener: add match all filter chain (envoyproxy#13449)
  fix mistakes in docstrings (envoyproxy#13603)
  ratelimit: add route entry metadata to ratelimit actions (envoyproxy#13269)
  cluster manager: avoid immediate activation for dynamic inserted cluster when initialize (envoyproxy#12783)
  ext_authz: Avoid calling check multiple times (envoyproxy#13288)
  docs: Unexclude remaining configs from validation (envoyproxy#13534)
  build: update rules_rust to allow Rustc in RBE (envoyproxy#13595)
  docs: Update sphinxext.rediraffe (envoyproxy#13589)
  Deprecate moonjit support on Windows before beta (envoyproxy#13541)
  dependencies: bump LuaJIT to 2.1 branch HEAD @ e9af1ab. (envoyproxy#13474)
  docs: add TLS stats to cluster stats doc (envoyproxy#13561)
  ci: stop building alpine-debug images in favor of ubuntu-based debug image (envoyproxy#13598)
  ...

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
@yanavlasov yanavlasov deleted the remove-release-assert branch February 1, 2021 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[HTTP/2] implement protocol error handling out of dispatch context
4 participants