tls: fix SslSocket read resumption after readDisable when processing the SSL record that contains the last bytes of the HTTP message #13234

antoniovicente · 2020-09-23T15:55:44Z

Commit Message:
tls: fix SslSocket read resumption after readDisable when processing the SSL record that contains the last bytes of the HTTP message

Drain SSL_pending additional bytes from RawSSL after hitting buffer high watermark to avoid read resumption error when all remaining bytes of an HTTP message are trapped in SSL internal buffers.
Additional Description:
Risk Level: medium, small localized fixed to IO resumption.
Testing: new unit and integration tests
Docs Changes: n/a
Release Notes: added
Fixes #12304

…fer high watermark to avoid read resumption error when all remaining bytes of an HTTP message are trapped in SSL internal buffers. Signed-off-by: Antonio Vicente <avd@google.com>

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente · 2020-09-23T15:55:58Z

/assign @alyssawilk

alyssawilk

This is fantastic! thanks so much for sorting this out

alyssawilk · 2020-09-23T17:31:20Z

source/extensions/transport_sockets/tls/ssl_socket.cc

+        ASSERT(pending_bytes < 16 * 1024, "SSL record should be at most 16KB");
+        if (pending_bytes > 0) {
+          ASSERT(keep_reading);
+          reserve_size = pending_bytes;


So I get the need to drain from the ssl buffer. What I'm less sure of is if we explicitly set pending_bytes to the bytes already buffered, if we're guaranteed that SSL_read won't read more from transport, but will only consume the current record which is already buffered. I was think of paging boringssl devs for that, but I think it'd be better to just regression test the behavior.

Is it possible to have a test where we send [lots of data][evenmoredata]
and verify that if the first read buffers [lots of data] and we tickle this corner case to drain it, that we won't also read and buffer evenmoredata? Otherwise I'm concerned we'll trade lack of resumption for lack of ceding. I think HighWatermarkReadResumptionProcessingHeaders probably lets you visually verify, but if SSL_read got greedy I don't think that test would fail (I think we'd need a unit test, but correct me if I'm wrong!)

(NB: I don't know anything about how Envoy handles I/O.)

This is sufficient of an edge case on the abstraction that testing things seems worthwhile (for better or worse, OpenSSL's SSL API is supposed to look vaguely like a UNIX TCP socket which abstracts over most buffer handling). But SSL_read will currently never touch the transport if there's is already an unconsumed, decrypted record in the buffer. I wouldn't anticipate us changing that. Likewise, when readahead is off (currently always off and even if we did implement it, it'd be optional), non-zero SSL_pending implies there isn't also an undecrypted record in the buffer.

Though I think the intention of the API (especially if, like OpenSSL but unlike BoringSSL, you believe in renegotiation) is that callers not make as strong of assumptions about the correlation between SSL reads and transport reads. If there's a renegotiation (again, not applicable for BoringSSL), SSL_read may be blocked on the handshake, which may be blocked on transport write or some random callback. A more general implementation (closer to what Chromium does) might look something like:

RunReadStateMachine()

If we're not ready to consume read from SSL (application layer doesn't want data, too much is buffered already, etc.), return. Ensure that RunReadStateMachine() is called again when this changes.

Call SSL_read, independent of the state of the transport.

If SSL_read returned data, great! Pass along to the application. Go back to the top of RunReadStateMachine() in case the application still wants data.

If SSL_read failed with SSL_ERROR_WANT_READ, return. Ensure that RunReadStateMachine() is called again when the transport is readable.

Ditto for other SSL_ERROR_WANT_* codes that we care to support. Though without renego or client 0-RTT, I think only WANT_READ matters here.

This doesn't depend on the state of the buffers, and it's always safe to call RunReadStateMachine(). However, one difference is RunReadStateMachine() will not pass data to an overbuffered application, even if the data is secretly available in SSL_read already. I think this is fine and conceptually simpler, but Antonio mentioned to me the intent was to pass that data along if available?

alyssawilk: SslReadBufferLimitTest.DrainToSslRecordBoundary attempts to cover the case you describe by doing 10 10kb writes and verifying that the calls to doRead on the server connection with a configured high-watermark of 16KB deliver the data in 20KB chunks.

I reached out to davidben about to help confirm my understanding of the current implementation and how it's likely to evolve going forward since the fix implemented here ends up depending on some internal details how boringssl deals with partial reads. Seems like the assumptions I'm making are fairly reasonable, but we may want to also think about implications they have on the transport socket contract and how likely it is for other transport socket implementations to fall on similar traps.

There is one other possible way to fix resumption in this case: explicitly schedule read resumption in cases where the read buffer is empty.
Change set: master...antoniovicente:ssl_read_resumption_alternate
Related comments: #12304 (comment)
This approach seems similar to the idea that davidben describes on how to ideally interact with SSL read/write.

A benefit of this alternate approach is that imposes less strict requirements on transport socket implementations but it does involve some possibly spurious synthetic fd readable events. I have an old draft PR that optimizes away fd re-registrations on readDisable transitions which seems to fix this resumption issue, but its blocked on performance data and debugging of some test failures: master...antoniovicente:optimize_read_disable

I think I have a preference for @davidben solution as it avoid assuming something about internal behavior that may change. What if we add a method to the transport socket to tell us if the has some buffered bytes, in this case you can avoid useless read events.

@yanavlasov So it would be better to do something similar to this alternate solution? master...antoniovicente:ssl_read_resumption_alternate

The issue with a state machine that specific to SSL handling is that it doesn't generalize well to other transport socket implementations. I think the more eager event scheduling proposed in my alternate fix is the closest we can get to davidben's suggestions.

I'm mildly inclined to avoid wake-ups every time. If we don't think we can do a clean and cheap "is there buffered data to process" check I'd lean towards the original solution with extensive testing in case boring or openssl ever change.

I'd definitely like alternate opinions here on which of the ugly options are the least ugly. cc @mattklein123 @snowp @ggreenway for thoughts

I think this is related to the my question above. I'm confused why we need this in the first place. If the old impl set read to ready, why wouldn't the data be sucked out the next time read is ready?

@mattklein123 In the normal case the current resumption logic works correctly. The problem is when we factor in calls to readDisable(true) + readDisable(false) done as we trigger and untrigger the buffer high watermark

Calls to readDisable unregister and re-register the fd for events. When fds are unregistered, any pending synthetic events on the fd are cleared, so the synthetic read event scheduled via setReadBufferReady is never delivered. Resumption after readDisable(false) depends on either there being some bytes in the user-space read buffer or some bytes in the kernel's socket receive buffer. When handling SSL traffic it is possible to have readable bytes in SSL internal buffers while there are 0 bytes in the stream read buffer and 0 bytes in the kernel buffer; when that happens fd re-registration does not lead to a wakeup despite the transport socket being able to produce bytes.

Calls to readDisable unregister and re-register the fd for events. When fds are unregistered, any pending synthetic events on the fd are cleared, so the synthetic read event scheduled via setReadBufferReady is never delivered. Resumption after readDisable(false) depends on either there being some bytes in the user-space read buffer or some bytes in the kernel's socket receive buffer. When handling SSL traffic it is possible to have readable bytes in SSL internal buffers while there are 0 bytes in the stream read buffer and 0 bytes in the kernel buffer; when that happens fd re-registration does not lead to a wakeup despite the transport socket being able to produce bytes.

Thanks this makes sense. At minimum, can you add more comments?

Another implementation idea: what if we had a transport socket specific way of asking "are there more bytes?" This could be overridden for TLS to look for kernel/user/within SSL. Then on readEnable() the check would pass and we would read again?

See #13772 instead. Your comment inspired me to think a bit more about cases where the resumption event is lost due to calls to readDisable. The connection impl can detect and remember when that happens and schedule resumption when needed.

alyssawilk · 2020-09-23T17:36:44Z

test/extensions/transport_sockets/tls/integration/ssl_integration_test.cc

+// Regression test for https://github.com/envoyproxy/envoy/issues/12304
+TEST_P(RawWriteSslIntegrationTest, HighWatermarkReadResumptionProcessingHeaders) {
+  // The raw writer will perform a separate SSL_write for each of the chunks below. Chunk sizes were
+  // picked such that the connection's high watermark will trigger while processing the last SSL


Is there a way to make sure this remains true even if we changed the default buffer limits / read limits? I think probably the unit test would do it.

The test SslReadBufferLimitTest.DrainToSslRecordBoundary further down ensures that when processing a stream of 10KB SSL records with a read limit of 16KB we end up reading 2 full records per read operation (e.i. 20KB)

That was one of the better unit test ideas that I could come up with. Another potentially good unit test would be to create a full server connection with SSL transport and verify behavior when readDisable(true) + readDisable(false) is called after each of the read wakeups, and that the drain operation runs to completion over a sequence of wakeups. I'll try to get that implemented.

alyssawilk · 2020-09-23T17:38:28Z

test/integration/utility.h

      last_connection_event_ = event;
      closed_ |= (event == Network::ConnectionEvent::RemoteClose ||
                  event == Network::ConnectionEvent::LocalClose);
      connected_ |= (event == Network::ConnectionEvent::Connected);
    }
-    void onAboveWriteBufferHighWatermark() override {}
-    void onBelowWriteBufferLowWatermark() override {}
+    void onAboveWriteBufferHighWatermark() override { high_watermark_triggered_ = true; }


Oh, or maybe you planned to use this to make sure it was regression-proof? Looks like it may not be used?

I thought I needed to pass this info to the write function, but it ended up not being necessary. Cleaned up.

Signed-off-by: Antonio Vicente <avd@google.com>

mattklein123 · 2020-09-30T15:47:16Z

source/extensions/transport_sockets/tls/ssl_socket.cc

+        // important to ensure read resumption works correctly after calls to
+        // Network::Connection::readDisable().
+        int pending_bytes = SSL_pending(rawSsl());
+        ASSERT(pending_bytes < 16 * 1024, "SSL record should be at most 16KB");


Should this be <= ?

The record size is at most 16KB, but at least 1 byte from the record must have been consumed by an earlier SSL_read, so SSL_pending can return at most 16KB-1

OK makes sense. Can you add a small comment and adjust the assert error to s/at most/less than?

mattklein123 · 2020-09-30T15:53:04Z

source/extensions/transport_sockets/tls/ssl_socket.cc

+        // internal buffers, and do one last read iteration if bytes are available. This is
+        // important to ensure read resumption works correctly after calls to
+        // Network::Connection::readDisable().


Do you mind adding a bit more color here on the control flow that requires this? Naively I would assume that in the previous logic we would signal to read again later, then we would attempt to read again, and get the data?

mattklein123 · 2020-09-30T15:56:15Z

source/extensions/transport_sockets/tls/ssl_socket.cc

+        ASSERT(pending_bytes < 16 * 1024, "SSL record should be at most 16KB");
+        if (pending_bytes > 0) {
+          ASSERT(keep_reading);
+          reserve_size = pending_bytes;


I think this is related to the my question above. I'm confused why we need this in the first place. If the old impl set read to ready, why wouldn't the data be sucked out the next time read is ready?

stale · 2020-10-11T11:45:40Z

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

antoniovicente · 2020-10-12T22:43:25Z

Back from vacation, will continue working on this PR soon.

antoniovicente · 2020-10-28T01:20:51Z

Abandoning this change in favor of #13772

antoniovicente added 2 commits September 23, 2020 03:42

tls: Drain SSL_pending additional bytes from RawSSL after hitting buf…

bc56faa

…fer high watermark to avoid read resumption error when all remaining bytes of an HTTP message are trapped in SSL internal buffers. Signed-off-by: Antonio Vicente <avd@google.com>

Add release note

83a8dce

Signed-off-by: Antonio Vicente <avd@google.com>

antoniovicente requested review from lizan and PiotrSikora as code owners September 23, 2020 15:55

repokitteh-read-only bot assigned alyssawilk Sep 23, 2020

alyssawilk reviewed Sep 23, 2020

View reviewed changes

alyssawilk assigned ggreenway Sep 23, 2020

cleanup unused class members.

58a818c

Signed-off-by: Antonio Vicente <avd@google.com>

yanavlasov self-assigned this Sep 24, 2020

mattklein123 reviewed Sep 30, 2020

View reviewed changes

stale bot added the stale stalebot believes this issue/PR has not been touched recently label Oct 11, 2020

stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Oct 12, 2020

antoniovicente added the waiting label Oct 12, 2020

antoniovicente closed this Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tls: fix SslSocket read resumption after readDisable when processing the SSL record that contains the last bytes of the HTTP message #13234

tls: fix SslSocket read resumption after readDisable when processing the SSL record that contains the last bytes of the HTTP message #13234

antoniovicente commented Sep 23, 2020

antoniovicente commented Sep 23, 2020

alyssawilk left a comment

alyssawilk Sep 23, 2020

davidben Sep 23, 2020 •

edited

Loading

antoniovicente Sep 24, 2020

yanavlasov Sep 24, 2020

antoniovicente Sep 25, 2020

alyssawilk Sep 28, 2020

mattklein123 Sep 30, 2020

antoniovicente Oct 1, 2020

mattklein123 Oct 1, 2020

antoniovicente Oct 27, 2020

alyssawilk Sep 23, 2020

antoniovicente Oct 1, 2020

alyssawilk Sep 23, 2020

antoniovicente Sep 24, 2020

mattklein123 Sep 30, 2020

antoniovicente Oct 1, 2020

mattklein123 Oct 1, 2020

mattklein123 Sep 30, 2020

mattklein123 Sep 30, 2020

stale bot commented Oct 11, 2020

antoniovicente commented Oct 12, 2020

antoniovicente commented Oct 28, 2020

tls: fix SslSocket read resumption after readDisable when processing the SSL record that contains the last bytes of the HTTP message #13234

tls: fix SslSocket read resumption after readDisable when processing the SSL record that contains the last bytes of the HTTP message #13234

Conversation

antoniovicente commented Sep 23, 2020

antoniovicente commented Sep 23, 2020

alyssawilk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidben Sep 23, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stale bot commented Oct 11, 2020

antoniovicente commented Oct 12, 2020

antoniovicente commented Oct 28, 2020

davidben Sep 23, 2020 •

edited

Loading