GetBody on client Requests #543

emcfarlane · 2023-07-06T21:15:28Z

Enable client request retries by setting request.GetBody() allowing retryable errors to retry. We create a new temporary buffer for the request writer. On write the stream and buffer are written to. If the buffer becomes too large, or the response is returned, the buffer is discarded. Afterwards writes ignore the buffer but continue to pass through to the underlying stream.

Tested by sending two requests over the same connection and triggering GOAWAY by setting "Connection" to"close". The streams after the GOAWAY get retried.

Measure overhead with the connect benchmark. Need more testing (too small a sample, but pipe rewrite saved some allocations).

$ go test -v -run=^$ -bench=BenchmarkConnect -benchmem -benchtime=5s
- BenchmarkConnect/unary-8                     477           2174951 ns/op         5709248 B/op        233 allocs/op
+ BenchmarkConnect/unary-8                     597           1983015 ns/op         5610329 B/op        230 allocs/op

Fixes https://github.com/bufbuild/connect-go/issues/541

The more I look at it, the more convinced I am that this option is a bad idea. It's very unclear what it's trying to accomplish, and there are many better options: * Limiting heap usage? Use the upcoming soft memory limit APIs (golang/go#48409). * Limiting network I/O? Use `http.MaxBytesReader` and set a per-stream limit. * Banning "large messages"? Be clear what you mean, and use `unsafe.SizeOf` or `proto.Size` in an interceptor. Basically, the behavior here (and in grpc-go) is an incoherent middle ground between Go runtime settings, HTTP-level settings, and a vague "no large messages" policy. I'm doubly sure we should delete this because we've decided not to expose the metrics to track how close users are to the configured limit :)

The gRPC-Web specification is explicitly a description of the reference implementations rather than a proper specification. Cross-testing reveals that the gRPC-Web JS expects trailers-only responses to have the trailing metadata sent as HTTP headers (what a mouthful).

This code used to be in a separate package, so we were doing this by hand. Using the helper is just as fast and less verbose.

Discard is designed to throw away the request body. Typically, we've encountered an error and we're trying to get to the HTTP trailers or we're making a best effort to re-use TCP connections. In a few places, though, it makes sense to bubble errors in discard further up.

Looking at the compression code again, we're not getting much value from generics. `WithCompression` is also the only generic `Option`, which is a little weird. This PR changes the compression pools to work with interfaces instead, which makes them quite a bit simpler. They're just as resistant to user error, but ever so slightly easier for us to mess up; I think it's a worthwhile tradeoff for the simplicity. The PR also handles errors from the pool in `protocol_grpc_lpm.go`. Against my better judgment, we'll just kick some of these errors back to the caller - it's not worth a special hook just to log this one error.

The termination condition on this while loop was incorrect; I think this snuck in during an automated rename of a super-short variable.

* Add test for errors marshaling protobuf Status If a user-supplied Codec errors when marshaling a protobuf Status message (which we use internally when converting errors to HTTP trailers), we currently drop all the error information on the floor and return "Unknown: EOF". This commit adds a reproduction as a test case. Relates to #197. * Use CodeInternal for errors marshaling proto Status If we can't marshal a protobuf Status message, send CodeInternal to the client with some details. * Fix error-wrapping lint

Fixes #205.

So that we can share code between our implementations of the Connect and gRPC protocols, factor enveloping (aka length-prefixed messages) and compression out of the gRPC-specific code.

* Add test for unary RPC with zero-byte messages * Add failing test for handler timeout handling We're parsing timeouts, but not properly propagating them into user code. Thanks, crosstests! * Fix timeout handling Since we know the shape of the Connect protocol, we can simplify the protocol interfaces and move some shared utility functions into `protocol.go`. This also fixes server-side timeout handling. * Keep Accept-Post string manipulation shorter We're only doing this at startup, so it's okay to make it slow. The code doesn't get much shorter, but it's arguably more readable. * Add indirection to constant limit Move the literal for our discard limit into a constant.

…nable to be sent (#211) Co-authored-by: Steve Ayers <sayers@uber.com>

…est is unable to be sent (#211)" This reverts commit eaacdf5.

In preparation for factoring out some of the complicated HTTP stuff happening in the gRPC client stream implementation, move timeout encoding to a less-convoluted portion of the code.

The most complex portion of the client-side gRPC sender and receiver is the HTTP layer, where we use `io.Pipe` to create a streaming request body. So that we can reuse this code for the Connect protocol, factor it out of the gRPC-specific code. This has the side effect of simplifying the gRPC implementation and more clearly separating the gRPC and gRPC-Web protocols.

The params structs for protocols are somewhat proven now, so we can stop hand-copying them into protocol-specific structs and embed instead.

duplex_http_call_test.go

duplex_http_call.go

Must copy reads not writes on io.Pipe, 1-1 favouring read side.

jhump

I think this approach has a couple of fundamental flaws:

We don't know message boundaries at the point where we are buffering. This matter because ideally we would always save at least the first message for any RPC. If we can save without making a copy, then we aren't actually using any more memory (we're just keeping the first message(s) pinned to the heap a little longer).
There is more copying and re-allocating buffers than is necessary. It looks like this is done (1) to continue using io.Pipe instead of an alternate mechanism to deliver data, and (2) to preserve the current pattern where the writer is what releases a buffer back to the pool instead of the reader. If we change those two assumptions, I think a much more efficient solution is possible.

emcfarlane · 2023-07-07T15:18:33Z

@jhump thanks for reviewing.

I was assuming the copies would be short lived and hopefully small. Like a copy of the buffer when compressing. It's only on the clients request. For any retry logic we still need to limit the memory, coping the buffer effectively halves the usable retry limit.
I agree this is a nicer solution.

jhump · 2023-07-07T15:24:43Z

the copies would be short lived and hopefully small

Unfortunately, Go's GC is not generational, so it being short-lived doesn't help much with GC pressure. As to whether they are small, that depends on the message sizes. This approach effectively double-allocates the bytes for the requests of every unary RPC, which feels kinda bad. While requests may usually be small, they could be quite large. So RPCs with large requests pay a proportionally bigger penalty with this approach.

emcfarlane · 2023-07-07T15:38:12Z

Closing in favour of the message queue.

emcfarlane · 2023-07-10T17:20:56Z

@jhump I think the initial memory analysis of this implementation was wrong. Currently for each send a message is written and returned to the pool for reuse. For the queue implementation each buffer not returned to the pool accumulates. Buffering will be equivalent to accumulating buffers minus the cost of the copy to the buffer, which is small. Having a single *bytes.Buffer for the buffer instead of []*bytes.Buffer for the message queue also reduces the number of alive buffers. I can use this implementation to benchmark the two. Adding the queue will have a very similar interface but Write([]byte) (int, error) will be replaced with WriteMessage(*bytes.Buffer) (int, error) and need the logic for conditionally freeing buffers.

For the GOAWAY condition I think we will only need a small buffer to manage the race between writing to a Conn and receiving a GOAWAY frame from an earlier stream on the same Conn. This needs testing. But either implementation has to have a limit based on bytes. I don't think a time limit makes sense as this would grow the heap excessively for a problem I think only requires a small amount of buffering.

Going to add the following improvements to this PR:

Add a method Rewind() bool to reset the reader and avoid the copy on error for GetBody
Avoid buffering on Read for unary client requests by blocking on establishing the connection.
Use a custom Pipe implementation to keep reads serialized with buffer writes using a single lock.

jhump · 2023-07-10T17:40:55Z

Having a single *bytes.Buffer for the buffer instead of []*bytes.Buffer for the message queue also reduces the number of alive buffers.

This isn't really true. The vast majority of RPCs have are unary, in which case we create a single buffer that goes into the queue. So with both approaches, we're keeping a single extra buffer, that can't be released until later.

minus the cost of the copy to the buffer, which is small

This is an assumption that may not always hold. For example, we have several RPCs in the buf CLI that can send many megabytes in a unary request. If this situation were to occur in a high-volume server, we'd suddenly be adding lots of wasted copying as well as extra memory pressure of having to duplicate the request bytes in memory for every operation.

I don't think a time limit makes sense as this would grow the heap excessively

I don't follow. The time limit wouldn't grow the heap excessively, because there's a size limit already. The time limit is strictly about decreasing the time for which a buffer is pinned to the heap: it allows us to reclaim the buffer sooner for cases where the server takes a very long time to reply with a status code and headers.

mattrobenolt · 2023-07-10T18:52:07Z

@emcfarlane jumping in here late and haven't fully read through the implementation, but would splitting the path for unary requests and bidirectional requests simplify a solution here?

The abstraction of a pipe and a writable buffer sorta aren't needed to the same extent for a unary request, and I think that abstraction is what's making it a bit more complicated to support the GetBody call.

I was sorta tinkering with a separate unary_http_call.go implementation rather than this duplex variant to simplify and speed up the unary path, which is also likely a more common path.

I suspect splitting the behavior on this boundary would help since we should easily have the entire request buffered already.

I also apologize since I haven't taken the time to fully read everything, just wanted to toss out this 2 cents for an alternative instead of trying to shove it into the current duplex behavior which has extra complexities to handle bidi streams.

emcfarlane · 2023-07-10T19:24:58Z

Hey @mattrobenolt, it would be great to see the tinkering on the unary client implementation! The unary solution would avoid the need for maintaining the buffers between send/recvs. I'm hoping an optimised solution for the streaming case won't add any additional overhead for unary calls. If that's the case we can avoid special pathing it, but that might not be realistic.

jhump · 2023-07-10T20:37:50Z

@mattrobenolt, by coincidence, @emcfarlane and I were discussing something similar: the approach we're looking at here would improve the robustness for all kinds of calls, but it is more intrusive and thus higher risk. So perhaps a better tradeoff for the short-term is to make a more surgical fix just for unary calls.

mattrobenolt · 2023-07-10T20:38:47Z

I'm hoping an optimised solution for the streaming case won't add any additional overhead for unary calls. If that's the case we can avoid special pathing it, but that might not be realistic.

That's kinda why I was going down this path. The overhead of the io.Pipe and using a goroutine for the request is kinda the biggest bottleneck for a unary request, and only needed to support bidi. If anything, I'd personally prefer swapping the idea to being slightly less optimal bidi, and being hyper focused on client unary.

Also arguably, bidi is by nature much generally long running and the allocations or setup required to setup the stream should be much less significant compared to clients doing thousands of unary RPC's per second. You probably aren't doing thousands or tens of thousands of bidi streams per second, that'd be weird.

emcfarlane · 2023-07-11T20:48:33Z

Putting this PR on hold. Next step is to move away from the Write interface to a WriteMessage one which would hand over ownership of the message buffer and allow us to avoid needing to copy buffers.

doriable and others added 30 commits April 6, 2022 18:12

Instantiate each new request for each call.

3e60884

style: move unexported test code after exported

5312ae4

Fix example name

fcec931

Don't return typed nils

a6f1181

Use header-merging helper in handler_stream.go

423a2ba

This code used to be in a separate package, so we were doing this by hand. Using the helper is just as fast and less verbose.

Fix auto-rename bug in end-of-stream logic

bf8233d

The termination condition on this while loop was incorrect; I think this snuck in during an automated rename of a super-short variable.

Delete client and handler stream constructors (#193)

fd76f3a

Pool buffers per handler and client (#192)

b574f42

Mark repository as a work in progress

cfcee6f

Improve error when clients try bidi over HTTP/1.x (#203)

4151ebe

Fix Make dependency (#202)

08e995b

Do not return error from Client constructor (#204)

dea5933

Cleanup (#206)

8a73466

Update names in CONTRIBUTING.md (#207)

fd9ace4

Rename Specification -> Spec (#208)

9161c3d

Fixes #205.

Factor enveloping and compression out of gRPC (#209)

c35ee25

So that we can share code between our implementations of the Connect and gRPC protocols, factor enveloping (aka length-prefixed messages) and compression out of the gRPC-specific code.

Returns a more intuitive Connect error if the underlying request is u…

eaacdf5

…nable to be sent (#211) Co-authored-by: Steve Ayers <sayers@uber.com>

Revert "Returns a more intuitive Connect error if the underlying requ…

532e99d

…est is unable to be sent (#211)" This reverts commit eaacdf5.

Move gRPC timeout encoding into NewStream

bba59af

In preparation for factoring out some of the complicated HTTP stuff happening in the gRPC client stream implementation, move timeout encoding to a less-convoluted portion of the code.

Improve error message

5abed41

Embed instead of copying

b35b2ca

The params structs for protocols are somewhat proven now, so we can stop hand-copying them into protocol-specific structs and embed instead.

Rename & move gRPC-specific HTTP mappings

8f935d0

Rename & move gRPC-specific error unmarshaling

4334b8c

emcfarlane commented Jul 7, 2023

View reviewed changes

duplex_http_call_test.go Outdated Show resolved Hide resolved

jhump reviewed Jul 7, 2023

View reviewed changes

emcfarlane changed the title ~~WriteBuffer for client requests~~ GetBody on client Requests Jul 7, 2023

Fix pipe buffer correctness

5072457

Must copy reads not writes on io.Pipe, 1-1 favouring read side.

jhump reviewed Jul 7, 2023

View reviewed changes

Test read buffer

e0821e3

emcfarlane closed this Jul 7, 2023

emcfarlane reopened this Jul 10, 2023

emcfarlane added 4 commits July 11, 2023 15:10

Wip new buffer logic

0407bce

Single close error

2c7a715

Annotate message impl

f148674

Add total for better errors

2023ab9

emcfarlane force-pushed the emcfarlane/getbody2 branch from 3344b06 to 2023ab9 Compare July 11, 2023 20:09

Set requestBody within duplex call

ef52aee

akshayjshah removed their request for review July 21, 2023 16:19

akshayjshah force-pushed the main branch from 843d045 to 3e10f8e Compare July 26, 2023 19:07

emcfarlane closed this Jul 27, 2023

pqn mentioned this pull request Jul 27, 2023

define Request.GetBody to avoid this error #541

Closed

akshayjshah deleted the emcfarlane/getbody2 branch September 15, 2023 23:51

emcfarlane mentioned this pull request Oct 9, 2023

Separate envelope writing from marshalling #586

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GetBody on client Requests #543

GetBody on client Requests #543

emcfarlane commented Jul 6, 2023 •

edited

Loading

jhump left a comment

emcfarlane commented Jul 7, 2023

jhump commented Jul 7, 2023

emcfarlane commented Jul 7, 2023

emcfarlane commented Jul 10, 2023 •

edited

Loading

jhump commented Jul 10, 2023

mattrobenolt commented Jul 10, 2023

emcfarlane commented Jul 10, 2023

jhump commented Jul 10, 2023

mattrobenolt commented Jul 10, 2023 •

edited

Loading

emcfarlane commented Jul 11, 2023

GetBody on client Requests #543

GetBody on client Requests #543

Conversation

emcfarlane commented Jul 6, 2023 • edited Loading

jhump left a comment

Choose a reason for hiding this comment

emcfarlane commented Jul 7, 2023

jhump commented Jul 7, 2023

emcfarlane commented Jul 7, 2023

emcfarlane commented Jul 10, 2023 • edited Loading

jhump commented Jul 10, 2023

mattrobenolt commented Jul 10, 2023

emcfarlane commented Jul 10, 2023

jhump commented Jul 10, 2023

mattrobenolt commented Jul 10, 2023 • edited Loading

emcfarlane commented Jul 11, 2023

emcfarlane commented Jul 6, 2023 •

edited

Loading

emcfarlane commented Jul 10, 2023 •

edited

Loading

mattrobenolt commented Jul 10, 2023 •

edited

Loading