net/http: performance collapse when http/2 requests wait for connection #34944
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Performance
Milestone
Does this issue reproduce with the latest release? / What version of Go are you using (
go version
)?The code that causes this behavior is present in Go 1.13 and in tip, but #34941 shadows the bug in releases newer than Go 1.11—so for today, I'll demonstrate it with go1.11.13.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
In the test, I set up an http/2 server/client pair, limited the number of TCP connections that could carry HTTP requests and started a large number of HTTP requests. I then measured how long it took to create and cancel one additional request.
What did you expect to see?
I expected the speed of creating and canceling additional requests to not vary based on the number of other HTTP requests waiting on the TCP connection.
What did you see instead?
Creating and canceling an HTTP request gets slow as a function of how many requests are waiting for the same TCP connection when HTTP/2 is active.
The code in
net/http/h2_bundle.go
that bridges betweenthe channel for canceling a single request and the
sync.Cond
that guards the TCP connection (net/http.http2ClientConn.awaitOpenSlotForRequest
) responds to request cancelation by waking up every goroutine that's waiting to use the TCP connection (withsync.Cond.Broadcast
). Each of those goroutines in sequence will acquire the lock on the*http2ClientConn
to check if there's room to send another request.On top of that, the contention on the
sync.Mutex
protecting a single connection results in a slowdown on thesync.Mutex
protecting the Transport's HTTP/2 connection pool when*http2clientConnPool.getClientConn
callscc.idleState
while holding the pool's lock.In the reproducer, the baseline speed of creating and canceling an HTTP/2 request is 1ms since the test waits that long before canceling to give the RoundTrip goroutine time to find the TCP connection in the pool and start waiting for an available slot.
When there are a small number of outstanding requests (100 or 200), creating and canceling an additional request takes about 1.3ms: that baseline of 1ms plus 300µs of actual time.
As the number of outstanding requests grows past the capacity of the single TCP connection (the default Go http/2 server sets that to 250 HTTP requests), creating and canceling a request wakes up more and more goroutines. The cost of this is still small with 1600 idle requests (1.1ms over the 1ms baseline), but with 6400 idle requests it's grown to 5.9ms over the 1ms baseline. With 100k idle requests, the cost to cancel one is nearly one second of work.
With N idle requests that all time out / are canceled, the total cost is O(N^2). The cost should be O(N).
The text was updated successfully, but these errors were encountered: