Skip to content

Commit

Permalink
shared: don't infinitely retry on I/O timeout
Browse files Browse the repository at this point in the history
After the introduction of shared clients, we received reports of
excessive CPU usage. Through a debug logging build, the cause was
identified to be mishandling of "i/o timeout" errors, which are returned
from read (or write) calls when the read (or write) deadline of a
connection are exceeded.

In the existing code, once we hit such an I/O timeout, the receiving
goroutine was calling 'ReadMsg' in a busy loop, as the connection was
not closed, but reads were not succeeding either. No useful work was
performed anymore, until a further attempt at sending would reset the
read deadline (c.f. the 'SendContext' method). This, however, depends on
whether the retry occurs on the same five tuple, which isn't guaranteed.
Indeed, it seems more than conceivable that a DNS client might retry
with a different source port.

To remedy the issue, this patch adds a check for the exceeded deadline.
Unceremoniously, we shut down the handling goroutine if we hit this I/O
timeout. This then closes the connection, which also shuts down the
receiving goroutine.

Reported-by: John Watson <john@dctrwatson.com>
Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
  • Loading branch information
bimmlerd committed Nov 16, 2023
1 parent eaf71f6 commit 085befa
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions shared_client.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"errors"
"fmt"
"net"
"os"
"sync"
"time"
)
Expand Down Expand Up @@ -208,6 +209,11 @@ func handler(wg *sync.WaitGroup, client *Client, conn *Conn, requests chan reque
close(waiter.ch)
}
waitingResponses = make(map[uint16]waiter)
// If we've exceeded the read deadline, stop handling. The conn will be closed, and
// the receive goroutine cleaned up.
if errors.Is(resp.err, os.ErrDeadlineExceeded) {
return
}
} else if resp.msg != nil {
if waiter, ok := waitingResponses[resp.msg.Id]; ok {
delete(waitingResponses, resp.msg.Id)
Expand Down

0 comments on commit 085befa

Please sign in to comment.