-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor client RPC timeouts #14965
Refactor client RPC timeouts #14965
Conversation
76d85fc
to
03c53f9
Compare
03c53f9
to
ed4c9ca
Compare
ed4c9ca
to
41a7c7a
Compare
41a7c7a
to
40e5e49
Compare
@@ -364,7 +364,7 @@ func (p *ConnPool) dial( | |||
tlsRPCType RPCType, | |||
) (net.Conn, HalfCloser, error) { | |||
// Try to dial the conn | |||
d := &net.Dialer{LocalAddr: p.SrcAddr, Timeout: p.Timeout} | |||
d := &net.Dialer{LocalAddr: p.SrcAddr, Timeout: DefaultDialTimeout} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was changed in #11500 but I reverted it back to DefaultDialTimeout
to be consistent with other usages.
func (c *TimeoutConn) Read(b []byte) (int, error) { | ||
timeout := c.DefaultTimeout | ||
// Apply timeout to first read then zero it out | ||
if c.FirstReadTimeout > 0 { | ||
timeout = c.FirstReadTimeout | ||
c.FirstReadTimeout = 0 | ||
} | ||
var deadline time.Time | ||
if timeout > 0 { | ||
deadline = time.Now().Add(timeout) | ||
} | ||
if err := c.Conn.SetReadDeadline(deadline); err != nil { | ||
return 0, err | ||
} | ||
return c.Conn.Read(b) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This type was introduced in #11500 but while debugging timeout issues I found this difficult to understand.
Now, instead of a wrapped net.Conn
which sets a deadline per Read
call, we simply set the deadline during the rpc
call
type BlockableQuery interface { | ||
// BlockingTimeout returns duration > 0 if the query is blocking. | ||
// Otherwise returns 0 for non-blocking queries. | ||
BlockingTimeout(maxQueryTime, defaultQueryTime time.Duration) time.Duration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that narrowly defining this interface was better than giving RPCInfo
a Timeout
method and forcing every type to return a timeout (in most cases is rpc_hold_timeout)
agent/consul/client_test.go
Outdated
var out struct{} | ||
err := c1.RPC("Long.Wait", &structs.NodeSpecificRequest{}, &out) | ||
require.Error(t, err) | ||
require.Contains(t, err.Error(), "rpc error making call: i/o deadline reached") | ||
// We use structs.KVSRequest, which does not implement pool.BlockableQuery | ||
// and should have no timeouts defined. | ||
require.NoError(t, c1.RPC("Long.Wait", &structs.KVSRequest{}, &out)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to sanity check that the setting the deadline on the stream connection did not persist when the client was cached in the connection pool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add the new field to api?
No, the new fields are config and don't need to be in api |
@@ -31,7 +32,7 @@ type muxSession interface { | |||
|
|||
// streamClient is used to wrap a stream with an RPC client | |||
type StreamClient struct { | |||
stream *TimeoutConn | |||
stream net.Conn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simply reverting a change from #11500
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks good.
Couple of small fixes noted inline but should be quick!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing job @kisunji 🥳
Thanks for patiently going through those extra bits to add reloading etc. This seems solid now!
Co-authored-by: Paul Banks <banks@banksco.de>
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d) Co-authored-by: Chris S. Kim <ckim@hashicorp.com>
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Fix an issue where rpc_hold_timeout was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling with rpc_hold_timeout. A new configuration `rpc_read_timeout` is created. Refactor some implementation from the original PR 11500 to remove the misleading linkage between RPCInfo's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts. (cherry picked from commit 29a297d)
Description
This PR has two objectives:
rpc_hold_timeout
was being used as the timeout for non-blocking queries. Users should be able to tune read timeouts without fiddling withrpc_hold_timeout
. A new configurationrpc_client_timeout
was created.RPCInfo
's timeout (used to retry in case of certain modes of failures) and the client RPC timeouts.Testing & Reproduction steps
PR Checklist