-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: Add shared client support #7
Conversation
42a5898
to
6f69545
Compare
24113b5
to
bba3eb4
Compare
shared_client.go
Outdated
client.refcount-- | ||
if client.refcount == 0 { | ||
// Make client unreachable and close it's connection. | ||
// Must hold the proxy mutex for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: I guess proxy
is not relevant once this code landed in the library, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, will amend!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the review!
bba3eb4
to
6ce6bc6
Compare
Fixed comments still mentioning the proxy and squashed the commits to clean things up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
Left a suggestion for a marginal optimization when key == ""
and a question about a possible perf regression.
shared_client.go
Outdated
// connection must be closed while holding the proxy lock to avoid a race | ||
// where a new client is created with the same 5-tuple before this one is | ||
// closed, which could happen if the proxy lock is released before this | ||
// Close call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the lock on the SharedClients
struct serves two purposes:
- protect each client's
refcount
- avoid binding errors because of the race mentioned in this comment
Actually, both 1) and 2) needs to be serialized only in case of concurrent calls to GetSharedClient
for the same key/5-tuple. So, to increase parallelism, we could use an approach similar to the one in https://github.com/cilium/cilium/blob/e40fd083a503b21dd3c88e8e21e83814356a3616/daemon/cmd/fqdn.go#L687 (having a slice of mutexes and hashing the 5-tuple to get the mutex to lock).
Looking at the two critical sections protected by the lock, I think the only relevant case where we might experience an unexpected delay is the closing of a TCP connection. Might that become a problem?
OTOH the hashing + mutex will surely increase complexity even more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lock on the SharedClients also protects the map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I missed to say that I was thinking about using a sync.Map
instead of a plain map (docs reports one of its suggested use as (2) when multiple goroutines read, write, and overwrite entries for disjoint sets of keys
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Came up with this pattern:
- client's refcount is protected by it's own mutex
- collection's (global) lock is only held during collection map operations and during the new Client creation in between a failing lookup and store, but Client creation does not make any system calls
- only client's own lock is held while it is dialing OR closing (including system calls)
- this means that while waiting for the client's lock to increase it's refcount, the client may start closing, but in this case
refcount == 0
and we may discard this client and start again.
func (s *SharedClients) GetSharedClient(key string, conf *Client, serverAddrStr string) (client *SharedClient, closer func()) {
for {
// lock for s.clients access
s.lock.Lock()
if key != "" {
// locate client to re-use if possible.
client = s.clients[key]
}
if client == nil {
client = newSharedClient(conf, serverAddrStr)
if key != "" {
s.clients[key] = client
}
s.lock.Unlock()
// new client, we are done
break
}
s.lock.Unlock()
// reusing client that may start closing while we wait for its lock
client.Lock()
if client.refcount > 0 {
// not closed, add our refcount
client.refcount++
client.Unlock()
break
}
// client was closed while we waited for it's lock, discard and try again
client.Unlock()
client = nil
}
return client, func() {
client.Lock()
defer client.Unlock()
client.refcount--
if client.refcount == 0 {
// connection close must be completed while holding the client's lock to
// avoid a race where a new client dials using the same 5-tuple and gets a
// bind error.
// The client remains findable so that new users with the same key may wait
// for this closing to be done with.
close(client.requestsToSend)
client.conn = nil
client.wg.Wait()
// Make client unreachable
// Must take s.lock for this.
s.lock.Lock()
if key != "" {
delete(s.clients, key)
}
s.lock.Unlock()
}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SharedClients
' lock
is now private and it is only used within GetSharedClient
function. When it is held no client locks can be taken.
shared_client.go
Outdated
s.Lock() | ||
defer s.Unlock() | ||
|
||
if key != "" { | ||
// locate client to re-use if possible. | ||
client = s.clients[key] | ||
} | ||
if client == nil { | ||
client = newSharedClient(conf, serverAddrStr) | ||
if key != "" { | ||
s.clients[key] = client | ||
} | ||
} | ||
client.refcount++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the client won't be shared in case key == ""
we can avoid setting refcount
and taking the lock:
if key == "" {
client = newSharedClient(conf, serverAddrStr)
return client, func() {
if client.conn != nil {
client.conn.Close()
}
}
}
// here we can take the lock (but we don't have to check if key is empty anymore)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, adopted!
shared_client_test.go
Outdated
if err != nil { | ||
errors.Store(id, fmt.Errorf("failed to exchange: %v", err)) | ||
} else { | ||
if r == nil { | ||
errors.Store(id, fmt.Errorf("response is nil")) | ||
} else { | ||
if r.Id != id { | ||
errors.Store(id, fmt.Errorf("incorrect id (%d != %d)", r.Id, id)) | ||
} | ||
if r.Rcode != RcodeSuccess { | ||
errors.Store(id, fmt.Errorf("failed to get an valid answer\n%v", r)) | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Early return makes this more readable:
if err != nil { | |
errors.Store(id, fmt.Errorf("failed to exchange: %v", err)) | |
} else { | |
if r == nil { | |
errors.Store(id, fmt.Errorf("response is nil")) | |
} else { | |
if r.Id != id { | |
errors.Store(id, fmt.Errorf("incorrect id (%d != %d)", r.Id, id)) | |
} | |
if r.Rcode != RcodeSuccess { | |
errors.Store(id, fmt.Errorf("failed to get an valid answer\n%v", r)) | |
} | |
} | |
} | |
if err != nil { | |
errors.Store(id, fmt.Errorf("failed to exchange: %v", err)) | |
return | |
} | |
if r == nil { | |
errors.Store(id, fmt.Errorf("response is nil")) | |
return | |
} | |
if r.Id != id { | |
errors.Store(id, fmt.Errorf("incorrect id (%d != %d)", r.Id, id)) | |
return | |
} | |
if r.Rcode != RcodeSuccess { | |
errors.Store(id, fmt.Errorf("failed to get an valid answer\n%v", r)) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, adopted!
6a4a248
to
3d09a0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should solve any performance concern in case of high parallelism! 💯
I've left some suggestions to simplify the code a bit, but nothing major.
shared_client.go
Outdated
*Client | ||
|
||
requestsToSend chan request | ||
responses chan sharedClientResponse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to make the responses
channel part of the SharedClient.
We can instantiate this in handler
and keep it there, since:
- only the goroutines in
handler
send to, receive from and close the channel handler
is called only once when theSharedClient
connects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed!
wg.Add(1) | ||
go func() { | ||
defer wg.Done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get rid of these wg.Add
and wg.Done()
here. That's because handler
can't exit until responses
is closed, and only after that the wg.Done()
deferred at the beginning of the function will be called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that handler can exit before responses
is closed, for example if requests
is closed. I saw this happen in tests when I was working on getting rid of channel buffers (which can hide bugs).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I've missed that case. 👍
shared_client.go
Outdated
// Make client unreachable | ||
// Must take s.lock for this. | ||
s.lock.Lock() | ||
if key != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check for key
being not-empty after checking it at the beginning of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed!
shared_client.go
Outdated
client.conn = nil | ||
client.wg.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should swap these two. Doing that we'll be sure that conn
is not still being used by the goroutines in handler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
shared_client.go
Outdated
client.conn = nil | ||
client.wg.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here for swapping the wait and the setting of conn
to nil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!, also refactored this repetition to SharedClient.close()
3d09a0a
to
8f678c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks! 💯
Add SharedClient and SharedClients types to support multiple different requests in flight on the same DNS client connection. Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
8f678c9
to
a547953
Compare
Add SharedClient and SharedClients types.