-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: sync: mechanism to select on condition variables #16620
Comments
It seems to me that you can turn a condition variable into a channel by writing
I'm sure I'm missing something, but what? |
Yeah, except then I can leak goroutines if I'm not careful. I also pay the additional 4+ KB cost per select, which can be blocking for some time. |
This comment has been minimized.
This comment has been minimized.
I would still reacquire the mutex afterwards. Instead of this example from our docs: c.L.Lock()
for !condition() {
c.Wait()
}
... make use of condition ...
c.L.Unlock() I would write: c.L.Lock()
for !condition() {
c.L.Unlock()
select {
case <-ctx.Done():
case <-c.WaitChan():
}
c.L.Lock()
}
... make use of condition ...
c.L.Unlock() |
This comment has been minimized.
This comment has been minimized.
There's nothing atomically special about the Cond.Wait method. It's just an unlock, wait, and lock: https://github.com/golang/go/blob/release-branch.go1.7/src/sync/cond.go#L53 I don't see how this is any more of a footgun than anybody using the sync package in general. Or anybody using two goroutines and sharing global state without the sync package. Bad Go code is full of data races. I'm proposing this as a mechanism for people who do know what they're doing. Not as some recommended high-level mechanism. |
This comment has been minimized.
This comment has been minimized.
A more generic solution is perhaps for the compiler
to pass stack frame size hints to the runtime.
for example, in Ian's example, the compiler could
recognize that the function will usually use very
little stack, even less than the minimum stack, so
it could hint the runtime to start the goroutine with
a smaller stack.
In general, if we make the compiler determine the
initial stack size for a new goroutine, could that
solve at least half of the problem? (goroutine leaking
might still occur, but that's another issue.)
|
Kinda. Really I want a magic type of hchan (sentinel value of hchan.elemtype) which makes hchan.buf mean it's a pointer to a notifyList, and have the select implementation on entry to select do the notifyListAdd, and then if separate case fires in select, we do something like a notifyListRemove, and unregister that waiter. Otherwise I'm afraid of the leaks where my condition variable is woken up (because, say, I selected on my context.Done()), but then I find that my condition is satisfied and I exit the function, and never wait on the condition variable again. I need to tell the cond.Wait running in the goroutine to stop. Which makes me think it's better done in the implementation of select with a special channel type. This wouldn't be a language change. But it would involve modifying the runtime to support an API addition. |
could you design a user facing sync.Cond API for this? we need to automatically lock cond.L when the channel |
@minux, I covered all this above. |
Something that has come to trouble me about this: channels are level-triggered. A channel has a value or it does not, and values do not disappear until somebody reads them. Condition variables are edge-triggered. A call to the If we turn an edge-triggered mechanism into a level-triggered mechanism, then we either do something laborious to drain the level (read the value from the channel) if nothing needs it, or we have too many triggers. This is particularly obvious when we consider the The select statement, unlike channels, can be either edge-triggered or level-triggered. It would be meaningful to design some way to attach an edge trigger to a select statement. It just wouldn't go through a channel. So I think this is a fairly subtle change. |
In particular, a send-with-default converts an edge-triggered event into a level-triggered event by "latching" on the first edge-crossing. A receive-with-default converts a level-triggered event to an edge-triggered event by "unlatching" the channel when the edge-trigger no longer applies.
On a With the "channel as cond-var" pattern we can implement without such a hook, With a built-in mechanism, we could do even better: we could have the The tricky part, I think, is constructing the API such that either the sender or the receiver can abandon a case after it has marked the other side as "ready". |
I add proposal for "future" internal type #17466 , and it looks similar (in spirit) to this proposal. |
After implementation of #8898 we'd necessarily have the runtime support for alternate kinds of channels. At that point it might make sense to try this. |
If we do this, we might also want to extend it to sync.Lockers. I'm working with some code now where I have a sync.Mutex that might be held by someone else for a long time and a context.Context to respect. For the same reasons as Brad already outlined, I'd like to be able to use a select to either acquire the mutex or read from ctx.Done(). select {
case <-mu.C():
// do work
mu.Unlock()
case <-ctx.Done():
// clean up, mu is not locked
} |
@josharian Looks like channel with buffer of size 1:
Then, why not implement "future"? But, perhaps I didn't quite understand it. Excuse me then. |
Sorry to stumble in, but this edge/level-trigger goroutine leak issue reminds me of an old golang-dev thread [1] about "non-parallel channels". On that thread, I (admittedly without much thought) proposed a new way to make a channel/goroutine pair where the two are intimately tied, such that when all receivers are garbage collected, the goroutine can also be garbage collected. It's a language change, and not fully-thought-out, but perhaps some such thing could present a more general solution to this type of goroutine leak without needing special handling for select or timer-specific channels. [1] https://groups.google.com/d/msg/golang-dev/VoswK7OBckY/-4ZPi3XoSI4J |
I keep running into this when plumbing context through things. For example, I have an implementation of a connection pool which is cleanly expressed with a mutex and a sync.Cond. If there are no available connections nor capacity to create new ones, a caller waits to borrow with cond.Wait. I want to be able to pass a context.Context through and cancel the wait, so it would be nice to select on both cond.Wait and ctx.Done(). |
@cespare Are you running into significant cases where you can't easily replace the |
IIUC, @bcmills, you are proposing that folks do something like https://play.golang.org/p/zyi-LBlyBN? That would solve the use-case I stumbled upon this issue for. But I wouldn't say it's obvious. :) I have no idea about the runtime efficiency. |
@zombiezen Provided that you can keep the locked:unlocked ratio low (to reduce contention), I would only use a mutex for the But I think the example you linked is a bit too simple anyway: the |
Also running into this issue, where I have a gRPC call that waits for a state transformation on a server and reports changes. If clients cancel these blocking/waiting RPCs, I currently leave goroutines behind, waiting on the condition variable. What about having a |
@EdSchouten, is there a reason you can't replace the condition variable with a channel? See my talk on Rethinking Classical Concurrency Patterns: the section on condition variables starts on slide 37, and there is more detail on patterns for broadcast events in the backup slides (101-105). |
I just found an implementation of a condition variable using channels at https://github.com/anacrolix/missinggo/blob/master/chancond.go . It's similar to some of the backup slides in @bcmills talk referenced above. While it seems useful for event notifications, because of its locking behaviors it would be prone to race conditions if you tried to use it for some of the things one might traditionally use a condition variable for. |
Cancellable Cond package that implements all sync.Cond methods and passes all sync.Cond tests. Please write your comments about it. |
@jasajona I'm not convinced, I'm afraid. It can panic when Signal is called at the same time as Broadcast. For example, this code panics immediately on my machine:
Concurrency code is tricky! |
Thank you for pointing out a bug! Fixed it and made additional test: |
Is the evaluation of this proposal still on hold for #8898? |
@smasher164 I suppose so. But I'm not sure that there is a clear consensus that this proposal should be implemented. See also #21165. |
Both https://github.com/anacrolix/missinggo/blob/master/chancond.go and https://gitlab.com/jonas.jasas/condchan have a race condition where they can drop a Signal() if the waiting thread isn't yet receiving from the channel. (I filed an issue). For those needing it, I've implemented a Cond with WaitContext(): https://pkg.go.dev/github.com/Jille/contextcond |
For those watching this, the shiny new I couldn't actually use |
@benhoyt That only works if you can Broadcast your semaphore every time one context gets cancelled. If you use Signal instead of Broadcast, context.AfterFunc won't solve this. |
Now #8898 is merged, do we have anything else block this issue? |
Thanks, I took this proposal off hold. |
Sometimes condition variables are the best fit for a problem, and sometimes channels are. They don't compose well, though.
I've increasingly been finding myself in a position where I would like to select on both channel(s) and a condition variable. This has come up a few times in http2, where I use condition variables for flow control, but there are plenty of channels (including the net/http.Request.Cancel channel) in play too. Sometimes I need to both wait for flow control, or wait for a user to cancel their request, or the peer side to close their connection, etc.
I would love to be able to select on a condition variable, like if the
sync.Cond
type had a method:The semantics would be the receiving from it acts like a
runtime_Syncsemacquire
onc.sema
. We'd need to define what happens if another case in the select fires first. (does the semacquire get canceled?).Thoughts?
The text was updated successfully, but these errors were encountered: