Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

select_general hang on Windows and macOS, sometimes on Linux too #44

Closed
tesuji opened this issue Sep 9, 2020 · 11 comments · Fixed by #70
Closed

select_general hang on Windows and macOS, sometimes on Linux too #44

tesuji opened this issue Sep 9, 2020 · 11 comments · Fixed by #70

Comments

@tesuji
Copy link
Contributor

tesuji commented Sep 9, 2020

fn select_general() {

hang quite often:

It has been disabled in #43 for Windows and macOS only.

@Restioson
Copy link
Collaborator

Restioson commented Sep 15, 2020

This hangs on Linux too according to CI

@tesuji
Copy link
Contributor Author

tesuji commented Sep 16, 2020

I guess we have to disable it on Linux too or fix it.

@zesterer
Copy link
Owner

This implies there is a race condition. Should anybody get the time to investigate: Selector is quite similar to async receivers in that they don't immediately pick up items (since multiple receivers could pick up an item at once, leading to a dropped item).

@tesuji tesuji changed the title select_general hang on Windows and macOS select_general hang on Windows and macOS, sometimes on Linux too Sep 17, 2020
@JavaDerg
Copy link

JavaDerg commented Dec 3, 2020

I'm experiencing the same issue on Windows and Linux.

@JavaDerg
Copy link

JavaDerg commented Dec 3, 2020

Using wait_timeout or similar seems to fix the problem somewhat, the Selector still hangs up, but returns successfully after the timeout passed.

JavaDerg added a commit to freepaint/threadmill that referenced this issue Dec 3, 2020
Due to flumes current bug with the `Selector` struct I am forced to use this to prevent lockups in the scheduling process.
zesterer/flume#44
@zesterer
Copy link
Owner

zesterer commented Dec 3, 2020

@post-rex Sorry about this. It's on my to-do list of things to fix, perhaps over the next week.

@JavaDerg
Copy link

JavaDerg commented Dec 3, 2020

@post-rex Sorry about this. It's on my to-do list of things to fix, perhaps over the next week.

Thanks that would be awesome!

@matklad
Copy link
Contributor

matklad commented Dec 20, 2020

Hm, I think I am hitting this. In a program I write, Selector::wait reliably blocks forever and takes 3/4 messages to unblock (on linux). I'll try to publish a reproduction shortly (this is code for my yet to be published blog post, I'll re-make it using crossbeam).

This feels like a pretty critical correctness bug to me ;-) I do not want to push for a fix here (this is open source software, it absolutely is ok to let things like this slip), but I do want to note that, imo, the graveness of this bug doesn't match the production-readiness status, signaled by the readme. (and of course there's a chance that it's my code which is broken :) ).

@zesterer
Copy link
Owner

@matklad Yep, this is a persistent issue that I've been trying to make time to fix for a while now. To the best of my knowledge, it's the only actual bug in the crate (at least, the crate has now been in use by quite a few projects for several months and nothing has come up besides this). Part of the reason I've not gotten to resolving it yet is that I'm rather unhappy with the API of Selector overall (it doesn't map particularly well to a select! macro) so resolving it would likely come with an overhaul of the API. This past weekend I've been overhauling another crate I maintain, euc, so I think I'll try to get to fixing this issue and making other improvements to flume in the week after the holiday period.

mbrubeck added a commit to mbrubeck/flume that referenced this issue Feb 7, 2021
Using Arc::ptr_eq on trait object pointers can fail unpredictably
because of rust-lang/rust#46139.

This can prevent a Hook from being removed when its RecvSelection is
de-inited, which makes it incorrectly push Tokens to a queue owned by a
Selector that no longer exists.

This may be the cause of zesterer#44.
@zesterer
Copy link
Owner

zesterer commented Feb 7, 2021

@matklad Would you happen to still be in a position to see whether you have this issue?

@zesterer
Copy link
Owner

zesterer commented Feb 8, 2021

The test that this issue specifically mentions now passes (thanks @tesuji). On the assumption that this implies that the bug if fixed, I'm closing the issue. If this turns out to not be the case, I can re-open the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants