-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A version of the sync macro that throws earlier #34198
Conversation
Couldn’t this be the new default behavior? |
base/task.jl
Outdated
during error handling. | ||
|
||
!!! Note | ||
This interface is experimental and subject to change or removal without notice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for marking this experimental. Otherwise, implementing structured concurrency #33248 becomes much harder.
Maybe we should call this |
You mean |
Isn't it strange to put an experimental API in |
Letting tasks drop out the bottom seems like a pretty big change to the semantics of As you say, it would ideally interact with some kind of cancellation. That's hard but it feels a lot more consistent with the way It seems handy though and could be a big practical improvement for the test suite. For the moment, could we just name it clearly and not export it? How about |
From triage: let's call this |
|
Let's merge this and see if it helps CI. Being in experimental makes it explicitly unsupported, so we can change or remove it if we feel like it at any point. |
I've been looking at what causes deadlocks in our test suite in an effort to cut down on the number of failed tests on CI that result in hangs (since those are hard to diagnose and resolve). I found that by playing with various resource limits, it is easy to create hangs in the test suite. The reason we get a hang rather than a more easily diagnosable error is two fold. We either: 1. Aren't watching for the error (e.g. a socket remote end closing) 2. We aren't propagating the error to the top level A very common situation for case 2) is that the test is wrapped in @sync which doesn't return until all tasks have finished or error'ed. However, in many cases one of the tasks produces data for the others, so if that task errors, the remaining tasks will wait forever. This PR aims to address that situation by introducing a new `Experimental.@sync` macro that immediately rethrows any errors thrown by a contained task rather than waiting for all of them to finish. The implementation isn't super performant (it allocates a new task per object being waited on), but should be sufficient for use in the test suite. A better implementation would create a new scheduler object that can be inserted into multiple wait queues. Example usage of the new macro: ``` @sync begin @async error("Hello") @async sleep(1000) end # Waits 1000s Experimental.@sync begin @async error("Hello") @async sleep(1000) end # Throws immediately ``` The macro doesn't do any sort of cleanup for the tasks that do not finish, and just lets them run. In the future, we may want to automatically cancel those tasks, but that seemed like a bigger design problem than the simple thing that I wanted (something that propagates error messages more readily, so we see them in the logs).
I've been looking at what causes deadlocks in our test suite in an effort to cut down on the number of failed tests on CI that result in hangs (since those are hard to diagnose and resolve). I found that by playing with various resource limits, it is easy to create hangs in the test suite. The reason we get a hang rather than a more easily diagnosable error is two fold. We either: 1. Aren't watching for the error (e.g. a socket remote end closing) 2. We aren't propagating the error to the top level A very common situation for case 2) is that the test is wrapped in @sync which doesn't return until all tasks have finished or error'ed. However, in many cases one of the tasks produces data for the others, so if that task errors, the remaining tasks will wait forever. This PR aims to address that situation by introducing a new `Experimental.@sync` macro that immediately rethrows any errors thrown by a contained task rather than waiting for all of them to finish. The implementation isn't super performant (it allocates a new task per object being waited on), but should be sufficient for use in the test suite. A better implementation would create a new scheduler object that can be inserted into multiple wait queues. Example usage of the new macro: ``` @sync begin @async error("Hello") @async sleep(1000) end # Waits 1000s Experimental.@sync begin @async error("Hello") @async sleep(1000) end # Throws immediately ``` The macro doesn't do any sort of cleanup for the tasks that do not finish, and just lets them run. In the future, we may want to automatically cancel those tasks, but that seemed like a bigger design problem than the simple thing that I wanted (something that propagates error messages more readily, so we see them in the logs).
I've been looking at what causes deadlocks in our test suite in an effort to cut down on the number of failed tests on CI that result in hangs (since those are hard to diagnose and resolve). I found that by playing with various resource limits, it is easy to create hangs in the test suite. The reason we get a hang rather than a more easily diagnosable error is two fold. We either:
A very common situation for case 2) is that the test is wrapped in
@sync
which doesn't return until all tasks have finished or error'ed. However, in many cases one of the tasks produces data for the others, so if that task errors, the remaining tasks will wait forever. This PR aims to address that situation by introducing a new@syncany
macro that immediately rethrows any errors thrown by a contained task rather than waiting for all of them to finish. The implementation isn't super performant (it allocates a new task per object being waited on), but should be sufficient for use in the test suite. A better implementation would create a new scheduler object that can be inserted into multiple wait queues.Example usage of the new macro:
The macro doesn't do any sort of cleanup for the tasks that do not finish, and just lets them run. In the future, we may want to automatically cancel those tasks, but that seemed like a bigger design problem than the simple thing that I wanted (something that propagates error messages more readily, so we see them in the logs).