Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More thorough basic supervision tests #91

Merged
merged 10 commits into from
Nov 23, 2019
Merged

Conversation

goodboy
Copy link
Owner

@goodboy goodboy commented Nov 22, 2019

Attention all lurkers: code reviews and specifically criticisms are appreciated!

This is mostly to resolve #43 but I created some related issues (#87, #88, #89) after thinking and working through other types of failure cases we need to handle.

The main solution to get this working was adding proper MultiError propagation support in the core.

Interestingly writing the nested sub-actor error propagation tests actually found some serious holdups in the multiprocessing forkserver code. Even with my patches to the stdlib (which are pretty questionable) the server breaks pretty easily (causing hangs) when spawning any more then a couple "levels" (really lifetime scopes) of process tree. It's all good fodder for #84 and either writing our own forkserver or looking at the proper way to get the stdlib's version to play with our model. Surprisingly the spawn method works just fine so trying out trio_run_in_process has some more incentive; interested to see how it performs vs. the stdlib's version.

In an effort towards #43. This completes the first major bullet's worth of tests
described in that issue.
This exemplifies the undefined behaviour in #88 and begins to test for
the last bullet in #43.
`trio.MultiError` isn't an `Exception` (derived instead from
`BaseException`) so we have to specially catch it in the task
invocation machinery and ship it upwards (like regular errors)
since nurseries running in sub-actors can raise them.
Add a test to verify that `trio.MultiError`s are properly propagated up
a simple actor nursery tree. We don't have any exception marshalling
between processes (yet) so we can't validate much more then a simple
2-depth tree. This satisfies the final bullet in #43.

Note I've limited the number of subactors per layer to around 5 since
any more then this seems to break the `multiprocessing` forkserver;
zombie subprocesses seem to be blocking teardown somehow...

Also add a single depth fast fail test just to verify that it's the
nested spawning that triggers this forkserver bug.
If a nursery fails to cancel (some sub-actors presumably) then hard kill
the whole process tree to avoid hangs during a catastrophic failure.
This logic may get factored out (and changed) as we introduce custom
supervisor strategies.
@goodboy
Copy link
Owner Author

goodboy commented Nov 22, 2019

And there you have it. Windows doesn't like "nested sub-processes" even with the spawn method...

@goodboy goodboy force-pushed the more_thorough_super_tests branch 2 times, most recently from 3ffab4b to 421459d Compare November 23, 2019 02:19
Seems like we've probably got some greater limitations
with Windows and "nested" spawned sub-processes...
@goodboy goodboy merged commit 4d43f25 into master Nov 23, 2019
@goodboy goodboy deleted the more_thorough_super_tests branch August 19, 2021 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Initial supervision tests
1 participant