Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve restarter logic #717

Merged
merged 6 commits into from
Nov 22, 2021
Merged

Conversation

vidartf
Copy link
Contributor

@vidartf vidartf commented Nov 1, 2021

Resolves #715.

There is a small change in the behavior of KernelRestarter.poll() and the async variant that should be considered whether it is appropriate (or if a better alternative exists). I.e. 2fabf92 . The other parts should be uncontroversial.

One simple test for recovery after a single "crash" + one test of handling "infinitely" crashing kernels, i.e. that the restart limit is respected.
Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spending time on this Vidar. The tests look really useful as well.

I'm happy to help with the provisioner-based changes if you like.

jupyter_client/ioloop/restarter.py Show resolved Hide resolved
jupyter_client/ioloop/restarter.py Outdated Show resolved Hide resolved
jupyter_client/restarter.py Show resolved Hide resolved
This is mainly due to the weakness of the previous method: `is_alive` only tests that the kernel process is alive, it does not indicate that the kernel has successfully completed startup. To solve this correctly, we would need to wait for a kernel info reply, but it is not necessarily appropriate to start a kernel client + channels in the restarter. Therefore, we use a "has been alive continuously for X time" as a heuristic for a stable start up.
@kevin-bates
Copy link
Member

The ubuntu 3.6 test was hung up (after 5 hours) so I cancelled and restarted the workflow.

Copy link
Member

@kevin-bates kevin-bates left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thank you @vidartf. I think the test intermittencies are unrelated.

@vidartf vidartf marked this pull request as draft November 17, 2021 13:51
@vidartf vidartf marked this pull request as ready for review November 17, 2021 17:44
@blink1073 blink1073 added the bug label Nov 22, 2021
@blink1073 blink1073 added this to the 7.1 milestone Nov 22, 2021
@blink1073 blink1073 merged commit 670ee79 into jupyter:master Nov 22, 2021
@vidartf vidartf deleted the fix-async-restarter branch November 29, 2021 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Restarter issues with slow to start kernels
4 participants