Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-123940: Ensure force-terminated daemon threads can be joined #124150

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

mpage
Copy link
Contributor

@mpage mpage commented Sep 17, 2024

During finalization, daemon threads are forced to exit immediately:

cpython/Python/pylifecycle.c

Lines 2023 to 2026 in 10de360

/* Remaining daemon threads will automatically exit
when they attempt to take the GIL (ex: PyEval_RestoreThread()). */
_PyInterpreterState_SetFinalizing(tstate->interp, tstate);
_PyRuntimeState_SetFinalizing(runtime, tstate);

without returning through the call-stack normally upon acquiring the GIL:

cpython/Python/ceval_gil.c

Lines 294 to 302 in 10de360

if (_PyThreadState_MustExit(tstate)) {
/* bpo-39877: If Py_Finalize() has been called and tstate is not the
thread which called Py_Finalize(), exit immediately the thread.
This code path can be reached by a daemon thread after Py_Finalize()
completes. In this case, tstate is a dangling pointer: points to
PyThreadState freed memory. */
PyThread_exit_thread();
}

Finalizers that run after this must be able to join the forcefully terminated threads.

The current implementation notified of thread completion before returning from thread_run:

_PyEvent_Notify(&handle->thread_is_exiting);

Unfortunately, this code will never execute if the thread is forced to exit during finalization. Any code that attempts to join such a thread will block indefinitely.

To fix this, keep track of daemon threads and have the runtime notify their events after they are forced to exit.

During finalization, daemon threads are force to exit immediately (without returning
through the call-stack normally) upon acquiring the GIL. Finalizers that run after
this must be able to join the forcefully terminated threads.

The current implementation notified of thread completion before returning
from `thread_run`. This code will never execute if the thread is forced to exit during
finalization. Any code that attempts to join such a thread will block indefinitely.

To fix this, use the old approach of notifying of thread completion when the thread state is
cleared. This happens both when `thread_run` exits normally and when thread states are
destroyed as part of finalization (which happens immediately after forcing daemon threads
to exit, before any python code can run).
@mpage mpage requested review from colesbury and pitrou September 17, 2024 03:11
@mpage mpage marked this pull request as ready for review September 17, 2024 03:12
Python/pystate.c Outdated Show resolved Hide resolved
Python/pystate.c Outdated Show resolved Hide resolved
Include/cpython/pystate.h Outdated Show resolved Hide resolved
Include/internal/pycore_lock.h Outdated Show resolved Hide resolved
Include/internal/pycore_lock.h Outdated Show resolved Hide resolved
Tools/tsan/suppressions_free_threading.txt Show resolved Hide resolved
Lib/test/test_threading.py Show resolved Hide resolved
Comment on lines +1184 to +1186
def loop():
while True:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A slight variant where loop() calls time.sleep(1) hard crashes in 3.11 and 3.12. I'm not sure what to make of that, other than this sort of behavior wasn't robust previously either. I think it's noteworthy that the change that led to the issue was from just a few days ago -- it doesn't look like it was some longstanding code that just broke now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly #87135 related.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the issue #116514 colesbury linked when I asked (I wasn't able to repro just the time.sleep variant mentioned here)

Python/pylifecycle.c Outdated Show resolved Hide resolved
Python/pylifecycle.c Outdated Show resolved Hide resolved
@mpage
Copy link
Contributor Author

mpage commented Sep 18, 2024

Marking this as a draft while I figure out the linker errors on windows.

@mpage mpage marked this pull request as draft September 18, 2024 20:44
Include threadmodule when building _freeze_module
@mpage mpage marked this pull request as ready for review September 18, 2024 21:29
@mpage mpage requested a review from a team as a code owner September 18, 2024 21:29
@mpage
Copy link
Contributor Author

mpage commented Sep 18, 2024

@colesbury @ericsnowcurrently - Would you please have another look at this? I think it makes sense to move ThreadHandle from _threadmodule.c into pycore_pythread.h/Python/thread.c but that's a bigger change, so I went with a smaller fix here. I can do the refactor in a separate PR.

Copy link
Member

@ericsnowcurrently ericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly LGTM

There's just the matter of where the code lives. It should mostly go in Python/thread.c. However, we have a bit of a mess in threadmodule.c already, which we needn't try to fix all in this PR. I might be okay with punting on that in favor of a follow-up PR, but I'd like the opinion of others on that first.

Modules/_threadmodule.c Show resolved Hide resolved
Modules/_threadmodule.c Show resolved Hide resolved
Comment on lines +668 to +669
llist_init(&interp->threads.daemon_handles);
llist_init(&interp->threads.non_daemon_handles);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to have a _PyThread_InitThreadHandles() for this, to match _PyThread_ClearThreadHandles().

Lib/test/test_threading.py Show resolved Hide resolved
@bedevere-app
Copy link

bedevere-app bot commented Sep 26, 2024

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@mpage
Copy link
Contributor Author

mpage commented Sep 26, 2024

@ericsnowcurrently and I chatted in person and the consensus is that this PR is ok as-is, but we should follow it up with a refactor (only on main) to move functionality out of Modules/_threadmodule.c and into Python/thread.c.

@gpshead
Copy link
Member

gpshead commented Sep 27, 2024

Unfortunately, this code will never execute if the thread is forced to exit during finalization. Any code that attempts to join such a thread will block indefinitely.

To fix this, keep track of daemon threads and have the runtime notify their events after they are forced to exit.

FYI - #105805, if merged, is set to change this so that these threads do not exit because our C API has no way for them to do so cleanly to cause an unwind of the call stack. They'll be made to hang instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants