Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test.libregrtest race condition in runtest_mp leads to 30 second delay in free-threaded build #117293

Closed
Tracked by #108219
colesbury opened this issue Mar 27, 2024 · 0 comments
Assignees
Labels
tests Tests in the Lib/test dir topic-free-threading

Comments

@colesbury
Copy link
Contributor

colesbury commented Mar 27, 2024

This is related to #90363.

In #30470, the (partial) fix for that issue I wrote:

There are still potential race conditions between checking if any workers are alive and reading from the output queue. This means that there can be up to an extra 30 second delay (instead of an hanging forever), while waiting for self.output.get().

We are now seeing that 30 second delay when running some tests with the GIL disabled, such as test_regrtest.

I think it's worth pursuing the approach used here, which avoids the race condition by tracking exiting workers by passing a sentinel value through the queue:
colesbury/nogil@406e5d7

Linked PRs

@colesbury colesbury added tests Tests in the Lib/test dir topic-free-threading labels Mar 27, 2024
@colesbury colesbury self-assigned this Mar 27, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Mar 27, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Mar 27, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Mar 27, 2024
The worker thread may still be alive after it enqueues it's last result,
which can lead to a delay of 30 seconds after the test finishes. This
happens much more frequently in the free-threaded build with the GIL
disabled.

This changes run_workers.py to track of live workers by enqueueing a
`WorkerExited()` instance before the worker exits.
colesbury added a commit to colesbury/cpython that referenced this issue Mar 27, 2024
The worker thread may still be alive after it enqueues it's last result,
which can lead to a delay of 30 seconds after the test finishes. This
happens much more frequently in the free-threaded build with the GIL
disabled.

This changes run_workers.py to track of live workers by enqueueing a
`WorkerExited()` instance before the worker exits.
colesbury added a commit to colesbury/cpython that referenced this issue Mar 27, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Mar 28, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Apr 1, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Apr 2, 2024
colesbury added a commit to colesbury/cpython that referenced this issue Apr 4, 2024
colesbury added a commit that referenced this issue Apr 8, 2024
The worker thread may still be alive after it enqueues it's last result,
which can lead to a delay of 30 seconds after the test finishes. This
happens much more frequently in the free-threaded build with the GIL
disabled.

This changes run_workers.py to track of live workers by enqueueing a
`WorkerExited()` instance before the worker exits.
diegorusso pushed a commit to diegorusso/cpython that referenced this issue Apr 17, 2024
The worker thread may still be alive after it enqueues it's last result,
which can lead to a delay of 30 seconds after the test finishes. This
happens much more frequently in the free-threaded build with the GIL
disabled.

This changes run_workers.py to track of live workers by enqueueing a
`WorkerExited()` instance before the worker exits.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Tests in the Lib/test dir topic-free-threading
Projects
None yet
Development

No branches or pull requests

1 participant