-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor WaitForThreadExit #752
Conversation
862b3eb
to
27b1129
Compare
add TestThreadDetachCleanup test remove some dead PerThread code module use init counter to 0 fmt update UnitTests.cmake unload oneapi/compiler before running undo some changes
Description: This commit updates the thread tracking variables in the `CHIPBackend` class. The variable `NumPerThreadQueues` has been renamed to `NumQueuesAlive` for better clarity. The changes have been made in both the source file `CHIPBackend.cc` and the header file `CHIPBackend.hh`.
3ce11b5
to
90094d7
Compare
{ | ||
auto NumPerThreadQueuesActive = ::Backend->getPerThreadQueuesActive(); | ||
if (!NumPerThreadQueuesActive) | ||
// go through all devices checking their NumQueuesAlive until all they're all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// go through all devices checking their NumQueuesAlive until all they're all | |
// go through all devices checking their NumQueuesAlive until they're all |
logWarn("Waiting for all per-thread queues to exit... This condition " | ||
"would indicate that the main thread didn't call " | ||
"join()"); | ||
sleep(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sleep here feels wrong/hacky. Could it be a condition variable invoked when the device's ref count goes to zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what we do above - we have an atomic counter in the queue ctor/dtor.
The issue the main thread can exit before all the threads have reached the ctor. I agree that this is not optimal but I can't think of a better way to solve this. I tried creating another thread counter class and making it public thread_local hoping that as soon as a thread is created the constructor would get called but it still gets called later down in the callstack as indicated in the comment.
Perhaps there's a clang trick to force a certain object constructor to get execute before others? @pjaaskel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I thought you meant the first sleep statement but my question still stands.
As for this, this sleep is part of the look where we check the counter. I think we want this debug statement and if we remove the sleep here it will spam stdout. This section of the code is only run in the case that a user wrote a bad hip program where threads are not joined() explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I just meant to have a cleaner approach for the wait instead of sleep-polling with a rather long sleep time - a (pthread) condition variable kind of approach. Not critical if you think it's good enough.
Since sleep polling only happens in a very edge case - merging. |
In the case that a user creates a bunch of threads and main() exits without calling join() on those threads, we must delay backend destruction so that those other threads can finish their work and cleanup via thread_local ctor/dtor sequence.
Otherwise, it's possible that the main thread will attempt to destroy the device and context handles while they're still being used by other detached threads.
This case is an example of a bad program since main thread should call join() but it's best if we handle it more gracefully than let it segfault.
chipstar::Queue
ctor/dtor