Refactor WaitForThreadExit #752

pvelesko · 2024-01-13T09:39:08Z

In the case that a user creates a bunch of threads and main() exits without calling join() on those threads, we must delay backend destruction so that those other threads can finish their work and cleanup via thread_local ctor/dtor sequence.

Otherwise, it's possible that the main thread will attempt to destroy the device and context handles while they're still being used by other detached threads.

This case is an example of a bad program since main thread should call join() but it's best if we handle it more gracefully than let it segfault.

Add a test
Refactor using atomic operations in thread_local chipstar::Queue ctor/dtor

src/CHIPBackend.cc

add TestThreadDetachCleanup test remove some dead PerThread code module use init counter to 0 fmt update UnitTests.cmake unload oneapi/compiler before running undo some changes

Description: This commit updates the thread tracking variables in the `CHIPBackend` class. The variable `NumPerThreadQueues` has been renamed to `NumQueuesAlive` for better clarity. The changes have been made in both the source file `CHIPBackend.cc` and the header file `CHIPBackend.hh`.

pjaaskel · 2024-01-16T13:56:17Z

src/CHIPBackend.cc

-    {
-      auto NumPerThreadQueuesActive = ::Backend->getPerThreadQueuesActive();
-      if (!NumPerThreadQueuesActive)
+  // go through all devices checking their NumQueuesAlive until all they're all


Suggested change

// go through all devices checking their NumQueuesAlive until all they're all

// go through all devices checking their NumQueuesAlive until they're all

pjaaskel · 2024-01-16T14:05:35Z

src/CHIPBackend.cc

+      logWarn("Waiting for all per-thread queues to exit... This condition "
+              "would indicate that the main thread didn't call "
+              "join()");
+      sleep(1);


The sleep here feels wrong/hacky. Could it be a condition variable invoked when the device's ref count goes to zero?

That's what we do above - we have an atomic counter in the queue ctor/dtor.

The issue the main thread can exit before all the threads have reached the ctor. I agree that this is not optimal but I can't think of a better way to solve this. I tried creating another thread counter class and making it public thread_local hoping that as soon as a thread is created the constructor would get called but it still gets called later down in the callstack as indicated in the comment.

Perhaps there's a clang trick to force a certain object constructor to get execute before others? @pjaaskel

Sorry I thought you meant the first sleep statement but my question still stands.

As for this, this sleep is part of the look where we check the counter. I think we want this debug statement and if we remove the sleep here it will spam stdout. This section of the code is only run in the case that a user wrote a bad hip program where threads are not joined() explicitly.

Yes, I just meant to have a cleaner approach for the wait instead of sleep-polling with a rather long sleep time - a (pthread) condition variable kind of approach. Not critical if you think it's good enough.

pvelesko · 2024-01-18T13:39:52Z

Since sleep polling only happens in a very edge case - merging.

pvelesko force-pushed the WaitForPerThread branch from 862b3eb to 27b1129 Compare January 13, 2024 17:32

pvelesko marked this pull request as draft January 13, 2024 18:52

pjaaskel requested changes Jan 15, 2024

View reviewed changes

src/CHIPBackend.cc Show resolved Hide resolved

pvelesko added 7 commits January 16, 2024 00:53

Refactor WaitForThreadExit()

cc55c69

add TestThreadDetachCleanup test remove some dead PerThread code module use init counter to 0 fmt update UnitTests.cmake unload oneapi/compiler before running undo some changes

remove unused setPerThreadStreamUsed

c1e2249

Update thread count atomic variable names

7ff9499

NumPerThreadQueues.fetch_add move to ctor

389f5c9

check.py fix dry run

ba05f04

Move PerThread queue counter

90094d7

pvelesko force-pushed the WaitForPerThread branch from 3ce11b5 to 90094d7 Compare January 16, 2024 09:04

fmt

f205395

pvelesko marked this pull request as ready for review January 16, 2024 09:38

pvelesko requested a review from pjaaskel January 16, 2024 09:38

pjaaskel reviewed Jan 16, 2024

View reviewed changes

pvelesko merged commit b18a3af into main Jan 18, 2024
28 checks passed

pvelesko deleted the WaitForPerThread branch January 18, 2024 13:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor WaitForThreadExit #752

Refactor WaitForThreadExit #752

pvelesko commented Jan 13, 2024 •

edited

Loading

pjaaskel Jan 16, 2024

pjaaskel Jan 16, 2024

pvelesko Jan 16, 2024

pvelesko Jan 16, 2024

pjaaskel Jan 16, 2024

pvelesko commented Jan 18, 2024

	// go through all devices checking their NumQueuesAlive until all they're all
	// go through all devices checking their NumQueuesAlive until they're all

Refactor WaitForThreadExit #752

Refactor WaitForThreadExit #752

Conversation

pvelesko commented Jan 13, 2024 • edited Loading

pjaaskel Jan 16, 2024

Choose a reason for hiding this comment

pjaaskel Jan 16, 2024

Choose a reason for hiding this comment

pvelesko Jan 16, 2024

Choose a reason for hiding this comment

pvelesko Jan 16, 2024

Choose a reason for hiding this comment

pjaaskel Jan 16, 2024

Choose a reason for hiding this comment

pvelesko commented Jan 18, 2024

pvelesko commented Jan 13, 2024 •

edited

Loading