Measure and report executor starvation time #322

psalz · 2024-12-20T13:44:31Z

We previously had a rudimentary Tracy zone for executor starvation, however this was only semi-correct, because if the scheduler is not currently producing any instructions (for example if the user is doing something other than submitting tasks to Celerity), then the executor is not really starving.

I've extended the scheduler to report its idle state to the runtime, which in turn feeds it to the executor for measuring starvation. If the total starvation time exceeds a percentage threshold of the time spent doing actual work (i.e., processing instructions), we print a warning indicating that the application might be scheduler-bound.

Since the scheduler looks at task and command queue sizes to decide whether it is busy or idle, the lookahead mechanism can cause the scheduler to remain busy even if it is not actually producing any instructions. I consider this a feature rather than a bug, because it means that applications that attempt to interleave Celerity kernels with other work will produce a starvation warning if the kernels aren't actually being executed. I've opted to include a hint for this in the warning itself.

The executor is considered to be "starving" when it is out of instructions to process, and the scheduler is currently busy. This means that phases of the user program that do not interact with the Celerity API do not count as starvation periods. If the total starvation time exceeds a percentage threshold of the time spent doing actual work (i.e., processing instructions), we print a warning indicating that the application might be scheduler-bound.

github-actions · 2024-12-20T13:49:30Z

Check-perf-impact results: (b640933924085610d15d7a9597b095a1)

✔️ No significant performance change in the microbenchmark set. You are good to go!

Relative execution time per category: (mean of relative medians)

command-graph : 1.03x
graph-nodes : 0.98x
grid : 1.03x
instruction-graph : 1.02x
scheduler : 1.00x
system : 1.06x
task-graph : 0.94x

github-actions

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

github-actions · 2024-12-20T13:56:23Z

test/scheduler_tests.cc

@@ -287,3 +287,76 @@ TEST_CASE("scheduler(lookahead::automatic) avoids reallocations in the RSim patt
 		CHECK(iq.count<free_instruction_record>() == num_devices);
 	});
 }
+
+TEST_CASE("scheduler reports idle and busy phases") { //
+	struct test_delegate : scheduler::delegate {


⚠️ cppcoreguidelines-virtual-class-destructor ⚠️
destructor of test_delegate is public and non-virtual

coveralls · 2024-12-20T14:02:11Z

Pull Request Test Coverage Report for Build 12432570567

Details

86 of 86 (100.0%) changed or added relevant lines in 6 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage decreased (-0.05%) to 95.004%

Totals
Change from base Build 12390656669:	-0.05%
Covered Lines:	7133
Relevant Lines:	7238

💛 - Coveralls

PeterTh

LGTM

GagaLP

LGTM!

fknorr

Nice, another simple but very useful debugging tool!

I agree on how starvation time is measured, but I believe that the design can be improved by moving the actual tracking from live_executor to runtime.

Proposal:

Move total_starvation_time and total_active_time into class runtime::impl
Have on_executor_busy and on_executor_idle in executor_delegate, mirroring scheduler_delegate
If either scheduler or executor change between busy/idle states, take a lock and record which of the 4 possible states we are in (each event also means suspending / resuming a thread, so I suppose the overhead is manageable)
Remove getters (and dry-run dummy implementations) from executors

This would have the following advantages:

executor would not need to concern itself with scheduler state conceptually (they are fully separated atm)
executor public API would not expose any query functions, which makes reasoning about its behavior in the multithreaded context easier
Cross-thread counter updates would be centralized in runtime (precedence: runtime::impl::m_latest_epoch_reached)
idle_state_change would not need to traverse the executor queue, so its reporting would be more accurate in time.

The only downside I see in this is that the Tracy integration will not be able to distinguish between executor starvation and idleness.

psalz added 2 commits December 20, 2024 14:46

Update benchmark results for executor starvation tracking

26b9c3f

psalz force-pushed the report-executor-starvation branch from cf2be90 to 26b9c3f Compare December 20, 2024 13:48

github-actions bot reviewed Dec 20, 2024

View reviewed changes

PeterTh approved these changes Dec 31, 2024

View reviewed changes

GagaLP approved these changes Jan 7, 2025

View reviewed changes

fknorr requested changes Jan 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measure and report executor starvation time #322

Measure and report executor starvation time #322

psalz commented Dec 20, 2024 •

edited

Loading

github-actions bot commented Dec 20, 2024

github-actions bot left a comment

github-actions bot Dec 20, 2024

coveralls commented Dec 20, 2024

PeterTh left a comment

GagaLP left a comment

fknorr left a comment

Measure and report executor starvation time #322

Are you sure you want to change the base?

Measure and report executor starvation time #322

Conversation

psalz commented Dec 20, 2024 • edited Loading

github-actions bot commented Dec 20, 2024

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot Dec 20, 2024

Choose a reason for hiding this comment

coveralls commented Dec 20, 2024

Pull Request Test Coverage Report for Build 12432570567

Details

💛 - Coveralls

PeterTh left a comment

Choose a reason for hiding this comment

GagaLP left a comment

Choose a reason for hiding this comment

fknorr left a comment

Choose a reason for hiding this comment

psalz commented Dec 20, 2024 •

edited

Loading