Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling specific threads? #39743

Closed
NHDaly opened this issue Feb 18, 2021 · 9 comments
Closed

Profiling specific threads? #39743

NHDaly opened this issue Feb 18, 2021 · 9 comments
Labels
kind:feature Indicates new feature / enhancement requests performance Must go faster

Comments

@NHDaly
Copy link
Member

NHDaly commented Feb 18, 2021

Are there any plans for configuring the Profile profiler to only collect profiles for a specific threadid?

It would be nice to e.g. only profile what's happening on threadid 1.

The use case I have in mind is that we're starting an HTTP.jl HTTP Server on thread 1, via HTTP.serve(server=tcp) do http_request ; ... end, which from what I understand uses @async to schedule the provided callback on new Tasks, on the current thread. I am interested in profiling the performance and utilization of just that one thread.

Is this something that could be easily supported and added to Profile? Or maybe something that could be performed in post-processing of the collected profile data? Do we record a thread id in the traces we collect? (I think we don't, but maybe i'm missing something?)

Thanks! 🙂 ❤️ 📈

@NHDaly NHDaly added kind:feature Indicates new feature / enhancement requests performance Must go faster labels Feb 18, 2021
@vchuravy
Copy link
Sponsor Member

Looking at

julia/src/signals-unix.c

Lines 677 to 724 in 0d47bfe

// sample each thread, round-robin style in reverse order
// (so that thread zero gets notified last)
if (critical || profile)
jl_lock_profile();
for (int i = jl_n_threads; i-- > 0; ) {
// notify thread to stop
jl_thread_suspend_and_get_state(i, &signal_context);
// do backtrace on thread contexts for critical signals
// this part must be signal-handler safe
if (critical) {
bt_size += rec_backtrace_ctx(bt_data + bt_size,
JL_MAX_BT_SIZE / jl_n_threads - 1,
signal_context, NULL);
bt_data[bt_size++].uintptr = 0;
}
// do backtrace for profiler
if (profile && running) {
if (jl_profile_is_buffer_full()) {
// Buffer full: Delete the timer
jl_profile_stop_timer();
}
else {
// unwinding can fail, so keep track of the current state
// and restore from the SEGV handler if anything happens.
jl_ptls_t ptls = jl_get_ptls_states();
jl_jmp_buf *old_buf = ptls->safe_restore;
jl_jmp_buf buf;
ptls->safe_restore = &buf;
if (jl_setjmp(buf, 0)) {
jl_safe_printf("WARNING: profiler attempt to access an invalid memory location\n");
} else {
// Get backtrace data
bt_size_cur += rec_backtrace_ctx((jl_bt_element_t*)bt_data_prof + bt_size_cur,
bt_size_max - bt_size_cur - 1, signal_context, NULL);
}
ptls->safe_restore = old_buf;
// Mark the end of this block with 0
bt_data_prof[bt_size_cur++].uintptr = 0;
}
}
// notify thread to resume
jl_thread_resume(i, sig);
}
we currently record from which thread the backtrace is coming from.

@NHDaly
Copy link
Member Author

NHDaly commented Feb 19, 2021

Do we? I don't see it in there. :) did you mean to type "we currently don't record"?

But I see that we could record it!

Thanks for linking me to the right spot to look ❤️

@NHDaly
Copy link
Member Author

NHDaly commented Feb 19, 2021

@vchuravy - per your suggestion, i gave this a shot here:
master...NHDaly:nhd-profile-threadid_mask

It seems like it should be pretty straightforward, actually, but the code I wrote doesn't seem to affect anything. Even the exit(1) i've added in the second commit (57cab58) doesn't have any effect. Is this not the right place to be looking (for macOS)?

exit(1);

@vchuravy
Copy link
Sponsor Member

did you mean to type "we currently don't record"?

Yes...

Is this not the right place to be looking (for macOS)?

Ah yes if you are on that other operating system, that only pretends to be unix.

julia/src/signals-mach.c

Lines 494 to 542 in 0d47bfe

for (i = jl_n_threads; i-- > 0; ) {
// if there is no space left, break early
if (jl_profile_is_buffer_full()) {
jl_profile_stop_timer();
break;
}
unw_context_t *uc;
jl_thread_suspend_and_get_state(i, &uc);
if (running) {
#ifdef LIBOSXUNWIND
/*
* Unfortunately compact unwind info is incorrectly generated for quite a number of
* libraries by quite a large number of compilers. We can fall back to DWARF unwind info
* in some cases, but in quite a number of cases (especially libraries not compiled in debug
* mode, only the compact unwind info may be available). Even more unfortunately, there is no
* way to detect such bogus compact unwind info (other than noticing the resulting segfault).
* What we do here is ugly, but necessary until the compact unwind info situation improves.
* We try to use the compact unwind info and if that results in a segfault, we retry with DWARF info.
* Note that in a small number of cases this may result in bogus stack traces, but at least the topmost
* entry will always be correct, and the number of cases in which this is an issue is rather small.
* Other than that, this implementation is not incorrect as the other thread is paused while we are profiling
* and during stack unwinding we only ever read memory, but never write it.
*/
forceDwarf = 0;
unw_getcontext(&profiler_uc); // will resume from this point if the next lines segfault at any point
if (forceDwarf == 0) {
// Save the backtrace
bt_size_cur += rec_backtrace_ctx((jl_bt_element_t*)bt_data_prof + bt_size_cur, bt_size_max - bt_size_cur - 1, uc, NULL);
}
else if (forceDwarf == 1) {
bt_size_cur += rec_backtrace_ctx_dwarf((jl_bt_element_t*)bt_data_prof + bt_size_cur, bt_size_max - bt_size_cur - 1, uc, NULL);
}
else if (forceDwarf == -1) {
jl_safe_printf("WARNING: profiler attempt to access an invalid memory location\n");
}
forceDwarf = -2;
#else
bt_size_cur += rec_backtrace_ctx((jl_bt_element_t*)bt_data_prof + bt_size_cur, bt_size_max - bt_size_cur - 1, uc, NULL);
#endif
// Mark the end of this block with 0
bt_data_prof[bt_size_cur++].uintptr = 0;
}
// We're done! Resume the thread.
jl_thread_resume(i, 0);

@NHDaly
Copy link
Member Author

NHDaly commented Feb 19, 2021

did you mean to type "we currently don't record"?

Yes...

Haha 🙈 it didn't seem like a likely typo, but i read the code over a few times to be sure 😋

Cool, thanks again for the second link! 😁

@NHDaly
Copy link
Member Author

NHDaly commented Feb 19, 2021

What does "mach" refer to? I see it all the time, but i have never actually asked.

@vchuravy
Copy link
Sponsor Member

What does "mach" refer to? I see it all the time, but i have never actually asked.

Mach -> XNU IIUC Mach is the microkernel basis of XNU which is the kernel for Darwin/MacOS

@NHDaly
Copy link
Member Author

NHDaly commented Feb 19, 2021

Thanks! that explains that for me. :)


haha okay, so thanks to your pointers, this was easy enough that i just did it myself:
#39746

😁 Interested to hear what you and @vtjnash have to say! ❤️ night!

@NHDaly
Copy link
Member Author

NHDaly commented Jan 2, 2022

Closed by #41742.

@NHDaly NHDaly closed this as completed Jan 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature Indicates new feature / enhancement requests performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants