Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make allocation profiling work after exec #342

Merged
merged 12 commits into from
Nov 14, 2023
Merged

Conversation

nsavoire
Copy link
Collaborator

@nsavoire nsavoire commented Nov 10, 2023

What does this PR do?

This PR add the ability to profile allocations even after a process has exec'd into a new binary image.

How do this work ?

ddprof now keeps a unix socket open, unix socket path can be overridden with --socket input option.
Worker accepts connections on this socket in a separate thread and sends ring buffer information upon request.
When a new process is exec'd, library looks up DD_PROFILING_NATIVE_LIB_SOCKET env variable to get socket path and if it exists, connects to it otherwise it spawns a ddprof process, then uses a pipe to synchronize with ddprof start and get socket path, and then connects to it.
By default ddprof creates an abstract socket, that does not exist on the filesystem: \0/tmp/ddprof-<pid>.sock.
Exec following can be disabled by setting env variable DD_PROFILING_NATIVE_ALLOCATION_PROFILING_FOLLOW_EXECS to 0/off/no.

@pr-commenter
Copy link

pr-commenter bot commented Nov 10, 2023

Benchmark results for collatz

Parameters

Baseline Candidate
config baseline candidate
profiler-version ddprof 0.15.0+f257ed6c.22893508 ddprof 0.15.0+e3efa971.23283787

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean execution_time
scenario:ddprof -S bench-collatz --preset cpu_only collatz_runner.sh same

@pr-commenter
Copy link

pr-commenter bot commented Nov 10, 2023

Benchmark results for BadBoggleSolver_run

Parameters

Baseline Candidate
config baseline candidate
profiler-version ddprof 0.15.0+f257ed6c.22893508 ddprof 0.15.0+e3efa971.23283787

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 1 metrics, 0 unstable metrics.

See unchanged results
scenario Δ mean execution_time
scenario:ddprof -S bench-bad-boggle-solver BadBoggleSolver_run work 1000 same

@nsavoire nsavoire force-pushed the nsavoire/follow_execs branch from b7b6411 to a95dfb7 Compare November 13, 2023 12:22
@nsavoire nsavoire force-pushed the nsavoire/follow_execs branch from a95dfb7 to 6a91d73 Compare November 13, 2023 14:04
check "./ddprof bash -c \"./test/simple_malloc ${opts}\"" 1

# Test disabling follow execs
check "env DD_PROFILING_NATIVE_ALLOCATION_PROFILING_FOLLOW_EXECS=0 DD_PROFILING_NATIVE_STARTUP_WAIT_MS=200 ./ddprof bash -c \"./test/simple_malloc ${opts}\"" 0 0 "inuse-space"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting to test the removal of the feature. For overhead, there might be a use case where users do not want to follow some execs. Good thinking.

@@ -297,10 +297,14 @@ int DDProfCLI::parse(int argc, const char *argv[]) {
app.add_flag("--help_extended", help_extended, "Show extended options")
->group(""));
extended_options.push_back(
app.add_option("--socket", socket,
"Profiler's IPC socket, as a file descriptor")
app.add_option("--socket", socket_path, "Profiler socket path")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: we could give a more detailed explanation of why we establish this socket.

Copy link
Collaborator Author

@nsavoire nsavoire Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean adding details to the help text or adding comments to explain why we need a socket ?
FYI I added an overview of the communication process between profiler/library here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a user I would be confused with what I can do with this.
The overview is great.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these are more developer options. We do not use them apart from testing use cases. Would something like:
Override the automatically created socket with a specific path
Be better ?

ddprof now keeps a unix socket open, unix socket path can be overridden with --socket input option.
Worker accepts connections on this socket in a separate thread and sends ring buffer information upon request.
When a new process is exec'd, library looks up DD_PROFILING_NATIVE_LIB_SOCKET env variable to get socket path
and if it exists, connects to it otherwise it spawns a ddprof process, then uses a pipe to synchronize with ddprof
start and get socket path, and then connects to it.
By default ddprof creates an abstract socket, that does not exist on the filesystem: \0/tmp/ddprof-<pid>.sock.
This behaviour can be disabled by setting env variable DD_PROFILING_NATIVE_ALLOCATION_PROFILING_FOLLOW_EXECS
to 0/off/no.
@nsavoire nsavoire force-pushed the nsavoire/follow_execs branch from 6a91d73 to d57d94b Compare November 13, 2023 17:00
src/lib/dd_profiling.cc Outdated Show resolved Hide resolved
@@ -109,6 +111,45 @@ DDRes spawn_workers(PersistentWorkerState *persistent_worker_state,
return {};
}

ReplyMessage create_reply_message(const DDProfContext &ctx) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: it feels slightly weird to have this function defined here.


if (ctx.watchers[alloc_watcher_idx].aggregation_mode ==
EventAggregationMode::kLiveSum) {
reply.allocation_flags |= (1 << ReplyMessage::kLiveCallgraph);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems someone forgot to rename the options all the way down 😅

int pfd_len = 0;
pollfd_setup(&ctx.worker_ctx.pevent_hdr, poll_fds, &pfd_len);

DDRES_CHECK_FWD(signalfd_setup(&poll_fds[pfd_len]));
int const signal_pos = pfd_len++;

WorkerServer const server =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow I had not thought about making this a separate thread. I thought we would make it live within the mainloop.
I think it is better this way.

r1viollet
r1viollet previously approved these changes Nov 14, 2023
Copy link
Collaborator

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
A very impressive capability.
I am eager to start playing with this.

@nsavoire nsavoire requested a review from r1viollet November 14, 2023 13:24
Copy link
Collaborator

@r1viollet r1viollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nsavoire nsavoire merged commit 0b28374 into main Nov 14, 2023
@nsavoire nsavoire deleted the nsavoire/follow_execs branch November 14, 2023 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants