Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure stack size #284

Merged
merged 12 commits into from
Jul 7, 2023
Merged

Configure stack size #284

merged 12 commits into from
Jul 7, 2023

Conversation

r1viollet
Copy link
Collaborator

@r1viollet r1viollet commented Jul 5, 2023

What does this PR do?

Add a parameter to configure the size of samples.

Motivation

Ensure we can unwind deeper stacks.
Discussions in #276

Additional Notes

For now the maximum value allowed by linux is the closest value to USHORT_MAX, 8 byte aligned:
65528
If this is not enough we should consider a different kernel API.

How to test the change?

A test with deep stacks was added. I will run this in CI.
https://github.com/DataDog/ddprof-build/pull/101

@r1viollet r1viollet force-pushed the r1viollet/config_sample_size branch from fb7350d to 6d3d342 Compare July 5, 2023 14:09
@@ -55,11 +58,34 @@ static void display_system_config(void) {
}
}

int pevent_compute_min_mmap_order(int min_buffer_size_order,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nsavoire @sanchda would you agree with computing the size of the ring buffer based on the requested sample size ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me

@r1viollet r1viollet force-pushed the r1viollet/config_sample_size branch from 0e9bced to e7249b6 Compare July 6, 2023 07:30
@r1viollet r1viollet marked this pull request as ready for review July 6, 2023 07:44
@r1viollet r1viollet requested a review from sanchda as a code owner July 6, 2023 07:44
@r1viollet r1viollet force-pushed the r1viollet/config_sample_size branch from aa5ba50 to f136a1c Compare July 6, 2023 07:55
@nsavoire
Copy link
Collaborator

nsavoire commented Jul 6, 2023

I find that the naming sample_stack_user used throughout the PR is not very good at telling what the parameter really is.
Perhaps something like: stack_sample_size or user_stack_sample_size

include/ddprof_defs.hpp Outdated Show resolved Hide resolved
src/ddprof_cli.cc Outdated Show resolved Hide resolved
@r1viollet r1viollet changed the title [WIP] Configure stack size Configure stack size Jul 6, 2023
@r1viollet
Copy link
Collaborator Author

I find that the naming sample_stack_user used throughout the PR is not very good at telling what the parameter really is. Perhaps something like: stack_sample_size or user_stack_sample_size

I thought the same, though I wanted the consistency with the perf_event_open API

       sample_stack_user (since Linux 3.7)
              This defines the size of the user stack to dump if
              PERF_SAMPLE_STACK_USER is specified.

I guess we can forget about consistency and make it more user friendly ?

src/ddprof_cmdline.cc Outdated Show resolved Hide resolved
src/pevent_lib.cc Outdated Show resolved Hide resolved
src/pevent_lib.cc Outdated Show resolved Hide resolved
test/deep_stacks.cc Show resolved Hide resolved
test/deep_stacks.sh Show resolved Hide resolved
test/deep_stacks.sh Show resolved Hide resolved
test/deep_stacks.sh Outdated Show resolved Hide resolved
test/deep_stacks.cc Outdated Show resolved Hide resolved
test/deep_stacks.cc Outdated Show resolved Hide resolved
./ddprof --config ${CONFIG_DEEP_STACK} ./test/deep_stacks | tee ./log_ddprof.txt

# Check for truncated stack traces
truncated_input=$(awk -F': ' '/datadog.profiling.native.unwind.stack.truncated_input/ { print $NF }' ./log_ddprof.txt)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not test that the stacks are correct ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to open source prof-correctness to avoid duplicating some of the logics for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in its current state this test does not test much (and is quite complicated).
In release mode, everything is probably inlined in main function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying out release mode, everything looks OK
image

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did add a "no inline" attribute to make sure. Thanks for mentioning this.

Various adjustments based on pull request #284 comments
- Adjust option naming
- Add a template configuration element to properly account for default settings
- other minor fixes
@r1viollet
Copy link
Collaborator Author

The failures in CI are related to mmap failures..

@r1viollet
Copy link
Collaborator Author

I find that the naming sample_stack_user used throughout the PR is not very good at telling what the parameter really is. Perhaps something like: stack_sample_size or user_stack_sample_size

I thought the same, though I wanted the consistency with the perf_event_open API

       sample_stack_user (since Linux 3.7)
              This defines the size of the user stack to dump if
              PERF_SAMPLE_STACK_USER is specified.

I guess we can forget about consistency and make it more user friendly ?

I adjusted the naming, we are now inconsistent, though the setting is easier to understand.

- Ensure we take into account values from the template configuration when parsing a new configuration
@r1viollet
Copy link
Collaborator Author

The failures in CI are related to mmap failures..

I reverted the increase in buffer size :-(

- Removal of useless volatile keyword
include/lib/allocation_event.hpp Outdated Show resolved Hide resolved
test/deep_stacks.cc Outdated Show resolved Hide resolved
test/deep_stacks.cc Outdated Show resolved Hide resolved
test/deep_stacks.cc Show resolved Hide resolved
src/event_parser/event_parser.y Outdated Show resolved Hide resolved
// considering sample size, we adjust the size of ring buffers.
// Following is considered as a minimum number of samples to be fit in the
// ring buffer.
constexpr auto k_min_number_samples_per_ring_buffer = 7;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was default config seems wasteful:

  • event size is 32992 bytes (32768 + 224)
  • hence buffer size order to accommodate 7 samples will be 6 , ie. 266240 bytes with a usable size of 262144 bytes
  • meaning that with 7 samples in the buffer, 31200 bytes will remain free

What about decreasing default sample stack size to something like 32000 bytes, then we could fit 8 samples in the buffer and 4352 bytes would remain for other events ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change on default sample sizes.

- More tweaking on default sizes
Copy link
Collaborator

@nsavoire nsavoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@r1viollet r1viollet merged commit 4b9cc81 into main Jul 7, 2023
@r1viollet r1viollet deleted the r1viollet/config_sample_size branch July 7, 2023 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants