Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(profiling): remove execute_internal hook #2719

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

morrisonlevi
Copy link
Collaborator

PROF-9982

Description

Background and Motivation

For wall- and CPU- time, we walk the call stack in the VM interrupt handler. Unfortunately, if an internal function call is the leaf frame at the time the interrupt was triggered, it gets popped off the call stack before the VM interrupt handler is triggered. To work around this, we add the zend_execute_internal hook and manually call the routine to collect the call stack while the internal frame was still on the call stack.

Using the zend_execute_internal hook has performance costs today. These costs are made even worse on the upcoming PHP 8.4 release because it adds a new optimization for frameless calls, and setting zend_execute_internal disables this optimization.

The goal of this task is to avoid installing this zend_execute_internal hook, while not regressing on performance, and without losing functionality--we definitely need that leaf frame. A lot of functions show up there, like regular expression and database related functions.

Implementation

This does some... uh... let's say "clever" things to avoid using the zend_execute_internal hook. In thread B which triggers the interrupt, it also saves the current value of thread A's EG(current_execute_data). Then when thread A handles its interrupt, it compares that against EG(vm_stack_top). If they are equal, then that's used as the top of the stack instead of the execute_data passed to the interrupt handler by the engine, and otherwise we just use execute_data as normal.

Reviewer checklist

  • Test coverage seems ok.
  • Appropriate labels assigned.

@github-actions github-actions bot added profiling Relates to the Continuous Profiler tracing labels Jun 18, 2024
@pr-commenter
Copy link

pr-commenter bot commented Jun 18, 2024

Benchmarks

Benchmark execution time: 2024-06-28 02:26:52

Comparing candidate commit 6d1ae69 in PR branch levi/vm_stack_top with baseline commit 44a07ba in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 27 metrics, 9 unstable metrics.

@morrisonlevi morrisonlevi force-pushed the levi/vm_stack_top branch 5 times, most recently from fb2ae5e to de2181c Compare June 21, 2024 18:59
@morrisonlevi morrisonlevi force-pushed the levi/vm_stack_top branch 2 times, most recently from 1ad5a64 to 2f1f393 Compare June 27, 2024 18:28
This does some... uh... let's say "clever" things to avoid using the
zend_execute_internal hook. In PHP 8.4, that would prevent frameless
function call optimizations from being used. Using the hook also has
performance costs even on older versions.
@codecov-commenter
Copy link

codecov-commenter commented Jun 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.52%. Comparing base (44a07ba) to head (6d1ae69).

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #2719      +/-   ##
============================================
+ Coverage     79.24%   79.52%   +0.28%     
  Complexity     2216     2216              
============================================
  Files           201      201              
  Lines         22595    22595              
============================================
+ Hits          17905    17969      +64     
+ Misses         4690     4626      -64     
Flag Coverage Δ
tracer-extension 78.82% <ø> (ø)
tracer-php 80.54% <ø> (+0.69%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 6 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44a07ba...6d1ae69. Read the comment docs.

@morrisonlevi
Copy link
Collaborator Author

It took quite a few randomized test runs but I managed to get a crash. Here's the top of the stack:

#0  datadog_php_profiling::profiling::stack_walking::detail::StringCache::get_or_insert (slot=0, f=..., self=<optimized out>)
    at profiling/src/profiling/stack_walking.rs:103
#1  datadog_php_profiling::profiling::stack_walking::detail::handle_function_cache_slot (func=0x7fbf16da3600, string_cache=<optimized out>)
    at profiling/src/profiling/stack_walking.rs:284
#2  datadog_php_profiling::profiling::stack_walking::detail::collect_call_frame (string_set=0x565364b5dba8, execute_data=<optimized out>)
    at profiling/src/profiling/stack_walking.rs:242
#3  datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample::{{closure}} (cell=<optimized out>) at profiling/src/profiling/stack_walking.rs:202
#4  std::thread::local::LocalKey<T>::try_with (self=<optimized out>, f=...) at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:270
#5  std::thread::local::LocalKey<T>::with (f=...) at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:246
#6  datadog_php_profiling::profiling::stack_walking::detail::collect_stack_sample (top_execute_data=<optimized out>) at profiling/src/profiling/stack_walking.rs:195
#7  0x00007fbf1d960303 in datadog_php_profiling::profiling::Profiler::collect_time (self=0x7fbf1deee698 <datadog_php_profiling::PROFILER+8>, 
    execute_data=0x23880000, interrupt_count=537535084) at profiling/src/profiling/mod.rs:693
#8  0x00007fbf1d961a1c in datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function::{{closure}} (cell=<optimized out>) at profiling/src/wall_time.rs:76
#9  std::thread::local::LocalKey<T>::try_with (self=<optimized out>, f=...) at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:270
#10 std::thread::local::LocalKey<T>::with (f=...) at /rustc/eb26296b556cef10fb713a38f3d16b9886080f26/library/std/src/thread/local.rs:246
#11 datadog_php_profiling::wall_time::ddog_php_prof_interrupt_function (execute_data=0x7fbf1fc14090) at profiling/src/wall_time.rs:36
#12 0x00007fbf1eddc0fc in dd_uhook_end (invocation=44259, execute_data=0x7fbf1fc14090, retval=0x7ffde6a66b10, auxiliary=0x7fbf1fdb5550, dynamic=0x7fbf16aab1c0)
    at /home/circleci/datadog/tmp/build_extension/ext/hook/uhook.c:383

Basically, the tracer calls the profiler's interrupt function before calling a closure in an end hook. The profiler crashes when looking at the run time cache of the "risky" frame.

Havnen't yet identified why the run time cache is invalid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
profiling Relates to the Continuous Profiler tracing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants