Skip to content

Latest commit

 

History

History
246 lines (201 loc) · 9.48 KB

signal-safe-tracing.md

File metadata and controls

246 lines (201 loc) · 9.48 KB

Signal-Safe Stack Tracing

Overview

Stack traces from signal handlers can provide very helpful information for debugging application crashes, e.g. from SIGSEGV or SIGTRAP handlers. Signal handlers are really restrictive environments as your application could be interrupted by a signal at any point, including in the middle of malloc or buffered IO or while holding a lock. Doing a stack trace in a signal handler is possible but it requires a lot of care. This is difficult to do correctly and most examples online do this incorrectly.

It is not possible to resolve debug symbols safely in the process from a signal handler without heroic effort. In order to produce a full trace there are three options:

  1. Carefully save the object trace information to be resolved at a later time outside the signal handler
  2. Write the object trace information to a file to be resolved later
  3. Spawn a new process, communicate object trace information to that process, and have that process do the trace resolution

For traces on segfaults, e.g., only options 2 and 3 are viable. The this guide will go over approach 3 and the cpptrace safe API.

Big-Picture

In order to do this full process safely the way to go is collecting basic information in the signal handler and then either resolving later or handing that information to another process to resolve.

It's not as simple as calling cpptrace::generate_trace().print() but this is what is needed to really do this safely as far as I can tell.

FAQ: What's the worst that could happen if you call cpptrace::generate_trace().print() from a signal handler? In many cases you might be able to get away with it but you risk deadlocking or memory corruption.

Important

Currently signal-safe stack unwinding is only possible with libunwind, more details later.

Important

_dl_find_object is required for signal-safe stack tracing. This is a relatively recent addition to glibc, added in glibc 2.35.

Caution

Calls to shared objects can be lazy-loaded where the first call to the shared object invokes non-signal-safe functions such as malloc(). Because of this, the signal safe api must be "warmed up" ahead of a signal handler.

API

Cpptrace provides APIs for generating raw trace information safely and then also safely resolving those raw pointers to the most minimal object information needed to resolve later.

namespace cpptrace {
    // signal-safe
    std::size_t safe_generate_raw_trace(frame_ptr* buffer, std::size_t size, std::size_t skip = 0);
    // signal-safe
    std::size_t safe_generate_raw_trace(frame_ptr* buffer, std::size_t size, std::size_t skip, std::size_t max_depth);

    struct safe_object_frame {
        frame_ptr raw_address;
        frame_ptr address_relative_to_object_start;
        char object_path[CPPTRACE_PATH_MAX + 1];
        object_frame resolve() const; // To be called outside a signal handler. Not signal safe.
    };

    // signal-safe
    void get_safe_object_frame(frame_ptr address, safe_object_frame* out);
    // signal-safe
    bool can_signal_safe_unwind();
}

safe_generate_raw_trace produces instruction pointers found while unwinding the stack. These addresses are sufficient information to resolve a stack trace in the currently running process after a signal handler exits, but they aren't sufficient for resolving outside of the currently running process, unless there is no position-independent code or shared-library code.

To resolve outside the current process safe_object_frame information is needed. This contains the path to the object where the address is located as well as the address before address randomization.

Strategy

Signal-safe tracing can be done three ways:

  • In a signal handler, call safe_generate_raw_trace and then outside a signal handler construct a cpptrace:raw_trace and resolve.
  • In a signal handler, call safe_generate_raw_trace, then write cpptrace::safe_object_frame information to a file to be resolved later.
  • In a signal handler, call safe_generate_raw_trace, fork() and exec() a process to handle the resolution, pass cpptrace::safe_object_frame information to that child through a pipe, and wait for the child to exit.

It's not as simple as calling cpptrace::generate_trace().print(), I know, but these are truly the only ways to do this safely as far as I can tell.

Technical Requirements

Note: Not all back-ends and platforms support these interfaces. If signal-safe unwinding isn't supported safe_generate_raw_trace will just produce an empty trace and if object information can't be resolved in a signal-safe way then get_safe_object_frame will not populate fields beyond the raw_address.

Currently the only back-end that can unwind safely is libunwind. Currently, the only way I know to get dladdr's information in a signal-safe manner is _dl_find_object, which doesn't exist on macos (or windows of course). If anyone knows ways to do these safely on other platforms, I'd be much appreciative.

Signal-Safe Tracing With fork() + exec()

Of the three strategies, fork() + exec(), is the most technically involved and the only way to resolve while the signal handler is running. I think it's worthwhile to do a deep-dive into how to do this.

In the source code, signal_demo.cpp and signal_tracer.cpp provide a working example for what is described here.

In the main program

The main program handles most of the complexity for tracing from signal handlers:

  • Collecting a raw trace
  • Spawning a child process
  • Resolving raw frame pointers to minimal object frames
  • Sending that info to the other process

Also note: A warmup for the library is done in main.

A basic implementation is as follows:

#include <sys/wait.h>
#include <cstring>
#include <signal.h>
#include <unistd.h>

#include <cpptrace/cpptrace.hpp>

// This is just a utility I like, it makes the pipe API more expressive.
struct pipe_t {
    union {
        struct {
            int read_end;
            int write_end;
        };
        int data[2];
    };
};

void do_signal_safe_trace(cpptrace::frame_ptr* buffer, std::size_t count) {
    // Setup pipe and spawn child
    pipe_t input_pipe;
    pipe(input_pipe.data);
    const pid_t pid = fork();
    if(pid == -1) {
        const char* fork_failure_message = "fork() failed\n";
        write(STDERR_FILENO, fork_failure_message, strlen(fork_failure_message));
        return;
    }
    if(pid == 0) { // child
        dup2(input_pipe.read_end, STDIN_FILENO);
        close(input_pipe.read_end);
        close(input_pipe.write_end);
        execl("signal_tracer", "signal_tracer", nullptr);
        const char* exec_failure_message = "exec(signal_tracer) failed: Make sure the signal_tracer executable is in "
            "the current working directory and the binary's permissions are correct.\n";
        write(STDERR_FILENO, exec_failure_message, strlen(exec_failure_message));
        _exit(1);
    }
    // Resolve to safe_object_frames and write those to the pipe
    for(std::size_t i = 0; i < count; i++) {
        cpptrace::safe_object_frame frame;
        cpptrace::get_safe_object_frame(buffer[i], &frame);
        write(input_pipe.write_end, &frame, sizeof(frame));
    }
    close(input_pipe.read_end);
    close(input_pipe.write_end);
    // Wait for child
    waitpid(pid, nullptr, 0);
}

void handler(int signo, siginfo_t* info, void* context) {
    // Print basic message
    const char* message = "SIGSEGV occurred:\n";
    write(STDERR_FILENO, message, strlen(message));
    // Generate trace
    constexpr std::size_t N = 100;
    cpptrace::frame_ptr buffer[N];
    std::size_t count = cpptrace::safe_generate_raw_trace(buffer, N);
    do_signal_safe_trace(buffer, count);
    // Up to you if you want to exit or continue or whatever
    _exit(1);
}

void warmup_cpptrace() {
    // This is done for any dynamic-loading shenanigans
    cpptrace::frame_ptr buffer[10];
    std::size_t count = cpptrace::safe_generate_raw_trace(buffer, 10);
    cpptrace::safe_object_frame frame;
    cpptrace::get_safe_object_frame(buffer[0], &frame);
}

int main() {
    warmup_cpptrace();
    // Setup signal handler
    struct sigaction action = { 0 };
    action.sa_flags = 0;
    action.sa_sigaction = &handler;
    if (sigaction(SIGSEGV, &action, NULL) == -1) {
        perror("sigaction");
    }

    /// ...
}

In the tracer program

The tracer program is quite simple. It just has to read cpptrace::safe_object_frames from the pipe, resolve to cpptrace::object_frames, and resolve an object_trace.

#include <cstdio>
#include <iostream>
#include <unistd.h>

#include <cpptrace/cpptrace.hpp>

int main() {
    cpptrace::object_trace trace;
    while(true) {
        cpptrace::safe_object_frame frame;
        // fread used over read because a read() from a pipe might not read the full frame
        std::size_t res = fread(&frame, sizeof(frame), 1, stdin);
        if(res == 0) {
            break;
        } else if(res != 1) {
            std::cerr<<"Something went wrong while reading from the pipe"<<res<<" "<<std::endl;
            break;
        } else {
            trace.frames.push_back(frame.resolve());
        }
    }
    trace.resolve().print();
}