Skip to content
rocallahan edited this page Mar 7, 2016 · 52 revisions

This page is in a somewhat disorganized state, please bear with us.

Table of Contents

Userspace recording

No target recompilation or VM hypervisor required.

Chronomancer/Chronicle

gdb reverse debugging ("process recorder")

cjones:

Process record and replay works by logging the execution of each machine instruction in the child process (the program being debugged), together with each corresponding change in machine state (the values of memory and registers).
  • unclear how modification of user memory during syscalls is recorded (apparently not at all)
  • unclear how process-shared memory is dealt with (apparently not at all)
  • very very high overhead (singlesteps the program using ptrace)
  • good approach for efficient replaying reverse-step et al.

UndoDB

  • Similar design to rr: records whole Linux process
  • Relies on code instrumentation in some manner
  • Single-core execution
  • Currently (4.0.3363) crashes when trying to record Firefox
  • Integrates with gdb

Nirvana

Hypervisor recording

ReVirt

VMWare Record & Replay

  • Project canceled

PANDA

Xen-TT

Performance Counters

Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations (Weaver, Terpstra, Moore)

Language/VM-specific Replay

WebReplay

Chakra JS Debugger

Python Time Travel Debugger

Chronon

roc:
It sounds like a very similar architecture to Chronomancer, but for the Java VM: instrument code to record variable changes and memory writes, pass raw trace data to helper threads which use carefully optimized compression, and provide a query engine to build a debugger on. Overheads quoted in that slide deck range from >200x (even more than Chronomancer) for well-optimized Java code that's CPU bound, down to 2x when you spend plenty of time in I/O or code that's excluded from Chronon instrumentation. That's probably a reasonable thing to do for J2EE code, and they get to use multiple cores to run the application. For something like Firefox though, where you really want to instrument the entire software stack and parallelism is not a big issue, rr's approach seems much better. Also I suspect they don't actually have complete heap information when they're not instrumenting the entire program, though they might do some heap traversal to mitigate that. Scalability issues mentioned here: http://blog.jetbrains.com/idea/2014/03/try-chronon-debugger-with-intellij-idea-13-1-eap/#comment-55286. Also it sounds like they don't have divergence support. Of course Java VMs don't support cloning, so they could only implement divergence using emulation, but you'd need a lot of heap data to make that work reliably.

GUI-level Record And Replay

Valera

Reran

(Not yet categorized)

Scribe

roc:

There are a few major differences between Scribe and rr:
  • Scribe doesn't serialize all threads. Instead they do a bunch of work to make sure all threads can run simultaneously. This reduces overhead in some places and adds overhead in others.
  • They say their approach doesn't require "changing, relinking or recompiling the kernel" but their approach has to track internal kernel state like inodes and VFS path traversal, and it's not really clear how they do that. They also say "Scribe records by intercepting all interactions of processes with their environment, capturing all nondeterminism in events that are stored in log queues inside the kernel" so my guess is they're using a kernel module. That's a pretty big negative in my view.
  • Scribe doesn't use performance counters to record asynchronous events. Instead they defer signal delivery until the next time the process enters the kernel. If the process doesn't enter the kernel for a long time, they basically take a snapshot of the entire state, force the process into the kernel and restart recording --- extremely heavyweight. For some bugs, it's essential to allow async signal delivery at any program point, so I don't like Scribe's approach there.

iDNA

Jockey

Pinplay

Respec

Echo

OS Support

BackTracker

Time-Traveling Virtual Machines

ExtraVirt

SubVirt

SMP-ReVirt

Speck

DoublePlay

See this page.

ReTrace

CLAP

Capo

QuickRec / Capo3

FlashBack

ORDER: Object centRic DEterministic Replay for Java

PRES: Probabilistic replay with execution sketching on multiprocessors

Arnold

Dune cjones:

This isn't a record/replay tool per se, but rather creates a framework on which one could be built. The elevator pitch is approximately that Dune exposes hardware virtualization features to userspace. So userspace can manage its own page tables, directly process exceptions, and so forth. With those tools, one could build a userspace-only ptrace equivalent. And that, in theory, could allow building an rr-like tool without rr's libpreload hackery (syscallbuf and seccomp-bpf) but with comparable performance. There are further interesting things that could be done with custom page-table entries. Lingering issues
  • does Dune expose rdtsc and cpuid virtualization?
  • does Dune expose some kind of interrupting programmable hwtimer?

Checkpointing

CRIU checkpointing of user-space Linux processes

Tonic Docker-based checkpointing for JS REPLs

seccomp-bpf

Mbox

Clone this wiki locally