Abstract Machines

Just as, say, Java compiles down to bytecode to be run on a virtual machine, functional programming languages similarly compile down to some bytecode. But functional languages confusingly use the term "abstract machine" instead of "virtual machine", even though there's not really much of a difference between the two.

Really, abstract machines are where the "rubber meets the road". Want to change semantics from eager evaluation to lazy? It's in the abstract machine. Swap out the garbage collector? Look at the abstract machine.

There has been some attempt to systematically study abstract machines.

Basic Structure

The basic structure of an abstract machine is like that of a virtual machine (usually a stack machine) or CPU (at least, mimicking its instruction cycle). There's some limited "instruction set", some binary bytecode or low-level "assembly code", a main loop which schematically looks like:

void main_loop(struct MachineState *state, bytecode *program, size_t programSize) {
    bool isRunning = true;
    size_t ip = 0;
    while (isRunning) {
        switch(program[ip]) {
            case /* instruction 1 */:
                isRunning = instruction_1(state, program, programSize, &ip); 
                break;
            // other cases
            case INST_TERMINATE:
                isRunning = false;
                break;
            default:
                // panic!!!
        }
    }
}

There are other ways to do this, e.g., using function pointers to try to emulate threaded code. (It's unclear to me if using inline assembly and goto to avoid function calls would be faster/better --- i.e., if coroutines would be superior in performance.)

typedef void (*Instruction)(struct MachineState *state);

void main_loop(struct MachineState *state, Instruction *program) {
    while (NULL != program) {
        (*program)(state); // run the current instruction
        program++; // then move onto the next one
    }
}

(Perhaps using inline functions would be the best of both worlds? The advantage of making a function for each instruction is that it makes unit testing easier. The disadvantage is it can penalize performance noticeably. The approach to take is (i) make it work, (ii) make it fast.)

There is no typechecking at the abstract machine level, it is all done at the previous stages.

But there is a need for memory management via garbage collection.

Overviews

Beniamino Accattoli The Complexity of Abstract Machines, 15 pages
Małgorzata Biernacka & Olivier Danvy, A concrete framework for environment machines shows how lambda calculus with explicit substitution is "the same" as an abstract machine.

SECD Machine

Landin's SECD machine was the first "abstract machine" for functional programming languages, at least in theory.

Olivier Danvy A Rational Deconstruction of Landin's SECD Machine, 32 pages
Olivier Danvy, Kevin Millikin A Rational Deconstruction of Landin's SECD Machine with the J Operator

Categorical Abstract Machine

Ralf Hinze, The Categorical Abstract Machine: Basics and Enhancements, a great "tutorial" introduction focusing on practicalities rather than airy category theoretic presentation
Cousineau, Curien, Mauny, The Categorical Abstract Machine first introduces the thing

ZINC Machine

A slight optimization of the CAM.

X. Leroy, The ZINC experiment: an economical implementation of the ML language

G-Machine

This has various enhancements, the spineless G-machine and the spineless tagless G-machine (STG-machine for short). From what I understand, the first people who worked on G-machines for graph reduction were:

Johnsson, T., Efficient compilation of lazy evaluation, and
Augustsson, L., "A compiler for Lazy ML" citation

Subsequent reviews and enhancements culminated in the STG:

RB Kieburtz, The G-Machine
GL Burn, SL Peyton Jones, JD Robson, The Spineless G-Machine
Simon Peyton-Jones Implementing lazy functional languages on stock hardware: the Spineless Tagless G-Machine

Calling Conventions

There are two main calling conventions for functional programming languages: push/enter for eager evaluation or call-by-value strategies, and eval/apply for lazy evaluation. Or rather, this is where they excel. We could, if we really wanted, create a lazy push/enter scheme (or an eager eval/apply scheme), but this is really "swimming against the current". Marlow and Peyton-Jones discuss the two strategies.

Simon Marlow & Simon L. Peyton Jones. "Making a fast curry: push/enter vs. eval/apply for higher-order languages." Eprint, 35 pages.
Maciej Piróg, Jeremy Gibbons, "From Push/Enter to Eval/Apply by Program Transformation". arXiv:1606.06380
Beniamino Accattoli, Giulio Guerrieri Implementing Open Call-by-Value (Extended Version), 39 pages
Stephen Chang, David Van Horn, Matthias Felleisen, "Evaluating Call-By-Need on the Control Stack". arXiv:1009.3174

Formal Verification

Fabian Kunze, Gert Smolka, Yannick Forster Formal Small-step Verification of a Call-by-value Lambda Calculus Machine, 20 pages

Abstracting Abstract Machines

David van Horn and Matt Might seemed to have invented the notion of "abstracting abstract machines". The first papers they wrote on the subject are:

David Van Horn and Matthew Might, "Abstracting abstract machines: a systematic approach to higher-order program analysis." arXiv:1105.1743, 8 pages (includes discussion of "abstract garbage collection")
David Van Horn and Matthew Might, "Systematic abstraction of abstract machines". arXiv:1107.3539, 39 pages

Follow up papers appeared:

J. Ian Johnson, Nicholas Labich, Matthew Might, David Van Horn, "Optimizing Abstract Abstract Machines". arXiv:1211.3722, 12 pages
J. Ian Johnson, David Van Horn, "Abstracting Abstract Control (Extended)". arXiv:1305.3163, 14 pages
David Darais, Nicholas Labich, Phuc C. Nguyen, David Van Horn, "Abstracting Definitional Interpreters". arXiv:1707.04755, 25 pages.

Verification

Ranjit Jhala, Rupak Majumdar, Andrey Rybalchenko, "HMC: Verifying Functional Programs Using Abstract Interpreters". arXiv:1004.2884, 12 pages
Sigurd Schneider, Gert Smolka, Sebastian Hack, "Verifying Programs via Intermediate Interpretation". arXiv:1705.06738, 37 pages.

Algebraic Data Types

Danko Ilik, The exp-log normal form of types
Tuan-Hung Pham, Andrew Gacek, Michael W. Whalen, Reasoning about Algebraic Data Types with Abstractions
Hossein Hojjat, Philipp Rümmer Deciding and Interpolating Algebraic Data Types by Reduction
Peter J. Stuckey, Martin Sulzmann, Type Inference for Guarded Recursive Data Types --- guarded recursive data types are a generalization of algebraic data types
J. Garrett Morris Variations on Variants ADTs are (possibly recursive) variant types

Provide feedback

Saved searches

Use saved searches to filter your results more quickly