-
Notifications
You must be signed in to change notification settings - Fork 0
Abstract Machines
Just as, say, Java compiles down to bytecode to be run on a virtual machine, functional programming languages similarly compile down to some bytecode. But functional languages confusingly use the term "abstract machine" instead of "virtual machine", even though there's not really much of a difference between the two.
Really, abstract machines are where the "rubber meets the road". Want to change semantics from eager evaluation to lazy? It's in the abstract machine. Swap out the garbage collector? Look at the abstract machine.
There has been some attempt to systematically study abstract machines.
The basic structure of an abstract machine is like that of a virtual machine (usually a stack machine) or CPU (at least, mimicking its instruction cycle). There's some limited "instruction set", some binary bytecode or low-level "assembly code", a main loop which schematically looks like:
void main_loop(struct MachineState *state, bytecode *program, size_t programSize) {
bool isRunning = true;
size_t ip = 0;
while (isRunning) {
switch(program[ip]) {
case /* instruction 1 */:
isRunning = instruction_1(state, program, programSize, &ip);
break;
// other cases
case INST_TERMINATE:
isRunning = false;
break;
default:
// panic!!!
}
}
}
There are other ways to do this, e.g., using function pointers to try to emulate threaded code. (It's unclear to me if using inline assembly and goto
to avoid function calls would be faster/better --- i.e., if coroutines would be superior in performance.)
typedef void (*Instruction)(struct MachineState *state);
void main_loop(struct MachineState *state, Instruction *program) {
while (NULL != program) {
(*program)(state); // run the current instruction
program++; // then move onto the next one
}
}
(Perhaps using inline functions would be the best of both worlds? The advantage of making a function for each instruction is that it makes unit testing easier. The disadvantage is it can penalize performance noticeably. The approach to take is (i) make it work, (ii) make it fast.)
There is no typechecking at the abstract machine level, it is all done at the previous stages.
But there is a need for memory management via garbage collection.
- Beniamino Accattoli The Complexity of Abstract Machines, 15 pages
- Małgorzata Biernacka & Olivier Danvy, A concrete framework for environment machines shows how lambda calculus with explicit substitution is "the same" as an abstract machine.
Landin's SECD machine was the first "abstract machine" for functional programming languages, at least in theory.
- Olivier Danvy A Rational Deconstruction of Landin's SECD Machine, 32 pages
- Olivier Danvy, Kevin Millikin A Rational Deconstruction of Landin's SECD Machine with the J Operator
- Ralf Hinze, The Categorical Abstract Machine: Basics and Enhancements, a great "tutorial" introduction focusing on practicalities rather than airy category theoretic presentation
- Cousineau, Curien, Mauny, The Categorical Abstract Machine first introduces the thing
A slight optimization of the CAM.
- Wouter Swierstra From Mathematics to Abstract Machine: A formal derivation of an executable Krivine machine
- Brandon Bohrer, Karl Crary TWAM: A Certifying Abstract Machine for Logic Programs
- Jon Fairbairn, Stuart Wray, Tim: A simple, lazy abstract machine to execute supercombinators
This has various enhancements, the spineless G-machine and the spineless tagless G-machine (STG-machine for short). From what I understand, the first people who worked on G-machines for graph reduction were:
- Johnsson, T., Efficient compilation of lazy evaluation, and
- Augustsson, L., "A compiler for Lazy ML" citation
Subsequent reviews and enhancements culminated in the STG:
- RB Kieburtz, The G-Machine
- GL Burn, SL Peyton Jones, JD Robson, The Spineless G-Machine
- Simon Peyton-Jones Implementing lazy functional languages on stock hardware: the Spineless Tagless G-Machine
There are two main calling conventions for functional programming languages: push/enter for eager evaluation or call-by-value strategies, and eval/apply for lazy evaluation. Or rather, this is where they excel. We could, if we really wanted, create a lazy push/enter scheme (or an eager eval/apply scheme), but this is really "swimming against the current". Marlow and Peyton-Jones discuss the two strategies.
-
Simon Marlow & Simon L. Peyton Jones. "Making a fast curry: push/enter vs. eval/apply for higher-order languages." Eprint, 35 pages.
-
Maciej Piróg, Jeremy Gibbons, "From Push/Enter to Eval/Apply by Program Transformation". arXiv:1606.06380
-
Beniamino Accattoli, Giulio Guerrieri Implementing Open Call-by-Value (Extended Version), 39 pages
-
Stephen Chang, David Van Horn, Matthias Felleisen, "Evaluating Call-By-Need on the Control Stack". arXiv:1009.3174
- Fabian Kunze, Gert Smolka, Yannick Forster Formal Small-step Verification of a Call-by-value Lambda Calculus Machine, 20 pages
David van Horn and Matt Might seemed to have invented the notion of "abstracting abstract machines". The first papers they wrote on the subject are:
- David Van Horn and Matthew Might, "Abstracting abstract machines: a systematic approach to higher-order program analysis." arXiv:1105.1743, 8 pages (includes discussion of "abstract garbage collection")
- David Van Horn and Matthew Might, "Systematic abstraction of abstract machines". arXiv:1107.3539, 39 pages
Follow up papers appeared:
- J. Ian Johnson, Nicholas Labich, Matthew Might, David Van Horn, "Optimizing Abstract Abstract Machines". arXiv:1211.3722, 12 pages
- J. Ian Johnson, David Van Horn, "Abstracting Abstract Control (Extended)". arXiv:1305.3163, 14 pages
- David Darais, Nicholas Labich, Phuc C. Nguyen, David Van Horn, "Abstracting Definitional Interpreters". arXiv:1707.04755, 25 pages.
- Ranjit Jhala, Rupak Majumdar, Andrey Rybalchenko, "HMC: Verifying Functional Programs Using Abstract Interpreters". arXiv:1004.2884, 12 pages
- Sigurd Schneider, Gert Smolka, Sebastian Hack, "Verifying Programs via Intermediate Interpretation". arXiv:1705.06738, 37 pages.
- Danko Ilik, The exp-log normal form of types
- Tuan-Hung Pham, Andrew Gacek, Michael W. Whalen, Reasoning about Algebraic Data Types with Abstractions
- Hossein Hojjat, Philipp Rümmer Deciding and Interpolating Algebraic Data Types by Reduction
- Peter J. Stuckey, Martin Sulzmann, Type Inference for Guarded Recursive Data Types --- guarded recursive data types are a generalization of algebraic data types
- J. Garrett Morris Variations on Variants ADTs are (possibly recursive) variant types