diff --git a/doc/FASTALLOC.md b/doc/FASTALLOC.md
new file mode 100644
index 00000000..21abd459
--- /dev/null
+++ b/doc/FASTALLOC.md
@@ -0,0 +1,321 @@
+# Fastalloc Design Overview
+
+Fastalloc is a register allocator made specifically for fast
+compile times. It's based on the reverse linear scan register
+allocation/SSRA algorithm.
+This document describes the data structures used and the allocation steps.
+
+# Data Structures
+
+The main data structures that Fastalloc uses to track its state are
+described below.
+
+## Current VReg Allocations (`vreg_allocs`)
+
+This is a vector that is used to hold the current allocation for every
+VReg during execution.
+
+## VReg Spillslots (`vreg_spillslots`)
+
+Whenever a VReg needs a spillslot, a dedicated slot is allocated for it.
+This vector is where all VReg's spillslots are stored.
+
+## Live VRegs (`live_vregs`)
+
+Live VReg information is kept in a `VRegSet`, a doubly linked list
+based on a vector. This is used for quick insertion, removal, and
+iteration.
+
+## Least Recently Used Caches (`lrus`)
+
+Every register class (int, float, and vector) has its own LRU and they
+are stored together in an array: `lrus`. An LRU is represented similarly
+to a `VRegSet`: it's a circular, doubly-linked list based on a vector.
+
+The last PReg in an LRU is the least-recently allocated PReg:
+
+most recently used PReg (head) -> 2nd MRU PReg -> ... -> LRU PReg
+
+## Current VReg In PReg Info (`vreg_in_preg`)
+
+During allocation, it's necessary to determine which VReg is in a PReg
+to generate the right move(s) for eviction.
+`vreg_in_preg` is a vector that stores this information.
+
+## Available PRegs For Use In Instruction (`available_pregs`)
+
+This is a 2-tuple of `PRegSet`s, a bitset of physical registers, one for
+the instruction's early phase and one for the late phase.
+They are used to determine which registers are available for use in the
+early/late phases of an instruction.
+
+Prior to the beginning of any instruction's allocation, this set is reset
+to include all allocatable physical registers, some of which may already
+contain a VReg.
+
+## VReg Liverange Location Info (`vreg_to_live_inst_range`)
+
+This is a vector of 3-tuples containing the beginning and the end
+of all VReg's liveranges, along with an allocation they are guaranteed
+to be in throughout that liverange.
+This is used to build the debug locations vector after allocation
+is complete.
+
+# Allocation Process Breakdown
+
+Allocation proceeds in reverse: from the last block to the first block,
+and in each block: from the last instruction to the first instruction.
+
+The allocation for each operand in an instruction can be viewed to happen
+in four phases: selection, assignment, eviction, and edit insertion.
+
+## Allocation Phase: Selection
+
+In this phase, a PReg is selected from `available_pregs` for the 
+operand based on the operand constraints. Depending on the operand's 
+position the selected PReg is removed from either the early or late 
+phase or both, indicating that the PReg is no longer available for 
+allocation by other operands in that phase.
+
+## Allocation Phase: Assignment
+
+In this phase, the selected PReg is set as the allocation for 
+the operand in the final output.
+
+## Allocation Phase: Eviction
+
+In this phase, the previous VReg in the allocation assigned to 
+an operand is evicted, if any.
+
+During eviction, a dedicated spillslot is allocated for the evicted 
+VReg and an edit is inserted after the instruction to move from the
+slot to the allocation it's expected to be in after the instruction.
+
+## Allocation Phase: Edit Insertion
+
+In this phase, edits are inserted to ensure that the dataflow from
+before the instruction to the selected allocation to after
+the instruction remain correct.
+
+# Invariants
+
+Some invariants that remain true throughout execution:
+
+1. During processing, the allocation of a VReg at any point in time
+as indicated in `vreg_allocs` changes exactly twice or thrice.
+Initially it is set to none. When it's allocated, it is
+changed to that allocation. After this, it doesn't change unless 
+it's evicted or spilled across a block boundary;
+if it is, then its current allocation will change to its dedicated 
+spillslot. After this, it doesn't change again until it's definition 
+is reached and it's deallocated, during which its `vreg_allocs` 
+entry is set to none. The only exception is block parameters that 
+are never used: these are never allocated.
+
+2. A virtual register that outlives the block it was defined in will 
+be in its dedicated spillslot by the end of the block.
+
+3. At the end of a block, before edits are inserted to move values 
+from branch arguments to block parameters spillslots, all branch 
+arguments will be in their dedicated spillslots.
+
+4. At the beginning of a block, all branch parameters and livein 
+virtual registers will be in their dedicated spillslots.
+
+# Instruction Allocation
+
+To allocate a single instruction, the first step is to reset the
+`available_pregs` sets to all allocated PRegs.
+
+Next, the selection phase is carried out for all operands with
+fixed register constraints: the registers they are constrained to use are
+marked as unavailable in the `available_pregs` set, depending on the
+phase that they are valid in. If the operand is an early use or late
+def operand, then the register will be marked as unavailable in the
+early set or late set, respectively. Otherwise, the PReg is marked
+as unavailable in both the early and late sets, because a PReg
+assigned to an early def or late use operand cannot be reused by another
+operand in the same instruction.
+
+After selection for fixed register operands, the eviction phase is 
+carried out for fixed register operands. Any VReg in their selected
+registers, indicated by `vreg_in_preg`, is evicted: a dedicated 
+spillslot is allocated for the VReg (if it doesn't have one already),
+an edit is inserted to move from the slot to the PReg, which is where
+the VReg expected to be after the instruction, and its current
+allocation in `vreg_allocs` is set to the spillslot.
+
+Next, all clobbers are removed from the early and late `available_pregs` 
+sets to avoid allocating a clobber to a def.
+
+Next, the selection, assignment, eviction, and edit insertion phases are 
+carried out for all def operands. When each def operand's allocation is
+complete, the def operands is immediately freed, marking the end of the
+VReg's liverange. It is removed from the  `live_vregs` set, its allocation
+in `vreg_allocs` is set to none, and if it was in a PReg, that PReg's
+entry in `vreg_in_preg` is set to none. The selection and eviction phases
+are omitted if the operand has a fixed constraint, as those phases have
+already been carried out.
+
+Next, the selection, assignment, and eviction phases are carried out for all
+use operands. As with def operands, the selection and eviction phases are 
+omitted if the operand has a fixed constraint, as those phases have already
+been carried out.
+
+Then the edit insertion phase is carried out for all use operands.
+
+Lastly, if the instruction being processed is a branch instruction, the
+parallel move resolver is used to insert edits before the instruction
+to move from the branch arguments spillslots to the block parameter
+spillslots.
+
+## Operand Allocation
+
+During the allocation of an operand, a check is first made to 
+see if the VReg's current allocation as indicated in 
+`vreg_allocs` is within the operand constraints.
+
+If it is, the assignment phase is carried out, setting the final
+allocation output's entry for that operand to the allocation.
+The selection phase is carried out, marking the PReg 
+(if the allocation is a PReg) as unavailable in the respective
+early/late sets. The state of the LRUs is also updated to reflect 
+the new most recently used PReg.
+No eviction needs to be done since the VReg is already in the 
+allocation and no edit insertion needs to be done either.
+
+On the other hand, if the VReg's current allocation is not within
+constraints, the selection and eviction phases are carried out for
+non-fixed operands. First, a set of PRegs that can be drawn from is
+created from `available_pregs`. For early uses and late defs,
+this draw-from set is the early set or late set respectively.
+For late uses and early defs, the draw-from set is an intersection
+of the available early and late sets (because a PReg used for a late
+use can't be reassigned to another operand in the early phase;
+likewise, a PReg used for an early def can't be reassigned to another
+operand in the late phase).
+The LRU for the VReg's regclass is then traversed from the end to find
+the least-recently used PReg in the draw-from set. Once a PReg is found,
+it is marked as the most recently used in the LRU, unavailable in the
+`available_pregs` sets, and whatever VReg was in it before is evicted.
+
+The assignment phase is carried out next: the final allocation for the
+operand is set to the selected register.
+
+If the newly allocated operand has not been allocated before, that is,
+this is the first use/def of the VReg encountered, the VReg is
+inserted into `live_vregs` and marked as the value in the allocated
+PReg in `vreg_in_preg`.
+
+Otherwise, if the VReg has been allocated before, then an edit will need
+to be inserted to ensure that the dataflow remains correct.
+The edit insertion phase is now carried out if the operand is a def
+operand: an edit is inserted after the instruction to move from the
+new allocation to the allocation it's expected to be in after the
+instruction.
+
+The edit insertion phase for use operands is done after all operands
+have been processed. Edits are inserted to move from the current
+allocations in `vreg_allocs` to the final allocated position before
+the instruction. This is to account for the possibility of multiple
+uses of the same operand in the instruction.
+
+## Reuse Operands
+
+Reuse def operands are handled by creating a new operand identical to the
+reuse def, except that its constraints are the constraints of the
+reused input and allocating that in its place.
+
+Reused inputs are handled by creating a new operand with a fixed register
+constraint to use whatever register was assigned to the reuse def.
+
+Because of the way reuse operands and reused inputs are handled, when
+selecting a register for an early use operand with a fixed constraint,
+the PReg is also marked as unavailable in the `available_pregs` late 
+set if the operand is a reused input. And when selecting a register 
+for reuse def operands, the selected register is marked as unavailable 
+in the `available_pregs` early set.
+
+## VReg Spillslots
+
+Whenever a VReg needs a spillslot, a suitable one is allocated and
+marked as the VReg's dedicated spillslot in `vreg_spillslots`.
+If a VReg never needs a spillslot, none is allocated for it.
+To ensure that a VReg will always be in its spillslot when expected,
+during the processing of a def operand, before it's deallocated,
+an edit is inserted to move from its current allocation as indicated
+in `vreg_allocs` to its dedicated spillslot, if one is present in
+`vreg_spillslots`.
+
+## Branch Instructions
+
+As an invariant, all branch arguments will be in their dedicated
+spillslots at the end of the block before edits are inserted to
+move from those spillslots to the block parameter spillslots
+of the successor blocks.
+
+If a branch argument is already in an allocation that isn't
+its spillslot (this could happen if the branch argument is used
+as an operand in the same instruction, because all normal
+instruction processing is completed before branch-specific
+processing), then an edit is inserted
+to move from the spillslot to that allocation and its current
+allocation in `vreg_allocs` is set to the spillslot.
+
+It's after these edits have been inserted that the parallel move
+resolver is then used to generate and insert edits to move from
+those spillslots to the spillslots of the block parameters.
+
+# Across Blocks
+
+When a block completes processing, some VRegs will still be live.
+These VRegs are either block parameters or livein VRegs.
+As an invariant, prior to the first instruction in a block, all
+block parameters and livein VRegs will be in their dedicated spillslots.
+
+To maintain this invariant, after a block completes processing, edits
+are inserted at the beginning of the block to move from the block
+parameter and livein spillslots to the allocation they are expected
+to be in from the first instruction.
+All block parameters are freed, just like defs, and liveins' current
+allocations in `vreg_allocs` are set to their spillslots.
+
+# Edits Order
+
+`regalloc2`'s outward interface guarantees that edits are in
+sorted order. Since allocation proceeds in reverse, all edits
+are also added in reverse. After all blocks have completed
+processing the edits are simply reversed to put it in the
+correct order.
+
+One of the reasons why the allocation order proceeds the way it
+does is because of this edit-order constraint. All edits that
+occur after the instruction must be inserted before all edits
+that occur before the instruction.
+
+# Debug Info
+
+After all blocks have completed processing, the debug locations
+vector is built.
+The information it's built from is assembled from liverange info 
+that is tracked throughout the allocation.
+Whenever a VReg is allocated for the first time, its liverange end
+is saved in the VReg's slot in the `vreg_to_live_inst_range`
+vector. Whenever a VReg's definition is encountered, its liverange
+beginning is saved, too. And the allocation it will be in
+throughout that range is also saved alongside.
+
+To determine the allocation the VReg will be in throughout the 
+liverange, the first invariant is used: the first time a VReg
+is allocated, its current allocation in `vreg_allocs` doesn't
+change unless its evicted or spilled across block boundaries.
+Using this info, if by the time the def of a VReg is allocated,
+that VReg has no dedicated spillslot,
+that implies that the VReg was never evicted or spilled, so whatever
+value its `vreg_allocs` entry says is the location it will be in
+throughout its liverange. Otherwise, if it has a spillslot
+allocated to it, that implies that the VReg was either evicted
+at some point or it was a livein of a predecessor or a block parameter.
+Either way, since all spillslots are dedicated to their respective VRegs,
+it is safe to record the spillslot as the allocation for the
+`vreg_to_live_inst_range` info.
diff --git a/doc/GENERAL.md b/doc/GENERAL.md
new file mode 100644
index 00000000..f5d70ff2
--- /dev/null
+++ b/doc/GENERAL.md
@@ -0,0 +1,212 @@
+# regalloc2 Design Overview
+
+This document describes the basic architecture of the regalloc2
+register allocator. It describes the externally-visible interface:
+input CFG, instructions, operands, with their invariants; meaning of
+various parts of the output.
+`ION.md` and `FASTALLOC.md` describe the specifics of the main Ion
+allocator and the fast allocator, respectively.
+
+# API, Input IR and Invariants
+
+The toplevel API to regalloc2 consists of a single entry point `run()`
+that takes a register environment, which specifies all physical
+registers, and the input program. The function returns either an error
+or an `Output` struct that provides allocations for each operand and a
+vector of additional instructions (moves, loads, stores) to insert.
+
+## Register Environment
+
+The allocator takes a `MachineEnv` which specifies, for each of the
+two register classes `Int` and `Float`, a vector of `PReg`s by index. A
+`PReg` is nothing more than the class and index within the class; the
+allocator does not need to know anything more.
+
+The `MachineEnv` provides a vector of preferred and non-preferred
+physical registers per class. Any register not in either vector will
+not be allocated. Usually, registers that do not need to be saved in
+the prologue if used (i.e., caller-save registers) are given in the
+"preferred" vector. The environment also provides exactly one scratch
+register per class. This register must not be in the preferred or
+non-preferred vectors, and is used whenever a set of moves that need
+to occur logically in parallel have a cycle (for a simple example,
+consider a swap `r0, r1 := r1, r0`).
+
+With some more work, we could potentially remove the need for the
+scratch register by requiring support for an additional edit type from
+the client ("swap"), but we have not pursued this.
+
+## CFG and Instructions
+
+The allocator operates on an input program that is in a standard CFG
+representation: the function body is a sequence of basic blocks, and
+each block has a sequence of instructions and zero or more
+successors. The allocator also requires the client to provide
+predecessors for each block, and these must be consistent with the
+successors.
+
+Instructions are opaque to the allocator except for a few important
+bits: (1) `is_ret` (is a return instruction); (2) `is_branch` (is a
+branch instruction); and (3) a vector of Operands, covered below.
+Every block must end in a return or branch.
+
+Both instructions and blocks are named by indices in contiguous index
+spaces. A block's instructions must be a contiguous range of
+instruction indices, and block i's first instruction must come
+immediately after block i-1's last instruction.
+
+The CFG must have *no critical edges*. A critical edge is an edge from
+block A to block B such that A has more than one successor *and* B has
+more than one predecessor. For this definition, the entry block has an
+implicit predecessor, and any block that ends in a return has an
+implicit successor.
+
+Note that there are *no* requirements related to the ordering of
+blocks, and there is no requirement that the control flow be
+reducible. Some *heuristics* used by the allocator will perform better
+if the code is reducible and ordered in reverse postorder (RPO),
+however: in particular, (1) this interacts better with the
+contiguous-range-of-instruction-indices live range representation that
+we use, and (2) the "approximate loop depth" metric will actually be
+exact if both these conditions are met.
+
+## Operands and VRegs
+
+Every instruction operates on values by way of `Operand`s. An operand
+consists of the following fields:
+
+- VReg, or virtual register. *Every* operand mentions a virtual
+  register, even if it is constrained to a single physical register in
+  practice. This is because we track liveranges uniformly by vreg.
+
+- Policy, or "constraint". Every reference to a vreg can apply some
+  constraint to the vreg at that point in the program. Valid policies are:
+
+  - Any location;
+  - Any register of the vreg's class;
+  - Any stack slot;
+  - A particular fixed physical register; or
+  - For a def (output), a *reuse* of an input register.
+
+- The "kind" of reference to this vreg: Def, Use, Mod. A def
+  (definition) writes to the vreg, and disregards any possible earlier
+  value. A mod (modify) reads the current value then writes a new
+  one. A use simply reads the vreg's value.
+
+- The position: before or after the instruction.
+  - Note that to have a def (output) register available in a way that
+    does not conflict with inputs, the def should be placed at the
+    "before" position. Similarly, to have a use (input) register
+    available in a way that does not conflict with outputs, the use
+    should be placed at the "after" position.
+
+VRegs, or virtual registers, are specified by an index and a register
+class (Float or Int). The classes are not given separately; they are
+encoded on every mention of the vreg. (In a sense, the class is an
+extra index bit, or part of the register name.) The input function
+trait does require the client to provide the exact vreg count,
+however.
+
+Implementation note: both vregs and operands are bit-packed into
+u32s. This is essential for memory-efficiency. As a result of the
+operand bit-packing in particular (including the policy constraints!),
+the allocator supports up to 2^21 (2M) vregs per function, and 2^6
+(64) physical registers per class. Later we will also see a limit of
+2^20 (1M) instructions per function. These limits are considered
+sufficient for the anticipated use-cases (e.g., compiling Wasm, which
+also has function-size implementation limits); for larger functions,
+it is likely better to use a simpler register allocator in any case.
+
+## Reuses and Two-Address ISAs
+
+Some instruction sets primarily have instructions that name only two
+registers for a binary operator, rather than three: both registers are
+inputs, and the result is placed in one of the registers, clobbering
+its original value. The most well-known modern example is x86. It is
+thus imperative that we support this pattern well in the register
+allocator.
+
+This instruction-set design is somewhat at odds with an SSA
+representation, where a value cannot be redefined.
+
+Thus, the allocator supports a useful fiction of sorts: the
+instruction can be described as if it has three register mentions --
+two inputs and a separate output -- and neither input will be
+clobbered. The output, however, is special: its register-placement
+policy is "reuse input i" (where i == 0 or 1). The allocator
+guarantees that the register assignment for that input and the output
+will be the same, so the instruction can use that register as its
+"modifies" operand. If the input is needed again later, the allocator
+will take care of the necessary copying.
+
+We will see below how the allocator makes this work by doing some
+preprocessing so that the core allocation algorithms do not need to
+worry about this constraint.
+
+## SSA
+
+regalloc2 takes an SSA IR as input, where the usual definitions apply:
+every vreg is defined exactly once, and every vreg use is dominated by
+its one def. (Using blockparams means that we do not need additional
+conditions for phi-nodes.)
+
+## Block Parameters
+
+Every block can have *block parameters*, and a branch to a block with
+block parameters must provide values for those parameters via
+operands. When a branch has more than one successor, it provides
+separate operands for each possible successor. These block parameters
+are equivalent to phi-nodes; we chose this representation because they
+are in many ways a more consistent representation of SSA.
+
+To see why we believe block parameters are a slightly nicer design
+choice than use of phi nodes, consider: phis are special
+pseudoinstructions that must come first in a block, are all defined in
+parallel, and whose uses occur on the edge of a particular
+predecessor. All of these facts complicate any analysis that scans
+instructions and reasons about uses and defs. It is much closer to the
+truth to actually put those uses *in* the predecessor, on the branch,
+and put all the defs at the top of the block as a separate kind of
+def. The tradeoff is that a vreg's def now has two possibilities --
+ordinary instruction def or blockparam def -- but this is fairly
+reasonable to handle.
+
+## Output
+
+The allocator produces two main data structures as output: an array of
+`Allocation`s and a sequence of edits. Some other miscellaneous data is also
+provided.
+
+### Allocations
+
+The allocator provides an array of `Allocation` values, one per
+`Operand`. Each `Allocation` has a kind and an index. The kind may
+indicate that this is a physical register or a stack slot, and the
+index gives the respective register or slot. All allocations will
+conform to the constraints given, and will faithfully preserve the
+dataflow of the input program.
+
+### Inserted Moves
+
+In order to implement the necessary movement of data between
+allocations, the allocator needs to insert moves at various program
+points.
+
+The vector of inserted moves contains tuples that name a program point
+and an "edit". The edit is either a move, from one `Allocation` to
+another, or else a kind of metadata used by the checker to know which
+VReg is live in a given allocation at any particular time. The latter
+sort of edit can be ignored by a backend that is just interested in
+generating machine code.
+
+Note that the allocator will never generate a move from one stackslot
+directly to another, by design. Instead, if it needs to do so, it will
+make use of the scratch register. (Sometimes such a move occurs when
+the scratch register is already holding a value, e.g. to resolve a
+cycle of moves; in this case, it will allocate another spillslot and
+spill the original scratch value around the move.)
+
+Thus, the single "edit" type can become either a register-to-register
+move, a load from a stackslot into a register, or a store from a
+register into a stackslot.
+
diff --git a/doc/DESIGN.md b/doc/ION.md
similarity index 85%
rename from doc/DESIGN.md
rename to doc/ION.md
index 4172a063..aea2be23 100644
--- a/doc/DESIGN.md
+++ b/doc/ION.md
@@ -1,217 +1,12 @@
-# regalloc2 Design Overview
+# Ion Design Overview
 
-This document describes the basic architecture of the regalloc2
-register allocator. It describes the externally-visible interface
-(input CFG, instructions, operands, with their invariants; meaning of
-various parts of the output); core data structures; and the allocation
+This document describes the basic architecture of the Ion
+register allocator. It describes the core data structures; and the allocation
 pipeline, or series of algorithms that compute an allocation. It ends
 with a description of future work and expectations, as well as an
 appendix that notes design influences and similarities to the
 IonMonkey backtracking allocator.
 
-# API, Input IR and Invariants
-
-The toplevel API to regalloc2 consists of a single entry point `run()`
-that takes a register environment, which specifies all physical
-registers, and the input program. The function returns either an error
-or an `Output` struct that provides allocations for each operand and a
-vector of additional instructions (moves, loads, stores) to insert.
-
-## Register Environment
-
-The allocator takes a `MachineEnv` which specifies, for each of the
-two register classes `Int` and `Float`, a vector of `PReg`s by index. A
-`PReg` is nothing more than the class and index within the class; the
-allocator does not need to know anything more.
-
-The `MachineEnv` provides a vector of preferred and non-preferred
-physical registers per class. Any register not in either vector will
-not be allocated. Usually, registers that do not need to be saved in
-the prologue if used (i.e., caller-save registers) are given in the
-"preferred" vector. The environment also provides exactly one scratch
-register per class. This register must not be in the preferred or
-non-preferred vectors, and is used whenever a set of moves that need
-to occur logically in parallel have a cycle (for a simple example,
-consider a swap `r0, r1 := r1, r0`).
-
-With some more work, we could potentially remove the need for the
-scratch register by requiring support for an additional edit type from
-the client ("swap"), but we have not pursued this.
-
-## CFG and Instructions
-
-The allocator operates on an input program that is in a standard CFG
-representation: the function body is a sequence of basic blocks, and
-each block has a sequence of instructions and zero or more
-successors. The allocator also requires the client to provide
-predecessors for each block, and these must be consistent with the
-successors.
-
-Instructions are opaque to the allocator except for a few important
-bits: (1) `is_ret` (is a return instruction); (2) `is_branch` (is a
-branch instruction); and (3) a vector of Operands, covered below.
-Every block must end in a return or branch.
-
-Both instructions and blocks are named by indices in contiguous index
-spaces. A block's instructions must be a contiguous range of
-instruction indices, and block i's first instruction must come
-immediately after block i-1's last instruction.
-
-The CFG must have *no critical edges*. A critical edge is an edge from
-block A to block B such that A has more than one successor *and* B has
-more than one predecessor. For this definition, the entry block has an
-implicit predecessor, and any block that ends in a return has an
-implicit successor.
-
-Note that there are *no* requirements related to the ordering of
-blocks, and there is no requirement that the control flow be
-reducible. Some *heuristics* used by the allocator will perform better
-if the code is reducible and ordered in reverse postorder (RPO),
-however: in particular, (1) this interacts better with the
-contiguous-range-of-instruction-indices live range representation that
-we use, and (2) the "approximate loop depth" metric will actually be
-exact if both these conditions are met.
-
-## Operands and VRegs
-
-Every instruction operates on values by way of `Operand`s. An operand
-consists of the following fields:
-
-- VReg, or virtual register. *Every* operand mentions a virtual
-  register, even if it is constrained to a single physical register in
-  practice. This is because we track liveranges uniformly by vreg.
-
-- Policy, or "constraint". Every reference to a vreg can apply some
-  constraint to the vreg at that point in the program. Valid policies are:
-
-  - Any location;
-  - Any register of the vreg's class;
-  - Any stack slot;
-  - A particular fixed physical register; or
-  - For a def (output), a *reuse* of an input register.
-
-- The "kind" of reference to this vreg: Def, Use, Mod. A def
-  (definition) writes to the vreg, and disregards any possible earlier
-  value. A mod (modify) reads the current value then writes a new
-  one. A use simply reads the vreg's value.
-
-- The position: before or after the instruction.
-  - Note that to have a def (output) register available in a way that
-    does not conflict with inputs, the def should be placed at the
-    "before" position. Similarly, to have a use (input) register
-    available in a way that does not conflict with outputs, the use
-    should be placed at the "after" position.
-
-VRegs, or virtual registers, are specified by an index and a register
-class (Float or Int). The classes are not given separately; they are
-encoded on every mention of the vreg. (In a sense, the class is an
-extra index bit, or part of the register name.) The input function
-trait does require the client to provide the exact vreg count,
-however.
-
-Implementation note: both vregs and operands are bit-packed into
-u32s. This is essential for memory-efficiency. As a result of the
-operand bit-packing in particular (including the policy constraints!),
-the allocator supports up to 2^21 (2M) vregs per function, and 2^6
-(64) physical registers per class. Later we will also see a limit of
-2^20 (1M) instructions per function. These limits are considered
-sufficient for the anticipated use-cases (e.g., compiling Wasm, which
-also has function-size implementation limits); for larger functions,
-it is likely better to use a simpler register allocator in any case.
-
-## Reuses and Two-Address ISAs
-
-Some instruction sets primarily have instructions that name only two
-registers for a binary operator, rather than three: both registers are
-inputs, and the result is placed in one of the registers, clobbering
-its original value. The most well-known modern example is x86. It is
-thus imperative that we support this pattern well in the register
-allocator.
-
-This instruction-set design is somewhat at odds with an SSA
-representation, where a value cannot be redefined.
-
-Thus, the allocator supports a useful fiction of sorts: the
-instruction can be described as if it has three register mentions --
-two inputs and a separate output -- and neither input will be
-clobbered. The output, however, is special: its register-placement
-policy is "reuse input i" (where i == 0 or 1). The allocator
-guarantees that the register assignment for that input and the output
-will be the same, so the instruction can use that register as its
-"modifies" operand. If the input is needed again later, the allocator
-will take care of the necessary copying.
-
-We will see below how the allocator makes this work by doing some
-preprocessing so that the core allocation algorithms do not need to
-worry about this constraint.
-
-## SSA
-
-regalloc2 takes an SSA IR as input, where the usual definitions apply:
-every vreg is defined exactly once, and every vreg use is dominated by
-its one def. (Using blockparams means that we do not need additional
-conditions for phi-nodes.)
-
-## Block Parameters
-
-Every block can have *block parameters*, and a branch to a block with
-block parameters must provide values for those parameters via
-operands. When a branch has more than one successor, it provides
-separate operands for each possible successor. These block parameters
-are equivalent to phi-nodes; we chose this representation because they
-are in many ways a more consistent representation of SSA.
-
-To see why we believe block parameters are a slightly nicer design
-choice than use of phi nodes, consider: phis are special
-pseudoinstructions that must come first in a block, are all defined in
-parallel, and whose uses occur on the edge of a particular
-predecessor. All of these facts complicate any analysis that scans
-instructions and reasons about uses and defs. It is much closer to the
-truth to actually put those uses *in* the predecessor, on the branch,
-and put all the defs at the top of the block as a separate kind of
-def. The tradeoff is that a vreg's def now has two possibilities --
-ordinary instruction def or blockparam def -- but this is fairly
-reasonable to handle.
-
-## Output
-
-The allocator produces two main data structures as output: an array of
-`Allocation`s and a sequence of edits. Some other miscellaneous data is also
-provided.
-
-### Allocations
-
-The allocator provides an array of `Allocation` values, one per
-`Operand`. Each `Allocation` has a kind and an index. The kind may
-indicate that this is a physical register or a stack slot, and the
-index gives the respective register or slot. All allocations will
-conform to the constraints given, and will faithfully preserve the
-dataflow of the input program.
-
-### Inserted Moves
-
-In order to implement the necessary movement of data between
-allocations, the allocator needs to insert moves at various program
-points.
-
-The vector of inserted moves contains tuples that name a program point
-and an "edit". The edit is either a move, from one `Allocation` to
-another, or else a kind of metadata used by the checker to know which
-VReg is live in a given allocation at any particular time. The latter
-sort of edit can be ignored by a backend that is just interested in
-generating machine code.
-
-Note that the allocator will never generate a move from one stackslot
-directly to another, by design. Instead, if it needs to do so, it will
-make use of the scratch register. (Sometimes such a move occurs when
-the scratch register is already holding a value, e.g. to resolve a
-cycle of moves; in this case, it will allocate another spillslot and
-spill the original scratch value around the move.)
-
-Thus, the single "edit" type can become either a register-to-register
-move, a load from a stackslot into a register, or a store from a
-register into a stackslot.
-
 # Data Structures
 
 We now review the data structures that regalloc2 uses to track its