Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…stalloc-doc
  • Loading branch information
d-sonuga committed Sep 30, 2024
2 parents 3702c90 + 24e5c9d commit 54da368
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 31 deletions.
14 changes: 7 additions & 7 deletions doc/GENERAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,21 +78,21 @@ consists of the following fields:
- VReg, or virtual register. *Every* operand mentions a virtual
register, even if it is constrained to a single physical register in
practice. This is because we track liveranges uniformly by vreg.

- Policy, or "constraint". Every reference to a vreg can apply some
constraint to the vreg at that point in the program. Valid policies are:

- Any location;
- Any register of the vreg's class;
- Any stack slot;
- A particular fixed physical register; or
- For a def (output), a *reuse* of an input register.

- The "kind" of reference to this vreg: Def, Use, Mod. A def
(definition) writes to the vreg, and disregards any possible earlier
value. A mod (modify) reads the current value then writes a new
one. A use simply reads the vreg's value.

- The position: before or after the instruction.
- Note that to have a def (output) register available in a way that
does not conflict with inputs, the def should be placed at the
Expand Down Expand Up @@ -157,7 +157,7 @@ block parameters must provide values for those parameters via
operands. When a branch has more than one successor, it provides
separate operands for each possible successor. These block parameters
are equivalent to phi-nodes; we chose this representation because they
are in many ways a more consistent representation of SSA.
are in many ways a more consistent representation of SSA.

To see why we believe block parameters are a slightly nicer design
choice than use of phi nodes, consider: phis are special
Expand All @@ -174,8 +174,8 @@ reasonable to handle.
## Output

The allocator produces two main data structures as output: an array of
`Allocation`s and a sequence of edits. Some other data, such as
stackmap slot info, is also provided.
`Allocation`s and a sequence of edits. Some other miscellaneous data is also
provided.

### Allocations

Expand Down
29 changes: 14 additions & 15 deletions doc/ION.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,7 @@ The livein and liveout bitsets (`liveins` and `liveouts` on the `Env`)
are allocated one per basic block and record, per block, which vregs
are live entering and leaving that block. They are computed using a
standard backward iterative dataflow analysis and are exact; they do
not over-approximate (this turns out to be important for performance,
and is also necessary for correctness in the case of stackmaps).
not over-approximate (this turns out to be important for performance).

### Blockparam Vectors: Source-Side and Dest-Side

Expand Down Expand Up @@ -426,7 +425,7 @@ them all here.
across its entire range. This has the effect of causing bundles to
be more important (more likely to evict others) the more they are
split.

- Requirement: a bundle's requirement is a value in a lattice that we
have defined, where top is "Unknown" and bottom is
"Conflict". Between these two, we have: any register (of a class);
Expand All @@ -435,7 +434,7 @@ them all here.
different requirements meets to Conflict. Requirements are derived
from the operand constraints for all uses in all liveranges in a
bundle, and then merged with the lattice meet-function.

The lattice is as follows (diagram simplified to remove multiple
classes and multiple fixed registers which parameterize nodes; any two
differently-parameterized values are unordered with respect to each
Expand Down Expand Up @@ -971,13 +970,13 @@ similarities than the differences.

* The core abstractions of "liverange", "bundle", "vreg", "preg", and
"operand" (with policies/constraints) are the same.

* The overall allocator pipeline is the same, and the top-level
structure of each stage should look similar. Both allocators begin
by computing liveranges, then merging bundles, then handling bundles
and splitting/evicting as necessary, then doing second-chance
allocation, then reifying the decisions.

* The cost functions are very similar, though the heuristics that make
decisions based on them are not.

Expand All @@ -999,33 +998,33 @@ Several notable high-level differences are:
and does not depend on scanning the code at all. In general, we
should be able to state simple invariants and see by inspection (as
well as fuzzing -- see above) that they hold.

* The data structures themselves are simplified. Where IonMonkey uses
linked lists in many places, this allocator stores simple inline
smallvecs of liveranges on bundles and vregs, and smallvecs of uses
on liveranges. We also (i) find a way to construct liveranges
in-order immediately, without any need for splicing, unlike
IonMonkey, and (ii) relax sorting invariants where possible to allow
for cheap append operations in many cases.

* The splitting heuristics are significantly reworked. Whereas
IonMonkey has an all-at-once approach to splitting an entire bundle,
and has a list of complex heuristics to choose where to split, this
allocator does conflict-based splitting, and tries to decide whether
to split or evict and which split to take based on cost heuristics.

* The liverange computation is exact, whereas IonMonkey approximates
using a single-pass algorithm that makes vregs live across entire
loop bodies. We have found that precise liveness improves allocation
performance and generated code quality, even though the liveness
itself is slightly more expensive to compute.

* Many of the algorithms in the IonMonkey allocator are built with
helper functions that do linear scans. These "small quadratic" loops
are likely not a huge issue in practice, but nevertheless have the
potential to be in corner cases. As much as possible, all work in
this allocator is done in linear scans.

* There are novel schemes for solving certain interesting design
challenges. One example: in IonMonkey, liveranges are connected
across blocks by, when reaching one end of a control-flow edge in a
Expand All @@ -1041,7 +1040,7 @@ Several notable high-level differences are:
for the core regalloc. Ion instead has to tweak its definition of
minimal bundles and create two liveranges that overlap (!) to
represent the two uses.

* Using block parameters rather than phi-nodes significantly
simplifies handling of inter-block data movement. IonMonkey had to
special-case phis in many ways because they are actually quite
Expand All @@ -1052,7 +1051,7 @@ Several notable high-level differences are:
* The allocator supports irreducible control flow and arbitrary block
ordering (its only CFG requirement is that critical edges are
split).

* The allocator supports non-SSA code, and has native support for
handling program moves specially.

Expand All @@ -1073,7 +1072,7 @@ number of general principles:
an allocation map for each PReg. This turned out to be significantly
(!) less efficient than Rust's built-in BTree data structures, for
the usual cache-efficiency vs. pointer-chasing reasons.

* We initially used dense bitvecs, as IonMonkey does, for
livein/liveout bits. It turned out that a chunked sparse design (see
below) was much more efficient.
Expand All @@ -1097,7 +1096,7 @@ number of general principles:
append liveranges to in-progress vreg liverange vectors and then
reverse at the end. The expensive part is a single pass; only the
bitset computation is a fixpoint loop.

* Sorts are better than always-sorted data structures (like btrees):
they amortize all the comparison and update cost to one phase, and
this phase is much more cache-friendly than a bunch of spread-out
Expand Down
9 changes: 0 additions & 9 deletions src/checker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -162,15 +162,6 @@ pub enum CheckerError {
op: Operand,
alloc: Allocation,
},
ConflictedValueInStackmap {
inst: Inst,
alloc: Allocation,
},
NonRefValuesInStackmap {
inst: Inst,
alloc: Allocation,
vregs: FxHashSet<VReg>,
},
StackToStackMove {
into: Allocation,
from: Allocation,
Expand Down

0 comments on commit 54da368

Please sign in to comment.