From 24e5c9dba768a3d838ec20f0a80a51c9983ba344 Mon Sep 17 00:00:00 2001 From: Nick Fitzgerald Date: Mon, 30 Sep 2024 11:28:28 -0700 Subject: [PATCH] Remove some final references to stack maps (#197) Stack maps are not provided by regalloc2 anymore. This removes the final references to stack maps in the codebase. --- doc/DESIGN.md | 43 +++++++++++++++++++++---------------------- src/checker.rs | 9 --------- 2 files changed, 21 insertions(+), 31 deletions(-) diff --git a/doc/DESIGN.md b/doc/DESIGN.md index 0c5f35ed..4172a063 100644 --- a/doc/DESIGN.md +++ b/doc/DESIGN.md @@ -80,21 +80,21 @@ consists of the following fields: - VReg, or virtual register. *Every* operand mentions a virtual register, even if it is constrained to a single physical register in practice. This is because we track liveranges uniformly by vreg. - + - Policy, or "constraint". Every reference to a vreg can apply some constraint to the vreg at that point in the program. Valid policies are: - + - Any location; - Any register of the vreg's class; - Any stack slot; - A particular fixed physical register; or - For a def (output), a *reuse* of an input register. - + - The "kind" of reference to this vreg: Def, Use, Mod. A def (definition) writes to the vreg, and disregards any possible earlier value. A mod (modify) reads the current value then writes a new one. A use simply reads the vreg's value. - + - The position: before or after the instruction. - Note that to have a def (output) register available in a way that does not conflict with inputs, the def should be placed at the @@ -159,7 +159,7 @@ block parameters must provide values for those parameters via operands. When a branch has more than one successor, it provides separate operands for each possible successor. These block parameters are equivalent to phi-nodes; we chose this representation because they -are in many ways a more consistent representation of SSA. +are in many ways a more consistent representation of SSA. To see why we believe block parameters are a slightly nicer design choice than use of phi nodes, consider: phis are special @@ -176,8 +176,8 @@ reasonable to handle. ## Output The allocator produces two main data structures as output: an array of -`Allocation`s and a sequence of edits. Some other data, such as -stackmap slot info, is also provided. +`Allocation`s and a sequence of edits. Some other miscellaneous data is also +provided. ### Allocations @@ -229,8 +229,7 @@ The livein and liveout bitsets (`liveins` and `liveouts` on the `Env`) are allocated one per basic block and record, per block, which vregs are live entering and leaving that block. They are computed using a standard backward iterative dataflow analysis and are exact; they do -not over-approximate (this turns out to be important for performance, -and is also necessary for correctness in the case of stackmaps). +not over-approximate (this turns out to be important for performance). ### Blockparam Vectors: Source-Side and Dest-Side @@ -631,7 +630,7 @@ them all here. across its entire range. This has the effect of causing bundles to be more important (more likely to evict others) the more they are split. - + - Requirement: a bundle's requirement is a value in a lattice that we have defined, where top is "Unknown" and bottom is "Conflict". Between these two, we have: any register (of a class); @@ -640,7 +639,7 @@ them all here. different requirements meets to Conflict. Requirements are derived from the operand constraints for all uses in all liveranges in a bundle, and then merged with the lattice meet-function. - + The lattice is as follows (diagram simplified to remove multiple classes and multiple fixed registers which parameterize nodes; any two differently-parameterized values are unordered with respect to each @@ -1176,13 +1175,13 @@ similarities than the differences. * The core abstractions of "liverange", "bundle", "vreg", "preg", and "operand" (with policies/constraints) are the same. - + * The overall allocator pipeline is the same, and the top-level structure of each stage should look similar. Both allocators begin by computing liveranges, then merging bundles, then handling bundles and splitting/evicting as necessary, then doing second-chance allocation, then reifying the decisions. - + * The cost functions are very similar, though the heuristics that make decisions based on them are not. @@ -1204,7 +1203,7 @@ Several notable high-level differences are: and does not depend on scanning the code at all. In general, we should be able to state simple invariants and see by inspection (as well as fuzzing -- see above) that they hold. - + * The data structures themselves are simplified. Where IonMonkey uses linked lists in many places, this allocator stores simple inline smallvecs of liveranges on bundles and vregs, and smallvecs of uses @@ -1212,25 +1211,25 @@ Several notable high-level differences are: in-order immediately, without any need for splicing, unlike IonMonkey, and (ii) relax sorting invariants where possible to allow for cheap append operations in many cases. - + * The splitting heuristics are significantly reworked. Whereas IonMonkey has an all-at-once approach to splitting an entire bundle, and has a list of complex heuristics to choose where to split, this allocator does conflict-based splitting, and tries to decide whether to split or evict and which split to take based on cost heuristics. - + * The liverange computation is exact, whereas IonMonkey approximates using a single-pass algorithm that makes vregs live across entire loop bodies. We have found that precise liveness improves allocation performance and generated code quality, even though the liveness itself is slightly more expensive to compute. - + * Many of the algorithms in the IonMonkey allocator are built with helper functions that do linear scans. These "small quadratic" loops are likely not a huge issue in practice, but nevertheless have the potential to be in corner cases. As much as possible, all work in this allocator is done in linear scans. - + * There are novel schemes for solving certain interesting design challenges. One example: in IonMonkey, liveranges are connected across blocks by, when reaching one end of a control-flow edge in a @@ -1246,7 +1245,7 @@ Several notable high-level differences are: for the core regalloc. Ion instead has to tweak its definition of minimal bundles and create two liveranges that overlap (!) to represent the two uses. - + * Using block parameters rather than phi-nodes significantly simplifies handling of inter-block data movement. IonMonkey had to special-case phis in many ways because they are actually quite @@ -1257,7 +1256,7 @@ Several notable high-level differences are: * The allocator supports irreducible control flow and arbitrary block ordering (its only CFG requirement is that critical edges are split). - + * The allocator supports non-SSA code, and has native support for handling program moves specially. @@ -1278,7 +1277,7 @@ number of general principles: an allocation map for each PReg. This turned out to be significantly (!) less efficient than Rust's built-in BTree data structures, for the usual cache-efficiency vs. pointer-chasing reasons. - + * We initially used dense bitvecs, as IonMonkey does, for livein/liveout bits. It turned out that a chunked sparse design (see below) was much more efficient. @@ -1302,7 +1301,7 @@ number of general principles: append liveranges to in-progress vreg liverange vectors and then reverse at the end. The expensive part is a single pass; only the bitset computation is a fixpoint loop. - + * Sorts are better than always-sorted data structures (like btrees): they amortize all the comparison and update cost to one phase, and this phase is much more cache-friendly than a bunch of spread-out diff --git a/src/checker.rs b/src/checker.rs index 297a89ef..d67dc2da 100644 --- a/src/checker.rs +++ b/src/checker.rs @@ -162,15 +162,6 @@ pub enum CheckerError { op: Operand, alloc: Allocation, }, - ConflictedValueInStackmap { - inst: Inst, - alloc: Allocation, - }, - NonRefValuesInStackmap { - inst: Inst, - alloc: Allocation, - vregs: FxHashSet, - }, StackToStackMove { into: Allocation, from: Allocation,