Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Arc] Improve LowerState to never produce read-after-write conflicts #7703

Merged
merged 1 commit into from
Oct 28, 2024

Commits on Oct 28, 2024

  1. [Arc] Improve LowerState to never produce read-after-write conflicts

    This is a complete rewrite of the `LowerState` pass that makes the
    `LegalizeStateUpdate` pass obsolete.
    
    The old implementation of `LowerState` produces `arc.model`s that still
    contain read-after-write conflicts. This primarily happens because the
    pass simply emits `arc.state_write` operations that write updated values
    to simulation memory for each `arc.state`, and any user of `arc.state`
    would use an `arc.state_read` operation to retrieve the original value
    of the state before any writes occurred. Memories are similar. The Arc
    dialect considers `arc.state_write` and `arc.memory_write` operations to
    be _deferred_ writes until the `LegalizeStateUpdate` pass runs, at which
    point they become _immediate_ since the legalization inserts the
    necessary temporaries to resolve the read-after-write conflicts.
    
    The previous implementation would also not handle state-to-output and
    state-to-side-effecting-op propagation paths correctly. When a model's
    eval function is called, registers are updated to their new value, and
    any outputs that combinatorially depend on those new values should also
    immediately update. Similarly, operations such as asserts or debug
    trackers should observe new values for states immediately after they
    have been written. However, since all writes are considered deferred,
    there is no way for `LowerState` to produce a mixture of operations that
    depend on a register's _old_ state (because they are used to compute a
    register's new state), and on a _new_ state because they are
    combinatorially derived values.
    
    This new implementation of `LowerState` completely avoids
    read-after-write conflicts. It does this by changing the way modules are
    lowered in two ways:
    
    **Phases:** The pass tracks in which _phase_ of the simulation lifecycle
    a value is needed and allows for operations to have different lowerings
    in different phases. An `arc.state` operation for example requires its
    inputs, enable, and reset to be computed based on the _old_ value they
    had, i.e. the value the end of the previous call to the model's eval
    function. The clock however has to be computed based on the _new_ value
    it has in the current call to eval. Therefore, the ops defining the
    inputs, enable, and reset are lowered in the _old_ phase, while the ops
    defining the clock are lowered in the _new_ phase. The `arc.state` op
    lowering will then write its _new_ value to simulation storage.
    
    This phase tracking allows registers to be used as the clock for other
    registers: since the clocks require _new_ values, registers serving as
    clock to others are lowered first, such that the dependent registers can
    immediately react to the updated clock. It also allows for module
    outputs and side-effecting ops based on `arc.state`s to be scheduled
    after the states have been updated, since they depend on the state's
    _new_ value.
    
    The pass also covers the situation where an operation depends on a
    module input and a state, and feeds into a module output as well as
    another state. In this situation that operation has to be lowered twice:
    once for the _old_ phase to serve as input to the subsequent state, and
    once for the _new_ phase to compute the new module output.
    
    In addition to the _old_ and _new_ phases representing the previous and
    current call to eval, the pass also models an _initial_ and _final_
    phase. These are used for `seq.initial` and `llhd.final` ops, and in
    order to compute the initial values for states. If an `arc.state` op has
    an initial value operand it is lowered in the _initial_ phase. Similarly
    for the ops in `llhd.final`. The pass places all ops lowered in the
    initial and final phases into corresponding `arc.initial` and
    `arc.final` ops. At a later point we may want to generate the
    `*_initial`, `*_eval`, and `*_final` functions directly.
    
    **No more clock trees:** The new implementation also no longer generates
    `arc.clock_tree` and `arc.passthrough` operations. These were a holdover
    from the early days of the Arc dialect, where no eval function would be
    generated. Instead, the user was required to directly call clock
    functions. This was never able to model clocks changing at the exact
    same moment, or having clocks derived from registers and other
    combinatorial operations. Since Arc has since switched to generating an
    eval function that can accurately interleave the effects of different
    clocks, grouping ops by clock tree is no longer needed. In fact,
    removing the clock tree ops allows for the pass to more efficiently
    interleave the operations from different clock domains.
    
    The Rocket core in the circt/arc-tests repository still works with this
    new implementation of LowerState. In combination with the MergeIfs pass
    the performance stays the same.
    
    I have renamed the implementation and test files to make the git diffs
    easier to read. The names will be changed back in a follow-up commit.
    fabianschuiki committed Oct 28, 2024
    Configuration menu
    Copy the full SHA
    c0246c2 View commit details
    Browse the repository at this point in the history