-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[arcilator] Add clock divider integration test #7705
Conversation
1427a02
to
57b6019
Compare
2c1bc29
to
7da872c
Compare
57b6019
to
9d3e32a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs #7703 to work correctly, doesn't it?
Independently of that, it is currently blasting through the C calling conventions, giving me this output locally:
0 0 2191024128 1808528296
0 0 2191024128 1808528296
0 0 2191024128 1808528296
[...]
The changes below should fix that.
Uh oh, damn calling conventions 🙈. Thanks for the fixes @fzi-hielscher 🥳. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, it's impressive that ClockDiv just works!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice that this just works!
507d712
to
af82ee5
Compare
7da872c
to
2af2512
Compare
af82ee5
to
43f6fa3
Compare
2af2512
to
b4d2ae4
Compare
43f6fa3
to
fceb310
Compare
b4d2ae4
to
8bef704
Compare
fceb310
to
0511ca9
Compare
8bef704
to
fce7727
Compare
0511ca9
to
1bf933e
Compare
This is a complete rewrite of the `LowerState` pass that makes the `LegalizeStateUpdate` pass obsolete. The old implementation of `LowerState` produces `arc.model`s that still contain read-after-write conflicts. This primarily happens because the pass simply emits `arc.state_write` operations that write updated values to simulation memory for each `arc.state`, and any user of `arc.state` would use an `arc.state_read` operation to retrieve the original value of the state before any writes occurred. Memories are similar. The Arc dialect considers `arc.state_write` and `arc.memory_write` operations to be _deferred_ writes until the `LegalizeStateUpdate` pass runs, at which point they become _immediate_ since the legalization inserts the necessary temporaries to resolve the read-after-write conflicts. The previous implementation would also not handle state-to-output and state-to-side-effecting-op propagation paths correctly. When a model's eval function is called, registers are updated to their new value, and any outputs that combinatorially depend on those new values should also immediately update. Similarly, operations such as asserts or debug trackers should observe new values for states immediately after they have been written. However, since all writes are considered deferred, there is no way for `LowerState` to produce a mixture of operations that depend on a register's _old_ state (because they are used to compute a register's new state), and on a _new_ state because they are combinatorially derived values. This new implementation of `LowerState` completely avoids read-after-write conflicts. It does this by changing the way modules are lowered in two ways: **Phases:** The pass tracks in which _phase_ of the simulation lifecycle a value is needed and allows for operations to have different lowerings in different phases. An `arc.state` operation for example requires its inputs, enable, and reset to be computed based on the _old_ value they had, i.e. the value the end of the previous call to the model's eval function. The clock however has to be computed based on the _new_ value it has in the current call to eval. Therefore, the ops defining the inputs, enable, and reset are lowered in the _old_ phase, while the ops defining the clock are lowered in the _new_ phase. The `arc.state` op lowering will then write its _new_ value to simulation storage. This phase tracking allows registers to be used as the clock for other registers: since the clocks require _new_ values, registers serving as clock to others are lowered first, such that the dependent registers can immediately react to the updated clock. It also allows for module outputs and side-effecting ops based on `arc.state`s to be scheduled after the states have been updated, since they depend on the state's _new_ value. The pass also covers the situation where an operation depends on a module input and a state, and feeds into a module output as well as another state. In this situation that operation has to be lowered twice: once for the _old_ phase to serve as input to the subsequent state, and once for the _new_ phase to compute the new module output. In addition to the _old_ and _new_ phases representing the previous and current call to eval, the pass also models an _initial_ and _final_ phase. These are used for `seq.initial` and `llhd.final` ops, and in order to compute the initial values for states. If an `arc.state` op has an initial value operand it is lowered in the _initial_ phase. Similarly for the ops in `llhd.final`. The pass places all ops lowered in the initial and final phases into corresponding `arc.initial` and `arc.final` ops. At a later point we may want to generate the `*_initial`, `*_eval`, and `*_final` functions directly. **No more clock trees:** The new implementation also no longer generates `arc.clock_tree` and `arc.passthrough` operations. These were a holdover from the early days of the Arc dialect, where no eval function would be generated. Instead, the user was required to directly call clock functions. This was never able to model clocks changing at the exact same moment, or having clocks derived from registers and other combinatorial operations. Since Arc has since switched to generating an eval function that can accurately interleave the effects of different clocks, grouping ops by clock tree is no longer needed. In fact, removing the clock tree ops allows for the pass to more efficiently interleave the operations from different clock domains. The Rocket core in the circt/arc-tests repository still works with this new implementation of LowerState. In combination with the MergeIfs pass the performance stays the same. I have renamed the implementation and test files to make the git diffs easier to read. The names will be changed back in a follow-up commit.
The new LowerState pass does not produce `arc.clock_tree` and `arc.passthrough` ops anymore. Remove them from the dialect entirely.
Add a test to check that arcilator can simulate a simple clock divider. This exercises a corner case of arcilator's simulation model scheduling, where a state updating its value can trigger other states and module outptus to update their values. In this case, a cascade of clock edges is generated by feeding one state's output into the clock input of the next state.
1bf933e
to
15bde05
Compare
fce7727
to
33c0e5a
Compare
Add a test to check that arcilator can simulate a simple clock divider. This exercises a corner case of arcilator's simulation model scheduling, where a state updating its value can trigger other states and module outptus to update their values. In this case, a cascade of clock edges is generated by feeding one state's output into the clock input of the next state.