Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Calyx-FIRRTL] Add fud2 support for FIRRTL-written primitives + externalized memory simulation #1898

Closed
wants to merge 35 commits into from

Conversation

ayakayorihiro
Copy link
Contributor

@ayakayorihiro ayakayorihiro commented Feb 6, 2024

Sorry that the writeup is very long! There was a lot I wanted to verbalize... The Simulating with externalized memories section was also meant to be a potential FAQ so it's ok to be skipped.

General

Currently, we have two "modes" of translating Calyx to FIRRTL with respect to primitives: (1) using extmodule in FIRRTL that uses the pre-existing Verilog implementation of primitives, (2) translating to FIRRTL implementations of primitives using a FIRRTL template and the primitive-uses backend. Previously, fud2 only supported (1) because I didn't have (2) implemented. This PR extends fud2 by allowing FIRRTL to be generated and simulated in both options.

I tried to make this code-clone-free as possible, but please let me know of more improvements I can make!! Also, I'm not the biggest fan of the namings I'm using (ex. verilog-mem-external and others), so I'm more than happy to get suggestions on that front too.

Simulating with externalized memories

I'd like to elaborate a bit on why there are weirdly so many moving parts in this PR, hopefully to answer potential questions like: "Why did you add so many states and transitions to fud2?", "What the heck is externalized_mem_tb.sv?".

FIRRTL does not have a mechanism to dump memory contents a la writememh, which is how @external memories are retrieved in simulation (implemented in the Verilog backend). Since we can't dump memory contents, we need to actually externalize the memory by turning its ports inside out. The problem with this is that the current Verilog testbench does not support main components with externalized memories. So, I wrote an adapted testbench (externamized_mem_tb.sv) that supports main components with externalized memories, and also feed in the Verilog implementation of std_mem_d1. That's what testbench_mem_externalized_setup does. I had to add a bunch of states and transitions to fud2 because (currently) this new testbench should be used if and only if we are simulating (2): FIRRTL with FIRRTL primitive implementations.

Lastly, I'd like to try to answer this potential question:

"Why did you change calyx-opt/src/passes/well_formed.rs and break the CI build?"
I was initially using the externalize pass on the original Calyx programs to produce the FIRRTL code for (2). The externalize pass exposes all input ports of a memory, producing a signature that looked like this:

component main(@go go: 1, @clk clk: 1, @reset reset: 1, mem_read_data: 32, mem_done: 1) -> (@done done: 1, mem_addr0: 1, mem_write_data: 32, mem_write_en: 1, mem_clk: 1, mem_reset: 1) {

This was causing a type error in the FIRRTL, since clk (that had the @clock attribute) had type Clock whereas mem_clk (which didn't have @clock) had type UInt<1>, and mem_clk had clk supplied into it.

@sampsyo suggested we replace @external with ref because (a) ref cells get around this problem (it doesn't produce a mem_clk port in the first place), (b) ref is much more general than @external (and it might be good to have ref simply subsume @external sometime in the future). I'll try to make an issue about (b) soon, I just hope I can explain things well 😅

That being said, I think my current changes for calyx-opt/src/passes/well_formed.rs are not quite it, since I didn't want to break the CI build. It would be nice if I can limit that conditional to just when the firrtl and primitive-uses backends are used, but I wasn't sure how to effectively do that... Suggestions would be very appreciated!

Usage

To generate + simulate (1) FIRRTL with Verilog primitive implementations, you can continue using --to firrtl and --through firrtl.

To generate (2) FIRRTL with FIRRTL primitive implementations, please use --to firrtl-with-primitives.
ex)

fud2 examples/tutorial/language-tutorial-control.futil --to firrtl-with-primitives

To simulate (2) FIRRTL with FIRRTL primitive implementations, you can use --through firrtl-with-primitives.
ex)

fud2 examples/tutorial/language-tutorial-control.futil --to dat --through firrtl-with-primitives -s sim.data=data.json -m dot

Below is the fud2 graph that got generated when I ran the above command with -m graph:
image

Sorry if it's hard to read, but I hope the picture helps conceptualize what's going on! Please let me know if you have any questions, or let me know of any suggestions!!!

@ayakayorihiro
Copy link
Contributor Author

There's a good chance I might make a big change to this branch, in which case I'll close this PR out for now. Apologies in advance!

Copy link
Contributor

@sampsyo sampsyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!!! Super cool that this is working, and really nice work plumbing it all the way through the build tool and all that nonsense. This was unavoidable, but it's too bad that it required you to duplicate the fud2 Verilog states for another "style" (doubling the total Verilog variants to 4), but that's the way that it is!

Generally, I suggest we revisit this post-deadline and see what we can do to unify the two styles. But I strongly endorse cutting all corners temporarily to just make stuff work in the near term.

Another thing to put on the list as a post-deadline task: let's add your very helpful "usage" notes here to a new documentation page about the FIRRTL backend!

Comment on lines 110 to 111
// Main component cannot use `ref` cells
if comp.name == ctx.entrypoint {
if ctx.bc.emit_primitive_extmodules && comp.name == ctx.entrypoint {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll write a tiny bit more about this elsewhere, but what about using ctx.bc.synthesis_mode? (That is, the --synthesis flag to the Calyx compiler.) This flag already has the effect of disabling the readmemh/writememh "built-in testbench," which sorta implies that it is appropriate when you want "just the hardware" and nothing else. It seems like it would be safe to allow ref in that case, because you're saying that you don't expect to use the compiled code with our little tb.sv anyway? And there's a chance this would cause less test disruption…

fud2/graph.pdf Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this was committed by mistake.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this 😅

fud2/src/main.rs Outdated Show resolved Hide resolved
fud2/src/main.rs Outdated
e.var("testbench", &format!("{}/tb.sv", e.config_val("rsrc")?))?;
Ok(())
});
let testbench_mem_externalized_setup =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a suggestion purely about naming stuff: what if we came up with a single, short, made-up word that refers to this "externalized memory" concept? Even if it's less explanatory than mem_externalized or mem-external or other variants, being made-up will make it easier to grep for and to recognize the connections between these things.

Anything could work, but I propose refmem (avoiding "externalized" to avoid confusion with the @external annotation, which we are actually not using). So this would be called testbench_refmem_setup.

fud2/src/main.rs Outdated
Comment on lines 157 to 159
let verilog_mem_external = bld.state("verilog-mem-external", &["sv"]);
let verilog_noverify_mem_external =
bld.state("verilog-noverify-mem-external", &["sv"]); // Need to use alternative testbench.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under my proposed naming convention, these states would be named something like verilog_refmem and verilog_noverify_refmem or whatever.

fud2/src/main.rs Outdated
verilog_noverify,
simulator,
|e, input, output| {
e.build("icarus-compile", input, output)?;
Ok(())
},
);
bld.op(
"icarus-mem-external",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this would be icarus-refmem or whatever.

);

let firrtl_primitives_setup = bld.setup("FIRRTL with primitives", |e| {
e.rule("external-to-ref", "sed 's/@external([0-9]*)/ref/g' $in | sed 's/@external/ref/g' > $out")?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The awesome power of sed!!!!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely up there in my list of favorite commands 😄

@@ -220,6 +304,13 @@ fn build_driver() -> Driver {
verilog_noverify,
firrtl_compile,
);
bld.op(
"firrtl-with-primitives-noverify",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why, but it is annoying we have to duplicate these FIRRTL compilers for the "noverify" variants! Otherwise, the Verilog will work with only Icarus or Verilator but not both. The real solution here, of course, is to somehow make our extra checks work in Icarus so we don't need this distinction… but that is a problem for another day.

Comment on lines 80 to 82
with open(calyx_program) as f:
for line in f:
if ("ref" in line) and "comb_mem_d1" in line:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to admit that this is a very hacky way to extract the memory declarations from a Calyx program, but it will do for a v1.0! At some point, it would be nice to actually parse the Calyx syntax to extract these things… maybe with yet another custom backend. But don't let that stop you from forging ahead with this string-matching version for now!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 😅 If I weren't pressed for time I definitely would've written a backend for this. I'll add it to my TODOs to fix this up after next week!

@sampsyo
Copy link
Contributor

sampsyo commented Feb 10, 2024

I wrote a little more about what to do (again, post-deadline) about the refmem Verilog emission mode here: #1603 (comment). Namely, I think we can get rid of it if we finally take the plunge and remove @external altogether, replacing it with ref instead.

@ayakayorihiro
Copy link
Contributor Author

ayakayorihiro commented Feb 10, 2024

Thanks a lot @sampsyo for the review! I'm sorry I forgot to let you know that this PR still contains WIP, I should've closed out the PR but somehow thought the changes I'd need to make are minimal (haha). Nevertheless your feedback is extremely helpful!!

I came up with an idea last night about how I can avoid making extra states in the fud2 setup while integrating my testbench generation script, so I'll spend a bit of time trying it out. If it doesn't work I'll defer to the multiple states setup for now and make the changes you suggested! Either way I'll make another comment when I'm done.

@ayakayorihiro
Copy link
Contributor Author

I added full fud2 support for the new custom testbench strategy and fixed some things up! I ended up not being able to remove the many styles/states that are currently residing in fud2, but we can figure things out post-deadline.

In my testing, I got to run some tests from the base tests/correctness directory: if.futil, if-static-different-latencies.futil, par.futil, seq.futil, while.futil (so exciting to be finally running programs that are not from the tutorials! 😄 ). I found a bug in the FIRRTL backend that I need to account for from invoke-memory.futil because some ports that are never used need to be initialized to invalid in FIRRTL to get the FIRRTL compiler working. I'll add a separate PR for that fix since I think it's out of scope for here. The rest of the tests require me to implement std_mult_pipe, while I'll get done soon.

If I can get another round of checking of this PR, I'd super appreciate it!

@ayakayorihiro ayakayorihiro marked this pull request as draft February 12, 2024 21:27
@sampsyo
Copy link
Contributor

sampsyo commented Feb 27, 2024

Things generally look good here! I could be wrong, but it looks like a few unrelated changes might have gotten mixed up with this PR? See, for example, the Dockerfile change at the top of the diff. Which is totally fine and normal during a deadline push, of course, but maybe we want to trim this down to the bare necessities to get it merged into the main branch.

I wouldn't worry about splitting this into smaller PRs—there are many FIRRTL-related things in here, and maybe we should just go ahead and merge the important ones? (And perhaps make ourselves a little to-do list of the small things to clean up afterward?)

@ayakayorihiro
Copy link
Contributor Author

ayakayorihiro commented Feb 27, 2024

@sampsyo Really sorry for the mess! It seems like what happened is that a couple of days ago I (1) merged the main branch into this branch, then (2) immediately "reverted" some changes on this branch because #1610 's changes to primitives/memories/seq.sv broke my benchmarks (the Dahlia-to-Calyx compiler doesn't emit the new version of seq_mem_d* and my benchmarks are from the Dahlia implementations of Polybench). Because the evaluation repo references this branch, I wanted to make sure the benchmarks can still run on this branch.
So the changes to Dockerfile and such should be from the revert. I didn't realize how much that complicated this PR :(

The other smaller issue is that I have to update my testbench generation script to produce the new/correct signature for seq_mem_d*. But I can take care of that. I think the best way to go from here is to keep this branch as is for the sake of the evaluation but close this PR, and make a new PR for a new branch that is up-to-date with main. I hope that sounds ok?

@sampsyo
Copy link
Contributor

sampsyo commented Feb 29, 2024

Got it; makes sense, and no worries!! It seems extremely reasonable to leave this PR here so that the eval repo can reference the exact commit with all of the caveats.

What you suggest here sounds exactly right:

I think the best way to go from here is to keep this branch as is for the sake of the evaluation but close this PR, and make a new PR for a new branch that is up-to-date with main. I hope that sounds ok?

Namely, please do whatever's easiest/simplest, but one option could be to git cherry-pick a few changes at a time and turn those into discrete PRs. Like, maybe one set of changes is just the fud2 stuff (which I promise to help with w/r/t resource files!!), another is for the testbench stuff, maybe another for any necessary tweaks to the primitive library… anyway, truly, you do you, but parceling things out like that might make the path forward clear.

@ayakayorihiro
Copy link
Contributor Author

@sampsyo Awesome!! Thanks for the suggestion :) Will close this PR now accordingly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants