-
Notifications
You must be signed in to change notification settings - Fork 158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build a stand-alone Miden assembly parser #298
Comments
I can extend the existing parser for docs to make it more extensive to handle this. |
I think we should probably use the current |
Got it, I'd like take this on as it seems relatively straight forward. Is there any preference what this binary format we should store AST in? |
I think as the first step we can implement the parsing part only. Serialization to binary format could be done in future PRs. Before we start working on this, might be good to figure out answers to the questions I had in the original post (especially for the |
Yes seems like exec in a local .masm is replaceable with a local index (though probably sorted as I assume order in the .masm file doesn't matter and shouldn't effect AST). |
I think we can still store names in the AST. I was thinking more about the future when we want to serialize to binary and then, we should try to be as compact as possible. So, when serializing, we'd need to replace all local procedure names with indexes (which probably could fit in 1 - or 2 bytes), and I thought it might be easier to handle this from the start. Another related reason is that we should be able to parse the binary format into an AST as well. And if we implement serialization as described above, procedure names would not be available. Thus, procedure name should probably be stored in the relevant node as The order of local procedures does matter - so, we should assign indexes in the order in which they were declared in the source file. |
One option to remove the stdlib as dependency of the assembly is to deprecate This new token will consume a given The souce provider will be a mapping We can, alternatively, change the behavior of |
Closed by #490 and many others. |
As discussed in #225, it might be a good idea to separate Miden assembly parsing from MAST generation. There are two primary reasons for that:
Script
struct), some info about the original assembly file is already lost. Specifically, the procedures have already been inlined, and thus, serializing MAST could result in a much bigger output size than the original.masm
file.Overall, the compilation process could look something like this:
As the first step, we should probably just implement parsing of source
.masm
files into simple ASTs. We could use off the shelf parsers such as Pest or Rowan, but given how simple Miden assembly syntax is, my initial thinking is that we can implement the parser ourselves.In terms of structure, we could probably define a
Node
enum which could look something like this:In the above,
Instruction
could be an enum itself, something like:A few things left to figure out:
.masm
file) and another one for external procedures (e.g., fromstdlib
). And there should be aProcedure
struct which will contain procedure name, number of locals etc.exec
instruction should work? In the.masm
file we provide a name of the procedure to be invoked. For local procedures we should probably replace names with indexes. For external procedures, we should probably replace names with procedure hashes, but I haven't thought through the implications completely.Once we have the parsing working, we can implement Miden assembly serialization into binary format, and then switch over to building MAST from the AST rather than directly from
.masm
source.The text was updated successfully, but these errors were encountered: