-
Notifications
You must be signed in to change notification settings - Fork 244
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[compiler] disentangle IR parser from bindings (#13990)
Currently the binding structure is redundantly specified in two places: Binds.scala, and the parser. We need the binding structure in the parser to propagate the environment, so we can annotate `Ref` nodes (and a few other things) with their types. But we can't use Binds.scala because we don't yet have an IR. This PR removes environment maintenance from the parser by deferring type annotation to a separate pass (which is simple, because it can use the Binds.scala infrastructure). One consequence is that we can't assign types to nodes like `Ref` during parsing, which means we can't ask for the type of any node during parsing, and by extension we can't ask for types of children in IR node constructors. Instead, all typechecking logic is moved to the `TypeCheck` pass. Some benefits of this change: * The parser is simpler, as it doesn't have to maintain a typing environment. * Binds.scala is now the single source of truth on the binding structure of the IR. * Instead of typechecking being split in an ad-hoc way between IR constructors and the `TypeCheck` pass, all typechecking and type error reporting logic is in one place. * The parser parses a context-free grammar, no more and no less. If the input is gramatically correct, the parser succeeds. * We can round trip IR with type errors through the text representation. For instance, if we log an IR that fails TypeCheck, we can copy the IR from the log, parse it, then debug. This change was motivated by my work in progress to convert the parser to use the SSA grammar, which this should greatly simplify. I chose to make the type annotation pass after parsing mutate the IR in place (with the unfortunate exception of `Apply`, which can change into an `ApplyIR` or `ApplySpecial`. Do these really need to be separate nodes?). The type of a `Ref` node was already mutable to allow this sort of deferred annotation, and I've had to make a few other things mutable as well. Alternatively we could rebuild the entire IR to include type annotations, but I think the mutation is sufficiently well localized, and this is a common compiler pattern. This touches a lot of lines, but the high level changes are: * Remove the `refMap` from the `IRParserEnvironment`, and remove all code that modifies the typing environment from the parser. Nodes like `Ref` that need a type from the environment get types set to `null`, to be filled in after parsing. * Add `annotateTypes` pass to fill in type annotations from the environment. This is currently written in the parser, and always called after parsing. This means for the moment we can only parse type correct IR. But in the future we could move this to a separate stage of the compilation pipeline. * Move typechecking logic on relational nodes from the constructors to a `typecheck` method, which is called from the `TypeCheck` pass. * Make the `typ` field on IR nodes consistently lazy (or a def when the type is a child's type without modification). Before we sometimes did this for performance, but it's now required to avoid querying children's types during construction. * Make types mutable on `AggSignature` and `ComparisonOp`, so they can be filled in after parsing. * Ensure that the structure in `Binds.scala` satisfies the following invariant: to construct the typing environment of child `i`, we only need the types of children with index less than `i`. This was almost always satisfied already, and allows us to use the generic binds infrastucture in the pass to annotate types (where when visiting child `i`, we can only query types of already visited children). * Change the text representation of `TailLoop`/`Recur` to move the explicit type annotation from `Recur` to `TailLoop`. This is necessary to comply with the previous point. It's also consistent with `Ref`, where types of references are inferred from the environment. * Add explicit types in the text representation of `TableFilterIntervals` and `MatrixFilterIntervals`, where the types were needed during parsing and we can no longer get them from child types. * Fix IR examples used in parser tests to be type correct. * Add an explicit return type to the `Apply` node. Before the parser parsed an `Apply` node to one of `Apply/ApplySpecial/ApplyIR`; now it always parses to `Apply`, and the type annotation pass converts to the appropriate specialization, which needs the parsed return type.
- Loading branch information
1 parent
258b26f
commit 9dc3495
Showing
41 changed files
with
967 additions
and
808 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.