Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove procedure cache from the assembler #1411

Merged
merged 3 commits into from
Jul 23, 2024
Merged

Conversation

bobbinth
Copy link
Contributor

@bobbinth bobbinth commented Jul 23, 2024

This PR builds on #1409 and tries to further simplify the assembler by removing ProcedureCache struct.

The PR consists of 3 commits:

  • The first commit just removes the ProcedureCache struct and moves all relevant functionality into the MastForestBuilder struct.
  • The second commit removes ResolvedTarget::Cached variant. The motivation is that it doesn't really matter if the procedure is cached or not - MastForestBuilder will make sure that we cannot create nodes with duplicate MAST roots, and so the same procedure cannot be inserted twice into the forest.
  • The last commit completely removes root tracking from the ModuleGraph. This means that the name resolver does not try to resolve InvocationTarget::MastRoot but just maps it directly to ResolvedTarget::PhantomCall (which I renamed into ResolvedTarget::MastRoot). The idea here is similar to the previous point in that MastForestBuilder has enough information to resolve a MAST root to a correct node in the forest.

I'm pretty sure that the first commit does not break anything, but I'm less sure about the other two commits (especially the 3rd one). So, it is possible to that I've missed some important details (tests are passing, but it is possible that something is not being tested).


Update: the last commit was removed from this PR as we can't test its effects sufficiently yet.

Comment on lines 344 to 359
// This is a phantom procedure - we know its root, but do not have its
// definition
break Err(AssemblyError::Failed {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not really clear to me why we have this error condition here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the find method expects to resolve a FullyQualifiedProcedureName to a GlobalProcedureIndex or return an error if that isn't possible. This is one of the reasons why it is useful to represent basic information about compiled modules/procedures in the graph, since those procedures will get assigned a GlobalProcedureIndex, and can be used to resolve invocations throughout the compilation graph.


// We don't have a cache entry yet, but we do want to make sure we don't have a conflicting
// cache entry with the same MAST root:
if let Some(cached) = self.find_procedure(&proc_root) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to discuss adding a new node type to MAST that represents procedure entry, specifically to ensure that the number of procedure locals is represented in the digest. This also lets us recover essential procedure metadata when loading a library from MAST (such as number of procedure locals).

Copy link
Contributor

@bitwalker bitwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of these changes should be fine. However, without some tests to verify assembly of various combinations of compiled and source-form modules, it is very difficult to say for sure if there are any issues that fall out of this. Unfortunately, we can't really add all of the ones we need until after we have #1401 merged. We could probably add some tests around phantoms though.

One of the risks of conflating procedures we have information about and those we don't (phantoms), is that there are places where we handled phantoms differently (and now treat them like resolved MAST root), and places where we handled a resolved MAST root differently (and will now treat them like a phantom). This may result in things that would have previously compiled successfully, failing to compile; or vice versa. We'll definitely need a good suite of tests to cover the various ways (successfully and unsuccessfully) that you can assemble a mix of compiled and source-form modules, to ensure that the behavior is correct.

kind,
target: target.clone(),
});
}
Ok(ResolvedTarget::Phantom(_)) => (),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should start populating self.invoked for ResolvedTarget::Phantom, since we use that information to construct the set of procedures referenced transitively from any given procedure in the module being visited. Previously, we only did that for ResolvedTarget::Cached, because we only cared about procedures we actually had on hand, but now that all targets resolved to a MAST root will use ResolvedTarget::Phantom, we should track all such references.

Comment on lines 344 to 359
// This is a phantom procedure - we know its root, but do not have its
// definition
break Err(AssemblyError::Failed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the find method expects to resolve a FullyQualifiedProcedureName to a GlobalProcedureIndex or return an error if that isn't possible. This is one of the reasons why it is useful to represent basic information about compiled modules/procedures in the graph, since those procedures will get assigned a GlobalProcedureIndex, and can be used to resolve invocations throughout the compilation graph.

@bobbinth
Copy link
Contributor Author

I think all of these changes should be fine. However, without some tests to verify assembly of various combinations of compiled and source-form modules, it is very difficult to say for sure if there are any issues that fall out of this. Unfortunately, we can't really add all of the ones we need until after we have #1401 merged. We could probably add some tests around phantoms though.

One of the risks of conflating procedures we have information about and those we don't (phantoms), is that there are places where we handled phantoms differently (and now treat them like resolved MAST root), and places where we handled a resolved MAST root differently (and will now treat them like a phantom). This may result in things that would have previously compiled successfully, failing to compile; or vice versa. We'll definitely need a good suite of tests to cover the various ways (successfully and unsuccessfully) that you can assemble a mix of compiled and source-form modules, to ensure that the behavior is correct.

Agreed. What I'll do is roll-back the 3rd commit in this PR and then once we have a more complete implementation of different parts, we can re-apply it in a separate PR.

Base automatically changed from plafer-single-use-assembler to next July 23, 2024 20:49
@bobbinth bobbinth merged commit 51ab7bb into next Jul 23, 2024
9 checks passed
@bobbinth bobbinth deleted the bobbin-remove-proc-cache branch July 23, 2024 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants