Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Compile lists and comprehensions #70

Merged
merged 28 commits into from
Jan 16, 2024
Merged

feat: Compile lists and comprehensions #70

merged 28 commits into from
Jan 16, 2024

Conversation

mark-koch
Copy link
Collaborator

@mark-koch mark-koch commented Dec 20, 2023

Compiles list comprehensions into a Hugr via nested Loops and Conditionals

@ss2165 ss2165 added this to the Release v0.1 milestone Jan 10, 2024
Copy link
Contributor

@acl-cqc acl-cqc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally good-to-great. This is quite complex stuff and I don't have any magic solutions to make things simpler! (Tho one suggestion about combining ifs might help a bit. Possibly.) There are some suggestions for wider-scoped followups, but mostly these are quite small.

  • One could add a ConditionalCompiler, or something like that, that knows that inside the conditional you only have Cases and does not have a self.dfg (because the body of the conditional isn't a Dfg)...I'm not sure that helps enough to justify the extra code, however.

But, is there a different approach? I find DesugaredConditional a bit odd - it does a lot of temporary-variable generation in advance and the two-stage approach seems a little hard to follow. Maybe that could be improved by minor tweaks, but just to outline an alternative:

  • Replace DesugaredConditional. Rather than a list of generators each with a list of if's, use a nested structure: each (generator or guard) contains the next (generator or guard) as a field, and you recurse on that, using the visitor pattern to choose between generator and guard. At the end of the list you'd have the return element. Then you'd have a ForComprehensionCompiler that only knows about for (generator), if (guard), and on seeing any other expression delegates to ExprCompiler. Then at the start you call FCC with the outer TreeStructuredGenerator.

I think this might be clearer, but I'm not sure enough that it'd be enough of an improvement (if any) to argue very strongly. But maybe have a think :)

guppy/compiler/expr_compiler.py Show resolved Hide resolved
) -> Iterator[None]:
"""Context manager to build a graph inside a new `DFContainer`.

Automatically updates `self.dfg` and makes the inputs available.
Copy link
Contributor

@acl-cqc acl-cqc Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unusual to have Iterator[None] and yield nothing (normally you'd yield the object that is here self.dfg), but agreed we have self.dfg already and need it everywhere, so ok.

Proposal for future (not this PR) though - is DFContainer the right name? It has a DFContainingNode, which sounds like much the same thing, and in fact the DFContainer doesn't contain anything (the ...Node) does. Maybe (DFG|DSG)(Compiler|Adaptor|Wrapper) (dataflow sibling graph) to emphasize that it's about (helping with) compiling a single graph/level? Then we could call this _in_subgraph or in_context or similar - the point being that contextmanager here doesn't create the containing node, only some book-keeping necessary for compiling into said containing node.

All naming/semantic nits, though. This all looks pretty good :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, the naming is not ideal. I added an issue to rethink this #90

old = self.dfg
inp = self.graph.add_input(parent=node)
new_locals = {
name.id: PortVariable(name.id, inp.add_out_port(get_type(name)), name, None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paranoia, but is it worth checking inputs are all unique? (Does the order matter, could inputs be a set?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order does matter (for example since it has to be the same for both Cases).

Adding a uniqueness check sounds good 👍

"""
loop = self.graph.add_tail_loop([self.visit(name) for name in inputs], parent)
with self._new_dfcontainer(inputs, loop):
yield
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha! Very cool :)

with self._new_dfcontainer(inputs, loop):
yield
# Output the branch predicate and the inputs for the next iteration
self.graph.add_output(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes reasonable sense until you are familiar with Hugr...then I am wondering, how does this get turned into a TupleSum (as the underlying Hugr TailLoop requires)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type checking ensures that branch has type bool, which is turned into the Hugr type Sum((), ())

Copy link
Contributor

@acl-cqc acl-cqc Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I see - right, so suddenly this "obviously works" :-). But (IIUC) - in the comment - the "inputs for the next iteration" are the extra values not in the tuple-sum, so they are also loop exit values. Maybe just call them loop variants or something like that?

# generator
with self._new_case(inputs, inputs, cond):

def compile_ifs(ifs: list[ast.expr]) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there another way here? We already have nested compile_gens in order to get the correct nested/exit behaviour, but how about if there are multiple if's just turning them into a conjunction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I was also wondering about := inside such if guards....)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, short-circuiting expressions and := assignments are not allowed in list comprehensions at all, since we can only compile them as a CFG for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Struggling to figure out what you mean here - by "as a CFG" do you mean "as a DFG"? That would make a lot more sense....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the "as a CFG" applies to the "short-circuiting expressions" and ":= assignments", not to the "list comprehensions"....(same thing, I'd read it inverted).

So there's no reason one couldn't pass extra variables defined by := into a Conditional, but this does sound quite hard (i.e. certainly not something we'd want to have to get right more than once!); and you'd have to have nested Conditionals.

However, given a single if guard could itself be a shortcircuit (and/or with :=) then I think making a nested CFG (selecting between the true/false arms here, with perhaps extra variables being introduced along the way) would be the ideal. Happy not to do that in this PR, though - just good to comment/document the limitations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also just slightly wondering about how you're handling the variables bound by each for - presumably these are added to the context by an earlier stage in compilation?

yield
# Output the branch predicate and the inputs for the next iteration
self.graph.add_output(
[self.visit(branch), *(self.visit(name) for name in inputs)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note repeated calls to self.visit(name) for name in inputs from a few lines above. Presume that's ok? Does it make sense to common up?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a helper function

def compile_inputs() -> Iterator[OutPortV]:
    return self.visit(name) for name in inputs

but not sure if that makes it clearer? The point is that we have to make a fresh call to self.visit since we are in a new nested context

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Don't think I'd bother with the helper, but commenting that we need to revisit in the new context would be good

guppy/compiler/expr_compiler.py Outdated Show resolved Hide resolved
guppy/hugr/hugr.py Outdated Show resolved Hide resolved
guppy/hugr/hugr.py Outdated Show resolved Hide resolved
Copy link
Contributor

@acl-cqc acl-cqc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd hit approve at this point if there were tests...I'm slightly nervous of doing so without so what's the situation there?

@mark-koch
Copy link
Collaborator Author

The tests are here: #71

Copy link
Contributor

@acl-cqc acl-cqc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks. One last thing is that it sounds like an if guard that just contains an and (or or) will not be compiled - that is a bit of a limitation, we should doc that somewhere. (and you could compile to another if, but not or so I don't think I'd bother)

@mark-koch
Copy link
Collaborator Author

That is a bit of a limitation, we should doc that somewhere.

Created issue #98 for this.

and you could compile to another if, but not or so I don't think I'd bother

Agreed

@mark-koch mark-koch self-assigned this Jan 16, 2024
Base automatically changed from lists/check to feat/lists January 16, 2024 13:02
@mark-koch mark-koch merged commit 270ea9e into feat/lists Jan 16, 2024
1 check passed
@mark-koch mark-koch deleted the lists/compile branch January 16, 2024 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants