Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Inference): Work harder in variable instantiation #591

Merged
merged 22 commits into from
Nov 8, 2023

Conversation

croyzor
Copy link
Contributor

@croyzor croyzor commented Oct 5, 2023

Extend the instantiate_variables method in extension inference to handle the case where we end up with variables which form loops of Plus constraints with other metavariables. This is required to get the rest of the test variants in test_cfg_loops to work

Resolves #598

@croyzor croyzor marked this pull request as ready for review October 9, 2023 14:22
src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved
while new_scope_size > scope_size {
scope_size = new_scope_size;
for m in variable_scope.clone().iter() {
for c in self.get_constraints(m).unwrap().iter() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's not clear to me from this what happens when we call self.get_constraints(m) for an m that we've seen before (on a previous iteration of the outermost loop). Will we get anything back? If so, is there any chance that we'll actually get a new thing to add to variable_scope? Or is it only the new/never-seen-before values of m that lead to finding new things? The latter would allow us to rewrite this loop much more simply but I don't know whether it's true....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the latter - if we process an m we've seen before, we'll see all the same constraints and redundantly add them to the variable_scope set

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so if each iteration made a set of new stuff to process in the next iteration, you could do away with the sizes and just have two sets - something like

let mut seen_variable_scope = HashSet::new();
let mut new_variables = self.variables.clone();
while !new_variables.empty() {
   let mut newer_variables = HashSet::new();
   for m in new_variables.iter() {
      ....add to newer_variables....
   }
   seen_variable_scope.extend(new_variables);
   new_variables = newer_variables
}

Moreover, this'll change your algorithm from O(n^2) to something closer to linear, so....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or have a Set (processed and finished with) and a VecDeque (queue), and add to the queue only if not previously in the set (if seen.insert(n) { queue.push(n); } sort of thing) - then you can do while let Some(m) = queue.pop() which is pretty neat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone for the former

src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved

// Strongly connected components are looping constraint dependencies.
// This means that each metavariable in the CC has the same solution.
for cc in relations.ccs() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing that bothers me a bit here is that we don't have much of a guarantee that relations.ccs() contains every node that we didn't add a solution for earlier. (That guarantee comes from the idea that only if there was a cyclic dependency would we have failed to find a solution earlier, so every node added to relations must be in a cycle, but still.)

One possible answer?? Don't mutate via self.add_solution but return a complete Map. Then you can assert that the map has the right number of entries (same as variable_scope.len()). Then you can add that Map to self.solutions all in one go (after debug_assert!ing that the keyset is disjoint, as per fn add_solution)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(There are other ways, you could record the ms for which the first loop failed to find a solution, then tick them off here. But that'd be more expensive when debug_assert is disabled)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add an assert that the number of nodes in relations.css() (flattened) is the same as the length of variable_scope?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd need to filter out the number of those that were already in self.solved (there's an if !self.solved.contains)...

But the flattening idea is good - how about just that the number of nodes in flattened-relations.css() is equal to the number of edges in relations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, some edges share endpoints. This works:

Suggested change
for cc in relations.ccs() {
assert_eq!(relations.node_map.len(), relations.ccs().iter().map(Vec::len).sum::<usize>());

Common up the two relations.ccs tho!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that check really makes sense, since it's apparent that the size of node_map and the number of nodes in the graph will be the same from looking at GraphContainer::add_or_retrieve, and sccs will return a clustered version of every node in the graph

Copy link
Contributor

@acl-cqc acl-cqc Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed size of node map equals number of nodes in the graph, yes the latter is what I meant.

But SCC = Strongly Connected Component, i.e. every node reaches every other? In a directed graph? Ah, are ccs undirectly-connected?

So A -> B -> C -> A is one SCC, with all three nodes. But A -> B -> C is NOT an SCC by that definition, but are you saying that .ccs() would give you [[A],[B],[C]]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed it does. In which case yeah, no assert/check really seems necessary, we're only really protecting against somehow getting in a muddle ourselves between the two loops.

Does make me wonder whether we could just put everything through the relations map, then. A node with no connections will trivially (and cheaply) be its own SCC. So this above:

if unresolved_metas,is_empty() {
   self.add_solution(m, solution.clone())
} else {
  unresolved_metas.iter().for_each(.....)
}

could just be

relations.add_or_retrieve(m);
unresolved_metas.iter().for_each(....)

and then all the add_solution calls would be in the second loop.

(ExtensionSet::new(), ExtensionSet::new(), just_a.clone()),
];

let mut variants = Vec::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahaha! Neat :-) You might consider flat_map (*3) to avoid mut but like it either way :)

src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved
src/extension/infer.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@acl-cqc acl-cqc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Craig - looks good to me, some minor suggestions

combined_solution = combined_solution.union(sol);
}
for m in cc.iter() {
self.add_solution(*m, combined_solution.clone());
Copy link
Contributor

@acl-cqc acl-cqc Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to add to solutions here do we? What if we had, say, two SCCs with an edge between them:

A -> B -> C -> A
D -> E -> F -> D
A -> D

I note Petgraph's tarjan_scc returns the components in a defined order (reverse topsort or something), so in theory (maybe you have to reverse cc) it's soluble, but I think you need to consider not just solutions.get(m) for each m in the SCC but the solutions to the constraints upon m (which by that topsort ordering have already been computed).

Or, maybe such a structure can't occur, I dunno....????

Copy link
Contributor

@acl-cqc acl-cqc Oct 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try branch inference-variable/fix-dependent-sccs (the last commit) - some uncertainties detailed in the comments there, and needs a test of a structure such as the ABCDEF above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, realize that wasn't passing tests! A missed self.resolve should make it pass now.

@acl-cqc acl-cqc self-requested a review October 13, 2023 08:24
@acl-cqc
Copy link
Contributor

acl-cqc commented Oct 23, 2023

Also, I have been wondering. Why is it sufficient to do this in instantiate_variables?

Firstly, shouldn't we include Metas 'solved' by finding they are in cycles, in the solution rather than the closure? But secondly, is it actually sufficient. Or, should we detect cycles of plus constraints earlier - perhaps before the first main_loop - and then convert them to equality constraints?

@croyzor croyzor added this pull request to the merge queue Nov 8, 2023
Merged via the queue into main with commit 2269161 Nov 8, 2023
8 checks passed
@croyzor croyzor deleted the fix/inference-variable branch November 8, 2023 09:37
@acl-cqc acl-cqc mentioned this pull request Nov 8, 2023
@github-actions github-actions bot mentioned this pull request Jan 3, 2024
github-merge-queue bot pushed a commit that referenced this pull request Jan 15, 2024
## 🤖 New release
* `quantinuum-hugr`: 0.1.0

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

## 0.1.0 (2024-01-15)

### Bug Fixes

- Subgraph boundaries with copies
([#440](#440))
- [**breaking**] Use internal tag for SumType enum serialisation
([#462](#462))
- Check kind before unwrap in insert_identity
([#475](#475))
- Allow for variables to get solved in inference
([#478](#478))
- In IdentityInsertion add noop to correct parent
([#477](#477))
- Failing release tests ([#485](#485))
- [**breaking**] Serialise `Value`, `PrimValue`, and `TypeArg` with
internal tags ([#496](#496))
- Serialise custom constants with internal tag
([#502](#502))
- [**breaking**] Reduce max int width in arithmetic extension to 64
([#504](#504))
- HugrView::get_function_type
([#507](#507))
- TODO in resolve_extension_ops: copy across input_extensions
([#510](#510))
- Use given input extensions in `define_function`
([#524](#524))
- Lessen requirements for hugrs in outline_cfg
([#528](#528))
- Make unification logic less strict
([#538](#538))
- Simple replace incorrectly copying metadata
([#545](#545))
- Account for self-referencial constraints
([#551](#551))
- Consider shunted metas when comparing equality
([#555](#555))
- Join labels in issue workflow
([#563](#563))
- Comment out broken priority code
([#562](#562))
- Handling of issues with no priority label
([#573](#573))
- Don't insert temporary wires when extracting a subgraph
([#582](#582))
- Improve convexity checking and fix test
([#585](#585))
- Ignore input->output links in
SiblingSubgraph::try_new_dataflow_subgraph
([#589](#589))
- Enforce covariance of SiblingMut::RootHandle
([#594](#594))
- Erratic stack overflow in infer.rs (live_var)
([#638](#638))
- Work harder in variable instantiation
([#591](#591))
- Actually add the error type to prelude
([#672](#672))
- Serialise dynamically computed opaqueOp signatures
([#690](#690))
- FuncDefns don't require that their extensions match their children
([#688](#688))
- Binary compute_signature returning a PolyFuncType with binders
([#710](#710))
- Use correct number of args for int ops
([#723](#723))
- [**breaking**] Add serde tag to TypeParam enum
([#722](#722))
- Allow widening and narrowing to same width.
([#735](#735))
- Case node should not have an external signature
([#749](#749))
- [**breaking**] Normalize input/output value/static/other ports in
`OpType` ([#783](#783))
- No dataflow_signature for block types
([#792](#792))
- Ignore unsupported test in miri
([#794](#794))
- Include schema rather than read file
([#807](#807))

### Documentation

- Add operation constraint to quantum extension
([#543](#543))
- Coverage check section in DEVELOPMENT.md
([#648](#648))
- Remove "quantum extension" from HUGR spec.
([#694](#694))
- Improve crate-level docs, including example code.
([#698](#698))
- Spec cleanups and clarifications
([#742](#742))
- Spec clarifications ([#738](#738))
- Spec updates ([#741](#741))
- [spec] Remove references to causal cone and Order edges from Input
([#762](#762))
- Mention experimental inference in readme
([#800](#800))
- Collection of spec updates for 0.1
([#801](#801))
- Add schema v0 ([#805](#805))
- Update spec wrt. polymorphism
([#791](#791))

### Features

- Derive things for builder structs
([#229](#229))
- Return a slice of types from the signature
([#238](#238))
- Move `dot_string` to `HugrView`
([#271](#271))
- [**breaking**] Change `TypeParam::USize` to `TypeParam::BoundedNat`
and use in int extensions
([#445](#445))
- TKET2 compatibility requirements
([#450](#450))
- Split methods between `HugrMut` and `HugrMutInternals`
([#441](#441))
- Add `HugrView::node_connections` to get all links between nodes
([#460](#460))
- Location information in extension inference error
([#464](#464))
- Add T, Tdg, X gates ([#466](#466))
- [**breaking**] Add `ApplyResult` associated type to `Rewrite`
([#472](#472))
- Implement rewrite `IdentityInsertion`
([#474](#474))
- Instantiate inferred extensions
([#461](#461))
- Default DFG builders to open extension sets
([#473](#473))
- Some helper methods ([#482](#482))
- Extension inference for conditional nodes
([#465](#465))
- Add ConvexChecker ([#487](#487))
- Add clippy lint for mut calls in a debug_assert
([#488](#488))
- Default more builder methods to open extension sets
([#492](#492))
- Port is serializable ([#489](#489))
- More general portgraph references
([#494](#494))
- Add extension deltas to CFG ops
([#503](#503))
- Implement petgraph traits on Hugr
([#498](#498))
- Make extension delta a parameter of CFG builders
([#514](#514))
- [**breaking**] SiblingSubgraph does not borrow Hugr
([#515](#515))
- Validate TypeArgs to ExtensionOp
([#509](#509))
- Derive Debug & Clone for `ExtensionRegistry`.
([#530](#530))
- Const nodes are built with some extension requirements
([#527](#527))
- Some python errors and bindings
([#533](#533))
- Insert_hugr/insert_view return node map
([#535](#535))
- Add `SiblingSubgraph::try_from_nodes_with_checker`
([#547](#547))
- PortIndex trait for undirected port parameters
([#553](#553))
- Insert/extract subgraphs from a HugrView
([#552](#552))
- Add `new_array` operation to prelude
([#544](#544))
- Add a `DataflowParentID` node handle
([#559](#559))
- Make TypeEnum and it's contents public
([#558](#558))
- Optional direction check when querying a port index
([#566](#566))
- Extension inference for CFGs
([#529](#529))
- Better subgraph verification errors
([#587](#587))
- Compute affected nodes for `SimpleReplacement`
([#600](#600))
- Move `SimpleReplace::invalidation_set` to the `Rewrite` trait
([#602](#602))
- [**breaking**] Resolve extension ops (mutating Hugr) in
(infer_and_->)update_validate
([#603](#603))
- Add accessors for node index and const values
([#605](#605))
- [**breaking**] Expose the value of ConstUsize
([#621](#621))
- Nicer debug and display for core types
([#628](#628))
- [**breaking**] Static checking of Port direction
([#614](#614))
- Builder and HugrMut add_op_xxx default to open extensions
([#622](#622))
- [**breaking**] Add panicking integer division ops
([#625](#625))
- Add hashable `Angle` type to Quantum extension
([#608](#608))
- [**breaking**] Remove "rotations" extension.
([#645](#645))
- Port.as_directed to match on either direction
([#647](#647))
- Impl GraphRef for PetgraphWrapper
([#651](#651))
- Provide+implement Replace API
([#613](#613))
- Require the node's metadata to always be a Map
([#661](#661))
- [**breaking**] Polymorphic function types (inc OpDefs) using dyn trait
([#630](#630))
- Make prelude error type public
([#669](#669))
- Shorthand for retrieving custom constants from `Const`, `Value`
([#679](#679))
- [**breaking**] HugrView API improvements
([#680](#680))
- Make FuncDecl/FuncDefn polymorphic
([#692](#692))
- [**breaking**] Simplify `SignatureFunc` and add custom arg validation.
([#706](#706))
- [**breaking**] Drop the `pyo3` feature
([#717](#717))
- [**breaking**] `OpEnum` trait for common opdef functionality
([#721](#721))
- MakeRegisteredOp trait for easier registration
([#726](#726))
- Getter for `PolyFuncType::body`
([#727](#727))
- `Into<OpType>` for custom ops
([#731](#731))
- Always require a signature in `OpaqueOp`
([#732](#732))
- Values (and hence Consts) know their extensions
([#733](#733))
- [**breaking**] Use ConvexChecker trait
([#740](#740))
- Custom const for ERROR_TYPE
([#756](#756))
- Implement RemoveConst and RemoveConstIgnore
([#757](#757))
- Constant folding implemented for core and float extension
([#769](#769))
- Constant folding for arithmetic conversion operations
([#720](#720))
- DataflowParent trait for getting inner signatures
([#782](#782))
- Constant folding for logic extension
([#793](#793))
- Constant folding for list operations
([#795](#795))
- Add panic op to prelude
([#802](#802))
- Const::from_bool function
([#803](#803))

### HugrMut

- Validate nodes for set_metadata/get_metadata_mut, too
([#531](#531))

### HugrView

- Validate nodes, and remove Base
([#523](#523))

### Miscellaneous Tasks

- [**breaking**] Update portgraph 0.10 and pyo3 0.20
([#612](#612))
- [**breaking**] Hike MSRV to 1.75
([#761](#761))

### Performance

- Use lazy static definittion for prelude registry
([#481](#481))

### Refactor

- Move `rewrite` inside `hugr`, `Rewrite` -> `Replace` implementing new
'Rewrite' trait ([#119](#119))
- Use an excluded upper bound instead of max log width.
([#451](#451))
- Add extension info to `Conditional` and `Case`
([#463](#463))
- `ExtensionSolution` only consists of input extensions
([#480](#480))
- Remove builder from more DFG tests
([#490](#490))
- Separate hierarchy views
([#500](#500))
- [**breaking**] Use named struct for float constants
([#505](#505))
- Allow NodeType::new to take any Into<Option<ExtensionSet>>
([#511](#511))
- Move apply_rewrite from Hugr to HugrMut
([#519](#519))
- Use SiblingSubgraph in SimpleReplacement
([#517](#517))
- CFG takes a FunctionType
([#532](#532))
- Remove check_custom_impl by inlining into check_custom
([#604](#604))
- Insert_subgraph just return HashMap, make InsertionResult new_root
compulsory ([#609](#609))
- [**breaking**] Rename predicate to TupleSum/UnitSum
([#557](#557))
- Simplify infer.rs/report_mismatch using early return
([#615](#615))
- Move the core types to their own module
([#627](#627))
- Change &NodeType->&OpType conversion into op() accessor
([#623](#623))
- Infer.rs 'fn results' ([#631](#631))
- Be safe ([#637](#637))
- NodeType constructors, adding new_auto
([#635](#635))
- Constraint::Plus stores an ExtensionSet, which is a BTreeSet
([#636](#636))
- [**breaking**] Remove `SignatureDescription`
([#644](#644))
- [**breaking**] Remove add_op_<posn> by generalizing add_node_<posn>
with "impl Into" ([#642](#642))
- Rename accidentally-changed Extension::add_node_xxx back to add_op
([#659](#659))
- [**breaking**] Remove quantum extension
([#670](#670))
- Use type schemes in extension definitions wherever possible
([#678](#678))
- [**breaking**] Flatten `Prim(Type/Value)` in to parent enum
([#685](#685))
- [**breaking**] Rename `new_linear()` to `new_endo()`.
([#697](#697))
- Replace NodeType::signature() with io_extensions()
([#700](#700))
- Validate ExtensionRegistry when built, not as we build it
([#701](#701))
- [**breaking**] One way to add_op to extension
([#704](#704))
- Remove Signature struct
([#714](#714))
- Use `MakeOpDef` for int_ops
([#724](#724))
- [**breaking**] Use enum op traits for floats + conversions
([#755](#755))
- Avoid dynamic dispatch for non-folding operations
([#770](#770))
- Simplify removeconstignore verify
([#768](#768))
- [**breaking**] Unwrap BasicBlock enum
([#781](#781))
- Make clear const folding only for leaf ops
([#785](#785))
- [**breaking**] `s/RemoveConstIgnore/RemoveLoadConstant`
([#789](#789))
- Put extension inference behind a feature gate
([#786](#786))

### SerSimpleType

- Use Vec not TypeRow ([#381](#381))

### SimpleReplace+OutlineCfg

- Use HugrMut methods rather than .hierarchy/.graph
([#280](#280))

### Testing

- Update inference test to not use DFG builder
([#550](#550))
- Strengthen "failing_sccs_test", rename to "sccs" as it's not failing!
([#660](#660))
- [**breaking**] Improve coverage in signature and validate
([#643](#643))
- Use insta snapshots to add dot_string coverage
([#682](#682))
- Miri ignore file-opening test
([#684](#684))
- Unify the serialisation tests
([#730](#730))
- Add schema validation to roundtrips
([#806](#806))

### `ConstValue

- :F64` and `OpaqueOp::new`
([#206](#206))

### Cleanup

- Remove outdated comment
([#536](#536))

### Cosmetic

- Format + remove stray TODO
([#444](#444))

### Doc

- Crate name as README title + add reexport
([#199](#199))

### S/EdgeKind

- :Const/EdgeKind::Static/
([#201](#201))

### Simple_replace.rs

- Use HugrMut::remove_node, includes clearing op_types
([#242](#242))

### Spec

- Remove "Draft 3" from title of spec document.
([#590](#590))
- Rephrase confusing paragraph about TailLoop inputs/outputs
([#567](#567))

### Src/ops/validate.rs

- Common-up some type row calculations
([#254](#254))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/MarcoIeni/release-plz/).

---------

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Agustín Borgna <121866228+aborgna-q@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix variable instantiation in extension inference
2 participants