Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule graph ambiguity can be triggered in situations that should be non-ambiguous due to Gets #7710

Closed
stuhood opened this issue May 13, 2019 · 5 comments
Assignees
Labels

Comments

@stuhood
Copy link
Member

stuhood commented May 13, 2019

#7379 introduced a constraint that only @rule subgraphs that consume a Get parameter are eligible for use: this constraint continues to be useful, but it is not being applied early enough in rule graph construction to eliminate some ambiguities.

Trivially applying the constraint earlier (in choose_dependencies rather than construct_dependencies) mostly works, but results in cyclic rules failing to determine that they actually consume the dependency that they provide to themselves. Fixing this might involve cleaning up how cyclic rules are handled.

@stuhood stuhood added the engine label May 13, 2019
@stuhood stuhood self-assigned this May 13, 2019
stuhood added a commit that referenced this issue May 22, 2019
### Problem

The `RuleGraph` is relatively intertwined with the rest of the engine and has dependencies on the python code and the externs. This makes it more challenging to test, and its interface less clear. 

### Solution

Extract implementations of `Rule` and `DependencyKey` from the `RuleGraph`, then move `RuleGraph` to its own crate and parameterize it on a `R: Rule` type.

### Result

Adding tests for #7710 should be more straightforward.
@stuhood
Copy link
Member Author

stuhood commented May 24, 2019

The RuleGraph extraction has completed, and the first lower level tests are in place. This should be unblocked, but it's still only a background task at the moment.

@stuhood
Copy link
Member Author

stuhood commented Sep 24, 2019

FTR: I've been working on this one for the last two days: made some progress on a refactor today that will hopefully make the solution a bit clearer.

@stuhood
Copy link
Member Author

stuhood commented Sep 30, 2019

Spent some more time fighting with this one. The refactor(s) made it slightly easier to reason about, although I'm not sure I'll keep them.

I have a reasonably good understanding of what is happening here. It's roughly that when a mutually-recursive cycle is created in the @rule graph (such as back and forth between FilesContent and Digest), the effect is that that cycle ends up sealed off as an infinite cycle between the two rules without any input parameters. And because the cycle has no input parameters, it 1) is the "cheapest" way to compute something according to our current heuristic of fewest-Params-wins, 2) doesn't satisfy the constraint that a subgraph must consume the parameter that is provided by a Get.

So the two problems are roughly:

  1. There is a bug, in that in cases where we could reasonably consume a Param (which would satisfy the existing constraint that the Param provided by a Get must be consumed), monomorphization will not choose to generate a choice of Entry that actually consumes that Param, and so that subgraph will never be chosen.
  2. To fix that bug, a choice also needs to be made about the semantics of the Get constraint: namely, whether the constraint applies transitively in the subgraph (ie, the Get's Param becomes the only source of that type in the subgraph), or "exactly once" (if one rule in the subgraph uses it, nothing else is required to). The former seems more understandable, and likely easier to implement.

@stuhood
Copy link
Member Author

stuhood commented Jan 10, 2020

I haven't forgotten about this one... just need to find some time to lean in on it.

stuhood added a commit that referenced this issue Aug 24, 2020
### Problem

Rule graph construction time became exponential worse recently as more rules were added to a particular cluster in the graph: in particular, the set of rules around target construction. See #10504.

Additionally, rule graph construction error reporting had become unmanageable with existing recursive rule graph construction algorithm: failing to construct a rule graph meant running post-facto logic to attempt to discern what had happened by re-constructing a graph of likely errors, which was slow and error prone. See #10440.

Attempting to solve these two issues in the existing recursive rule graph algorithm proved to be a dead end: the logic that it used to attempt to deal with cycles was beginning to fail to deal with more complicated cyclic structures, and was contributing to the total runtime by backtracking out of cycles. 

### Solution

Move to constructing a `RuleGraph` using data flow analysis on the rule call graph. Additionally, replace `RootRule` definitions (which were used to "plug" holes in the rule graph in places where external inputs were expected, but which were in practice sprinkled everywhere someone thought they might be necessary) with `QueryRule` definitions, which correspond almost* 1:1 with `Schedule.product_request` callsites.

The new construction algorithm is broken up into phases:

1. Building a polymorphic graph, where all sources of a particular dependency are included, but without regard for which types are available at a callsite. This phase will fail fast if no `Query` or `Get` anywhere in the graph provides a particular `Param` type.
2. Run [live variable analysis](https://en.wikipedia.org/wiki/Live_variable_analysis) on the polymorphic graph to compute the initial set of `Params` used by each node in the graph. During this phase, each node in the graph has a reference to all possible sources of a particular type, and so the computed set is very conservative (ie, overly large). 
3. Monomorphize the polymorphic graph by using the liveness sets to partition nodes (and their dependees) for each valid combination of their dependencies. Unfortunately, this phase remains the most complicated component: while it is implemented using an algorithm similar to live variable analysis, the fit isn't perfect, and so there is a special case to break out of fruitless loops: see the "splits" TODO in `fn monomorphize`.
4. Choose the best dependencies via in/out sets, and prune unambiguous choices. Once the monomorphic graph has converged, each node in the graph will ideally have exactly one source of each dependency: in cases where a node has more, it is because given a particular set of input `Params`, there was more than one way to compute a dependency (ie, the `Params` were ambiguous). This phase is the second phase that renders errors: it does so by walking nodes that were marked deleted during `monomorphize` to attempt to find the root cause of a particular failure.
5. Once the graph is known to be valid, generate the final `RuleGraph` for all rules reachable from queries.

### Result

RuleGraph construction is faster in almost all cases, so this fixes #10504. #7710 is fixed because pruning is applied lazily. But unfortunately codegen still contributes to a cluster of rules that takes a very long time to converge in the `monomorphize` phase, and it eventually fails (the "splits" workaround mentioned above likely contributes to this). To unblock the common case speedup, we will temporarily disable protobuf: see #10683.

Using this new foundation, error messages have become slightly better: #10440 is fixed and #8862 is fixed. In other places (in particular the codegen case above) they are roughly equivalent to before. #10649 should be a straightforward followup to re-enable rule reachability checks.

Finally, `QueryRule` replaces `RootRule`, and has useful errors to explain its own usage in most cases. Thanks a ton to @Eric-Arellano for his help in applying that portion of the change!
@stuhood
Copy link
Member Author

stuhood commented Aug 24, 2020

Fixed by #10645.

@stuhood stuhood closed this as completed Aug 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant