-
-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rule graph ambiguity can be triggered in situations that should be non-ambiguous due to Gets #7710
Comments
### Problem The `RuleGraph` is relatively intertwined with the rest of the engine and has dependencies on the python code and the externs. This makes it more challenging to test, and its interface less clear. ### Solution Extract implementations of `Rule` and `DependencyKey` from the `RuleGraph`, then move `RuleGraph` to its own crate and parameterize it on a `R: Rule` type. ### Result Adding tests for #7710 should be more straightforward.
The |
FTR: I've been working on this one for the last two days: made some progress on a refactor today that will hopefully make the solution a bit clearer. |
Spent some more time fighting with this one. The refactor(s) made it slightly easier to reason about, although I'm not sure I'll keep them. I have a reasonably good understanding of what is happening here. It's roughly that when a mutually-recursive cycle is created in the So the two problems are roughly:
|
I haven't forgotten about this one... just need to find some time to lean in on it. |
### Problem Rule graph construction time became exponential worse recently as more rules were added to a particular cluster in the graph: in particular, the set of rules around target construction. See #10504. Additionally, rule graph construction error reporting had become unmanageable with existing recursive rule graph construction algorithm: failing to construct a rule graph meant running post-facto logic to attempt to discern what had happened by re-constructing a graph of likely errors, which was slow and error prone. See #10440. Attempting to solve these two issues in the existing recursive rule graph algorithm proved to be a dead end: the logic that it used to attempt to deal with cycles was beginning to fail to deal with more complicated cyclic structures, and was contributing to the total runtime by backtracking out of cycles. ### Solution Move to constructing a `RuleGraph` using data flow analysis on the rule call graph. Additionally, replace `RootRule` definitions (which were used to "plug" holes in the rule graph in places where external inputs were expected, but which were in practice sprinkled everywhere someone thought they might be necessary) with `QueryRule` definitions, which correspond almost* 1:1 with `Schedule.product_request` callsites. The new construction algorithm is broken up into phases: 1. Building a polymorphic graph, where all sources of a particular dependency are included, but without regard for which types are available at a callsite. This phase will fail fast if no `Query` or `Get` anywhere in the graph provides a particular `Param` type. 2. Run [live variable analysis](https://en.wikipedia.org/wiki/Live_variable_analysis) on the polymorphic graph to compute the initial set of `Params` used by each node in the graph. During this phase, each node in the graph has a reference to all possible sources of a particular type, and so the computed set is very conservative (ie, overly large). 3. Monomorphize the polymorphic graph by using the liveness sets to partition nodes (and their dependees) for each valid combination of their dependencies. Unfortunately, this phase remains the most complicated component: while it is implemented using an algorithm similar to live variable analysis, the fit isn't perfect, and so there is a special case to break out of fruitless loops: see the "splits" TODO in `fn monomorphize`. 4. Choose the best dependencies via in/out sets, and prune unambiguous choices. Once the monomorphic graph has converged, each node in the graph will ideally have exactly one source of each dependency: in cases where a node has more, it is because given a particular set of input `Params`, there was more than one way to compute a dependency (ie, the `Params` were ambiguous). This phase is the second phase that renders errors: it does so by walking nodes that were marked deleted during `monomorphize` to attempt to find the root cause of a particular failure. 5. Once the graph is known to be valid, generate the final `RuleGraph` for all rules reachable from queries. ### Result RuleGraph construction is faster in almost all cases, so this fixes #10504. #7710 is fixed because pruning is applied lazily. But unfortunately codegen still contributes to a cluster of rules that takes a very long time to converge in the `monomorphize` phase, and it eventually fails (the "splits" workaround mentioned above likely contributes to this). To unblock the common case speedup, we will temporarily disable protobuf: see #10683. Using this new foundation, error messages have become slightly better: #10440 is fixed and #8862 is fixed. In other places (in particular the codegen case above) they are roughly equivalent to before. #10649 should be a straightforward followup to re-enable rule reachability checks. Finally, `QueryRule` replaces `RootRule`, and has useful errors to explain its own usage in most cases. Thanks a ton to @Eric-Arellano for his help in applying that portion of the change!
Fixed by #10645. |
#7379 introduced a constraint that only
@rule
subgraphs that consume aGet
parameter are eligible for use: this constraint continues to be useful, but it is not being applied early enough in rule graph construction to eliminate some ambiguities.Trivially applying the constraint earlier (in
choose_dependencies
rather thanconstruct_dependencies
) mostly works, but results in cyclic rules failing to determine that they actually consume the dependency that they provide to themselves. Fixing this might involve cleaning up how cyclic rules are handled.The text was updated successfully, but these errors were encountered: