rust-lang · firefighterduck · Feb 2, 2021 · Feb 2, 2021 · Feb 4, 2021 · jackh726
diff --git a/book/src/recursive/coinduction.md b/book/src/recursive/coinduction.md
@@ -1,3 +1,80 @@
 # Coinduction
 
-TBD
+This sub-chapter is meant to describe the current handling of coinductive goals in the recursive solver rather than providing an extensive overview over the theoretical backgrounds and ideas.
+It follows the description in [this GitHub comment](https://github.com/rust-lang/chalk/issues/399#issuecomment-643420016) and the Zulip topic linked there.
+
+## General Idea
+The general idea for the handling of coinductive cycles in the recursive solver is to start by assuming the goal is provable and then try to find evidence that it is not.
+This search for a disproof is done by the standard recursive solving process described in the sub-chapters before.
+
+Albeit this approach would allow for the handling of mixed inductive/co-inductive cycles, these are actually handled as errors to prevent propagation of the assumed provability outside of the coinductive cycle.
+This propagation of the assumed solution might also happen in pure coinductive cycles and can potentially lead to false positives.
+
+## Prevention of False Positives
+The problem of false positives propagated outside of the coinductive cycle is also described in the [Coinduction chapter](../engine/logic/coinduction.md) for the SLG solver alongside the rather complex handling used with it.
+
+### The Problem
+The problem arises if a solution that is purely based on the positive starting value for the coinductive cycle is cached and as such propagated to other goals that are possibly reliant on this. An example may look like this (cf. the test case `coinduction::coinductive_unsound1`):
+
+```notrust
+C :- C1.
+C :- C2
+C1 :- C2, C3.
+C2 :- C1.
+```
+
+Here `C` may be proved by either showing `C1` or `C2`.
+Assuming the solver starts evaluating the branch with `C1` first, it then recursively tries to prove `C2` and `C3`.
+For proving `C2` it needs to show `C1` again, the coinductive cycle becomes evident.
+Therefore, `C1` is assumed to be provable and the solver proves `C2` with this information.
+Assuming, the solve does not handle this case specifically, the solution for `C2` is cached. 
+Now it tries solving `C3` but fails due to the lack of information about it.
+As such, `C1` can also not be proven for this program.
+The recursive solver will now attempt to prove the initial goal `C` by solving `C2`.
+Unfortunately, it finds the invalidly cached solution and returns it as proof for `C`.
+
+By visualizing this path of computation, it becomes evident, where the problem lies:
+* Start proving `C` with `C1`:
+    * For `C1` prove `C2` and `C3`:
+        * For `C2` prove `C1`:
+            * This is a coinductive cycle. Assume that `C1` holds.
+        * Thus `C2` also holds. Store this result about `C2` in the cache.
+        * There is no way to prove `C3`. Lift this failure up.
+    * Due to the failure of `C3` there is also no solution for `C1`.
+* Try proving `C` with `C2`:
+    * Find the cached result that `C2` has a solution and return it as the solution for `C`.
+* Stop with the invalid result for `C`.
+
+### The Solution
+The above example should make it evident that the caching of found solutions in coinductive cycles can lead to false positives and should therefore be prevented.
+This can be achieved by delaying the caching of all results inside the coinductive cycle until it is clear whether the start of the cycle (i.e. `C1` in the example above) is provable.
+If the start of the cycle can be proven by the results of the cycle and related subgoals then the assumption about it was correct and thus all results for goals inside the cycle are also valid.
+If, however, the start of the cycle can not be proved, i.e. the initial assumption was false, then a subset of the found solutions for the coinductive cycle may be invalid (i.e. the solution for `C2` in the example).
+This subset can (probably) only consist of positive results, i.e. derived solutions that could reference the assumption.
+
+Negative results should not be influenced by the positive assumption as refutability can not be followed from provability.
+As a result, it is sound to over-approximate and remove all positive solutions if the start of the cycle is disproved.
+Therefore, the recursive solver delays caching by relating all solutions in the cycle too its start (i.e. by adjusting the minimum in this subtree) and removes all positive results inside the coinductive cycle if and only if the start is disproven.
+The only downside of this approach is the late caching which might lead to increased work for shared subgoals in the same proof sub-tree as the coinductive cycle.
+
+With this procedure, the example is handled as follows:
+* Start proving `C` with `C1`:
+    * For `C1` prove `C2` and `C3`:
+        * For `C2` prove `C1`:
+            * This is a coinductive cycle. Assume that `C1` holds.
+        * Thus `C2` also holds. Delay the caching of the result about `C2`.
+        * There is no way to prove `C3`. Lift this failure up.
+    * Due to the failure of `C3` there is also no solution for `C1`. Cache this result as well as the failure of `C3` but delete everything about `C2`.
+* Start proving `C` with `C2`:
+    * For `C2` prove `C1`:
+        * Find the cached result that `C1` is disproved.
+    * Follow that `C2` is also disproved and cache this result.
+* Stop with the valid disproof of `C`.
+
+### Other Concerns
+Another possible concern is related to the [search graph](./search_graph.md).
+As the solutions of the coinductive cycle are only moved to the cache after a certain delay, they are still part of the search graph and might influence other computations as such.
+Fortunately, as long as the subtree with the coinductive cycle is not finished all solutions within the graph can be assumed to be valid. 
+This follows directly from the initial assumption and the fact that every goal that is investigated by the solver can still contribute to the whole cycle's validity (errors are propagated upwards directly).
+This is especially important as reoccurring nodes in the search graph could also denote inductive cycles.
+Though, as all nodes from inside the coinductive cycle are directly related to its start, they are not treated as part of any other cycle and the computed solutions are still valid in this context. 
diff --git a/chalk-recursive/src/recursive.rs b/chalk-recursive/src/recursive.rs
@@ -224,16 +224,25 @@ impl<'me, I: Interner> SolveDatabase<I> for Solver<'me, I> {
             // Check if this table is still on the stack.
             if let Some(depth) = self.context.search_graph[dfn].stack_depth {
                 // Is this a coinductive goal? If so, that is success,
-                // so we can return normally. Note that this return is
-                // not tabled.
-                //
-                // XXX how does caching with coinduction work?
+                // so we can return and set the minimum to its DFN.
+                // Note that this return is not tabled. And so are
+                // all other solutions in the cycle until the cycle
+                // start is finished. This avoids prematurely cached
+                // false positives.
                 if self.context.stack.coinductive_cycle_from(depth) {
                     let value = ConstrainedSubst {
                         subst: goal.trivial_substitution(self.program.interner()),
                         constraints: Constraints::empty(self.program.interner()),
                     };
+
                     debug!("applying coinductive semantics");
+
+                    // Set minimum to first occurrence of cyclic goal to prevent premature caching of possibly invalid solutions
+                    minimums.update_from(self.context.search_graph[dfn].links);
+
+                    // Mark the start of the coinductive cycle
+                    self.context.search_graph[dfn].coinductive_start = true;
+
                     return Ok(Solution::Unique(Canonical {
                         value,
                         binders: goal.canonical.binders,
@@ -275,6 +284,11 @@ impl<'me, I: Interner> SolveDatabase<I> for Solver<'me, I> {
             // worst of the repeated work that we do during tabling.
             if subgoal_minimums.positive >= dfn {
                 if self.context.caching_enabled {
+                    // Remove possible false positives from coinductive cycle
+                    if self.context.search_graph[dfn].coinductive_start && result.is_err() {
+                        self.context.search_graph.remove_false_positives_after(dfn);
+                    }
+
                     self.context
                         .search_graph
                         .move_to_cache(dfn, &mut self.context.cache);

diff --git a/chalk-recursive/src/search_graph.rs b/chalk-recursive/src/search_graph.rs
@@ -37,6 +37,11 @@ pub(super) struct Node<I: Interner> {
     /// from the stack, it contains the DFN of the minimal ancestor
     /// that the table reached (or MAX if no cycle was encountered).
     pub(crate) links: Minimums,
+
+    /// If this is true, the node is the start of coinductive cycle.
+    /// Thus, some cleanup has to be done before its result can be
+    /// cached to rule out false positives.
+    pub(crate) coinductive_start: bool,
 }
 
 impl<I: Interner> SearchGraph<I> {
@@ -71,6 +76,7 @@ impl<I: Interner> SearchGraph<I> {
             solution_priority: ClausePriority::High,
             stack_depth: Some(stack_depth),
             links: Minimums { positive: dfn },
+            coinductive_start: false,
         };
         self.nodes.push(node);
         let previous_index = self.indices.insert(goal.clone(), dfn);
@@ -101,6 +107,32 @@ impl<I: Interner> SearchGraph<I> {
             cache.insert(node.goal, node.solution);
         }
     }
+
+    /// Removes all nodes that are part of a coinductive cycle and
+    /// have a solution as they might be false positives due to
+    /// coinductive reasoning.
+    #[instrument(level = "debug", skip(self))]
+    pub(crate) fn remove_false_positives_after(&mut self, dfn: DepthFirstNumber) {
+        let mut false_positive_indices = vec![];
+
+        // Find all possible false positives in the graph below the
+        // start of the coinductive cycle
+        for (index, node) in self.nodes[dfn.index + 1..].iter().enumerate() {
+            if node.solution.is_ok() {
+                false_positive_indices.push(index + dfn.index + 1);
+            }
+        }
+
+        // Remove the potential false positives from the indices
+        self.indices
+            .retain(|_key, value| !false_positive_indices.contains(&value.index));
+
+        // Remove the potential false positives from the nodes
+        // in descending order to avoid unnecessary shifts
+        for false_positive_index in false_positive_indices.into_iter().rev() {
+            self.nodes.remove(false_positive_index);
+        }
+    }
 }
 
 impl<I: Interner> Index<DepthFirstNumber> for SearchGraph<I> {

diff --git a/tests/test/coinduction.rs b/tests/test/coinduction.rs
@@ -213,23 +213,58 @@ fn coinductive_unsound1() {
 
         goal {
             forall<X> { X: C1orC2 }
-        } yields[SolverChoice::slg(3, None)] {
+        } yields {
             "No possible solution"
         }
+    }
+}
+
+/// The only difference between this test and `coinductive_unsound1`
+/// is the order of the final `forall` clauses.
+#[test]
+fn coinductive_unsound2() {
+    test! {
+        program {
+            trait C1orC2 { }
+
+            #[coinductive]
+            trait C1 { }
+
+            #[coinductive]
+            trait C2 { }
+
+            #[coinductive]
+            trait C3 { }
+
+            forall<T> {
+                T: C1 if T: C2, T: C3
+            }
+
+            forall<T> {
+                T: C2 if T: C1
+            }
+
+            forall<T> {
+                T: C1orC2 if T: C2
+            }
+
+            forall<T> {
+                T: C1orC2 if T: C1
+            }
+        }
 
         goal {
             forall<X> { X: C1orC2 }
-        } yields[SolverChoice::recursive_default()] {
-            // FIXME(chalk#399) recursive solver doesn't handle coinduction correctly
-            "Unique; substitution [], lifetime constraints []"
+        } yields {
+            "No possible solution"
         }
     }
 }
 
-/// The only difference between this test and `coinductive_unsound1`
-/// is the order of the final `forall` clauses.
+/// Same as the two before but needs to show T: C2 in both
+// branches of T: C1 :- T: C2, T: C3.
 #[test]
-fn coinductive_unsound2() {
+fn coinductive_unsound3() {
     test! {
         program {
             trait C1orC2 { }
@@ -243,6 +278,13 @@ fn coinductive_unsound2() {
             #[coinductive]
             trait C3 { }
 
+            #[coinductive]
+            trait C4 { }
+
+            forall<T> {
+                T: C3 if T: C2, T: C4
+            }
+
             forall<T> {
                 T: C1 if T: C2, T: C3
             }
@@ -251,13 +293,62 @@ fn coinductive_unsound2() {
                 T: C2 if T: C1
             }
 
+            forall<T> {
+                T: C1orC2 if T: C1
+            }
+
             forall<T> {
                 T: C1orC2 if T: C2
             }
+        }
+
+        goal {
+            forall<X> { X: C1orC2 }
+        } yields {
+            "No possible solution"
+        }
+    }
+}
+
+/// Tests whether a nested coinductive cycle
+/// that is also unsound is handled correctly.
+#[test]
+fn coinductive_unsound_nested() {
+    test! {
+        program {
+            trait C1orC2 { }
+
+            #[coinductive]
+            trait C1 { }
+
+            #[coinductive]
+            trait C2 { }
+
+            #[coinductive]
+            trait C3 { }
+
+            #[coinductive]
+            trait C4 { }
+
+            forall<T> {
+                T: C4 if T:C2, T: C3
+            }
+
+            forall<T> {
+                T: C1 if T: C2, T: C3
+            }
+
+            forall<T> {
+                T: C2 if T: C1, T: C4
+            }
 
             forall<T> {
                 T: C1orC2 if T: C1
             }
+
+            forall<T> {
+                T: C1orC2 if T: C2
+            }
         }
 
         goal {
@@ -450,14 +541,8 @@ fn coinductive_multicycle4() {
 
         goal {
             forall<X> { X: Any }
-        } yields_all[SolverChoice::slg(3, None)] {
-        }
-
-        goal {
-            forall<X> { X: Any }
-        } yields[SolverChoice::recursive_default()] {
-            // FIXME(chalk#399) recursive solver doesn't handle coinduction correctly
-            "Unique; substitution [], lifetime constraints []"
+        } yields {
+            "No possible solution"
         }
     }
 }