Skip to content

Commit 5883018

Browse files
committed
explain
1 parent 57144ff commit 5883018

File tree

1 file changed

+248
-78
lines changed

1 file changed

+248
-78
lines changed

compiler/rustc_mir_build/src/thir/pattern/usefulness.rs

+248-78
Original file line numberDiff line numberDiff line change
@@ -12,85 +12,54 @@
1212
//!
1313
//! -----
1414
//!
15-
//! This file includes the logic for exhaustiveness and reachability checking for pattern-matching.
16-
//! Specifically, given a list of patterns for a type, we can tell whether:
17-
//! (a) each pattern is reachable (reachability)
15+
//! This file contains the logic for exhaustiveness and reachability checking for pattern-matching.
16+
//! Specifically, given a list of patterns in a match, we can tell whether:
17+
//! (a) a given pattern is reachable (reachability)
1818
//! (b) the patterns cover every possible value for the type (exhaustiveness)
1919
//!
20-
//! The algorithm implemented here is a modified version of the one described in [this
21-
//! paper](http://moscova.inria.fr/~maranget/papers/warn/index.html). We have however generalized
22-
//! it to accommodate the variety of patterns that Rust supports. We thus explain our version here,
23-
//! without being as rigorous.
20+
//! The algorithm implemented here is inspired from the one described in [this
21+
//! paper](http://moscova.inria.fr/~maranget/papers/warn/index.html). We have however changed it in
22+
//! various ways to accommodate the variety of patterns that Rust supports. We thus explain our
23+
//! version here, without being as precise.
2424
//!
2525
//!
2626
//! # Summary
2727
//!
28-
//! The core of the algorithm is the notion of "usefulness". A pattern `q` is said to be *useful*
29-
//! relative to another pattern `p` of the same type if there is a value that is matched by `q` and
30-
//! not matched by `p`. This generalizes to many `p`s: `q` is useful w.r.t. a list of patterns
31-
//! `p_1 .. p_n` if there is a value that is matched by `q` and by none of the `p_i`. We write
32-
//! `usefulness(p_1 .. p_n, q)` for a function that returns a list of such values. The aim of this
33-
//! file is to compute it efficiently.
34-
//!
35-
//! This is enough to compute reachability: a pattern in a `match` expression is reachable iff it
36-
//! is useful w.r.t. the patterns above it:
37-
//! ```rust
38-
//! # fn foo(x: Option<i32>) {
39-
//! match x {
40-
//! Some(_) => {},
41-
//! None => {}, // reachable: `None` is matched by this but not the branch above
42-
//! Some(0) => {}, // unreachable: all the values this matches are already matched by
43-
//! // `Some(_)` above
44-
//! }
45-
//! # }
46-
//! ```
28+
//! The algorithm is given as input a list of patterns, one for each arm of a match, and computes
29+
//! the following:
30+
//! - a set of values that match none of the patterns (if any),
31+
//! - for each subpattern (taking into account or-patterns), whether it would catch any value that
32+
//! isn't caught by a pattern before it, i.e. whether it is reachable.
4733
//!
48-
//! This is also enough to compute exhaustiveness: a match is exhaustive iff the wildcard `_`
49-
//! pattern is _not_ useful w.r.t. the patterns in the match. The values returned by `usefulness`
50-
//! are used to tell the user which values are missing.
51-
//! ```compile_fail,E0004
52-
//! # fn foo(x: Option<i32>) {
53-
//! match x {
54-
//! Some(0) => {},
55-
//! None => {},
56-
//! // not exhaustive: `_` is useful because it matches `Some(1)`
57-
//! }
58-
//! # }
59-
//! ```
34+
//! To a first approximation, the algorithm works by exploring all possible values for the type
35+
//! being matched on, and determining which arm(s) catch which value. To make this tractable we
36+
//! cleverly group together values, as we'll see below.
6037
//!
6138
//! The entrypoint of this file is the [`compute_match_usefulness`] function, which computes
62-
//! reachability for each match branch and exhaustiveness for the whole match.
39+
//! reachability for each subpattern and exhaustiveness for the whole match.
40+
//!
41+
//! In this page we explain the necessary concepts to understand how the algorithm works.
6342
//!
6443
//!
6544
//! # Constructors and fields
6645
//!
67-
//! Note: we will often abbreviate "constructor" as "ctor".
68-
//!
69-
//! The idea that powers everything that is done in this file is the following: a (matchable)
70-
//! value is made from a constructor applied to a number of subvalues. Examples of constructors are
71-
//! `Some`, `None`, `(,)` (the 2-tuple constructor), `Foo {..}` (the constructor for a struct
72-
//! `Foo`), and `2` (the constructor for the number `2`). This is natural when we think of
73-
//! pattern-matching, and this is the basis for what follows.
74-
//!
75-
//! Some of the ctors listed above might feel weird: `None` and `2` don't take any arguments.
76-
//! That's ok: those are ctors that take a list of 0 arguments; they are the simplest case of
77-
//! ctors. We treat `2` as a ctor because `u64` and other number types behave exactly like a huge
78-
//! `enum`, with one variant for each number. This allows us to see any matchable value as made up
79-
//! from a tree of ctors, each having a set number of children. For example: `Foo { bar: None,
80-
//! baz: Ok(0) }` is made from 4 different ctors, namely `Foo{..}`, `None`, `Ok` and `0`.
81-
//!
82-
//! This idea can be extended to patterns: they are also made from constructors applied to fields.
83-
//! A pattern for a given type is allowed to use all the ctors for values of that type (which we
84-
//! call "value constructors"), but there are also pattern-only ctors. The most important one is
85-
//! the wildcard (`_`), and the others are integer ranges (`0..=10`), variable-length slices (`[x,
86-
//! ..]`), and or-patterns (`Ok(0) | Err(_)`). Examples of valid patterns are `42`, `Some(_)`, `Foo
87-
//! { bar: Some(0) | None, baz: _ }`. Note that a binder in a pattern (e.g. `Some(x)`) matches the
88-
//! same values as a wildcard (e.g. `Some(_)`), so we treat both as wildcards.
89-
//!
90-
//! From this deconstruction we can compute whether a given value matches a given pattern; we
91-
//! simply look at ctors one at a time. Given a pattern `p` and a value `v`, we want to compute
92-
//! `matches!(v, p)`. It's mostly straightforward: we compare the head ctors and when they match
93-
//! we compare their fields recursively. A few representative examples:
46+
//! In the value `Pair(Some(0), true)`, `Pair` is called the constructor of the value, and `Some(0)`
47+
//! and `true` are its fields. Every matcheable value can be decomposed in this way. Examples of
48+
//! constructors are: `Some`, `None`, `(,)` (the 2-tuple constructor), `Foo {..}` (the constructor
49+
//! for a struct `Foo`), and `2` (the constructor for the number `2`).
50+
//!
51+
//! Each constructor takes a fixed number of fields; this is called its arity. `Pair` and `(,)` have
52+
//! arity 2, `Some` has arity 1, `None` and `42` have arity 0. Each type has a known set of
53+
//! constructors. Some types have many (like `u64`) or even an infinitely-many (like `&str` or
54+
//! `&[]`) constructors.
55+
//!
56+
//! Patterns are similar: `Pair(Some(_), _)` has constructor `Pair` and two fields. The difference
57+
//! is that we get some extra pattern-only constructors, namely: the wildcard `_`, integer ranges
58+
//! like `0..=10`, and variable-length slices like `[_, ..]`. We treat or-patterns separately, see
59+
//! the dedicated section below.
60+
//!
61+
//! Now to check if a value `v` matches a pattern `p`, we check if `v`'s constructor matches `p`'s
62+
//! constructor, then recursively compare their fields if necessary. A few representative examples:
9463
//!
9564
//! - `matches!(v, _) := true`
9665
//! - `matches!((v0, v1), (p0, p1)) := matches!(v0, p0) && matches!(v1, p1)`
@@ -100,21 +69,87 @@
10069
//! - `matches!(v, 1..=100) := matches!(v, 1) || ... || matches!(v, 100)`
10170
//! - `matches!([v0], [p0, .., p1]) := false` (incompatible lengths)
10271
//! - `matches!([v0, v1, v2], [p0, .., p1]) := matches!(v0, p0) && matches!(v2, p1)`
103-
//! - `matches!(v, p0 | p1) := matches!(v, p0) || matches!(v, p1)`
104-
//!
105-
//! Constructors, fields and relevant operations are defined in the [`super::deconstruct_pat`] module.
10672
//!
107-
//! Note: this constructors/fields distinction may not straightforwardly apply to every Rust type.
108-
//! For example a value of type `Rc<u64>` can't be deconstructed that way, and `&str` has an
109-
//! infinitude of constructors. There are also subtleties with visibility of fields and
110-
//! uninhabitedness and various other things. The constructors idea can be extended to handle most
111-
//! of these subtleties though; caveats are documented where relevant throughout the code.
73+
//! Constructors, fields and relevant operations are defined in the [`super::deconstruct_pat`]
74+
//! module. The question of whether a constructor is matched by another one is answered by
75+
//! [`Constructor::is_covered_by`].
11276
//!
113-
//! Whether constructors cover each other is computed by [`Constructor::is_covered_by`].
77+
//! Note 1: variables (like in `Some(x)`) match anything, so we treat them as wildcards.
78+
//! Note 2: this only applies to matcheable values. For example a value of type `Rc<u64>` can't be
79+
//! deconstructed that way.
11480
//!
11581
//!
11682
//! # Specialization
11783
//!
84+
//! Recall that we need to try all possible values to see if a match is exhaustive. We do it
85+
//! constructor-by-constructor, e.g. for the type `Option<T>`, "these patterns match all values" is
86+
//! equivalent to "these patterns match all values with constructor `Some` as well as all values
87+
//! with constructor `None`".
88+
//!
89+
//! Now observe that "the following matches all values that look like `(Some(_), _)`"
90+
//! ```rust,ignore(example)
91+
//! match x {
92+
//! (Some(0), _) => 1,
93+
//! (_, false) => 2,
94+
//! (Some(0), false) => 3,
95+
//! }
96+
//! ```
97+
//!
98+
//! is equivalent to "the following matches all values"
99+
//! ```rust,ignore(example)
100+
//! match x {
101+
//! (0, _) => 1,
102+
//! (_, false) => 2,
103+
//! (0, false) => 3,
104+
//! }
105+
//! ```
106+
//!
107+
//! and "the following matches all values that look like `(None, _)`"
108+
//! ```rust,ignore(example)
109+
//! match x {
110+
//! (Some(0), _) => 1,
111+
//! (_, false) => 2,
112+
//! (Some(0), false) => 3,
113+
//! }
114+
//! ```
115+
//!
116+
//! is equivalent to "the following matches all values"
117+
//! ```rust,ignore(example)
118+
//! match x {
119+
//! false => 2,
120+
//! }
121+
//! ```
122+
//!
123+
//! In other words, this is exhaustive:
124+
//! ```rust,ignore(example)
125+
//! match x {
126+
//! (Some(0), _) => 1,
127+
//! (_, false) => 2,
128+
//! (Some(0), false) => 3,
129+
//! }
130+
//! ```
131+
//!
132+
//! if and only if these two are exhaustive:
133+
//! ```rust,ignore(example)
134+
//! match x {
135+
//! (0, _) => 1,
136+
//! (_, false) => 2,
137+
//! (0, false) => 3,
138+
//! }
139+
//! match x {
140+
//! false => 2,
141+
//! }
142+
//! ```
143+
//!
144+
//! This is how the algorithm works: we recursively peel off one constructor at a time until we have
145+
//! tried them all. This "peeling off" step is called "specialization".
146+
//!
147+
//! TODO: we operate on rows
148+
//! TODO: define and illustrate specialize
149+
//! TODO: unspecialization to reconstruct witnesses
150+
//!
151+
//! Note: we will sometimes abbreviate "constructor" as "ctor".
152+
//!
118153
//! Recall that we wish to compute `usefulness(p_1 .. p_n, q)`: given a list of patterns `p_1 ..
119154
//! p_n` and a pattern `q`, all of the same type, we want to find a list of values (called
120155
//! "witnesses") that are matched by `q` and by none of the `p_i`. We obviously don't just
@@ -289,6 +324,12 @@
289324
//! The details are not necessary to understand this file, so we explain them in
290325
//! [`super::deconstruct_pat`]. Splitting is done by the `Constructor::split` function.
291326
//!
327+
//!
328+
//! # Or-patterns
329+
//!
330+
//! TODO
331+
//!
332+
//!
292333
//! # Constants in patterns
293334
//!
294335
//! There are two kinds of constants in patterns:
@@ -304,6 +345,138 @@
304345
//! In order to honor the `==` implementation, constants of types that implement `PartialEq` manually
305346
//! stay as a full constant and become an `Opaque` pattern. These `Opaque` patterns do not participate
306347
//! in exhaustiveness, specialization or overlap checking.
348+
//!
349+
//!
350+
//! # Example
351+
//!
352+
//! A picture is worth a thousand words.
353+
//!
354+
//! ```rust,ignore(example)
355+
//! match x {
356+
//! Pair(Some(0), _) => 1,
357+
//! Pair(_, false) => 2,
358+
//! Pair(Some(0), false) => 3,
359+
//! }
360+
//! ```
361+
//!
362+
//! We start:
363+
//! ┐ Patterns:
364+
//! │ 1. `[Pair(Some(0), _)]`
365+
//! │ 2. `[Pair(_, false)]`
366+
//! │ 3. `[Pair(Some(0), false)]`
367+
//! │
368+
//! │ Dig into `Pair`:
369+
//! ├─┐ Patterns:
370+
//! │ │ 1. `[Some(0), _]`
371+
//! │ │ 2. `[_, false]`
372+
//! │ │ 3. `[Some(0), false]`
373+
//! │ │
374+
//! │ │ Dig into `Some`:
375+
//! │ ├─┐ Patterns:
376+
//! │ │ │ 1. `[0, _]`
377+
//! │ │ │ 2. `[_, false]`
378+
//! │ │ │ 3. `[0, false]`
379+
//! │ │ │
380+
//! │ │ │ Dig into `0`:
381+
//! │ │ ├─┐ Patterns:
382+
//! │ │ │ │ 1. `[_]`
383+
//! │ │ │ │ 3. `[false]`
384+
//! │ │ │ │
385+
//! │ │ │ │ Dig into `true`:
386+
//! │ │ │ ├─┐ Patterns:
387+
//! │ │ │ │ │ 1. `[]`
388+
//! │ │ │ │ │
389+
//! │ │ │ │ │ We note arm 1 is reachable (by `Pair(Some(0), true)`).
390+
//! │ │ │ ├─┘
391+
//! │ │ │ │
392+
//! │ │ │ │ Dig into `false`:
393+
//! │ │ │ ├─┐ Patterns:
394+
//! │ │ │ │ │ 1. `[]`
395+
//! │ │ │ │ │ 3. `[]`
396+
//! │ │ │ │ │
397+
//! │ │ │ │ │ We note arm 1 is reachable (by `Pair(Some(0), false)`).
398+
//! │ │ │ ├─┘
399+
//! │ │ ├─┘
400+
//! │ │ │
401+
//! │ │ │ Dig into `1..`:
402+
//! │ │ ├─┐ Patterns:
403+
//! │ │ │ │ 2. `[false]`
404+
//! │ │ │ │
405+
//! │ │ │ │ Dig into `true`:
406+
//! │ │ │ ├─┐ Patterns:
407+
//! │ │ │ │ │ // no rows left
408+
//! │ │ │ │ │
409+
//! │ │ │ │ │ We have found an unmatched value! This gives us a witness.
410+
//! │ │ │ │ │ New witnesses:
411+
//! │ │ │ │ │ `[]`
412+
//! │ │ │ ├─┘
413+
//! │ │ │ │ New witnesses from `true`:
414+
//! │ │ │ │ `[true]`
415+
//! │ │ │ │
416+
//! │ │ │ │ Dig into `false`:
417+
//! │ │ │ ├─┐ Patterns:
418+
//! │ │ │ │ │ 2. `[]`
419+
//! │ │ │ │ │
420+
//! │ │ │ │ │ We note arm 2 is reachable (by `Pair(Some(1..), false)`).
421+
//! │ │ │ ├─┘
422+
//! │ │ │ │
423+
//! │ │ │ │ Total witnesses for `1..`:
424+
//! │ │ │ │ `[true]`
425+
//! │ │ ├─┘
426+
//! │ │ │ New witnesses from `1..`:
427+
//! │ │ │ `[1.., true]`
428+
//! │ │ │
429+
//! │ │ │ Total witnesses for `Some`:
430+
//! │ │ │ `[1.., true]`
431+
//! │ ├─┘
432+
//! │ │ New witnesses from `Some`:
433+
//! │ │ `[Some(1..), true]`
434+
//! │ │
435+
//! │ │ Dig into `None`:
436+
//! │ ├─┐ Patterns:
437+
//! │ │ │ 2. `[false]`
438+
//! │ │ │
439+
//! │ │ │ Dig into `true`:
440+
//! │ │ ├─┐ Patterns:
441+
//! │ │ │ │ // no rows left
442+
//! │ │ │ │
443+
//! │ │ │ │ We have found an unmatched value! This gives us a witness.
444+
//! │ │ │ │ New witnesses:
445+
//! │ │ │ │ `[]`
446+
//! │ │ ├─┘
447+
//! │ │ │ New witnesses from `true`:
448+
//! │ │ │ `[true]`
449+
//! │ │ │
450+
//! │ │ │ Dig into `false`:
451+
//! │ │ ├─┐ Patterns:
452+
//! │ │ │ │ 2. `[]`
453+
//! │ │ │ │
454+
//! │ │ │ │ We note arm 2 is reachable (by `Pair(None, false)`).
455+
//! │ │ ├─┘
456+
//! │ │ │
457+
//! │ │ │ Total witnesses for `None`:
458+
//! │ │ │ `[true]`
459+
//! │ ├─┘
460+
//! │ │ New witnesses from `None`:
461+
//! │ │ `[None, true]`
462+
//! │ │
463+
//! │ │ Total witnesses for `Pair`:
464+
//! │ │ `[Some(1..), true]`
465+
//! │ │ `[None, true]`
466+
//! ├─┘
467+
//! │ New witnesses from `Pair`:
468+
//! │ `[Pair(Some(1..), true)]`
469+
//! │ `[Pair(None, true)]`
470+
//! │
471+
//! │ Final witnesses:
472+
//! │ `[Pair(Some(1..), true)]`
473+
//! │ `[Pair(None, true)]`
474+
//! ┘
475+
//!
476+
//! We conclude:
477+
//! - Arm 3 is unreachable (it was never marked as reachable);
478+
//! - The match is not exhaustive;
479+
//! - Adding `Pair(Some(1..), true)` and `Pair(None, true)` would make the match exhaustive.
307480
308481
use super::deconstruct_pat::{
309482
Constructor, ConstructorSet, DeconstructedPat, IntRange, MaybeInfiniteInt, SplitConstructorSet,
@@ -1069,9 +1242,6 @@ pub(crate) struct UsefulnessReport<'p, 'tcx> {
10691242

10701243
/// The entrypoint for the usefulness algorithm. Computes whether a match is exhaustive and which
10711244
/// of its arms are reachable.
1072-
///
1073-
/// Note: the input patterns must have been lowered through
1074-
/// `check_match::MatchVisitor::lower_pattern`.
10751245
#[instrument(skip(cx, arms), level = "debug")]
10761246
pub(crate) fn compute_match_usefulness<'p, 'tcx>(
10771247
cx: &MatchCheckCtxt<'p, 'tcx>,

0 commit comments

Comments
 (0)