Other than the type universes and Pi types, inductively defined types provide the only means of defining new types in the Calculus of Inductive Constructions. We have also seen that, fundamentally, the constructors and the recursors provide the only means of defining functions on these types. By the propositions-as-types correspondence, this means that induction is the fundamental method of proof for these types.
Working with induction and recursion is therefore fundamental to working in the Calculus of Inductive Constructions. For that reason Lean provides more natural ways of defining recursive functions, performing pattern matching, and writing inductive proofs. Behind the scenes, these are “compiled” down to recursors, using some of the auxiliary definitions described in Section Automatically Generated Constructions. Thus, the function definition package, which performs this reduction, is not part of the trusted code base.
The cases_on
recursor can be used to define functions and prove
theorems by cases. But complicated definitions may use several nested
cases_on
applications, and may be hard to read and understand.
Pattern matching provides a more convenient and standard way of
defining functions and proving theorems. Lean supports a very general
form of pattern matching called dependent pattern matching.
A pattern-matching definition is of the following form:
definition [name] [parameters] : [domain] → [codomain]
| [name] [patterns_1] := [value_1]
...
| [name] [patterns_n] := [value_n]
The parameters are fixed, and each assignment defines the value of the
function for a different case specified by the given pattern. As a
first example, we define the function sub2
for natural numbers:
open nat
definition sub2 : nat → nat
| sub2 0 := 0
| sub2 1 := 0
| sub2 (a+2) := a
example : sub2 5 = 3 := rfl
The default compilation method guarantees that the pattern matching equations hold definitionally.
open nat
definition sub2 : nat → nat
| sub2 0 := 0
| sub2 1 := 0
| sub2 (a+2) := a
-- BEGIN
example : sub2 0 = 0 := rfl
example : sub2 1 = 0 := rfl
example (a : nat) : sub2 (a + 2) = a := rfl
-- END
We can use the command print definition
to inspect how our definition was compiled into
recursors.
open nat
definition sub2 : nat → nat
| sub2 0 := 0
| sub2 1 := 0
| sub2 (a+2) := a
-- BEGIN
print definition sub2
-- END
We will say a term is a constructor application if it is of the form
c a_1 ... a_n
where c
is the constructor of some inductive
datatype. Note that in the definition sub2
, the terms 1
and a+2
are not constructor applications. However, the compiler normalizes
them at compilation time, and obtains the constructor applications
succ zero
and succ (succ a)
respectively. This normalization step
is just a convenience that allows us to write definitions resembling
the ones found in textbooks. There is no magic here: the compiler
simply uses the kernel’s ordinary evaluation mechanism. If we had
written 2+a
, the definition would be rejected since 2+a
does not
normalize into a constructor application.
In the next example, we use pattern-matching to define Boolean
negation bnot
, and proving bnot (bnot b) = b
.
open bool
definition bnot : bool → bool
| bnot tt := ff
| bnot ff := tt
theorem bnot_bnot : ∀ (b : bool), bnot (bnot b) = b
| bnot_bnot tt := rfl -- proof that bnot (bnot tt) = tt
| bnot_bnot ff := rfl -- proof that bnot (bnot ff) = ff
As described in Chapter Inductive Types, Lean inductive datatypes can
be parametric. The following example defines the tail
function using
pattern matching. The argument A : Type
is a parameter and occurs
before the colon to indicate it does not participate in the pattern
matching. Lean allows parameters to occur after :
, but it cannot
pattern match on them.
import data.list
open list
definition tail {A : Type} : list A → list A
| tail nil := nil
| tail (h :: t) := t
-- Parameter A may occur after ':'
definition tail2 : Π {A : Type}, list A → list A
| tail2 (@nil A) := (@nil A)
| tail2 (h :: t) := t
-- @ is allowed on the left-hand-side
definition tail3 : Π {A : Type}, list A → list A
| @tail3 A nil := nil
| @tail3 A (h :: t) := t
-- A is explicit parameter
definition tail4 : Π (A : Type), list A → list A
| tail4 A nil := nil
| tail4 A (h :: t) := t
The function definition package supports structural recursion, that is, recursive applications where one of the arguments is a subterm of the corresponding term on the left-hand-side. Later, we describe how to compile recursive equations using well-founded recursion. The main advantage of the default compilation method is that the recursive equations hold definitionally.
Here are some examples from the last chapter, written in the new style:
namespace hide
inductive nat : Type :=
| zero : nat
| succ : nat → nat
namespace nat
-- BEGIN
definition add : nat → nat → nat
| add m zero := m
| add m (succ n) := succ (add m n)
infix `+` := add
theorem add_zero (m : nat) : m + zero = m := rfl
theorem add_succ (m n : nat) : m + succ n = succ (m + n) := rfl
theorem zero_add : ∀ n, zero + n = n
| zero_add zero := rfl
| zero_add (succ n) := eq.subst (zero_add n) rfl
definition mul : nat → nat → nat
| mul n zero := zero
| mul n (succ m) := mul n m + m
-- END
end nat
end hide
The “definition” of zero_add
makes it clear that proof by induction
is really a form of induction in Lean.
As with definition by pattern matching, parameters to a structural recursion or induction may appear before the colon. Such parameters are simply added to the local context before the definition is processed. For example, the definition of addition may be written as follows:
namespace hide
inductive nat : Type :=
| zero : nat
| succ : nat → nat
namespace nat
-- BEGIN
definition add (m : nat) : nat → nat
| add zero := m
| add (succ n) := succ (add n)
-- END
end nat
end hide
This may seem a little odd, but you should read the definition as
follows: “Fix m
, and define the function which adds something to m
recursively, as follows. To add zero, return m
. To add the successor
of n
, first add n
, and then take the successor.” The mechanism
for adding parameters to the local context is what makes it possible
to process match expressions within terms, as described below.
A more interesting example of structural recursion is given by the
Fibonacci function fib
. The subsequent theorem, fib_pos
, combines
pattern matching, recursive equations, and calculational proof.
import data.nat
open nat algebra
definition fib : nat → nat
| fib 0 := 1
| fib 1 := 1
| fib (a+2) := fib (a+1) + fib a
-- the defining equations hold definitionally
example : fib 0 = 1 := rfl
example : fib 1 = 1 := rfl
example (a : nat) : fib (a+2) = fib (a+1) + fib a := rfl
-- fib is always positive
theorem fib_pos : ∀ n, 0 < fib n
| fib_pos 0 := show 0 < 1, from zero_lt_succ 0
| fib_pos 1 := show 0 < 1, from zero_lt_succ 0
| fib_pos (a+2) := show 0 < fib (a+1) + fib a, from calc
0 = 0 + 0 : rfl
... < fib (a+1) + 0 : add_lt_add_right (fib_pos (a+1)) 0
... < fib (a+1) + fib a : add_lt_add_left (fib_pos a) (fib (a+1))
Another classic example is the list append
function.
import data.list
open list
definition append {A : Type} : list A → list A → list A
| append nil l := l
| append (h::t) l := h :: append t l
example : append [(1 : ℕ), 2, 3] [4, 5] = [1, 2, 3, 4, 5] := rfl
All the examples we have seen so far can be easily written
using cases_on
and rec_on
. However, this is not the case
with indexed inductive families, such as vector A n
.
A lot of boilerplate code needs to be written to define
very simple functions such as map
, zip
, and unzip
using
recursors.
To understand the difficulty, consider what it would take to define a
function tail
which takes a vector v : vector A (succ n)
and
deletes the first element. A first thought might be to use the
cases_on
function:
namespace hide
-- BEGIN
open nat
inductive vector (A : Type) : nat → Type :=
| nil {} : vector A zero
| cons : Π {n}, A → vector A n → vector A (succ n)
open vector
notation h :: t := cons h t
check @vector.cases_on
-- Π {A : Type}
-- {C : Π (a : ℕ), vector A a → Type}
-- {a : ℕ}
-- (n : vector A a),
-- (e1 : C 0 nil)
-- (e2 : Π {n : ℕ} (a : A) (a_1 : vector A n), C (succ n) (cons a a_1)),
-- C a n
-- END
end hide
But what value should we return in the nil
case? Something funny is
going on: if v
has type vector A (succ n)
, it can’t be nil, but
it is not clear how to tell that to cases_on
.
One standard solution is to define an auxiliary function:
import data.examples.vector
open nat vector
-- BEGIN
definition tail_aux {A : Type} {n m : nat} (v : vector A m) :
m = succ n → vector A n :=
vector.cases_on v
(assume H : 0 = succ n, nat.no_confusion H)
(take m (a : A) w : vector A m,
assume H : succ m = succ n,
have H1 : m = n, from succ.inj H,
eq.rec_on H1 w)
definition tail {A : Type} {n : nat} (v : vector A (succ n)) : vector A n :=
tail_aux v rfl
-- END
In the nil
case, m
is instantiated to 0
, and no_confusion
(discussed in Section Automatically Generated Constructions) makes use
of the fact that 0 = succ n
cannot occur. Otherwise, v
is of the
form a :: w
, and we can simply return w
, after casting it from a
vector of length m
to a vector of length n
.
The difficulty in defining tail
is to maintain the relationships
between the indices. The hypothesis e : m = succ n
in tail_aux
is
used to “communicate” the relationship between n
and the index
associated with the minor premise. Moreover, the zero = succ n
case
is “unreachable,” and the canonical way to discard such a case is to
use no_confusion
.
The tail
function is, however, easy to define using recursive
equations, and the function definition package generates all the
boilerplate code automatically for us.
Here are a number of examples:
import data.examples.vector
open nat vector prod
-- BEGIN
definition head {A : Type} : Π {n}, vector A (succ n) → A
| head (h :: t) := h
definition tail {A : Type} : Π {n}, vector A (succ n) → vector A n
| tail (h :: t) := t
theorem eta {A : Type} : ∀ {n} (v : vector A (succ n)), head v :: tail v = v
| eta (h::t) := rfl
definition map {A B C : Type} (f : A → B → C)
: Π {n : nat}, vector A n → vector B n → vector C n
| map nil nil := nil
| map (a::va) (b::vb) := f a b :: map va vb
definition zip {A B : Type} : Π {n}, vector A n → vector B n → vector (A × B) n
| zip nil nil := nil
| zip (a::va) (b::vb) := (a, b) :: zip va vb
-- END
Note that we can omit recursive equations for “unreachable” cases such
as head nil
. The automatically generated definitions for indexed
families are far from straightforward. For example:
import data.examples.vector
open nat vector prod
definition map {A B C : Type} (f : A → B → C)
: Π {n : nat}, vector A n → vector B n → vector C n
| map nil nil := nil
| map (a::va) (b::vb) := f a b :: map va vb
-- BEGIN
print map
/-
definition map : Π {A : Type} {B : Type} {C : Type},
(A → B → C) → (Π {n : ℕ}, vector A n → vector B n → vector C n)
λ (A : Type) (B : Type) (C : Type) (f : A → B → C) {n : ℕ}
(a : vector A n) (a_1 : vector B n),
nat.brec_on n
(λ {n : ℕ} (b : nat.below n) (a : vector A n) (a_1 : vector B n),
nat.cases_on n
(λ (b : nat.below 0) (a : vector A 0) (a_1 : vector B 0),
(λ (t_1 : ℕ) (a_2 : vector A t_1),
vector.cases_on a_2
(λ (H_1 : 0 = 0) (H_2 : a == nil),
(λ (t_1 : ℕ) (a_1_1 : vector B t_1),
vector.cases_on a_1_1
(λ (H_1 : 0 = 0) (H_2 : a_1 == nil), nil)
(λ (n : ℕ) (a : B) (a_2 : vector B n)
(H_1 : 0 = succ n),
nat.no_confusion H_1))
0
a_1
(eq.refl 0)
-/
-- END
The map
function is even more tedious to define by hand than the
tail
function. We encourage you to try it, using rec_on
,
cases_on
and no_confusion
.
The name of the function being defined can be omitted from the
left-hand side of pattern matching equations. This feature is
particularly useful when the function name is long or there are many
cases. When the name is omitted, Lean will silently include @f
in
the left-hand-side of every pattern matching equation, where f
is
the name of the function being defined. Here is an example:
import data.examples.vector
open nat vector prod
-- BEGIN
variables {A B : Type}
definition unzip : Π {n : nat}, vector (A × B) n → vector A n × vector B n
| zero nil := (nil, nil)
| (succ n) ((a, b)::v) :=
match unzip v with
(va, vb) := (a :: va, b :: vb)
end
example : unzip (((1 : ℕ), (10 : ℕ)) :: (2, 20) :: nil) =
(1 :: 2 :: nil, 10 :: 20 :: nil) :=
rfl
-- END
We say that a set of recursive equations overlaps when there is an
input that more than one left-hand-side can match. In the following
definition the input 0 0
matches the left-hand-side of the first two
equations. Should the function return 1
or 2
?
open nat
-- BEGIN
definition f : nat → nat → nat
| f 0 y := 1
| f x 0 := 2
| f (x+1) (y+1) := 3
-- END
Overlapping patterns are often used to succinctly express complex patterns in data, and they are allowed in Lean. Lean handles the ambiguity by using the first applicable equation. In the example above, the following equations hold definitionally:
open nat
definition f : nat → nat → nat
| f 0 y := 1
| f x 0 := 2
| f (x+1) (y+1) := 3
-- BEGIN
variables (a b : nat)
example : f 0 0 = 1 := rfl
example : f 0 (a+1) = 1 := rfl
example : f (a+1) 0 = 2 := rfl
example : f (a+1) (b+1) = 3 := rfl
-- END
Lean also supports wildcard patterns, also known as anonymous
variables. They are used to create patterns where we don’t care about
the value of a specific argument. In the function f
defined above,
the values of x
and y
are not used in the right-hand-side. Here is
the same example using wildcards:
open nat
definition f : nat → nat → nat
| f 0 _ := 1
| f _ 0 := 2
| f _ _ := 3
variables (a b : nat)
example : f 0 0 = 1 := rfl
example : f 0 (a+1) = 1 := rfl
example : f (a+1) 0 = 2 := rfl
example : f (a+1) (b+1) = 3 := rfl
Some functional languages support incomplete patterns. In these
languages, the interpreter produces an exception or returns an
arbitrary value for incomplete cases. We can simulate the arbitrary
value approach using the inhabited
type class, discussed in Chapter
Type Classes. Roughly, an element of inhabited A
is simply a
witness to the fact that there is an element of A
; in Chapter Type
Classes, we will see that Lean can be instructed that suitable base
types are inhabited, and can automatically infer that other
constructed types are inhabited on that basis. On this basis, the
standard library provides an arbitrary element, arbitrary A
, of any
inhabited type.
We can also use the type option A
to simulate incomplete patterns.
The idea is to return some a
for the provided patterns, and use
none
for the incomplete cases. The following example demonstrates
both approaches.
open nat option
definition f1 : nat → nat → nat
| f1 0 _ := 1
| f1 _ 0 := 2
| f1 _ _ := arbitrary nat -- the "incomplete" case
variables (a b : nat)
example : f1 0 0 = 1 := rfl
example : f1 0 (a+1) = 1 := rfl
example : f1 (a+1) 0 = 2 := rfl
example : f1 (a+1) (b+1) = arbitrary nat := rfl
definition f2 : nat → nat → option nat
| f2 0 _ := some 1
| f2 _ 0 := some 2
| f2 _ _ := none -- the "incomplete" case
example : f2 0 0 = some 1 := rfl
example : f2 0 (a+1) = some 1 := rfl
example : f2 (a+1) 0 = some 2 := rfl
example : f2 (a+1) (b+1) = none := rfl
Sometimes an argument in a dependent matching pattern is not essential
to the definition, but nonetheless has to be included to specialize
the type of the expression appropriately. Lean allows users to mark
such subterms as inaccessible for pattern matching. These
annotations are essential, for example, when a term occurring in the
left-hand side is neither a variable nor a constructor application,
because these are not suitable targets for pattern matching. We can
view such inaccessible terms as “don’t care” components of the
patterns. You can declare a subterm inaccesible by writing ⌞t⌟
(the
brackets are entered as \cll
and \clr
, for “corner-lower-left” and
“corner-lower-right”) or ?(t)
.
The following example can be found in \cite{goguen:et:al:06}. We
declare an inductive type that defines the property of “being in the
image of f
”. You can view an element of the type image_of f b
as
evidence that b
is in the image of f
, whereby the constructor
imf
is used to build such evidence. We can then define any function
f
with an “inverse” which takes anything in the image of f
to an
element that is mapped to it. The typing rules forces us to write f
a
for the first argument, but this term is not a variable nor a
constructor application, and plays no role in the pattern-matching
definition. To define the function inverse
below, we have to mark f
a
inaccessible.
variables {A B : Type}
inductive image_of (f : A → B) : B → Type :=
imf : Π a, image_of f (f a)
open image_of
definition inverse : Π f : A → B, Π b, image_of f b → A
| inverse f ⌞f a⌟ (imf _ _) := a
Inaccessible terms can also be used to reduce the complexity of the
generated definition. Dependent pattern matching is compiled using the
cases_on
and no_confusion
constructions. The number of instances
of cases_on
introduced by the compiler can be reduced by marking
parts that only report specialization. In the next example, we define
the type of finite ordinals finord n
, a type with n
inhabitants. We
also define the function to_nat
that maps an element of finord n
to
an elmeent of nat
. If we do not mark n+1
as inaccessible, the
compiler will generate a definition containing two cases_on
expressions. We encourage you to replace ⌞n+1⌟
with (n+1)
in the
next example and inspect the generated definition using print
definition to_nat
.
open nat
inductive finord : nat → Type :=
| fz : Π n, finord (succ n)
| fs : Π {n}, finord n → finord (succ n)
open finord
definition to_nat : Π {n : nat}, finord n → nat
| @to_nat ⌞n+1⌟ (fz n) := zero
| @to_nat ⌞n+1⌟ (fs f) := succ (to_nat f)
Lean also provides a compiler for match-with expressions found in many functional languages. It uses essentially the same infrastructure used to compile recursive equations.
import data.list
open nat bool list
-- BEGIN
definition is_not_zero (a : nat) : bool :=
match a with
| zero := ff
| succ _ := tt
end
-- We can use recursive equations and match
variable {A : Type}
variable p : A → bool
definition filter : list A → list A
| filter nil := nil
| filter (a :: l) :=
match p a with
| tt := a :: filter l
| ff := filter l
end
example : filter is_not_zero [1, 0, 0, 3, 0] = [1, 3] := rfl
-- END
You can also use pattern matching in a local have
expression:
import data.nat logic
open bool nat
definition mult : nat → nat → nat :=
have plus : nat → nat → nat
| 0 b := b
| (succ a) b := succ (plus a b),
have mult : nat → nat → nat
| 0 b := 0
| (succ a) b := plus (mult a b) b,
mult
In some definitions, we have to help the compiler by providing some implicit arguments explicitly in the left-hand-side of recursive equations. In such cases, if we don’t provide the implicit arguments, the elaborator is unable to solve some placeholders (i.e.~meta-variables) in the nested match expression.
import data.examples.vector
open nat vector prod
-- BEGIN
variables {A B : Type}
definition unzip : Π {n : nat}, vector (A × B) n → vector A n × vector B n
| @unzip zero nil := (nil, nil)
| @unzip (succ n) ((a, b)::v) :=
match unzip v with
(va, vb) := (a :: va, b :: vb)
end
example : unzip (((1 : ℕ), (10 : ℕ)) :: (2, 20) :: nil) =
(1 :: 2 :: nil, 10 :: 20 :: nil) :=
rfl
-- END
Next, we define the function diag
which extracts the diagonal of a
square matrix vector (vector A n) n
. Note that, this function is
defined by structural induction. However, the term map tail v
is not
a subterm of ((a :: va) :: v)
. Could you explain what is going on?
import data.examples.vector
open nat vector
-- BEGIN
variables {A B : Type}
definition tail : Π {n}, vector A (succ n) → vector A n
| tail (h :: t) := t
definition map (f : A → B)
: Π {n : nat}, vector A n → vector B n
| map nil := nil
| map (a::va) := f a :: map va
definition diag : Π {n : nat}, vector (vector A n) n → vector A n
| diag nil := nil
| diag ((a :: va) :: v) := a :: diag (map tail v)
-- END
[TODO: write this section.]