Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Constant generics (restricted Π types) for Rust, core RFC (v2) #1931

Closed
wants to merge 39 commits into from
Closed
Changes from 37 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
27c33a3
rfc: Π-types
ticki Jun 22, 2016
e4446c5
Expand the `overview` section
ticki Jun 22, 2016
eb60cd3
Propose alternative syntaxes
ticki Jun 22, 2016
71f2d8b
Improve type inference with a new rule allowing direct substitution
ticki Jun 22, 2016
b355811
Add section on impl unification
ticki Jun 22, 2016
a705532
Add transitivity suggestion
ticki Jun 22, 2016
946aa55
Change syntax, add transitivity rule
ticki Jun 22, 2016
d52e728
Add 'How we teach this'
ticki Jun 22, 2016
551688c
Fix the code example
ticki Jun 22, 2016
ea92a57
Add optional extension
ticki Jun 22, 2016
1d8813b
s/baz/foo
ticki Jun 22, 2016
d6a9b05
Use unicode for the lines
ticki Jun 22, 2016
99b2005
Expand on the implication checking and the rules governing it
ticki Jun 23, 2016
60467c2
Add subsection of 'motivation' dedicated to examples, at exit-point i…
ticki Jun 23, 2016
468f16a
Expand motivation with aims
ticki Jun 23, 2016
a01250c
Double negation and division by zero handling
ticki Jun 23, 2016
817ec48
Prove decidability
ticki Jun 23, 2016
d46f550
Thanks for the feedback, gnzlbg, killercup, and kennytm
ticki Jun 23, 2016
54e00af
"Informalize" the rules
ticki Jun 23, 2016
5718d4b
Fix some typos in Pi Types RFCs
killercup Jun 23, 2016
386de25
Merge pull request #1 from killercup/patch-1
ticki Jun 23, 2016
dc4635a
Add section on SMT-solvers, fix gnzlbg's notes
ticki Jun 23, 2016
d75674b
Merge branch 'pi-types' of github.com:Ticki/rfcs into pi-types
ticki Jun 23, 2016
4e36bed
Add remark
ticki Jun 23, 2016
b555120
Add section on possible extensions
ticki Jun 23, 2016
892e3d6
Fix title on the second question
ticki Jun 23, 2016
754b944
Add section on structural equality, add symbolic implication
ticki Jun 23, 2016
c7a2f28
Fix structural equality section
ticki Jun 23, 2016
9573c4d
Add alternative with inherited bounds
ticki Jun 23, 2016
74ef189
Move the SMT-section
ticki Jun 24, 2016
9f06922
Add Hoare-based value/type optional extension
ticki Jun 24, 2016
6b88ae5
Fix typos
ticki Jun 24, 2016
998c292
Transcription rules for inequalities
ticki Jun 24, 2016
09aef59
Clarify notation
ticki Jun 24, 2016
0f3a499
Typos
ticki Jun 25, 2016
2dbfe2d
Fix typos
ticki Jun 27, 2016
41c8ff3
Pi types core RFC.
ticki Feb 26, 2017
aaf7d55
Π → Γ
ticki Feb 26, 2017
9a9c028
Add note on notation.
ticki Mar 12, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
301 changes: 301 additions & 0 deletions text/0000-pi-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,301 @@
- Feature Name: pi-types
- Start Date: 2017-02-26
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

We propose a simple, yet sufficiently expressive, addition of dependent-types
(also known as, Π-types and value-types).
Copy link

@hanna-kruppe hanna-kruppe Feb 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As already pointed out in previous discussion, these really aren't dependent types or Pi-types, at least not until #1933 gets accepted. I had to carefully read both RFCs to come to this conclusion, so this should really be clarified to make this RFC precise and self-contained. A good term that has been used before is "const-dependent types".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to being inaccurate I think "pi types" is not a very explanatory choice of name for most users. Const parameters seems more obvious.

Copy link
Member

@eddyb eddyb Feb 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"const generics", "const generic parameters" and "const parameters" are all better (both user-facing, and even in implementation details) than Π IMO.

Copy link

@clarfonthey clarfonthey Feb 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I linked this discussion in #1930 because it affects all three RFCs and is super important to bring up.


Type checking will not require SMT-solvers or other forms of theorem provers.

## Generic value parameters

A `const` type parameter acts like a generic parameter, containing a constant
expression. Declaring a generic parameter `const a: usize`, declares a constant
variable `a` of type `usize`.

One can create implementations, structs, enums, and traits, abstracting over
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, is the intent to allow impl<const a :usize> Trait for a { ... }, aka, will the constant be allowed to be used as bare type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given specialization, this would also allow things like having a general algorithm for all n and a specialized impl of it when n happens to be 0 or 1, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't be sensible, at least as I understand the trait system. The Self parameter of a trait is of kind type, not of kind const.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@est31 I am not sure that makes sense... Why can't you just do impl Trait for usize ?

@Ixrec Wouldn't that require trait bounds? i.e. where a == 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the subject of specialization, it seems plausible that Foo<0> is a specialization of for<const n: usize> Foo<n>, but I'd like to keep specializing on const parameters to a future RFC.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-i-m yeah, it would require the "with bounds" proposed in the 2nd RFC

this generic value parameter.

Such a parameter acts type-like in the context of types, generics, and
polymorphism, and value-like in the context of expressions, function bodies,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you are meaning, but I think it would be clearer if you removed "function bodies" here. I mean in function bodies you should be able to use it in both expression and type parameter form.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, a function body is in fact a block expression now, so "expressions and patterns" is enough.

and applications.

## Compile time calculations on constant parameters

Since it is simply consisting of constexprs, one can apply constant functions
(`const fn`) to the parameter, to perform compile time, type level calculations
on the parameter. This allows for great expressiveness as `const fn` improves.

# Motivation
[motivation]: #motivation

An often requested feature is the "type-level numerals", which enables generic
length arrays. The current proposals are often limited to integers or even lack
of value maps, and other critical features.

_Note:_ In an earlier version of this RFC, a `where` bound was proposed, but it
proved too complex, so for now, types with invariants about the constant
parameter can be used.

It allows for creating powerful abstractions without type-level hackery.

## What we want, and what we don't want

We have to be very careful to avoid certain things, while still preserving the core features:

1. Ability to use and manipulate values at type-level.
2. The ability to use said values on expression-level (runtime).

Yet, we do not want:

1. SMT-solvers, due to not only undecidability (note, although, that SAT is
decidable) and performance, but the complications it adds to `rustc`.
2. Monomorphisation-time errors, i.e. errors that happens during codegen of
generic functions. We try to avoid adding _more_ of these (as noted by
petrochenkov, these [already exists](https://github.com/rust-lang/rfcs/pull/1657#discussion_r68202733))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link is broken for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again? 😆 The original note:


This error is currently reported during monomorphization, each of the arrays has acceptable size, but they cross the limit when combined together in one structure S:

    struct S<T, U>(T, U);

    fn f<T, U>() {
        let s: S<T, U>; // error: the type `S<[u8; 70368744177664], [u8; 70368744177664]>` is too big
    }

    fn main() {
        f::<[u8; 1 << 46], [u8; 1 << 46]>();
    }

Overflows or division-by-zero in type level arithmetic with integer generic parameters feel very close to this situation, i.e. requiring where clauses for performing addition of two generic integers is similar to requiring where clauses for combining two generic types in a tuple like this:

struct S<T, U>(T, U) where sizeof(T) + sizeof(U) + PADDING < MAX_SIZE;


# Detailed design
[design]: #detailed-design

## The new value-type construct, `const`

Declaring a parameter `const x: T` allows using `x` in both an expression context
(as a value of type `T`) and a type context (as a type parameter). In a sense,
const "bridges" the world between values and types, since it allows us to
declare value dependent types ([`ε → τ` constructors](https://en.wikipedia.org/wiki/Dependent_type)).

Such a parameter is declared, like type parameters, in angle brackets (e.g.
`struct MyStruct<const x: usize>`).

The expr behavior is described as:

ValueParameterDeclaration:
Π ⊢ const x: T
──────────────
Π ⊢ x: T

In human language, this simply means that one can use a constant parameter,
`const x: T`, in expression context, as a value of type `T`.

On the type level, we use the very same semantics as the ones generic
parameters currently follows.

## `const fn`s as Π-constructors

We are interested in value dependency, but at the same time, we want to avoid
complications such as [SMT-solvers](https://en.wikipedia.org/wiki/Satisfiability_modulo_theories).

We achieve this by `const fn`, which allows us to take some const parameter and
map it by some arbitrary, pure function, following the rules described in [RFC
0911](https://github.com/rust-lang/rfcs/blob/master/text/0911-const-fn.md#detailed-design).

## Type inference

Since we are able to evaluate the function at compile time, we can easily infer
const parameters, by adding an unification relation, simply

PiRelationInference
Γ ⊢ y = f(x)
Γ ⊢ T: U<y>
──────────────
Γ ⊢ T: U<f(x)>

Informally, this means that, you can substitute equal terms (in this case, `const fn` relations).

The relational edge between two const parameters is simple a const fn, which is
resolved under unification.

We add an extra rule to improve inference:

DownLiftEquality:
Γ ⊢ T: A → 𝓤
Γ ⊢ c: A
Γ ⊢ x: A
Γ ⊢ a: T<c>
Γ ⊢ a: T<x>
────────────
Γ ⊢ c = x

So, if two types share constructor by some Π-constructor, share a value, their
value parameter is equal. Take `a: [u8; 4]` as an example. If we have some
unknown variable `x`, such that `a: [u8; x]`, we can infer that `x = 4`.

This allows us to infer e.g. array length parameters in functions:

```rust
// [T; N] is a constructor, T → usize → 𝓤 (parameterize over T and you get A → 𝓤).
fn foo<const n: usize, const l: [u32; n]>() -> [u32; n] {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my browser, the letter lowercase L (l) and the number 1 are almost identical. For a little bit I actually thought it was the number 1 and I was really confused. Something like arr instead of l would eliminate any possible confusion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use UPPER_SNAKE_CASE for const globals and I still would like that somewhat - unless, of course, there's a good argument for keeping it lowercase.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean SCREAMING_SNAKE_CASE?

// ^ note how l depends on n.
l
}

// We know n from the length of the array.
let l = foo::<_, [1, 2, 3, 4, 5, 6]>();
// ^ ^^^^^^^^^^^^^^^^
```

## Structural equality

Structural equality plays a key role in type checking of dependent types.

Structural equality, in this case, is defined as an equivalence relation, which
allows substitution without changing semantics.

Any constant parameter must have the `structural_match` property as defined in
[RFC #1445](https://github.com/rust-lang/rfcs/pull/1445). This property, added
Copy link
Member

@cramertj cramertj Feb 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think we'd want to require type parameters to be Copy, at least for the time being. Types can currently be structural_match but not Copy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the downside of allowing non-Copy structural_match types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we'll also need is Display to be able to include values in error messages (got AwesomeType<2> but expected AwesomeType<3>...), but I'm not sure if that's already included in the definition of structural_match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it'd look more like fmt::Debug but the compiler invoking the formatting machinery would be pretty hard, compared to #[structural_match] which is straight-forward.

through the `#[structural_match]` attribute, essentially states that the `Eq`
implementation is structural.

Without this form of equality, substitution wouldn't be possible, and thus
typechecking an arbitrarily value-depending type constructor would not be
possible.

## Parsing

Introducing expr subgrammar in type position isn't possible without setting
some restrictions to the possible expressions.

In this case, we put restrictions to arithmetic and bitwise operations (`+-/!^`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly are those restrictions? Could you give some examples what is / isn't allowed?

Copy link
Contributor

@Ixrec Ixrec Feb 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this sentence enumerates exactly what those restrictions are, but the grammar does need work. I'd phrase it as:

This RFC proposes restricting the expressions that can appear in type position to arithmetic and bitwise operations (+-/!^ etc.) and function calls (myconstfn(a, b, c, ...) with a, b, c, ... being const-expr fragments).

Copy link
Contributor

@petrochenkov petrochenkov Feb 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think disambiguation "type vs expression" is better performed based on the first (ok, maybe first two) tokens. Sets of tokens that can start types or expressions are known, select tokens from the expression list that don't start a type and will never start a type in the future (e.g. LITERAL or {) and prepend everything else with disambiguating const.

f::<u8, 1>() // OK
f::<u8, ident> // Not OK, `ident` is interpreted as a type
f::<u8, const ident> // OK, disambiguated by `const`, `ident` is interpreted as a value
f::<u8, {ident}> // OK, `{` can't start a type, `{ident}` is interpreted as a value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ident looks really weird to me, but I'd be happy with a rule like "only a literal or a parenthesized expression containing ... " if we want to completely rule out any risk of ambiguity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full list for reference: {, LITERAL, -, |, ||, .., ..., #, box, break, continue, false, if, loop, match, move, return, true, while.

From those {, -, LITERAL, true and false look like simultaneously 1) useful and 2) future proof subset.
Undisambiguated # is certainly not future proof, everything else can require disambiguation just to be conservative.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer {...} because of the reduced verbosity, and because it is already expression syntax (if we want to use some crazy cover grammar scheme in the future).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{...} would be a bit weird looking and it does not declare intent... I think const ident is more explicit, but I am even more in favor of using a ; to separate const expressions from types:

struct Foo<T, U; x: u32, y: isize>;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or to be more explicit:

struct Foo<T, U; const x: u32, const y: isize>;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everyone saying const ident means const expr, right? e.g. const a+b.
@mark-i-m Also, I thought we were talking about use, not definition?
e.g. Array<T, {n+1}> instead of Array<T, const n+1>.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️ 😆 @eddyb You are right on both counts

etc.) and function calls (`myconstfn(a, b, c, ...)` with `a, b, c, ...` being
`const-expr` fragments).

Additionally, we allow parenthesizes, which can contain any general const-expr
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Types can already be parenthesized though. How do you parse a<(b<c,d>::e)>?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "parenthesizes" a typo? Maybe you mean "parentheses" here.

fragment.

# How we teach this

This RFC aims to keep a "symmetric" syntax to the current construct, giving an
intuitive behavior, however there are multiple things that are worth explaining
and/or clearing up:

**What are dependent types?**

Dependent types are types, which _depend_ on values, instead of types. For
example, [T; 3], is dependent since it depends on the value, `3`, for
constructing the type. Dependent types, in a sense, are similar to normal
generics, where types can depend on other types (e.g. `Vec<T>`), whereas
dependent types depend on values.

**How does this differ from other languages' implementations of dependent types?**

Various other languages have dependent type systems. Strictly speaking, all
that is required for a dependent type system is value-to-type constructors,
although some languages (coq, agda, etc.) goes a step further and remove the
boundary between value and type. Unfortunately, as cool as it sounds, it has
some severe disadvantages: most importantly, the type checking becomes
undecidable. Often you would need some form of theorem prover to type check
the program, and those have their limitations too.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to say that the part about undecidability here is very misleading.

To be clear, type checking in Agda and Coq is decidable in general. Undecidability potentially comes into play from non-termination of some program fragment during evaluation that occurs while type checking. But by default, both Agda and Coq require that programs pass the termination check before they can be used. So that source of undecidability cannot arise unless termination checking is explicitly disabled.

The case may be different for Idris since it does not enforce termination by default as far as I know.

However, even in the case where there is undecidability, it's also misleading to bring it up as a serious disadvantage in the context of Rust. The fact of the matter is that type checking in Rust is already undecidable due to the expressive power of associated types. For instance, you can already encode a full lambda calculus interpreter in the type system.


**What are `const fn` and how is it linked to this RFC?**

`const fn` is a function, which can be evaluated at compile time. While it
is currently rather limited, in the future it will be extended (see
[Miri](https://github.com/solson/miri)). You can use constexprs to take one
type-level value, and non-trivially calculate a new one.

**What are the usecases?**

There are many usecases for this. The most prominent one, perhaps, is
abstracting over generically sized arrays. Dependent types allows one to lift
the length of the array up to the type-level, effectively allowing one to
parameterize over them.

# Drawbacks
[drawbacks]: #drawbacks

If we want to have type-level Turing completeness, the halting problem is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are const exprs turing-complete? Are there plan to make them so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Miri-based constexprs could be turing-complete, allowing arbitrary looping and recursion. Miri already has configurable stack-depth, runtime, and memory usage limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, how useful is it to have a Turing-complete type system? Is it worth the undecidability?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have one, so the issue is more "does this have enough benefits to justify making the Turing-completeness more likely to crop up accidentally, with the footguns that arise from that."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always restrict Turing completeness by setting a reasonable time and memory bound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, but I would like to avoid getting errors like "Sorry, compiling your program is undecidable". Also, such an error is hard to explain if you don't know some CS theory.

Copy link

@hanna-kruppe hanna-kruppe Mar 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-i-m The type system is already Turing complete in other respects, and like in those contexts, an error would not mention computability theory but just the specific limit you overstepped. (I am 9% sure there will a limit in any case, to avoid going in an infinite loop on erroneous programs — choosing Turing-completeness would just mean allowing the user to raise this limit arbitrarily high.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was that we should avoid making these errors more likely to come up if possible. I can imagine new users getting such an error and just thinking that the compiler is really slow or poorly engineered, rather than realizing that they are actually asking the compiler to do something impossible.

inevitable. One could "fix" this by adding timeouts, like the current recursion
bounds.

Another drawback is the lack of implication proves.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean "implication proofs" or "implication provers" here? I am not well-versed in type theory, but "implication proves" seems like strange wording to me...


# Alternatives
[alternatives]: #alternatives

## Alternative syntax

### A constexpr type constructor

Add some language item type constructor, `Const<T>`, allowing for constructing
a constexpr-only types.

`x: T` can coerce into `Const<T>` if `x` is constexpr. Likewise, can `Const<T>`
coerce into `T`.

```rust
fn do_something(x: Const<u32>) -> u32 { x }

struct Abc {
constfield: Const<u32>,
}
```

The pro is that it adds ability to implement e.g. constant indexing, `Index<Const<usize>>`.

The syntax is described above is, in fact, ambiguous, and multiple other better
or worse candidates exists:

### Blending the value parameters into the arguments

This one is an interesting one. It allows for defining functions with constant
_arguments_ instead of constant _parameters_. This allows for bounds on e.g.
`atomic::Ordering`.

```rust
fn do_something(const x: u32) -> u32 { x }
```

From the callers perspective, this one is especially nice to work with, however
it can lead to confusion about mixing up constargs and runtime args. One
possible solution is to segregate the constargs from the rest arguments by a
`;` (like in array types).

Another way to semantically justify such a change is by the [`Const` type constructor](#an-extension-a-constexpr-type-constructor)

### Square brackets

Use square brackets for dependent parameters:

```rust
fn do_something[x: u32]() -> u32 { x }

do_something::[2]();
```

### `const` as an value-type constructor

Create a keyword, `const`:

```rust
fn do_something<x: const u32>() -> u32 { x }

do_something::<2>();
```
# Unresolved questions
[unresolved]: #unresolved-questions

## Syntactical/conventional

What syntax is preferred?

What should be the naming conventions?

Should we segregate the value parameters and type parameters by `;`?

## Compatibility

How does this play together with HKP?

What API would need to be rewritten to take advantage of Π-types?

## Features

Should there be a way to parameterize functions dynamically?

## Semantics

Find some edge cases, which can be confusing.