Merge remote-tracking branch 'aturon/slice-notation'

rust-lang · Sep 11, 2014 · 7e4046e · 7e4046e
2 parents 61c64a2 + c233ec7
commit 7e4046e
Showing 1 changed file with 227 additions and 0 deletions.
diff --git a/active/0000-slice-notation.md b/active/0000-slice-notation.md
@@ -0,0 +1,227 @@
+- Start Date: (fill me in with today's date, 2014-08-12)
+- RFC PR #: (leave this empty)
+- Rust Issue #: (leave this empty)
+
+# Summary
+
+This RFC adds *overloaded slice notation*:
+
+- `foo[]` for `foo.as_slice()`
+- `foo[n..m]` for `foo.slice(n, m)`
+- `foo[n..]` for `foo.slice_from(n)`
+- `foo[..m]` for `foo.slice_to(m)`
+- `mut` variants of all the above
+
+via two new traits, `Slice` and `SliceMut`.
+
+It also changes the notation for range `match` patterns to `...`, to
+signify that they are inclusive whereas `..` in slices are exclusive.
+
+# Motivation
+
+There are two primary motivations for introducing this feature.
+
+### Ergonomics
+
+Slicing operations, especially `as_slice`, are a very common and basic thing to
+do with vectors, and potentially many other kinds of containers.  We already
+have notation for indexing via the `Index` trait, and this RFC is essentially a
+continuation of that effort.
+
+The `as_slice` operator is particularly important. Since we've moved away from
+auto-slicing in coercions, explicit `as_slice` calls have become extremely
+common, and are one of the
+[leading ergonomic/first impression](https://github.com/rust-lang/rust/issues/14983)
+problems with the language. There are a few other approaches to address this
+particular problem, but these alternatives have downsides that are discussed
+below (see "Alternatives").
+
+### Error handling conventions
+
+We are gradually moving toward a Python-like world where notation like `foo[n]`
+calls `fail!` when `n` is out of bounds, while corresponding methods like `get`
+return `Option` values rather than failing. By providing similar notation for
+slicing, we open the door to following the same convention throughout
+vector-like APIs.
+
+# Detailed design
+
+The design is a straightforward continuation of the `Index` trait design. We
+introduce two new traits, for immutable and mutable slicing:
+
+```rust
+trait Slice<Idx, S> {
+    fn as_slice<'a>(&'a self) -> &'a S;
+    fn slice_from(&'a self, from: Idx) -> &'a S;
+    fn slice_to(&'a self, to: Idx) -> &'a S;
+    fn slice(&'a self, from: Idx, to: Idx) -> &'a S;
+}
+
+trait SliceMut<Idx, S> {
+    fn as_mut_slice<'a>(&'a mut self) -> &'a mut S;
+    fn slice_from_mut(&'a mut self, from: Idx) -> &'a mut S;
+    fn slice_to_mut(&'a mut self, to: Idx) -> &'a mut S;
+    fn slice_mut(&'a mut self, from: Idx, to: Idx) -> &'a mut S;
+}
+```
+
+(Note, the mutable names here are part of likely changes to naming conventions
+that will be described in a separate RFC).
+
+These traits will be used when interpreting the following notation:
+
+*Immutable slicing*
+
+- `foo[]` for `foo.as_slice()`
+- `foo[n..m]` for `foo.slice(n, m)`
+- `foo[n..]` for `foo.slice_from(n)`
+- `foo[..m]` for `foo.slice_to(m)`
+
+*Mutable slicing*
+
+- `foo[mut]` for `foo.as_mut_slice()`
+- `foo[mut n..m]` for `foo.slice_mut(n, m)`
+- `foo[mut n..]` for `foo.slice_from_mut(n)`
+- `foo[mut ..m]` for `foo.slice_to_mut(m)`
+
+Like `Index`, uses of this notation will auto-deref just as if they were method
+invocations. So if `T` implements `Slice<uint, [U]>`, and `s: Smaht<T>`, then
+`s[]` compiles and has type `&[U]`.
+
+Note that slicing is "exclusive" (so `[n..m]` is the interval `n <= x
+< m`), while `..` in `match` patterns is "inclusive". To avoid
+confusion, we propose to change the `match` notation to `...` to
+reflect the distinction. The reason to change the notation, rather
+than the interpretation, is that the exclusive (respectively
+inclusive) interpretation is the right default for slicing
+(respectively matching).
+
+## Rationale for the notation
+
+The choice of square brackets for slicing is straightforward: it matches our
+indexing notation, and slicing and indexing are closely related.
+
+Some other languages (like Python and Go -- and Fortran) use `:` rather than
+`..` in slice notation. The choice of `..` here is influenced by its use
+elsewhere in Rust, for example for fixed-length array types `[T, ..n]`. The `..`
+for slicing has precedent in Perl and D.
+
+See [Wikipedia](http://en.wikipedia.org/wiki/Array_slicing) for more on the
+history of slice notation in programming languages.
+
+### The `mut` qualifier
+
+It may be surprising that `mut` is used as a qualifier in the proposed
+slice notation, but not for the indexing notation. The reason is that
+indexing includes an implicit dereference. If `v: Vec<Foo>` then
+`v[n]` has type `Foo`, not `&Foo` or `&mut Foo`. So if you want to get
+a mutable reference via indexing, you write `&mut v[n]`. More
+generally, this allows us to do resolution/typechecking prior to
+resolving the mutability.
+
+This treatment of `Index` matches the C tradition, and allows us to
+write things like `v[0] = foo` instead of `*v[0] = foo`.
+
+On the other hand, this approach is problematic for slicing, since in
+general it would yield an unsized type (under DST) -- and of course,
+slicing is meant to give you a fat pointer indicating the size of the
+slice, which we don't want to immediately deref. But the consequence
+is that we need to know the mutability of the slice up front, when we
+take it, since it determines the type of the expression.
+
+# Drawbacks
+
+The main drawback is the increase in complexity of the language syntax. This
+seems minor, especially since the notation here is essentially "finishing" what
+was started with the `Index` trait.
+
+## Limitations in the design
+
+Like the `Index` trait, this forces the result to be a reference via
+`&`, which may rule out some generalizations of slicing.
+
+One way of solving this problem is for the slice methods to take
+`self` (by value) rather than `&self`, and in turn to implement the
+trait on `&T` rather than `T`. Whether this approach is viable in the
+long run will depend on the final rules for method resolution and
+auto-ref.
+
+In general, the trait system works best when traits can be applied to
+types `T` rather than borrowed types `&T`. Ultimately, if Rust gains
+higher-kinded types (HKT), we could change the slice type `S` in the
+trait to be higher-kinded, so that it is a *family* of types indexed
+by lifetime. Then we could replace the `&'a S` in the return value
+with `S<'a>`. It should be possible to transition from the current
+`Index` and `Slice` trait designs to an HKT version in the future
+without breaking backwards compatibility by using blanket
+implementations of the new traits (say, `IndexHKT`) for types that
+implement the old ones.
+
+# Alternatives
+
+For improving the ergonomics of `as_slice`, there are two main alternatives.
+
+## Coercions: auto-slicing
+
+One possibility would be re-introducing some kind of coercion that automatically
+slices.
+We used to have a coercion from (in today's terms) `Vec<T>` to
+`&[T]`. Since we no longer coerce owned to borrowed values, we'd probably want a
+coercion `&Vec<T>` to `&[T]` now:
+
+```rust
+fn use_slice(t: &[u8]) { ... }
+
+let v = vec!(0u8, 1, 2);
+use_slice(&v)           // automatically coerce here
+use_slice(v.as_slice()) // equivalent
+```
+
+Unfortunately, adding such a coercion requires choosing between the following:
+
+* Tie the coercion to `Vec` and `String`. This would reintroduce special
+  treatment of these otherwise purely library types, and would mean that other
+  library types that support slicing would not benefit (defeating some of the
+  purpose of DST).
+
+* Make the coercion extensible, via a trait. This is opening pandora's box,
+  however: the mechanism could likely be (ab)used to run arbitrary code during
+  coercion, so that any invocation `foo(a, b, c)` might involve running code to
+  pre-process each of the arguments. While we may eventually want such
+  user-extensible coercions, it is a *big* step to take with a lot of potential
+  downside when reasoning about code, so we should pursue more conservative
+  solutions first.
+
+## Deref
+
+Another possibility would be to make `String` implement `Deref<str>` and
+`Vec<T>` implement `Deref<[T]>`, once DST lands. Doing so would allow explicit
+coercions like:
+
+```rust
+fn use_slice(t: &[u8]) { ... }
+
+let v = vec!(0u8, 1, 2);
+use_slice(&*v)          // take advantage of deref
+use_slice(v.as_slice()) // equivalent
+```
+
+There are at least two downsides to doing so, however:
+
+* It is not clear how the method resolution rules will ultimately interact with
+  `Deref`. In particular, a leading proposal is that for a smart pointer `s: Smaht<T>`
+  when you invoke `s.m(...)` only *inherent* methods `m` are considered for
+  `Smaht<T>`; *trait* methods are only considered for the maximally-derefed
+  value `*s`.
+
+  With such a resolution strategy, implementing `Deref` for `Vec` would make it
+  impossible to use trait methods on the `Vec` type except through UFCS,
+  severely limiting the ability of programmers to usefully implement new traits
+  for `Vec`.
+
+* The idea of `Vec` as a smart pointer around a slice, and the use of `&*v` as
+  above, is somewhat counterintuitive, especially for such a basic type.
+
+Ultimately, notation for slicing seems desireable on its own merits anyway, and
+if it can eliminate the need to implement `Deref` for `Vec` and `String`, all
+the better.