Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cloning example for dot operator behaviour #292

Merged
merged 6 commits into from
Jul 23, 2021
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 107 additions & 1 deletion src/dot-operator.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,111 @@

The dot operator will perform a lot of magic to convert types. It will perform
auto-referencing, auto-dereferencing, and coercion until types match.
The detailed mechanics of method lookup are defined [here][method_lookup],
but here is a brief overview that outlines the main steps.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved

TODO: steal information from http://stackoverflow.com/questions/28519997/what-are-rusts-exact-auto-dereferencing-rules/28552082#28552082
Suppose we have a function `foo` that has a receiver (a `self`, `&self` or
`&mut self` parameter). If we call `value.foo()`, the compiler needs to determine
what type `Self` is before it can call the correct implementation of the function.
For this example, we will say that `value` has type `T`.

We will use [fully-qualified syntax][fqs]
to be more clear about exactly which type we are calling a function on.

- First, the compiler checks if we can call `T::foo(value)` directly.
This is called a "by value" method call.
- If we can't call this function (for example, if the function has the wrong type
or a trait isn't implemented for `Self`), then the compiler tries to add in an
automatic reference. This means that the compiler tries `<&T>::foo(value)` and
`<&mut T>::foo(value)`. This is called an "autoref" method call.
- If none of these candidates worked, we dereference `T` and try again. This
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
uses the `Deref` trait - if `T: Deref<Target = U>` then we try again with type `U`
instead of `T`. If we can't dereference `T`, we can also try _unsizing_ `T`.
This just means that if `T` has a size parameter known at compile time, we "forget"
it for the purpose of resolving methods. For instance, this unsizing step can
convert `[i32; 2]` into `[i32]` by "forgetting" the size of the array.

Here is an example of the method lookup algorithm.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
```rust.ignore
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
let array: Rc<Box<[T; 3]>> = ...;
let first_entry = array[0];
```

How does the compiler actually compute `array[0]` when the array is behind so
many indirections? First, `array[0]` is really just syntax sugar for the [`Index`][index]
trait - the compiler will convert `array[0]` into `array.index(0)`. Now, the
compiler checks to see if `array` implements `Index`, so that we can call the
function.

First, the compiler checks if `Rc<Box<[T; 3]>>` implements `Index`, but it
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
does not, and neither do `&Rc<Box<[T; 3]>>` or `&mut Rc<Box<[T; 3]>>`. Since
none of these worked, the compiler dereferences the `Rc<Box<[T; 3]>>` into
`Box<[T; 3]>` and tries again. `Box<[T; 3]>`, `&Box<[T; 3]>` and `&mut Box<[T; 3]>`
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
do not implement `Index`, so it dereferences again. `[T; 3]` and its autorefs
also do not implement `Index`. We can't dereference `[T; 3]`, so the compiler
unsizes it, giving `[T]`. Finally, `[T]` implements `Index`, so we can now call the
actual `index` function.

Consider the following more complicated example of the dot operator at work.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
```rust.ignore
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
fn do_stuff<T: Clone>(value: &T) {
let cloned = value.clone();
}
```
What type is `cloned`? First, the compiler checks if we can call by value.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
The type of `value` is `&T`, and so the `clone` function has signature
`fn clone(&T) -> T`. We know that `T: Clone`, so the compiler finds that
`cloned: T`.

What would happen if the `T: Clone` restriction was removed? We would not be able
to call by value, since there is no implementation of `Clone` for `T`. So the
compiler tries to call by autoref. In this case, the function has signature
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
`fn clone(&&T) -> &T` since `Self = &T`. The compiler sees that `&T: Clone`, and
then deduces that `cloned: &T`.

Here is another example where the autoref behaviour is used to create some subtle
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
effects.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
```rust.ignore
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
use std::sync::Arc;

zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
#[derive(Clone)]
struct Container<T>(Arc<T>);

fn clone_containers<T>(foo: &Container<i32>, bar: &Container<T>) {
let foo_cloned = foo.clone();
let bar_cloned = bar.clone();
}
```
What types are `foo_cloned` and `bar_cloned`? We know that `Container<i32>: Clone`,
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
so the compiler calls `clone` by value to give `foo_cloned: Container<i32>`.
However, `bar_cloned` actually has type `&Container<T>`. Surely this doesn't make
sense - we added `#[derive(Clone)]` to `Container`, so it must implement `Clone`!
Looking closer, the code generated by the `derive` macro is (roughly)
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
```rust.ignore
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
impl<T> Clone for Container<T> where T: Clone {
fn clone(&self) -> Self {
Self(Arc::clone(&self.0))
}
}
```
The derived `Clone` implementation is
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
[only defined where `T: Clone`][clone],
so there is no implementation for `Container<T>: Clone` for a generic `T`. The
compiler then looks to see if `&Container<T>` implements `Clone`, which it does.
So it deduces that `clone` is called by autoref, and so `bar_cloned` has type
`&Container<T>`.

We can fix this by implementing `Clone` manually without requiring `T: Clone`.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
```rust.ignore
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
impl<T> Clone for Container<T> {
fn clone(&self) -> Self {
Self(Arc::clone(&self.0))
}
}
```
Now, the type checker deduces that `bar_cloned: Container<T>`.
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved

[fqs]: https://doc.rust-lang.org/nightly/book/ch19-03-advanced-traits.html#fully-qualified-syntax-for-disambiguation-calling-methods-with-the-same-name
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
[method_lookup]: https://rustc-dev-guide.rust-lang.org/method-lookup.html
[index]: https://doc.rust-lang.org/std/ops/trait.Index.html
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved
[clone]: https://doc.rust-lang.org/std/clone/trait.Clone.html#derivable
zeramorphic marked this conversation as resolved.
Show resolved Hide resolved