-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove markers and replace with a default rule for unused type/lifetime parameters #12
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
- Start Date: (fill me in with today's date, YYY-MM-DD) | ||
- RFC PR #: (leave this empty) | ||
- Rust Issue #: (leave this empty) | ||
|
||
# Summary | ||
|
||
Change Rust's variance inference rules to make them less error-prone. | ||
|
||
# Motivation | ||
|
||
Rust currently employs automatic variance inference to decide whether | ||
a type parameter is covariant, contravariant, or invariant. The | ||
current inference scheme is as general as possible but sometimes leads | ||
to surprising behavior, particularly around type or lifetime | ||
parameters that are not used. | ||
|
||
For example, consider the following struct definition: | ||
|
||
struct Wrap<T>; | ||
|
||
Here, the type parameter `T` is not used at all. The inference | ||
algorithm will infer that this is *bivariant*, meaning that no matter | ||
what value we supply for `T`, accesses to the fields of `Wrap` will | ||
continue to be safe. In other words, if I have `&Wrap<int>`, that is | ||
interchangable with `&Wrap<uint>`. The algorithm is not wrong. Since | ||
`Wrap` has no fields, there is nothing you can do with this `T` that | ||
is incorrect per se. But the result is certainly surprising. | ||
|
||
The current algorithm is particularly broken when people employ unsafe | ||
code that may encode uses for types that are not immediately evident. | ||
For example, one idiom is to attach a lifetime to a struct that | ||
contains unsafe pointers, to prevent that struct from being used | ||
outside of a certain scope. For example, at some point the iterator | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it is just me, but from an engineering point of view, this seems like a really horrible idiom and we should not encourage it. If there is a need for this we should have some sort of explicit visibility thing in the language. I don't think we should change our type system to help its use. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not obvious to me why this is horrible or what the alternatives would be. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, that was strong language to use without explanation. I meant it is horrible because it feels like a hack - you add an unused parameter to enforce using the struct in a certain scope - there is no indication of intent there. Someone reading the code would not know that that was what the lifetime parameter was for if they didn't know about the idiom. I would prefer something explicit to limit the usage, rather than implicitly relying on lifetime checking where there is not really a lifetime (from the perspective of inside the strucure). I also have no idea what such a thing would look like though. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, we have an existing design that gives a hint at what it looks like -- |
||
for a vector was defined as follows: | ||
|
||
struct Elements<'a, T> { | ||
start: *T, end: *T | ||
} | ||
|
||
Here of course, the lifetime `'a` is not used, and hence the inference | ||
algorithm again infers bivariance. This effectively means that the | ||
lifetime parameter `'a` is completely ignored, which is certainly not | ||
what was intended. | ||
|
||
The current way to control the inference algorithm is to employ marker | ||
types like `CovariantType` or `ContravariantLifetime`. These are | ||
maximally flexible but also dangerous as defaults -- if you don't know | ||
that you need to use them, you will get very lax typing, much laxer | ||
than you might expect. | ||
|
||
I'd like to remove the need for marker types altogether. Currently | ||
they are used to opt out of the builtin kinds, but we are moving to an | ||
opt-in system which makes those markers unnecessary. The remaining | ||
place marker types are used is for specifying variance, but with this | ||
proposal they will not be needed there either. | ||
|
||
# Detailed design | ||
|
||
The proposal has three parts: | ||
|
||
1. Change the default behavior of the inference algorithm as follows: | ||
If a type or lifetime parameter is not used within the body of a | ||
type, we default to covariance for types and contravariance for | ||
lifetimes. | ||
2. Treat `Unsafe<T>` as invariant with respect to `T`. This is not | ||
strictly necessary (the current behavior can be simulated, | ||
described below), but it's more convenient for users. | ||
3. Remove the marker types for variance inference. | ||
|
||
Let's address each part in turn. | ||
|
||
## Change 1. Adjust how inference algorithm treats unused parameters. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would encourage invariant as the default for types. I think programmers should assume invariant subtyping, even though covariant is the intuitive thing. We should not encourage this wrong-thinking. |
||
This change chooses as defaults what I believe most people would expect. | ||
More specifically, if you write a struct with an unused type parameter `T`: | ||
|
||
struct Foo<T> { } // Note: T is not used | ||
|
||
That is equivalent (from the variance algorithm's point of view) to | ||
a struct that contains an instance of `T`: | ||
|
||
struct Foo<T> { field: T } // Foo<T> above equivalent to this | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would expect this to make There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not how Rust works, though. Mutability is imposed from the outside: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Note that if you had There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So you will infer different variances for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, because |
||
|
||
Similarly, if you write a struct with an unused lifetime parameter `'a`: | ||
|
||
struct Bar<'a> { } // Note: 'a is not used | ||
|
||
That is equivalent to a struct containing a reference of lifetime `'a`: | ||
|
||
struct Bar<'a> { f: &'a () } // Bar<'a> above equivalent to this | ||
|
||
This is what I intuitively expect when I see `Foo<T>` or `Bar<'a>`. | ||
|
||
## Change 2. Treat `Unsafe<T>` as invariant in the inference analysis. | ||
|
||
We've already decided that interior mutability which does not build on | ||
`Unsafe<T>` is undefined behavior (this is needed to prevent segfaults | ||
and so forth from static constants). Therefore, since `Unsafe<T>` is | ||
built into the language rules, it just makes sense for `Unsafe<T>` to | ||
be known to the variance inference as well. The inference algorithm | ||
can treat `T` as invariant, which means there will be no need for a | ||
marker type here. | ||
|
||
This is more convenient for users since an `Unsafe` static constant | ||
can be created with having to write out any markers: | ||
|
||
``` | ||
// Before | ||
static UNSAFE_ZERO: Unsafe<uint> = Unsafe { value: 0, | ||
marker: marker::InvariantType }; | ||
|
||
// After: | ||
static UNSAFE_ZERO: Unsafe<uint> = Unsafe { value: 0 }; | ||
``` | ||
|
||
Note that we could not make this change, but it would require keeping | ||
markers or something equivalent to markers, as described in the next | ||
section. | ||
|
||
## Change 3. Remove marker types. | ||
|
||
If we change the algorithm as described above, then I think there is | ||
no longer any need for marker types. | ||
|
||
One reason for this is that their effect can be completely simulated: | ||
|
||
``` | ||
struct CovariantType<T>; | ||
struct ContravariantType<T> { m: CovariantType<fn(T)> } | ||
struct InvariantType<T> { m: CovariantType<fn(T) -> T> } | ||
struct CovariantLifetime<'a> { m: CovariantType<fn(&'a int)> } | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was really worried about this, but now I like the idea, seeing how |
||
struct ContravariantLifetime<'a>; | ||
struct InvariantLifetime<'a> { m: CovariantType<fn(&'a int) -> &'a int> } | ||
``` | ||
|
||
But the real reason for this change is not precisely that the current | ||
markers can be simulated; it's more that I don't see the need for | ||
them. Previously, there were two known situations where inference | ||
was insufficient and markers were needed: | ||
|
||
1. Unused lifetimes not capturing the expected constraint. Addressed by | ||
Change 1 above. | ||
2. Interior mutability (`Cell<T>`, `RefCell<T>`), which should be invariant | ||
with respect to `T`. This is addressed by Change 2. | ||
|
||
# Alternatives | ||
|
||
There are many possible alternatives: | ||
|
||
**Keep the current system, with the attendant risks.** I think we'll | ||
see broken code this way, where users are not getting the type system | ||
guarantees they think they are getting. | ||
|
||
**Default to invariance for unused type parameters, not covariance.** | ||
This would mean that, e.g., `Foo<&'static int>` is not a subtype of | ||
`Foo<&'a int>`. Probably not much difference in practice, but I think | ||
it's slightly more intuitive to follow the rule that unused type | ||
parameters act *as if* they appeared the struct had a member of that | ||
type. | ||
|
||
**Make it an error to have an unused type or lifetime parameters, | ||
except for in marker types.** This is the most explicit system, but | ||
also requires that we keep marker types (which are undeniably awkward) | ||
and requires the most direct interaction with variance annotations. | ||
|
||
# Unresolved questions | ||
|
||
None. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a lint (warning you about bivariant type parameters or lifetime parameters) deal with this? The lint could then point the programmer at the
std::kinds::marker
module, and then the programmer who had some particular intuition about what that phantom type was for could actually explicitly write out their intent.(You have of course mentioned
std::kinds::marker
below. I just want to understand if the purpose of this RFC is to reduce risk, which I think a lint will be better at, or for programmer convenience, which is where choosing the defaults to match expectations (or perhaps to match actual frequency of occurrence) would be relevant.)((ugh, this is exactly what bill-myers said above. Sorry for the noise.. :( ))