-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict constants in patterns #1445
Restrict constants in patterns #1445
Conversation
| u8 | u16 | u32 | u64 | usize // unsigned integers | ||
| char // characters | ||
| bool // booleans | ||
| (B, ..., B) // tuples of builtin types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be useful to consider including &str
and arrays as well, since they have a well defined “primitive” memory-representation equality.
Pretty strongly in favour. |
This RFC is a good idea. Since it only affects matching on constants of non-primitive types which is quite the sticky situation, I am in favor of this RFC. |
-- the same issues arise if you consider exhaustiveness checking. | ||
|
||
On the other hand, it feels very silly for the compiler not to | ||
understand that `match some_bool { true => ..., false => ... }` is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say true
and false
feel more like unit variants of "enum bool { false, true }
" than opaque constants.
This breaks lots of code of this kind: pub struct Errno(pub c_int);
pub const Interrupted: Errno = Errno(EINTR); My code contains at least 26 instances of this pattern (excluding flags) and an uncountable number of variants:
For instance, the Errno struct comes with 132 variants. How am I supposed to repair this? |
Newtypes like this are exactly the pattern which is responsible for the most of breakage on crates.io and in rustc itself. |
I've also no idea what this has to do with constants. match x {
Errno(1) => 1,
_ => 0,
}
if x == Errno(1) { 1 } else { 0 } are not guaranteed to produce the same result either. Whether |
Of course the same also applies to enums. E.g. enum A {
X(u8),
Y,
}
impl PartialEq for A {
fn partial_cmp(&self, other: &A) -> bool { false }
} Matching will return results that are incompatible with the PartialEq results. |
Given that the behavior for enums is already fixed (at least the RFC doesn't suggest otherwise), and given that newtypes with all fields public are very similar to enums, it seems that the current behavior (which agrees with the enum behavior) is already the expected behavior. |
@mahkoh the problem here is using enums has a well specified structural equivalency, but using a constant doesn’t make the equivalency used obvious. There’s a bunch of additional reasons outlined by the RFC why structural equivalency with constants is not satisfactory. |
I see a difference between matching on a constant Looking at it like this, there is no reason to expect that match foo { Errno(1) => ... } and const C: Errno = Errno(1);
match foo { C => ... } would do the same thing, just as you would not necessarily expect that: if errno.0 == 1 { ... } and if errno == C { ... } would do the same thing. PS, I am summarizing something I wrote on the internals thread, just for reference. |
Yes, the proposal is backwards incompatible, and this is exactly the kind of cases that would no longer work. You would have to either use the feature-gate or translate such code from UPDATE: To be clear, I don't want to break code either, and I also really want constants of user-defined types to work in patterns! I don't mean to sound glib. But, based on the crater results, it does seem that we have room here to rollback to semantics everyone can agree on. This would allow us to have a productive discussion later on focusing on what the best overall semantics ought to be for constants in a pattern. |
I'm fine with a feature gate if it can be enabled at the point where the type is defined. E.g. #[structural_match]
pub struct Errno(pub c_int); |
By this, do you mean that the clients who are matching on |
@nikomatsakis Exactly. |
@mahkoh Hmm. It is normally something we would not allow, since if we decided to use semantic equality (for example), then there might not be an equivalent behavior to |
@nikomatsakis As long as the attribute is behind a feature gate, breaking it again doesn't seem to be a problem. |
This sort of breaking change seems like it should be a semver major bump. It's not a security, bug, or soundness fix. It's instead trying to improve reasoning about code. Not that the goal is bad, but that breakage for that goal seems unacceptable for the 1.x versions. |
@nikomatsakis Whether the behavior is the current one or one that uses PartialEq, the code I'm worried about behaves the same. This attribute is simply supposed to bridge the gap. Maybe call it |
Personally, I consider this a bug fix. That is, I did not expect that constants of arbitrary types should be matchable. In fact, I opened a bug about it before 1.0, but that bug was accidentally closed when I wrote a comment like "does not try to fix #20489", and hence it dropped off of my radar when triaging for "things that ought to be feature-gated before 1.0". However, clearly there is room for disagreement here. |
Curiously, Exhaustiveness checking is a separate orthogonal concern, it still may be better to turn it off for constants regardless of |
That is mostly true, but only if all types embedded within the struct also |
@nikomatsakis
So, yes, it was implied, that the both |
So, on reflection, I quite like this idea. My assumption is that this would be something like
Shall I adjust the RFC? I think so. (*) Actually, the attribute could even stay as a kind of performance optimization. That is, @pnkfelix and I have talked about the compiler recognizing when the |
|
On interaction of Variant 1. Vartiant 2. Variant 3. Some bikeshedding, names for |
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(
Feature-gate the use of constants in patterns unless those constants have simple types, like integers, booleans, and characters. The semantics of constants in general were never widely discussed and the compiler's current implementation is not broadly agreed upon (though it has many proponents). The intention of adding a feature-gate is to give us time to discuss and settle on the desired semantics in an "affirmative" way.
Because the compiler currently accepts a larger set of constants, this is a backwards incompatible change. This is justified as part of the "underspecified language semantics" clause of RFC 1122. A crater run found 14 regressions on crates.io, which suggests that the impact of this change on real code would be minimal.
Note: this was also discussed on an internals thread. Major points from that thread are summarized either inline or in alternatives.
Rendered view.
tracking issue: rust-lang/rust#31434