Restrict constants in patterns #1445

nikomatsakis · 2016-01-06T18:05:02Z

Feature-gate the use of constants in patterns unless those constants have simple types, like integers, booleans, and characters. The semantics of constants in general were never widely discussed and the compiler's current implementation is not broadly agreed upon (though it has many proponents). The intention of adding a feature-gate is to give us time to discuss and settle on the desired semantics in an "affirmative" way.

Because the compiler currently accepts a larger set of constants, this is a backwards incompatible change. This is justified as part of the "underspecified language semantics" clause of RFC 1122. A crater run found 14 regressions on crates.io, which suggests that the impact of this change on real code would be minimal.

Note: this was also discussed on an internals thread. Major points from that thread are summarized either inline or in alternatives.

Rendered view.
tracking issue: rust-lang/rust#31434

nagisa · 2016-01-06T18:47:30Z

text/0000-restrict-constants-in-patterns.md

+  | u8 | u16 | u32 | u64 | usize  // unsigned integers
+  | char                          // characters 
+  | bool                          // booleans
+  | (B, ..., B)                   // tuples of builtin types


Might be useful to consider including &str and arrays as well, since they have a well defined “primitive” memory-representation equality.

nagisa · 2016-01-06T19:14:34Z

Pretty strongly in favour.

retep998 · 2016-01-06T19:24:41Z

This RFC is a good idea. Since it only affects matching on constants of non-primitive types which is quite the sticky situation, I am in favor of this RFC.

petrochenkov · 2016-01-06T19:24:56Z

text/0000-restrict-constants-in-patterns.md

+-- the same issues arise if you consider exhaustiveness checking.
+
+On the other hand, it feels very silly for the compiler not to
+understand that `match some_bool { true => ..., false => ... }` is


I'd say true and false feel more like unit variants of "enum bool { false, true }" than opaque constants.

mahkoh · 2016-01-06T19:28:15Z

This breaks lots of code of this kind:

pub struct Errno(pub c_int);

pub const Interrupted: Errno = Errno(EINTR);

My code contains at least 26 instances of this pattern (excluding flags) and an uncountable number of variants:

ack "pub struct .*\(pub .*\)" | grep -iv flags | wc -l

For instance, the Errno struct comes with 132 variants. How am I supposed to repair this?

petrochenkov · 2016-01-06T19:43:00Z

pub struct Errno(pub c_int);

Newtypes like this are exactly the pattern which is responsible for the most of breakage on crates.io and in rustc itself.

mahkoh · 2016-01-06T19:43:47Z

I've also no idea what this has to do with constants.

match x {
    Errno(1) => 1,
    _ => 0,
}

if x == Errno(1) { 1 } else { 0 }

are not guaranteed to produce the same result either. Whether Errno(1) is behind a constant or not is irrelevant here.

mahkoh · 2016-01-06T19:48:58Z

Of course the same also applies to enums. E.g.

enum A {
    X(u8),
    Y,
}

impl PartialEq for A {
    fn partial_cmp(&self, other: &A) -> bool { false }
}

Matching will return results that are incompatible with the PartialEq results.

mahkoh · 2016-01-06T19:52:03Z

Given that the behavior for enums is already fixed (at least the RFC doesn't suggest otherwise), and given that newtypes with all fields public are very similar to enums, it seems that the current behavior (which agrees with the enum behavior) is already the expected behavior.

nagisa · 2016-01-06T19:56:47Z

@mahkoh the problem here is using enums has a well specified structural equivalency, but using a constant doesn’t make the equivalency used obvious. There’s a bunch of additional reasons outlined by the RFC why structural equivalency with constants is not satisfactory.

nikomatsakis · 2016-01-06T19:56:49Z

@mahkoh

I've also no idea what this has to do with constants.

I see a difference between matching on a constant C (which is defined as Errno(1)) and matching on Errno(1) directly in the pattern. Basically, to me, the pattern translates into a set of "test-and-extract" operations that will be performed. So, the pattern Errno(pattern>) compiles to "test that the variant is Errno and extract the value". We then recursively apply <pattern> to that value. Sometimes there are no values to extract: if you match a nullary enum variant like None, it just means "test that the variant is None". Along these lines, matching a constant pattern like C (resp. 1) corresponds to "test that the value is equal (for some definition of equal) to C (resp. 1) and extract no values".

Looking at it like this, there is no reason to expect that

match foo { Errno(1) => ... }

and

const C: Errno = Errno(1);
match foo { C => ... }

would do the same thing, just as you would not necessarily expect that:

if errno.0 == 1 { ... }

and

if errno == C { ... }

would do the same thing.

PS, I am summarizing something I wrote on the internals thread, just for reference.

nikomatsakis · 2016-01-06T19:59:24Z

@mahkoh

This breaks lots of code of this kind

Yes, the proposal is backwards incompatible, and this is exactly the kind of cases that would no longer work. You would have to either use the feature-gate or translate such code from match foo { BAR => ... } to match foo { c if c == BAR => ... }.

UPDATE: To be clear, I don't want to break code either, and I also really want constants of user-defined types to work in patterns! I don't mean to sound glib. But, based on the crater results, it does seem that we have room here to rollback to semantics everyone can agree on. This would allow us to have a productive discussion later on focusing on what the best overall semantics ought to be for constants in a pattern.

mahkoh · 2016-01-06T20:03:12Z

@nikomatsakis

I'm fine with a feature gate if it can be enabled at the point where the type is defined. E.g.

#[structural_match]
pub struct Errno(pub c_int);

nikomatsakis · 2016-01-06T20:05:22Z

@mahkoh

I'm fine with a feature gate if it can be enabled at the point where the type is defined.

By this, do you mean that the clients who are matching on Errno values would not need to use the feature-gate?

mahkoh · 2016-01-06T20:05:32Z

@nikomatsakis Exactly.

nikomatsakis · 2016-01-06T20:10:53Z

@mahkoh Hmm. It is normally something we would not allow, since if we decided to use semantic equality (for example), then there might not be an equivalent behavior to #[structural_match] in the future (and I wouldn't want to preserve the attribute). OTOH, since I suspect you do define Eq for errno to act exactly as structural match acts, that might be ok (modulo exhaustiveness). I have to think about it.

mahkoh · 2016-01-06T20:12:52Z

@nikomatsakis As long as the attribute is behind a feature gate, breaking it again doesn't seem to be a problem.

seanmonstar · 2016-01-06T20:19:20Z

This sort of breaking change seems like it should be a semver major bump. It's not a security, bug, or soundness fix. It's instead trying to improve reasoning about code. Not that the goal is bad, but that breakage for that goal seems unacceptable for the 1.x versions.

mahkoh · 2016-01-06T20:21:00Z

@nikomatsakis Whether the behavior is the current one or one that uses PartialEq, the code I'm worried about behaves the same. This attribute is simply supposed to bridge the gap. Maybe call it #[deprecated_const_match] if that makes it more obvious.

nikomatsakis · 2016-01-06T20:24:19Z

@seanmonstar

This sort of breaking change seems like it should be a semver major bump. It's not a security, bug, or soundness fix. It's instead trying to improve reasoning about code. Not that the goal is bad, but that breakage for that goal seems unacceptable for the 1.x versions.

Personally, I consider this a bug fix. That is, I did not expect that constants of arbitrary types should be matchable. In fact, I opened a bug about it before 1.0, but that bug was accidentally closed when I wrote a comment like "does not try to fix #20489", and hence it dropped off of my radar when triaging for "things that ought to be feature-gated before 1.0". However, clearly there is room for disagreement here.

petrochenkov · 2016-01-06T21:03:17Z

Curiously, #[structural_match] already exists and is called #[derive(PartialEq)].
#[derive(PartialEq)] is essentially an assertion that structural and semantic matching do the same thing, so there's no choice left what to do when performing a match and we can unambiguously generate the same code as today.
Maybe #[derive(PartialEq)] should not only generate some code during expansion, but also emit a flag this_adt_is_usable_in_const_pattern_matching,which is considered in later stages of compilation.
this_adt_is_usable_in_const_pattern_matching can become a separate attribute eventually if necessary, but generating it as a part of #[derive(PartialEq)] allows to avoid breakage and generally looks like a reasonable way forward, covering most of practical cases.

Exhaustiveness checking is a separate orthogonal concern, it still may be better to turn it off for constants regardless of this_adt_is_usable_in_const_pattern_matching

nikomatsakis · 2016-01-06T22:05:56Z

@petrochenkov

Curiously, #[structural_match] already exists and is called #[derive(PartialEq)].

That is mostly true, but only if all types embedded within the struct also #[derive(PartialEq)]. (Still, that's a very interesting thought.)

petrochenkov · 2016-01-07T13:27:57Z

@nikomatsakis
#[structural_match] should not affect nested things, otherwise you could make an "opaque" structure transparent/matchable by using a newtype:

`#[structural_match]`
struct Transparent(Opaque);

So, yes, it was implied, that the both #[structural_match] or #[derive(PartialEq)] have to be checked recursively.
In theory marking a structure containing non-#[structural_match] fields with #[structural_match] can be made an error, but derive is not smart enough to emit #[structural_match] conditionally to avoid this error, so I don't consider this variant.

nikomatsakis · 2016-01-08T16:38:38Z

@petrochenkov @mahkoh

So, on reflection, I quite like this idea. My assumption is that this would be something like #[fundamental] -- that is, an attribute that we intend to never stabilize but which lets us adopt the subset of semantics we know we want while we dicker about the remainder. In this case, using the semantics you proposed, we could ultimately resolve this in at least two ways:

Removing the attribute and adopt the current "structural match" implementation.
Removing the attribute and adopt the "semantic equality" interpretation. (*)

Shall I adjust the RFC? I think so.

(*) Actually, the attribute could even stay as a kind of performance optimization. That is, @pnkfelix and I have talked about the compiler recognizing when the PartialEq impl resulted from derive and generating better match code in that case, since its semantics are well understood. One could imagine adopting an unsafe attribute or something along those lines, which PartialEq implicitly provides.

petrochenkov · 2016-01-08T21:15:35Z

~~#[structural_match] can probably be a (possibly unsafe) structural trait (OIBIT) and not an attribute.~~
Edit: It can't be an OIBIT, but it can be a normal trait, maybe even a sub-trait for PartialEq, kind of like Copy for Clone.

trait TrivialPartialEq: PartialEq {}

petrochenkov · 2016-01-08T21:31:54Z

On interaction of #[structural_match] with floats.
(I assume that match can be used only with #[structural_match] types)
Edit: by "permitted in match" below I mean permitted in constants in match.

Variant 1.
The attribute gives hard guarantees about PartialEq properties.
In this case it can be used for optimizations, for example operators == for #[structural_match] types without padding can be translated into memcmps.
In this case floats simply can't be used in match, their equality comparison is not trivial.

Vartiant 2.
The attribute is a rubber stamp, it says "yeah, this can be used in match" and that's all. match in its turn performs a structural comparison OR runs partial_eq even if they do different things. In this case floats are allowed in match and compared structurally OR semantically, respectively.

Variant 3.
Floats are an exception, they are not #[structural_match], but permitted in match and compared structurally or semantically. Structures containing floats are not permitted in match,

Some bikeshedding, names for structural_match:
trivially_comparable, default_comparable - "comparable" can be confused with PartialOrd
trivial_equality
trivial_partial_eq (or TrivialPartialEq for a trait) - says exactly what it does, seems like a good if it provides hard guarantees.

As discussed, for the price of having to think about `TargetTriple` (like `String`) vs `&TargetTripleRef` (like `&str`), we get: * No accidentally passing some other kind of string to a thing expecting a `TargetTriple` * Serialization/deserialization is still transparent, no schema changes or anything * We can add methods to it (like `is_windows()` in this PR - note that I dream of a `ParsedTargetTriple` in a separate PR) * Those methods are the only place where we check properties of the string (before this commit, we have `.contains("windows")` and `.contains("pc-windows")` for example) * We can "find all references" to the type itself ("where do we care about targets?") * We can "find all references" to `TargetTriple::new` ("where do we build targets from strings?") * We can "find all references" to `TargetTripleRef::as_str` ("where do we coerce it back into a string to pass it to a tool like cargo/wix/etc.) That kind of change is invaluable for me when working on cross-compilation support, and I suspect it will be invaluable for any current and future maintainers of cargo-dist as well (I've used it with great success in other large codebases). You can still treat `TargetTriple` as a string, but it'll be uglier (on purpose). There is however, some ugliness that isn't on purpose. In this changeset I discovered some annoyances around `.iter()` (which returns an `Iterator<Item = &TargetTriple>` instead of an `Iterator<Item = &TargetTripleRef>`. I've added `.as_explicit_ref` to work around those cases. Similarly, calling `Vec<TargetTriple>::contains()` with a `&TargetTripleRef` doesn't work (and you cannot convert a `&TargetTripleRef` into a `&TargetTriple`, the same way you cannot convert a `&str` back into a `&String` - you don't know where it's allocated from!). Finally, I ran into <rust-lang/rfcs#1445> while making this change: there was a big `match` for converting target triples to their display names, and although that works with `&str` constants, it doesn't work with `&TargetTripleRef` constants, due to Rust limitations right now. That explains the lazy_static (which we already depended on transitively, so at least that). I would've used `LazyLock` but our MSRV is currently 1.79 and LazyLock is since 1.80 :(

add new RFC

4b471de

nikomatsakis added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 6, 2016

nagisa reviewed Jan 6, 2016
View reviewed changes

petrochenkov reviewed Jan 6, 2016
View reviewed changes

nrc assigned nikomatsakis Jan 7, 2016

tomusdrw mentioned this pull request Apr 6, 2016

Removing match on constant openethereum/parity-ethereum#888

Merged

mathstuf mentioned this pull request Jun 12, 2016

message: derive MessageType from Eq srwalter/dbus-bytestream#11

Merged

kennytm mentioned this pull request Jun 23, 2016

RFC: const-dependent type system. #1657

Closed

lifthrasiir added a commit to chronotope/chrono that referenced this pull request Jul 25, 2016

Fixed warnings from rust-lang/rfcs#1445.

0b32182

This was referenced Oct 1, 2016

Tracking issue for illegal_floating_point_constant_pattern compatibility lint rust-lang/rust#36890

Closed

Tracking issue for illegal_struct_or_enum_constant_pattern compatibility lint rust-lang/rust#36891

Closed

jsantell mentioned this pull request Feb 2, 2017

Parse and display EDN values for NaN, +Infinity and -Infinity. Fixes … mozilla/mentat#238

Merged

carols10cents mentioned this pull request Apr 7, 2017

Chapter 18: Patterns rust-lang/book#469

Merged

est31 mentioned this pull request Apr 29, 2017

Tracking issue for illegal_floating_point_literal_pattern compatibility lint rust-lang/rust#41620

Closed

3 tasks

ExpHP mentioned this pull request Mar 15, 2018

RFC: #[derive_no_bound(..)] and #[derive_field_bound(..)] #2353

Closed

Centril added A-patterns Pattern matching related proposals & ideas A-const Proposals relating to const items A-const-eval Proposals relating to compile time evaluation (CTFE). labels Nov 23, 2018

pnkfelix mentioned this pull request Oct 16, 2019

ICE resolving non-existent PartialEq::Eq from match of const rust-lang/rust#65466

Closed

petrochenkov mentioned this pull request Apr 3, 2020

Fully destructure constants into patterns rust-lang/rust#70743

Merged

pnkfelix mentioned this pull request Apr 28, 2020

function pointers as match patterns have optimization-dependent behavior rust-lang/rust#70861

Closed

ecstatic-morse mentioned this pull request May 14, 2020

Aggressively check for non-PartialEq types in const_to_pat rust-lang/rust#72184

Closed

fasterthanlime mentioned this pull request Oct 22, 2024

Introduce strongly-typed strings, starting with TargetTriple axodotdev/cargo-dist#1474

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict constants in patterns #1445

Restrict constants in patterns #1445

nikomatsakis commented Jan 6, 2016 •

edited by pnkfelix

Loading

nagisa Jan 6, 2016

nagisa commented Jan 6, 2016

retep998 commented Jan 6, 2016

petrochenkov Jan 6, 2016

mahkoh commented Jan 6, 2016

petrochenkov commented Jan 6, 2016

mahkoh commented Jan 6, 2016

mahkoh commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nagisa commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

mahkoh commented Jan 6, 2016

seanmonstar commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

petrochenkov commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

petrochenkov commented Jan 7, 2016

nikomatsakis commented Jan 8, 2016

petrochenkov commented Jan 8, 2016

petrochenkov commented Jan 8, 2016

Restrict constants in patterns #1445

Restrict constants in patterns #1445

Conversation

nikomatsakis commented Jan 6, 2016 • edited by pnkfelix Loading

nagisa Jan 6, 2016

Choose a reason for hiding this comment

nagisa commented Jan 6, 2016

retep998 commented Jan 6, 2016

petrochenkov Jan 6, 2016

Choose a reason for hiding this comment

mahkoh commented Jan 6, 2016

petrochenkov commented Jan 6, 2016

mahkoh commented Jan 6, 2016

mahkoh commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nagisa commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

mahkoh commented Jan 6, 2016

seanmonstar commented Jan 6, 2016

mahkoh commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

petrochenkov commented Jan 6, 2016

nikomatsakis commented Jan 6, 2016

petrochenkov commented Jan 7, 2016

nikomatsakis commented Jan 8, 2016

petrochenkov commented Jan 8, 2016

petrochenkov commented Jan 8, 2016

nikomatsakis commented Jan 6, 2016 •

edited by pnkfelix

Loading