Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In macros, $($x:expr),* fragments can be used to bypass future-proofing restrictions #25658

Closed
nikomatsakis opened this issue May 20, 2015 · 33 comments
Assignees
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) P-high High priority T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

This should not be accepted, but it is:

macro_rules! foo {
    ( $($x:expr),* ... ) => { 1 }
}

fn main() { }

cc @pnkfelix @cmr

@nikomatsakis
Copy link
Contributor Author

cc @bluss

@nikomatsakis
Copy link
Contributor Author

Note that

macro_rules! foo {
    ( $x:expr ... ) => { 1 }
}

fn main() { }

is (correctly) prohibited.

@bluss
Copy link
Member

bluss commented May 20, 2015

It's not just the type expr, I also tested pat.

@DanielKeep
Copy link
Contributor

Nooooooo! You're going to break my macro article again, aren't you? :(

(Note: I realise the following is not a good reason to not fix this problem.)

The problem for me, personally, is that the following doesn't work:

macro_rules! foo {
    ( $($x:expr),* , ... ) => { 1 }
}

fn main() {
// error: local ambiguity: multiple parsing options: built-in NTs expr ('x') or 1 other options.
    let _ = foo!(1, 2, 3, ...);
//                        ^~~
}

It would be nice if MBE preferred literal matches to captures.

@Stebalien
Copy link
Contributor

@DanielKeep Shouldn't that be $($x:expr,)*? Your example won't work anyways for the reason you posted but fixing this bug shouldn't change anything. Anyways, you should be able to do this with a recursive macro:

macro_rules! foo {
    (...) => {{
        ()
    }};
    ($e:expr, $($tok:tt)+) => {{
        ($e, foo!($($tok)+))
    }}
}

fn main() {
    let tpl = foo!(1,2,3,4,...);
    println!("{:?}", tpl);
}

@steveklabnik steveklabnik added the A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) label May 21, 2015
@alexcrichton
Copy link
Member

triage: I-nominated

@nikomatsakis
Copy link
Contributor Author

cc @pnkfelix

@nikomatsakis nikomatsakis added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Jun 2, 2015
@nikomatsakis
Copy link
Contributor Author

So my feeling is that we should probably fix this, but it'd be good to evaluate the impact. It might be a good place to use a more elongated "breaking change procedure", or at least actively reach out to help people rewrite.

@emberian
Copy link
Member

emberian commented Jun 2, 2015

Is this a bug where we need to consider "FOLLOW"? I thought I saw a PR/RFC that added detailed FOLLOW checking but I could be wrong.

@bluss
Copy link
Member

bluss commented Jun 2, 2015

Would it be a middle ground to always warn on this, and then "just let it break" if expression syntax ever changes so that it does -- the macro would suddenly match differently.

@DanielKeep and I found out a parsing strategy with $x:tt $($xs:tt)* where you can split on any token you want to and reinterpret the parts on either side. That is like an explicit way to circumvent any syntax restriction, and it's not tied to any changing meaning of expr or similar macro parts.

@nikomatsakis
Copy link
Contributor Author

triage: P-high

Assigning high priority as, if we are going to fix this, we need to do it soon. If we're just going to warn, urgency is somewhat lower. I'm not yet sure of best overall strategy here. A crater report might be helpful in making the determination.

@rust-highfive rust-highfive added P-high High priority and removed I-nominated labels Jun 3, 2015
@pnkfelix pnkfelix self-assigned this Aug 6, 2015
@jonas-schievink
Copy link
Contributor

I just found this gem, which is probably related to this issue:

macro_rules! e {
    ($($e:expr)*) => ($($e)*);
}

fn main() {
    e!(() "");
    e!("" 4);
}

The first invocation is accepted, the second leads to a type error:

<anon>:7:8: 7:10 error: mismatched types:
 expected `()`,
    found `&'static str`
(expected (),
    found &-ptr) [E0308]
<anon>:7     e!("" 4);
                ^~

(this doesn't depend on the return type of the function, it gives the same error when it returns something else than ())

@pnkfelix
Copy link
Member

@jonas-schievink Hmm, I don't think that is related to this ticket... it is a strange inconsistency though.

(I also think that all of those e! invocations should either be universally rejected or universally accepted. Based on the output I get from rustc -Z unstable-options --pretty expanded, it seems like there are semi-colons inserted after each expression emitted in the expansion, but that does not explain why the second e! invocation is rejected.)

update: spawned off into #29799

@pnkfelix
Copy link
Member

Fixing the bug here without adding a warning seems simple: just let the checking for delimited-sequences fall, on success, into the code for non-delimited ones. (Wish I had looked at this for a day back in May...)

I'm going to double-check that this passes make check before I attempt a crater run with it.

(And in parallel with the crater run, I'll see how hard it would be to instead emit a (hard) warning... it may require more invasive surgery on the code in question.)

@pnkfelix
Copy link
Member

I did a crater run for my simple (error-generating, not a warning cycle) fix for this.

While that was running, I hacked together a warning-cycle version of the change. (As predicted, it is more invasive, but also lays out further changes we can make to simplify this code if we go down the route I'm suggesting.)

Once that builds successfully, I'll put it through a crater run as well, since I am not thrilled with 17 root regressions.

@DanielKeep
Copy link
Contributor

@pnkfelix Ha! I'm responsible for four of those 17.

Oh... this breaks trailing commas. That's really going to suck.

@pnkfelix
Copy link
Member

Some of the errors from my crater run above seem like they may be due to our limited approach to having sequences followed by other sequences, which the code in some ways acts like has always been disallowed, but clearly we let it through for delimited sequences. I'm having difficulty interpreting the intent of RFC 550 here; I'll report more later after I've finished writing up an analysis of the results.

@pnkfelix
Copy link
Member

@DanielKeep my initial guess is that my naive approach, while accepted by our internal make check, is not going to be acceptable in practice. But I'm not sure yet, I want to avoid jumping to any conclusion until I finish analysis.

@pnkfelix
Copy link
Member

I looked through all of the crater reported root regressions; the analysis writeup is here:

https://gist.github.com/pnkfelix/be4b76cd8891230eb065

Of the 17 root regressions

  • 2 are what I think are crater issues (internal fails or misclassification)
  • 3 are regressions that are fundamental consequences of fixing this bug, and
  • the other 12 are what I am calling "seq rep followed by seq rep" failures.

For the "fundamental consequences", I mean the cases:

  • dlib has a matcher $($vargs: ty),+ ..., which is now flagged saying "you can't have ... after a ty fragment."
  • cssparser has a matcher $( $string: expr => $result: expr ),+ _, which is now flagged saying "you can't have _ after a expr fragment
  • closet has a matcher $($parname:ident: $partyp:ty),+|, which is now flagged saying "you can't have | after a ty fragment."

For "seq rep followed by seq rep", the code as written disallows $(seq)* $(seq)*, but it originally did not disallow e.g. $(seq),* $(seq)* -- the use of a delimited sequence is allowed. (As far as I can tell this was an unintentional oversight in the implementation.)

But there's no fundamental reason that we cannot allow sequences followed by sequences. The reason why I think the current code tried to disallow them is to allow for a simple computation to find the "first" token that immediately follows any given sequence. (This is supported by comments in the code like "This doesn't actually compute the FIRST of the rest of the matcher yet, it only considers single tokens and simple NTs" (emphasis added).)

@nikomatsakis
Copy link
Contributor Author

My feeling here is that we should not rush to issue a breaking change IF we think there are simple ways to make the code more intelligent that would mitigate the effect of that change. In other words, if being smart about follow sets means we can both fix this bug AND support more macros, seems like we should do that. I'm not entirely clear if that is the case from your write-up (perhaps because I need to read it more carefully :)

@nikomatsakis
Copy link
Contributor Author

OK, so, this "trailing comma" pattern jumps out: $($vs:expr),* $(,)*, and it seems like this IS a case where a knowledge of FIRST would solve the issue, right? (depending, I guess, on what comes after the $(,)*...?)

@pnkfelix
Copy link
Member

this "trailing comma" pattern jumps out: $($vs:expr),* $(,)*, and it seems like this IS a case where a knowledge of FIRST would solve the issue, right?

@nikomatsakis yes I think that is right. I'm working on a revised version of the code that does a more precise computation of FIRST for the suffix after the cursor (and also tracks LAST for the token sequence prefix before the cursor, since you can have things like $($a:expr $( stuff; )*) $(,)*), and so when the cursor is right before the $(,)*, the LAST would hold { expr, ';' } The current code doesn't appear to be general enough to possibly be handling that correctly.

@pnkfelix
Copy link
Member

results of new crater run with a smarter (still in development) changeset:

https://gist.github.com/pnkfelix/b351caa523d45f9aa86b

I interpret this as a signal that I need to add { to the follow set for expr and probably all other NT's. But the other non-noise regressions there look legit:

  • expr followed by ...
  • ty followed by functions
  • ty followed by |
  • ty followed by varargs
  • expr followed by _
  • ty followed by [
  • expr followed by copy
  • expr followed by swap
  • ty followed by where (actually, we can/should expand the follow set of ty to include where).

(Some of the regressing crates fall into more than one of the buckets above...)

@durka
Copy link
Contributor

durka commented Nov 24, 2015

I don't understand disallowing ty |, to be honest. Clearly a type can be followed by | since that is the syntax of closures.

@durka
Copy link
Contributor

durka commented Nov 24, 2015

Also, as a matter of policy, could/should crate authors be notified (assuming an email was provided) when their crate is caught regressing in a crater run related to an accepted breaking change?

@huonw
Copy link
Member

huonw commented Nov 24, 2015

I don't understand disallowing ty |, to be honest. Clearly a type can be followed by | since that is the syntax of closures.

Almost certainly an oversight. Good point!

@nikomatsakis
Copy link
Contributor Author

On Wed, Nov 18, 2015 at 08:53:16AM -0800, Felix S Klock II wrote:

results of new crater run with a smarter (still in development) changeset:

https://gist.github.com/pnkfelix/b351caa523d45f9aa86b

I interpret this as a signal that I need to add { to the follow set for expr and probably all other NT's.

I still feel uncomfortable with this. I would rather than a nt for
"expr that cannot have {", which would correspond to basically "the
stuff you can put in an if". I realize this is not backwards compat
though, but it seems like having { in follow set of expr is
just...bogus. (e.g. imagine we make break take an argument.)

Perhaps though there is a compromise:

  • add another fragment specifier for expr-minus-{
  • add a lint warning that suggests people change expr to expr-minus-{ when
    appropriate

@pnkfelix
Copy link
Member

pnkfelix commented Dec 2, 2015

@nikomatsakis

Perhaps though there is a compromise:

  • add another fragment specifier for expr-minus-{
  • add a lint warning that suggests people change expr to expr-minus-{ when appropriate

yeah, I have come around to seeing this as a better solution (see e.g. rust-lang/rfcs#1384 (comment) )

Just need to think of a good name for the specifier. :)

@pnkfelix
Copy link
Member

now that amendment rust-lang/rfcs#1384 has been accepted, this issue should be fixed once the implementation has landed (i.e. #30450 is resolved).

@retep998
Copy link
Member

I should point out that crater is still massively deficient as it still does not test Windows targets at all, and did not notice that winapi would be hit by the ty followed by [ change. Even if I fix this on my end, it will still end up breaking everyone on windows who uses an older version of winapi. Please reconsider at least this case.

C:\msys64\home\Peter\.cargo\registry\src\github.com-88ac128001ac3a9a\winapi-0.2.5\src\macros.rs:159:46: 159:47 warning: `$fieldtype:ty` is followed by `[`, which is not allowed for `ty` fragments
C:\msys64\home\Peter\.cargo\registry\src\github.com-88ac128001ac3a9a\winapi-0.2.5\src\macros.rs:159     ($base:ident $field:ident: $fieldtype:ty [
                                                                                                                                                 ^
C:\msys64\home\Peter\.cargo\registry\src\github.com-88ac128001ac3a9a\winapi-0.2.5\src\macros.rs:159:46: 159:47 note: The above warning will be a hard error in the next release.
C:\msys64\home\Peter\.cargo\registry\src\github.com-88ac128001ac3a9a\winapi-0.2.5\src\macros.rs:159     ($base:ident $field:ident: $fieldtype:ty [
                                                                                                                                                 ^

@pnkfelix
Copy link
Member

Cc @brson re above crater note

(I strongly doubt we would revert the RFC amendment or associated PR. you might find support for adding [ to FOLLOW of ty )

@nikomatsakis
Copy link
Contributor Author

@retep998 Thanks for bringing this up. I assure you the deficiencies of crater are well-known, and it'd be really great if we could improve it -- though even if we got 100% coverage for crates.io, obviously it still wouldn't be 100% coverage of all the Rust code out there. In any case though, this is part of why we try our best to do warning cycles for bug fixes that are likely to have major impact, so that we can find out about other problems before changes become permanent (and also of course to give people time to transition).

As @pnkfelix said, I don't think it makes sense to rollback this bug fix completely or anything like that. It might alleviate some short term pain but it'd be guaranteed to cause more breakage in the long-term, since basically every change to Rust's grammar would be a potential breaking change to some macro.

However, also as @pnkfelix said, adding [ to the follow set is a plausible change. I don't think I'd have added it initially, as it is not something that follows from today's grammar (I don't think). It'd be a restriction on future types, but I tend to think since we have [T], I'd be disinclined to add T[...] anyway, since it seems ripe for confusion. Worth strong consideration, certainly.

@pnkfelix
Copy link
Member

This issue (as described in the description) is fixed by #30694

bors added a commit that referenced this issue Jun 8, 2016
Remove the old FOLLOW checking (aka `check_matcher_old`).

It was supposed to be removed at the next release cycle but is still in the tree since like 6 months.
Potential breaking change, since some cases (such as #25658) will change from a warning to an error. But the warning stating that it will be a hard error in the next release has been there for 6 months now.
I think it's safe to break this code. ^_^
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-macros Area: All kinds of macros (custom derive, macro_rules!, proc macros, ..) P-high High priority T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests