Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match hyphen in multi-revision comment matchers #124137

Merged
merged 1 commit into from
Apr 20, 2024

Conversation

tgross35
Copy link
Contributor

@tgross35 tgross35 commented Apr 18, 2024

Currently, the matcher //[rev-foo,rev-bar]~ does not get selected by the regex. Change the matcher to include -.

@rustbot
Copy link
Collaborator

rustbot commented Apr 18, 2024

r? @compiler-errors

rustbot has assigned @compiler-errors.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) labels Apr 18, 2024
@rustbot
Copy link
Collaborator

rustbot commented Apr 18, 2024

Some changes occurred in src/tools/compiletest

cc @jieyouxu

@tgross35
Copy link
Contributor Author

Reasoning: https://github.com/rust-lang/rust/pull/123816/files#diff-0238262528c07a864b8aa49d6d3aa989c49b045a012b1a8e2c491c84e3ccac3b wasn't matching anything labeled //[legacy,verbose-legacy]~^, @michaelwoerister suggested fixing the regex.

@compiler-errors
Copy link
Member

r? @jieyouxu

@michaelwoerister
Copy link
Member

The new regex looks good, @tgross35. Would you mind adding a unit test for the parse_expected function that covers the basics? See this for examples of other unit tests in compiletest:
https://github.com/rust-lang/rust/blob/master/src/tools/compiletest/src/header/tests.rs

@jieyouxu
Copy link
Member

Making revision names accept - makes sense to me, but I don't remember if the dash messes with crate names generated by revisions, could you double check if something like

//@ revisions: hello-world
//@[hello-world] run-pass
fn main() {}

isn't borked?

@jieyouxu
Copy link
Member

IIRC revision names are used in crate names for artifact names and output, I don't immediately recall if we normalized it to become underscore.

@@ -118,14 +118,14 @@ fn parse_expected(
// //[rev1]~
// //[rev1,rev2]~^^
static RE: Lazy<Regex> =
Lazy::new(|| Regex::new(r"//(?:\[(?P<revs>[\w,]+)])?~(?P<adjust>\||\^*)").unwrap());
Lazy::new(|| Regex::new(r"//(?:\[(?P<revs>.+)])?~(?P<adjust>\||\^*)").unwrap());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this regex mean test.name is accepted, along other non-alphanumeric characters? If so, that will break anything using revision name as crate names unless you also perform "name mangling" to normalize the non-alphanumeric characters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another case of "silently failing" in compiletest where it really really needs to loudly report that your directives don't work and won't work as expected...

@michaelwoerister
Copy link
Member

michaelwoerister commented Apr 19, 2024

#123816 is modifying a test case that already uses - in a revision name. It seems to work except for when used in error annotations.

However, generally restricting revision names to valid ASCII-only identifiers seems reasonable to me (as long as we give a good error message when compiletest encounters an invalid revision name 🙂)

@jieyouxu
Copy link
Member

jieyouxu commented Apr 19, 2024

#123816 is modifying a test case that already uses - in a revision name. It seems to work except for when used in error annotations.

However, generally restricting revision names to valid ASCII-only identifiers seems reasonable to me (as long as we give a good error message when compiletest encounters an invalid revision name 🙂)

I think that only recently works because I added some explicit crate name mangling for revisioned run-rustfix: #123601, which modifying the accepted revision name regex luckily hits. But that only handles . and - I want to say? So if the revision name contains a $ it probably breaks?

@jieyouxu
Copy link
Member

However, generally restricting revision names to valid ASCII-only identifiers seems reasonable to me (as long as we give a good error message when compiletest encounters an invalid revision name 🙂)

I think we could just only accept revision names that are crate-name-like (that is, restrict revision names to only consist of alphanumeric + underscore characters), and report an error otherwise. I don't think we want to perform any weird crate name mangling, because e.g. if you naively mangle rev-name as rev_name, then the corresponding crate name will conflict with another revision rev_name. People usually don't write revision names like this, but yeah...

@michaelwoerister
Copy link
Member

I agree, the simpler the better.

@tgross35
Copy link
Contributor Author

Having - in revision names seems reasonably common, just based on a quick search https://github.com/search?q=repo%3Arust-lang%2Frust+%2F%40.*revisions%3A.*-%2F&type=code, so it seems like that should work without any special mangling. I just changed the regex to match everything so if all revision names get matched even if invalid, and let them fail elsewhere rather than silently be skipped (doesn’t help if the revision doesn’t exist, but that is a different problem…).

What would you prefer for this PR? I can leave it as-is and add a test, or change back to the original pattern and just add -.

@jieyouxu
Copy link
Member

Having - in revision names seems reasonably common, just based on a quick search https://github.com/search?q=repo%3Arust-lang%2Frust+%2F%40.*revisions%3A.*-%2F&type=code, so it seems like that should work without any special mangling. I just changed the regex to match everything so if all revision names get matched even if invalid, and let them fail elsewhere rather than silently be skipped (doesn’t help if the revision doesn’t exist, but that is a different problem…).

What would you prefer for this PR? I can leave it as-is and add a test, or change back to the original pattern and just add -.

Huh, I suppose then it does work with -s. In that case, let's respect the existing accepted revision names (alphanumeric + _ + -), but just update the error annotation regex to accept _ and - in addition to alphanumeric characters only (and not any character). Could you also enforce that revision names can only contain alphanumeric + _ + -, and report an error if revision names don't match the regex? A test is also desirable to verify our behavior here.

@jieyouxu
Copy link
Member

But otherwise it seems like an oversight that revisions accept - but not the error annotations (lol). What we really should do in compiletest (in the long term) is to have the revisions parsing be strict about what revision names it accept. Thanks for the fix, helping compiletest be more self-consistent is very helpful.

@tgross35
Copy link
Contributor Author

I changed this PR to just do the simplest thing and match - in addition to what was there before. I plan on refactoring #123882, which I think would be a better place to check that the name is valid.

Since there is no new rejecting behavior, is this okay as-is or do you want a test still? (if so, what would that look like - just a dummy UI test?)

@tgross35 tgross35 changed the title Match any characters in multi-revision comment matchers Match hyphen in multi-revision comment matchers Apr 19, 2024
@jieyouxu
Copy link
Member

jieyouxu commented Apr 19, 2024

Since there is no new rejecting behavior, is this okay as-is or do you want a test still? (if so, what would that look like - just a dummy UI test?)

Could you still add a unit test for parse_expected checking that a revision with a name containing - is accepted? I realize that uhh the error reporting here is less than ideal, but for the time being having a regression test might remind future reworks in terms of what revision names can be accepted.

@tgross35
Copy link
Contributor Author

Alrightie I added a simple test, let me know if that looks fine.

@tgross35
Copy link
Contributor Author

Tidy failed so it didn't push before, should be up to date now

@rust-log-analyzer

This comment has been minimized.

Currently, the matcher `//[rev-foo,rev-bar]~` does not get selected by
the regex. Change the matcher to also match strings that contain a `-`.h
@jieyouxu
Copy link
Member

Looks good to me, thanks for the patch!

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Apr 19, 2024

📌 Commit 282488c has been approved by jieyouxu

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 19, 2024
@tgross35
Copy link
Contributor Author

Thanks!

bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 20, 2024
…iaskrgr

Rollup of 3 pull requests

Successful merges:

 - rust-lang#123409 (Implement Modified Condition/Decision  Coverage)
 - rust-lang#124104 (Fix capturing duplicated lifetimes via parent in `precise_captures` (`impl use<'...>`))
 - rust-lang#124137 (Match hyphen in multi-revision comment matchers)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit f58ef08 into rust-lang:master Apr 20, 2024
12 checks passed
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Apr 20, 2024
Rollup merge of rust-lang#124137 - tgross35:testsuite-multi-rev-regex, r=jieyouxu

Match hyphen in multi-revision comment matchers

Currently, the matcher `//[rev-foo,rev-bar]~` does not get selected by the regex. Change the matcher to include `-`.
@rustbot rustbot added this to the 1.79.0 milestone Apr 20, 2024
@michaelwoerister
Copy link
Member

Thanks for getting this fixed, everyone!

@tgross35 tgross35 deleted the testsuite-multi-rev-regex branch July 21, 2024 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants