Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix logical error with what text is considered whitespace. #134366

Merged
merged 1 commit into from
Dec 20, 2024

Conversation

harrisonkaiser
Copy link

@harrisonkaiser harrisonkaiser commented Dec 16, 2024

There appears to be a logical issue around what counts as leading white-space. There is code which does a subtraction assuming that no errors will be reported inside the leading whitespace. However we compute the length of that whitespace with std::char::is_whitespace and not rustc_lexer::is_whitespace. The former will include a no-break space while later will excluded it. We can only safely make the assumption that no errors will be reported in whitespace if it is all "Rust Standard" whitespace. Indeed an error does occur in unicode whitespace if it contains a no-break space. In that case the subtraction will cause a ICE (for a compiler in debug mode) as described in #132918.

@rustbot
Copy link
Collaborator

rustbot commented Dec 16, 2024

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @jieyouxu (or someone else) some time within the next two weeks.

Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (S-waiting-on-review and S-waiting-on-author) stays updated, invoking these commands when appropriate:

  • @rustbot author: the review is finished, PR author should check the comments and take action accordingly
  • @rustbot review: the author is ready for a review, this PR will be queued again in the reviewer's queue

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 16, 2024
@rustbot
Copy link
Collaborator

rustbot commented Dec 16, 2024

These commits modify the Cargo.lock file. Unintentional changes to Cargo.lock can be introduced when switching branches and rebasing PRs.

If this was unintentional then you should revert the changes before this PR is merged.
Otherwise, you can ignore this comment.

@harrisonkaiser
Copy link
Author

Addresses: #132918 (comment)

Copy link
Member

@jieyouxu jieyouxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. We'll need to add a regression test for #132918 to prevent it from happening again in the future. Note that the test probably will need to be marked with

//@ needs-rustc-debug-assertions

to catch the overflow assertion that only happens if rustc was built with debug assertions.

Might also need to artificially constrain diagnostics width with something like

//@ compile-flags: --diagnostics-width=80

to prevent terminal size from interfering with test outcome. I'm not super familiar with what the logic here is intended to do, so I'm going to reroll a WG-diagnostics reviewer.

compiler/rustc_errors/src/emitter.rs Outdated Show resolved Hide resolved
@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 16, 2024
@jieyouxu
Copy link
Member

r? diagnostics

@rustbot rustbot assigned davidtwco and unassigned jieyouxu Dec 16, 2024
There is a logical issue around what counts as leading white-space.
There is code which does a subtraction assuming that no errors will be reported
inside the leading whitespace. However we compute the length of
that whitespace with std::char::is_whitespace and not
rustc_lexer::is_whitespace. The former will include a no-break space while
later will excluded it. We can only safely make the assumption that no errors
will be reported  in whitespace if it is all "Rust Standard" whitespace.
Indeed an error does occur in unicode whitespace if it contains a no-break
space.
@harrisonkaiser
Copy link
Author

Thank you for the prompt response and guidance on adding a compilation tests.

I've fixed the typo and added a test case and confirmed that it doesn't pass on master and does pass on this branch.

@rustbot rustbot added the has-merge-commits PR has merge commits, merge with caution. label Dec 17, 2024
@rustbot

This comment has been minimized.

@rustbot rustbot removed the has-merge-commits PR has merge commits, merge with caution. label Dec 17, 2024
@harrisonkaiser
Copy link
Author

I noticed also that is_whitespace is also called here: (

if source_string.chars().take(ann.start_col.display).all(|c| c.is_whitespace()) {
), and I'm don't know if that should also be rustc_lexer::is_whitespace. I figured I'd stick to the failure case I could re-produce.

@harrisonkaiser
Copy link
Author

@rustbot review

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 18, 2024
@davidtwco
Copy link
Member

I noticed also that is_whitespace is also called here: (

if source_string.chars().take(ann.start_col.display).all(|c| c.is_whitespace()) {

), and I'm don't know if that should also be rustc_lexer::is_whitespace. I figured I'd stick to the failure case I could re-produce.

I'd be happy to see this changed to rustc_lexer::is_whitespace, it's probably better to be consistent. But that can be a follow-up so this can land quicker.

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Dec 20, 2024

📌 Commit 1e33dd1 has been approved by davidtwco

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 20, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 20, 2024
Rollup of 5 pull requests

Successful merges:

 - rust-lang#134366 (Fix logical error with what text is considered whitespace.)
 - rust-lang#134514 (Improve dependency_format a bit)
 - rust-lang#134519 (ci: use ubuntu `24` instead of `latest`)
 - rust-lang#134551 (coverage: Rename `basic_coverage_blocks` to just `graph`)
 - rust-lang#134553 (add member constraints comment)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 20, 2024
Rollup of 5 pull requests

Successful merges:

 - rust-lang#134366 (Fix logical error with what text is considered whitespace.)
 - rust-lang#134514 (Improve dependency_format a bit)
 - rust-lang#134519 (ci: use ubuntu `24` instead of `latest`)
 - rust-lang#134551 (coverage: Rename `basic_coverage_blocks` to just `graph`)
 - rust-lang#134553 (add member constraints comment)

r? `@ghost`
`@rustbot` modify labels: rollup
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Dec 20, 2024
…avidtwco

Fix logical error with what text is considered whitespace.

There appears to be a logical issue around what counts as leading white-space. There is code which does a subtraction assuming that no errors will be reported inside the leading whitespace. However we compute the length of that whitespace with std::char::is_whitespace and not rustc_lexer::is_whitespace. The former will include a no-break space while later will excluded it. We can only safely make the assumption that no errors will be reported  in whitespace if it is all "Rust Standard" whitespace. Indeed an error does occur in unicode whitespace if it contains a no-break space. In that case the subtraction will cause a ICE (for a compiler in debug mode) as described in rust-lang#132918.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Dec 20, 2024
…avidtwco

Fix logical error with what text is considered whitespace.

There appears to be a logical issue around what counts as leading white-space. There is code which does a subtraction assuming that no errors will be reported inside the leading whitespace. However we compute the length of that whitespace with std::char::is_whitespace and not rustc_lexer::is_whitespace. The former will include a no-break space while later will excluded it. We can only safely make the assumption that no errors will be reported  in whitespace if it is all "Rust Standard" whitespace. Indeed an error does occur in unicode whitespace if it contains a no-break space. In that case the subtraction will cause a ICE (for a compiler in debug mode) as described in rust-lang#132918.
@bors bors merged commit 1652e3a into rust-lang:master Dec 20, 2024
8 of 12 checks passed
@rustbot rustbot added this to the 1.85.0 milestone Dec 20, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Dec 20, 2024
Rollup merge of rust-lang#134366 - harrisonkaiser:no-break-space, r=davidtwco

Fix logical error with what text is considered whitespace.

There appears to be a logical issue around what counts as leading white-space. There is code which does a subtraction assuming that no errors will be reported inside the leading whitespace. However we compute the length of that whitespace with std::char::is_whitespace and not rustc_lexer::is_whitespace. The former will include a no-break space while later will excluded it. We can only safely make the assumption that no errors will be reported  in whitespace if it is all "Rust Standard" whitespace. Indeed an error does occur in unicode whitespace if it contains a no-break space. In that case the subtraction will cause a ICE (for a compiler in debug mode) as described in rust-lang#132918.
@harrisonkaiser harrisonkaiser deleted the no-break-space branch December 20, 2024 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants