Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chars/bytes confusion in the error emitter #44080

Closed
est31 opened this issue Aug 25, 2017 · 1 comment
Closed

chars/bytes confusion in the error emitter #44080

est31 opened this issue Aug 25, 2017 · 1 comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@est31
Copy link
Member

est31 commented Aug 25, 2017

src/librustc_errors/snippet.rs has big comment saying that the column info is provided in characters, not in bytes. However, the error emitter doesn't care about that at all and uses these like byte offsets all over the place. This leads to bugs like #44023 and #44078 .

As an example, look how span printing varies with varying characters used:

Correct case:

12 |       "B   "";
   |  ___________^

Now add an emoji character:

12 |       "😊   "";
   |  ___________^

Note how its off by one char now. This can stack up:

12 |       "😊😊😊😊   "";
   |  ______________^

If I didn't use any spaces at all, I'd run into #44078.

Now this can be fixed by going through the emitter code and looking for all places where the pos is used in a byte position fashion. A much more proper fix instead is to stop trusting that people read comments and encode this via the type system. There is already a mechanism for that inside the compiler, its libsyntax_pos::CharPos! Just convert the types of start_col, end_col members of the MultilineAnnotation and Annotation structs to CharPos, or maybe to BytePos if that's preferred.

@euclio
Copy link
Contributor

euclio commented Aug 25, 2017

cc #8706

est31 added a commit to est31/rust that referenced this issue Aug 25, 2017
Fixes rust-lang#44078. Fixes rust-lang#44023.

The start_col member is given in chars,
while the code previously assumed it was given in bytes.

The more basic issue rust-lang#44080 doesn't get fixed.
@shepmaster shepmaster added A-diagnostics Area: Messages for errors, warnings, and lints C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Aug 25, 2017
alexcrichton added a commit to alexcrichton/rust that referenced this issue Aug 26, 2017
Fix a byte/char confusion issue in the error emitter

Fixes rust-lang#44078. Fixes rust-lang#44023.

The start_col member is given in chars, while the code previously assumed it was given in bytes.

The more basic issue rust-lang#44080 doesn't get fixed.
bors added a commit that referenced this issue Aug 26, 2017
Fix a byte/char confusion issue in the error emitter

Fixes #44078. Fixes #44023.

The start_col member is given in chars, while the code previously assumed it was given in bytes.

The more basic issue #44080 doesn't get fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants