Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the start/end byte positions in the compiler JSON output #65074

Merged
merged 1 commit into from
Oct 25, 2019

Conversation

Rantanen
Copy link
Contributor

@Rantanen Rantanen commented Oct 3, 2019

Track the changes made during normalization in the SourceFile and use this information to correct the start_byte and end_byte fields in the JSON output.

This should ensure the start/end byte fields can be used to index the original file, even if Rust normalized the source code for parsing purposes. Both CRLF to LF and BOM removal are handled with this one.

The rough plan was discussed with @matklad in rust-lang/rustfix#176 - although I ended up going with u32 offset tracking so I wouldn't need to deal with u32 + i32 arithmetics when applying the offset to the span byte positions.

Fixes #65029

@rust-highfive
Copy link
Collaborator

r? @petrochenkov

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Oct 3, 2019
@Rantanen
Copy link
Contributor Author

Rantanen commented Oct 3, 2019

I added the normalized_pos field in SourceFile next to the existing position based bits (multibyte_chars and non_narrow_chars) as it felt conceptually similar to those fields. Not sure if I should have added it as the last field for chronological reasons.

This got a bit more confusing when implementing encode/decode for the type as there I didn't want to change the existing field indices (not sure if there's some compatibility requirements with those functions), so there I added the normalized_pos as the last field in field id order.

(I also managed to sneak in a typo with my last refactoring that fails the build, I'll be rebasing this in a moment in the morning once my build goes through.)

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-10-03T22:45:50.7129455Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-10-03T22:45:50.7361669Z ##[command]git config gc.auto 0
2019-10-03T22:45:50.7487592Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-10-03T22:45:50.7494507Z ##[command]git config --get-all http.proxy
2019-10-03T22:45:50.7619058Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/65074/merge:refs/remotes/pull/65074/merge
---
2019-10-03T22:53:00.2779451Z    Compiling serde_json v1.0.40
2019-10-03T22:53:02.0718808Z    Compiling tidy v0.1.0 (/checkout/src/tools/tidy)
2019-10-03T22:53:13.2074542Z     Finished release [optimized] target(s) in 1m 28s
2019-10-03T22:53:13.2150912Z tidy check
2019-10-03T22:53:14.1294428Z tidy error: /checkout/src/libsyntax/tests.rs: too many trailing newlines (2)
2019-10-03T22:53:15.2836883Z some tidy checks failed
2019-10-03T22:53:15.2843409Z 
2019-10-03T22:53:15.2843409Z 
2019-10-03T22:53:15.2845972Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor"
2019-10-03T22:53:15.2846148Z 
2019-10-03T22:53:15.2846173Z 
2019-10-03T22:53:15.2854873Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
2019-10-03T22:53:15.2854971Z Build completed unsuccessfully in 0:01:31
2019-10-03T22:53:15.2854971Z Build completed unsuccessfully in 0:01:31
2019-10-03T22:53:15.2900159Z == clock drift check ==
2019-10-03T22:53:15.2928628Z   local time: Thu Oct  3 22:53:15 UTC 2019
2019-10-03T22:53:15.4439225Z   network time: Thu, 03 Oct 2019 22:53:15 GMT
2019-10-03T22:53:15.4439311Z == end clock drift check ==
2019-10-03T22:53:16.8120859Z ##[error]Bash exited with code '1'.
2019-10-03T22:53:16.8151552Z ##[section]Starting: Checkout
2019-10-03T22:53:16.8154038Z ==============================================================================
2019-10-03T22:53:16.8154095Z Task         : Get sources
2019-10-03T22:53:16.8154160Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Rantanen Rantanen force-pushed the json-byte-pos branch 2 times, most recently from 994bda0 to d0dc2a8 Compare October 4, 2019 06:51
@Rantanen
Copy link
Contributor Author

Rantanen commented Oct 4, 2019

Fixed couple of naming issues and one spaces-within-parens violation. I think those were all of them. As far as I'm concerned, I think the code could be good to go. The compiler might disagree, my test run is still in progress - wanted to get the style issues pushed early so they wouldn't distract too much in the review.

src/libsyntax_pos/lib.rs Outdated Show resolved Hide resolved
src/libsyntax_pos/lib.rs Outdated Show resolved Hide resolved
src/libsyntax/json.rs Outdated Show resolved Hide resolved
src/libsyntax/json/tests.rs Show resolved Hide resolved
@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-10-04T06:51:57.7474491Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-10-04T06:51:57.7735648Z ##[command]git config gc.auto 0
2019-10-04T06:51:57.7831170Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-10-04T06:51:57.7961166Z ##[command]git config --get-all http.proxy
2019-10-04T06:51:57.8072454Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/65074/merge:refs/remotes/pull/65074/merge
---
2019-10-04T07:55:29.9711995Z .................................................................................................... 1500/9103
2019-10-04T07:55:37.3540834Z .................................................................................................... 1600/9103
2019-10-04T07:55:47.0156763Z .................................................................................................... 1700/9103
2019-10-04T07:55:56.9295896Z .......i...............i............................................................................ 1800/9103
2019-10-04T07:56:04.5117020Z ..................................................................................................ii 1900/9103
2019-10-04T07:56:21.5725080Z iii................................................................................................. 2000/9103
2019-10-04T07:56:30.7917861Z .................................................................................................... 2200/9103
2019-10-04T07:56:33.5712899Z .................................................................................................... 2300/9103
2019-10-04T07:56:40.1274287Z .................................................................................................... 2400/9103
2019-10-04T07:56:45.9673838Z .................................................................................................... 2500/9103
---
2019-10-04T07:59:47.1056504Z .......................................................................................i............ 4700/9103
2019-10-04T07:59:55.3186074Z ...i................................................................................................ 4800/9103
2019-10-04T08:00:06.8267049Z .................................................................................................... 4900/9103
2019-10-04T08:00:12.1970339Z .................................................................................................... 5000/9103
2019-10-04T08:00:24.6165656Z ................................................................................ii.ii............... 5100/9103
2019-10-04T08:00:34.5118217Z .................................................................................................... 5300/9103
2019-10-04T08:00:44.9218881Z .................................................................................................... 5400/9103
2019-10-04T08:00:52.0572405Z ..............................................i..................................................... 5500/9103
2019-10-04T08:00:59.2961273Z .................................................................................................... 5600/9103
2019-10-04T08:00:59.2961273Z .................................................................................................... 5600/9103
2019-10-04T08:01:10.3684320Z .................................................................................................... 5700/9103
2019-10-04T08:01:18.4408987Z ...........................................ii...i..ii...........i................................... 5800/9103
2019-10-04T08:01:45.4940074Z .................................................................................................... 6000/9103
2019-10-04T08:01:55.3044672Z .................................................................................................... 6100/9103
2019-10-04T08:01:55.3044672Z .................................................................................................... 6100/9103
2019-10-04T08:02:08.2088696Z ................................................i..ii............................................... 6200/9103
2019-10-04T08:02:32.1328817Z .................................................................................................... 6400/9103
2019-10-04T08:02:34.5183128Z ........i........................................................................................... 6500/9103
2019-10-04T08:02:36.9351961Z ................................................................................i................... 6600/9103
2019-10-04T08:02:39.8428787Z .................................................................................................... 6700/9103
---
2019-10-04T08:07:27.2366257Z  finished in 5.731
2019-10-04T08:07:27.2558027Z Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:07:27.4521991Z 
2019-10-04T08:07:27.4523145Z running 150 tests
2019-10-04T08:07:30.9910521Z i....iii......iii..iiii....i.............................i..i..................i....i.........ii.i.i 100/150
2019-10-04T08:07:33.0854403Z ..iiii..............i.........iii.i.......ii......
2019-10-04T08:07:33.0855850Z 
2019-10-04T08:07:33.0859806Z  finished in 5.830
2019-10-04T08:07:33.1056059Z Check compiletest suite=codegen-units mode=codegen-units (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:07:33.2744555Z 
---
2019-10-04T08:07:35.5037113Z  finished in 2.397
2019-10-04T08:07:35.5237609Z Check compiletest suite=assembly mode=assembly (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:07:35.7019241Z 
2019-10-04T08:07:35.7020267Z running 9 tests
2019-10-04T08:07:35.7021317Z iiiiiiiii
2019-10-04T08:07:35.7022121Z 
2019-10-04T08:07:35.7026304Z  finished in 0.179
2019-10-04T08:07:35.7249225Z Check compiletest suite=incremental mode=incremental (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:07:35.9272433Z 
---
2019-10-04T08:07:55.4723657Z  finished in 19.747
2019-10-04T08:07:55.4963581Z Check compiletest suite=debuginfo mode=debuginfo-gdb+lldb (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:07:55.6914440Z 
2019-10-04T08:07:55.6914808Z running 123 tests
2019-10-04T08:08:21.5127886Z .iiiii...i.....i..i...i..i.i.i..i.ii..i.i.....i..i....ii..........iiii..........i...ii...i.......ii. 100/123
2019-10-04T08:08:26.5512621Z i.i.i......iii.i.....ii
2019-10-04T08:08:26.5514371Z 
2019-10-04T08:08:26.5519688Z  finished in 31.055
2019-10-04T08:08:26.5530350Z Uplifting stage1 rustc (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:08:26.5531001Z Copying stage2 rustc from stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu)
---
2019-10-04T08:21:47.9998121Z 
2019-10-04T08:21:47.9999033Z    Doc-tests core
2019-10-04T08:21:53.3189640Z 
2019-10-04T08:21:53.3190585Z running 2405 tests
2019-10-04T08:22:05.5716228Z ......iiiii......................................................................................... 100/2405
2019-10-04T08:22:17.0647393Z ...............................................................................ii................... 200/2405
2019-10-04T08:22:44.2507114Z .i.................................................................................................. 400/2405
2019-10-04T08:22:44.2507114Z .i.................................................................................................. 400/2405
2019-10-04T08:22:55.5572755Z ................................................i..i.................iiii........................... 500/2405
2019-10-04T08:23:17.3158426Z .................................................................................................... 700/2405
2019-10-04T08:23:28.5415172Z .................................................................................................... 800/2405
2019-10-04T08:23:39.7730405Z .................................................................................................... 900/2405
2019-10-04T08:23:50.8558964Z .................................................................................................... 1000/2405
---
2019-10-04T08:28:18.0756337Z 
2019-10-04T08:28:18.0756603Z running 992 tests
2019-10-04T08:28:40.4411697Z i................................................................................................... 100/992
2019-10-04T08:28:52.7406534Z .................................................................................................... 200/992
2019-10-04T08:29:01.6175689Z .................iii......i......i...i......i....................................................... 300/992
2019-10-04T08:29:07.8043904Z .................................................................................................... 400/992
2019-10-04T08:29:16.2149074Z ...................................i..i.................................ii.......................... 500/992
2019-10-04T08:29:32.1011131Z .................................................................................................... 700/992
2019-10-04T08:29:32.1011131Z .................................................................................................... 700/992
2019-10-04T08:29:40.8489504Z ..................iiii.............................................................................. 800/992
2019-10-04T08:29:56.7336797Z .................................................................................................... 900/992
2019-10-04T08:30:04.8305995Z ........................................iiii................................................
2019-10-04T08:30:04.8307227Z 
2019-10-04T08:30:04.8368208Z  finished in 203.177
2019-10-04T08:30:04.8387980Z Testing term stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:30:05.0836200Z    Compiling term v0.0.0 (/checkout/src/libterm)
---
2019-10-04T08:34:34.9413781Z 
2019-10-04T08:34:34.9445779Z  finished in 4.017
2019-10-04T08:34:34.9467113Z Testing syntax_pos stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
2019-10-04T08:34:35.1427045Z    Compiling syntax_pos v0.0.0 (/checkout/src/libsyntax_pos)
2019-10-04T08:34:36.2251932Z error[E0599]: no method named `map` found for type `std::vec::Vec<NormalizedPos>` in the current scope
2019-10-04T08:34:36.2255508Z   --> src/libsyntax_pos/tests.rs:26:58
2019-10-04T08:34:36.2261155Z    |
2019-10-04T08:34:36.2261608Z 26 |         let actual_positions : Vec<_> = actual_positions.map(|nc| nc.pos).collect();
2019-10-04T08:34:36.2262342Z    |                                                          ^^^ method not found in `std::vec::Vec<NormalizedPos>`
2019-10-04T08:34:36.2262906Z    = note: the method `map` exists but the following trait bounds were not satisfied:
2019-10-04T08:34:36.2262906Z    = note: the method `map` exists but the following trait bounds were not satisfied:
2019-10-04T08:34:36.2263193Z            `&mut [NormalizedPos] : std::iter::Iterator`
2019-10-04T08:34:36.2263549Z            `&mut std::vec::Vec<NormalizedPos> : std::iter::Iterator`
2019-10-04T08:34:36.2430227Z error: aborting due to previous error
2019-10-04T08:34:36.2430301Z 
2019-10-04T08:34:36.2430622Z For more information about this error, try `rustc --explain E0599`.
2019-10-04T08:34:36.2693594Z error: could not compile `syntax_pos`.
---
2019-10-04T08:34:36.2786339Z == clock drift check ==
2019-10-04T08:34:36.2804301Z   local time: Fri Oct  4 08:34:36 UTC 2019
2019-10-04T08:34:36.5777186Z   network time: Fri, 04 Oct 2019 08:34:36 GMT
2019-10-04T08:34:36.5778187Z == end clock drift check ==
2019-10-04T08:34:37.2317521Z ##[error]Bash exited with code '1'.
2019-10-04T08:34:37.2375423Z ##[section]Starting: Checkout
2019-10-04T08:34:37.2377485Z ==============================================================================
2019-10-04T08:34:37.2377550Z Task         : Get sources
2019-10-04T08:34:37.2377619Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@petrochenkov
Copy link
Contributor

r? @matklad

@rust-highfive rust-highfive assigned matklad and unassigned petrochenkov Oct 4, 2019
@Rantanen
Copy link
Contributor Author

Rantanen commented Oct 4, 2019

I'll also try to remember run the libsyntax_pos unit tests this time. Unfortunately UDP tests are failing on my machine so running the full test suite is not possible for me. :|

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-10-04T17:56:28.8054148Z ##[command]git remote add origin https://github.com/rust-lang/rust
2019-10-04T17:56:28.8312524Z ##[command]git config gc.auto 0
2019-10-04T17:56:28.8400880Z ##[command]git config --get-all http.https://github.com/rust-lang/rust.extraheader
2019-10-04T17:56:28.8465965Z ##[command]git config --get-all http.proxy
2019-10-04T17:56:28.8618800Z ##[command]git -c http.extraheader="AUTHORIZATION: basic ***" fetch --force --tags --prune --progress --no-recurse-submodules --depth=2 origin +refs/heads/*:refs/remotes/origin/* +refs/pull/65074/merge:refs/remotes/pull/65074/merge
---
2019-10-04T18:03:47.7157516Z    Compiling serde_json v1.0.40
2019-10-04T18:03:50.5272471Z    Compiling tidy v0.1.0 (/checkout/src/tools/tidy)
2019-10-04T18:04:01.2609780Z     Finished release [optimized] target(s) in 1m 33s
2019-10-04T18:04:01.2696648Z tidy check
2019-10-04T18:04:01.5555780Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:1: CR character
2019-10-04T18:04:01.5556628Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:2: CR character
2019-10-04T18:04:01.5557125Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:3: CR character
2019-10-04T18:04:01.5557564Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:4: CR character
2019-10-04T18:04:01.5558038Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:5: CR character
2019-10-04T18:04:01.5558454Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:6: CR character
2019-10-04T18:04:01.5558887Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:7: CR character
2019-10-04T18:04:01.5559310Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:8: CR character
2019-10-04T18:04:01.5559720Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:9: CR character
2019-10-04T18:04:01.5560301Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:10: CR character
2019-10-04T18:04:01.5560712Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:11: CR character
2019-10-04T18:04:01.5561146Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:12: CR character
2019-10-04T18:04:01.5561808Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:13: CR character
2019-10-04T18:04:01.5562305Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:14: CR character
2019-10-04T18:04:01.5562940Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:15: CR character
2019-10-04T18:04:01.5563358Z tidy error: /checkout/src/test/ui/json-bom-plus-crlf.rs:16: CR character
2019-10-04T18:04:02.3036707Z tidy error: /checkout/src/libsyntax/json/tests.rs: too many trailing newlines (2)
2019-10-04T18:04:03.5156055Z some tidy checks failed
2019-10-04T18:04:03.5156226Z 
2019-10-04T18:04:03.5156226Z 
2019-10-04T18:04:03.5157170Z command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor"
2019-10-04T18:04:03.5157284Z 
2019-10-04T18:04:03.5157312Z 
2019-10-04T18:04:03.5162382Z failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
2019-10-04T18:04:03.5162484Z Build completed unsuccessfully in 0:01:36
2019-10-04T18:04:03.5162484Z Build completed unsuccessfully in 0:01:36
2019-10-04T18:04:03.5217091Z == clock drift check ==
2019-10-04T18:04:03.5246201Z   local time: Fri Oct  4 18:04:03 UTC 2019
2019-10-04T18:04:03.8019457Z   network time: Fri, 04 Oct 2019 18:04:03 GMT
2019-10-04T18:04:03.8020054Z == end clock drift check ==
2019-10-04T18:04:05.1954794Z ##[error]Bash exited with code '1'.
2019-10-04T18:04:05.1990377Z ##[section]Starting: Checkout
2019-10-04T18:04:05.1992408Z ==============================================================================
2019-10-04T18:04:05.1992481Z Task         : Get sources
2019-10-04T18:04:05.1992528Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@Rantanen
Copy link
Contributor Author

Rantanen commented Oct 4, 2019

Also found a way to run the tidy tests. All the relevant tests passed on my machine, so I think this is good to go now from my end as long as I didn't screw things over in rebase and it's okay to leave the with_default_globals unit tests in for now.

Copy link
Member

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of nits left

src/libsyntax/json/tests.rs Outdated Show resolved Hide resolved
src/libsyntax/json/tests.rs Outdated Show resolved Hide resolved
src/libsyntax/json/tests.rs Show resolved Hide resolved
src/librustc/ich/impls_syntax.rs Outdated Show resolved Hide resolved
@matklad
Copy link
Member

matklad commented Oct 8, 2019

@bors r+

Thanks!

@bors
Copy link
Contributor

bors commented Oct 8, 2019

📌 Commit bbf262d has been approved by matklad

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 8, 2019
@matklad matklad added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Oct 8, 2019
Centril added a commit to Centril/rust that referenced this pull request Oct 8, 2019
Fix the start/end byte positions in the compiler JSON output

Track the changes made during normalization in the `SourceFile` and use this information to correct the `start_byte` and `end_byte` fields in the JSON output.

This should ensure the start/end byte fields can be used to index the original file, even if Rust normalized the source code for parsing purposes. Both CRLF to LF and BOM removal are handled with this one.

The rough plan was discussed with @matklad in rust-lang/rustfix#176 - although I ended up going with `u32` offset tracking so I wouldn't need to deal with `u32 + i32` arithmetics when applying the offset to the span byte positions.

Fixes rust-lang#65029
Centril added a commit to Centril/rust that referenced this pull request Oct 8, 2019
Fix the start/end byte positions in the compiler JSON output

Track the changes made during normalization in the `SourceFile` and use this information to correct the `start_byte` and `end_byte` fields in the JSON output.

This should ensure the start/end byte fields can be used to index the original file, even if Rust normalized the source code for parsing purposes. Both CRLF to LF and BOM removal are handled with this one.

The rough plan was discussed with @matklad in rust-lang/rustfix#176 - although I ended up going with `u32` offset tracking so I wouldn't need to deal with `u32 + i32` arithmetics when applying the offset to the span byte positions.

Fixes rust-lang#65029
@Centril
Copy link
Contributor

Centril commented Oct 8, 2019

Failed in #65199 (comment), @bors r-

matklad added a commit to matklad/rust that referenced this pull request Oct 10, 2019
…, r=petrochenkov"

This reverts commit ef1ecbe, reversing
changes made to fc8765d.

That changed unfortunately broke rustfix on windows:

rust-lang/rustfix#176

Specifically, what ef1ecbe did was to
enforce normalization of \r\n to \n at file loading time, similarly to
how we deal with Byte Order Mark. Normalization changes raw offsets in
files, which are exposed via `--error-format=json`, and used by rusfix.

The proper solution here (which also handles the latent case with BOM) is

rust-lang#65074

However, since it's somewhat involved, and we are time sensitive, we
prefer to revert the original change on beta.
@matklad
Copy link
Member

matklad commented Oct 10, 2019

Submitted #65273 which should relieve time pressure here, and increase our confidence that the backport is correct.

@joelpalmer
Copy link

Ping from Triage: Hi @Rantanen, any updates?

@Rantanen
Copy link
Contributor Author

Oh, sorry. I was under the impression this was good to go and people were just busy with other stuff that the review was taking a while.

Let's squash the commits so that git history is not confusing about relative/absolute offsets?

I squashed the "Fix implementation" commits into the "Fix the start/end byte positions in the compiler JSON output", but still left the logically separate "Add unit tests" commits alone. Didn't know if the wish was to squash them all into one.

I've now done so. Do let me know if there's still something missing.

@matklad
Copy link
Member

matklad commented Oct 22, 2019

Sorry @Rantanen , I've missed force push, changes look good to me now!

@bors r+

@bors
Copy link
Contributor

bors commented Oct 22, 2019

📌 Commit ff1860a has been approved by matklad

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Oct 22, 2019
@bors
Copy link
Contributor

bors commented Oct 22, 2019

⌛ Testing commit ff1860a with merge 93193b822279b7c5a943dc56de5e4296934a9ea0...

@rust-highfive
Copy link
Collaborator

Your PR failed (pretty log, raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
2019-10-22T19:11:29.5676683Z do so (now or later) by using -b with the checkout command again. Example:
2019-10-22T19:11:29.5676754Z 
2019-10-22T19:11:29.5676834Z   git checkout -b <new-branch-name>
2019-10-22T19:11:29.5676890Z 
2019-10-22T19:11:29.5677015Z HEAD is now at 93193b822 Auto merge of #65074 - Rantanen:json-byte-pos, r=matklad
2019-10-22T19:11:29.6094268Z ##[section]Starting: Collect CPU-usage statistics in the background
2019-10-22T19:11:29.6228055Z ==============================================================================
2019-10-22T19:11:29.6228152Z Task         : Bash
2019-10-22T19:11:29.6228245Z Description  : Run a Bash script on macOS, Linux, or Windows
---
2019-10-22T19:11:30.6842714Z ==============================================================================
2019-10-22T19:11:32.4133421Z Generating script.
2019-10-22T19:11:32.4196578Z ========================== Starting Command Output ===========================
2019-10-22T19:11:32.4579240Z ##[command]"C:\windows\system32\cmd.exe" /D /E:ON /V:OFF /S /C "CALL "D:\a\_temp\98eab51a-8581-4c15-a65d-9cca954d7520.cmd""
2019-10-22T19:11:44.8694315Z iwr : The remote name could not be resolved: 'rust-lang-ci-mirrors.s3-us-west-1.amazonaws.com'
2019-10-22T19:11:44.8694644Z At line:1 char:43
2019-10-22T19:11:44.8694744Z + ... yContinue'; iwr -outf sccache\sccache.exe https://rust-lang-ci-mirror ...
2019-10-22T19:11:44.8695434Z +                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2019-10-22T19:11:44.8696092Z     + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebExc 
2019-10-22T19:11:44.8696701Z    eption
2019-10-22T19:11:44.8697698Z     + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
2019-10-22T19:11:44.8697978Z  
2019-10-22T19:11:45.0008378Z ##[error]Cmd.exe exited with code '1'.
2019-10-22T19:11:45.0794337Z ##[section]Starting: Upload CPU usage statistics
2019-10-22T19:11:45.0988325Z ==============================================================================
2019-10-22T19:11:45.0988464Z Task         : Bash
2019-10-22T19:11:45.0988546Z Description  : Run a Bash script on macOS, Linux, or Windows
---
2019-10-22T19:11:45.4330899Z ========================== Starting Command Output ===========================
2019-10-22T19:11:45.4337286Z [command]"C:\Program Files\Git\bin\bash.exe" --noprofile --norc /d/a/_temp/e72207d0-9cf0-4fcd-8292-6d34481d4715.sh
2019-10-22T19:11:45.4830315Z /d/a/_temp/e72207d0-9cf0-4fcd-8292-6d34481d4715.sh: line 1: aws: command not found
2019-10-22T19:11:45.4877604Z 
2019-10-22T19:11:45.4900118Z ##[error]Bash exited with code '127'.
2019-10-22T19:11:45.4977285Z ##[section]Starting: Checkout
2019-10-22T19:11:45.5089986Z ==============================================================================
2019-10-22T19:11:45.5090088Z Task         : Get sources
2019-10-22T19:11:45.5090198Z Description  : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@bors
Copy link
Contributor

bors commented Oct 22, 2019

💔 Test failed - checks-azure

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Oct 22, 2019
@Rantanen
Copy link
Contributor Author

iwr : The remote name could not be resolved: 'rust-lang-ci-mirrors.s3-us-west-1.amazonaws.com'

I'm assuming this is a transient error. If it isn't and I need to do something about it, I'll need a bit more information than that. :)

@matklad
Copy link
Member

matklad commented Oct 23, 2019

hm, indeed seems spurious

@bors retry

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Oct 23, 2019
Centril added a commit to Centril/rust that referenced this pull request Oct 25, 2019
Fix the start/end byte positions in the compiler JSON output

Track the changes made during normalization in the `SourceFile` and use this information to correct the `start_byte` and `end_byte` fields in the JSON output.

This should ensure the start/end byte fields can be used to index the original file, even if Rust normalized the source code for parsing purposes. Both CRLF to LF and BOM removal are handled with this one.

The rough plan was discussed with @matklad in rust-lang/rustfix#176 - although I ended up going with `u32` offset tracking so I wouldn't need to deal with `u32 + i32` arithmetics when applying the offset to the span byte positions.

Fixes rust-lang#65029
Centril added a commit to Centril/rust that referenced this pull request Oct 25, 2019
Fix the start/end byte positions in the compiler JSON output

Track the changes made during normalization in the `SourceFile` and use this information to correct the `start_byte` and `end_byte` fields in the JSON output.

This should ensure the start/end byte fields can be used to index the original file, even if Rust normalized the source code for parsing purposes. Both CRLF to LF and BOM removal are handled with this one.

The rough plan was discussed with @matklad in rust-lang/rustfix#176 - although I ended up going with `u32` offset tracking so I wouldn't need to deal with `u32 + i32` arithmetics when applying the offset to the span byte positions.

Fixes rust-lang#65029
bors added a commit that referenced this pull request Oct 25, 2019
Rollup of 9 pull requests

Successful merges:

 - #64639 (Stabilize `#[non_exhaustive]` (RFC 2008))
 - #65074 (Fix the start/end byte positions in the compiler JSON output)
 - #65315 (Intern place projection)
 - #65685 (Fix check of `statx` and handle EPERM)
 - #65731 (Prevent unnecessary allocation in PathBuf::set_extension.)
 - #65740 (Fix default "disable-shortcuts" feature value)
 - #65787 (move panictry! to where it is used.)
 - #65789 (move Attribute::with_desugared_doc to librustdoc)
 - #65790 (move report_invalid_macro_expansion_item to item.rs)

Failed merges:

r? @ghost
@bors bors merged commit ff1860a into rust-lang:master Oct 25, 2019
Mark-Simulacrum pushed a commit to Mark-Simulacrum/rust that referenced this pull request Oct 26, 2019
…, r=petrochenkov"

This reverts commit ef1ecbe, reversing
changes made to fc8765d.

That changed unfortunately broke rustfix on windows:

rust-lang/rustfix#176

Specifically, what ef1ecbe did was to
enforce normalization of \r\n to \n at file loading time, similarly to
how we deal with Byte Order Mark. Normalization changes raw offsets in
files, which are exposed via `--error-format=json`, and used by rusfix.

The proper solution here (which also handles the latent case with BOM) is

rust-lang#65074

However, since it's somewhat involved, and we are time sensitive, we
prefer to revert the original change on beta.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

libsyntax JSON output bytes counts ignore CRLF normalization
8 participants