Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hashing for windows paths containing a CurDir component #93697

Merged
merged 2 commits into from
Feb 12, 2022

Conversation

the8472
Copy link
Member

@the8472 the8472 commented Feb 6, 2022

  • the logic only checked for / but not for \
  • verbatim paths shouldn't skip items at all since they don't get normalized
  • the extra branches get optimized out on unix since is_sep_byte is a trivial comparison and is_verbatim is always-false
  • tests lacked windows coverage for these cases

That lead to equal paths not having equal hashes and to unnecessary collisions.

* the logic only checked for / but not for \
* verbatim paths shouldn't skip items at all since they don't get normalized
* the extra branches get optimized out on unix since is_sep_byte is a trivial comparison and is_verbatim is always-false
* tests lacked windows coverage for these cases

That lead to equal paths not having equal hashes and to unnecessary collisions.
@the8472 the8472 added O-windows Operating system: Windows T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 6, 2022
@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 6, 2022
@the8472
Copy link
Member Author

the8472 commented Feb 6, 2022

Lacking a windows machine (and too lazy to setup a VM) I haven't actually run the tests on windows.

@Mark-Simulacrum
Copy link
Member

That lead to equal paths not having equal hashes and to unnecessary collisions.

Equal paths not having equal hashes means that things like HashMap etc may be just broken today, right? Is this a recently introduced bug we should be backporting a fix for (IIRC, I may have approved a PR in this area somewhat recently).

@Mark-Simulacrum
Copy link
Member

@bors r+ rollup=iffy

In the meantime, this seems good to me.

@bors
Copy link
Contributor

bors commented Feb 6, 2022

📌 Commit 45082b0 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 6, 2022
@the8472
Copy link
Member Author

the8472 commented Feb 6, 2022

Equal paths not having equal hashes means that things like HashMap etc may be just broken today, right?

Yes.

Is this a recently introduced bug we should be backporting a fix for (IIRC, I may have approved a PR in this area somewhat recently).

#90596 (1.58).

@Mark-Simulacrum Mark-Simulacrum added the beta-nominated Nominated for backporting to the compiler in the beta channel. label Feb 6, 2022
@Mark-Simulacrum
Copy link
Member

I'm going to beta nominate then for backport to 1.59, since it seems plausible that folks could hit it in the wild.

It might be worth considering running a fuzzer or similar against the old and new impls, maybe that could help uncover any other hidden bugs...?

@the8472
Copy link
Member Author

the8472 commented Feb 6, 2022

I'll take a stab at that, would be my first time using one and this looks like a good first test case.

@the8472
Copy link
Member Author

the8472 commented Feb 6, 2022

I'm going to beta nominate then for backport to 1.59, since it seems plausible that folks could hit it in the wild.

Just to be explicit, identical paths will hash fine. It's only some not-identical-but-equal paths, on windows, that would hash incorrectly. Which means one would have to put something under one such path into a hashmap and then construct a different path as lookup key. And one can't use PathBuf::push to construct such a path because push is also normalizing. Not impossible, but probably mitigates the impact.

@m-ou-se m-ou-se added the beta-accepted Accepted for backporting to the compiler in the beta channel. label Feb 9, 2022
@Mark-Simulacrum
Copy link
Member

@bors p=1 -- beta-backport accepted

@bors
Copy link
Contributor

bors commented Feb 11, 2022

⌛ Testing commit 45082b0 with merge 84f41fa6091a72dd04fda915e16a37a4dc56cf60...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Feb 11, 2022

💔 Test failed - checks-actions

@bors bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Feb 11, 2022
@the8472
Copy link
Member Author

the8472 commented Feb 11, 2022

The failing test

// run-pass
use std::collections::HashMap;
use std::path::Path;
fn main() {
let mut map = HashMap::new();
map.insert(Path::new("a"), 0);
map.get(Path::new("a"));
}

Running it locally with the wasm32-unknown-unknown target gets a little further and fails with

/home/the8472/workspace/rust/src/etc/wasm32-shim.js:17
let m = new WebAssembly.Module(buffer);
        ^

CompileError: WebAssembly.Module(): Compiling function #3:"_ZN11issue_230364main17hf6fa4aae1344ccadE" failed: Invalid opcode 0xff @+874
    at Object.<anonymous> (/home/the8472/workspace/rust/src/etc/wasm32-shim.js:17:9)
    at Module._compile (node:internal/modules/cjs/loader:1097:14)
    at Object.Module._extensions..js (node:internal/modules/cjs/loader:1149:10)
    at Module.load (node:internal/modules/cjs/loader:975:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
    at node:internal/main/run_main_module:17:47

An LLVM bug?

@Mark-Simulacrum
Copy link
Member

Maybe. Could be a NodeJS/V8 parsing bug too. Not really sure what the best way to investigate is -- probably cross-compiling to wasm32 locally or otherwise getting that bytecode to try and run it against a newer NodeJS or perhaps load it up in a disassembler of some kind?

It may be worth simplifying the test too, we probably just need to hash a path...

@the8472
Copy link
Member Author

the8472 commented Feb 11, 2022

My system nodejs is 17.3, that's fairly recent.

It may be worth simplifying the test too, we probably just need to hash a path...

Replacing the test with the following makes it pass.

    let mut hasher = RandomState::new().build_hasher();
    Path::new("a").hash(&mut hasher);
    dbg!(hasher.finish());

@the8472
Copy link
Member Author

the8472 commented Feb 11, 2022

Rebuilt LLVM with asserts enabled, now I get the same error as CI

rustc: /home/the8472/workspace/rust/src/llvm-project/llvm/lib/MC/WasmObjectWriter.cpp:149: void {anonymous}::writePatchableLEB(llvm::raw_pwrite_stream&, uint64_t, uint64_t) [with int W = 5; uint64_t = long unsigned int]: Assertion `SizeLen == W' failed.

@Mark-Simulacrum
Copy link
Member

Hm -- I wonder if we can figure out a small delta on the input code to land this PR and then follow up on the LLVM bug separately? At minimum, a minimal LLVM IR example would be good to poke LLVM folks with.

@the8472
Copy link
Member Author

the8472 commented Feb 11, 2022

The LLVM IR from the failing UI test: repro.ir.txt
Running it through llc repro.ir -march=wasm32 -filetype=obj doesn't trigger the assert, but then passing the output to wasm-dis prints the following error, so I guess llc just doesn't get built with asserts.

[parse exception: LEB dropped bits only valid for signed LEB]

@Mark-Simulacrum
Copy link
Member

Hm. Well, maybe let's file a rust-lang/rust bug with that and ping the LLVM group -- I would like to see this PR land ideally, but it sounds like that might be tough to have happen in a shortish time frame.

I don't really have much expertise myself at digging in LLVM.

@the8472
Copy link
Member Author

the8472 commented Feb 11, 2022

If we want a fix for beta then another option is to revert #90596, file a LLVM bug and then try to reland the version from this PR when it is fixed.

@Mark-Simulacrum
Copy link
Member

Yeah, that's probably the right approach. Let's go ahead and revert #90596 on master to avoid needing to remember beta backports as well, and then we can backport that. Once the LLVM bug is fixed we can re-land the work (in combination with this PR).

@the8472
Copy link
Member Author

the8472 commented Feb 11, 2022

Huh. I tried to selectively revert a083dd6 (on top of master, not this PR of course) which is the subset of #90596 that this PR touches too and it's still failing in that UI test.

Doing some more tests...

@the8472
Copy link
Member Author

the8472 commented Feb 12, 2022

A complete revert of #90596 also triggers this assert. I suspect this bug has been around for a while and is highly responsive to perturbations.

My observations so far

  • master: [X] windows bug [ ] wasm bug
  • this PR: [ ] windows bug [X] wasm bug
  • revert 90596 whole: [ ] windows bug [X] wasm bug
  • revert a083dd6: [ ] windows bug [X] wasm bug
  • revert 7f6e080 a6e0aa2: [X] windows bug [ ] wasm bug

We can conclude that the universe does not allow wasm and windows to exist simultaneously.

Would it be ok to disable the test on wasm? It seems somewhat questionable since miscompilations of Path would be fairly serious. On the other hand the bug seems fragile in the first place.

E.g. changing the test from

     map.insert(Path::new("a"), 0); 
     map.get(Path::new("a")); 

to

    map.insert(Path::new("a"), 0);
    map[Path::new("a")];

or

     map.insert(Path::new("aa/./b"), 0); 
     map.get(Path::new("aa/b")); 

makes it pass.

@Mark-Simulacrum
Copy link
Member

I think we should disable the test - wasm is not a tier 1 platform, windows is more important in that sense.

A fix applied to std::Path::hash triggers a miscompilation/assert in LLVM in this test on wasm32.
The miscompilation appears to pre-existing. Reverting some previous changes done std::Path also trigger it
and slight modifications such as changing the test path from "a" to "ccccccccccc" also make it pass, indicating
it's very flaky.
Since the fix is for a higher-tier platform than wasm it takes precedence.
@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Feb 12, 2022
@the8472
Copy link
Member Author

the8472 commented Feb 12, 2022

Disabled the test. Since the issue appears to be preexisting and just trigger unreliable it shouldn't affect the beta nomination but you might want to reevaluate that anyway.

@Mark-Simulacrum
Copy link
Member

I think the beta backport likely still makes sense, I suspect this assert is not really related to this PR (or even the previous one), just triggering some latent state inside LLVM.

@bors r+

@bors
Copy link
Contributor

bors commented Feb 12, 2022

📌 Commit 1d21ce7 has been approved by Mark-Simulacrum

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 12, 2022
@bors
Copy link
Contributor

bors commented Feb 12, 2022

⌛ Testing commit 1d21ce7 with merge 9c3a3e3...

@bors
Copy link
Contributor

bors commented Feb 12, 2022

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 9c3a3e3 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Feb 12, 2022
@bors bors merged commit 9c3a3e3 into rust-lang:master Feb 12, 2022
@rustbot rustbot added this to the 1.60.0 milestone Feb 12, 2022
@Mark-Simulacrum Mark-Simulacrum removed the beta-nominated Nominated for backporting to the compiler in the beta channel. label Feb 12, 2022
@Mark-Simulacrum Mark-Simulacrum modified the milestones: 1.60.0, 1.59.0 Feb 12, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (9c3a3e3): comparison url.

Summary: This benchmark run did not return any relevant results.

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@rustbot label: -perf-regression

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 12, 2022
…ulacrum

[beta] backports

This backports:

*  Complete removal of #[main] attribute from compiler rust-lang#93753
*  Resolve lifetimes for const generic defaults rust-lang#93669
*  backport llvm fix for issue 91671. rust-lang#93426
*  Fix invalid special casing of the unreachable! macro rust-lang#93179
*  Fix hashing for windows paths containing a CurDir component rust-lang#93697

r? `@Mark-Simulacrum`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beta-accepted Accepted for backporting to the compiler in the beta channel. merged-by-bors This PR was explicitly merged by bors. O-windows Operating system: Windows S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants