-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc_errors: use perfect hashing for character replacements #128463
Conversation
Failed to set assignee to
|
b8da67c
to
03906c5
Compare
r? @ghost |
Failed to set assignee to
|
1 similar comment
Failed to set assignee to
|
So I guess it only works when it is present in the first version of the OP. Cool. |
was about to queue this and noticed that the other PR still hadn't landed... |
03906c5
to
4108ac4
Compare
These commits modify the If this was unintentional then you should revert the changes before this PR is merged. The list of allowed third-party dependencies may have been modified! You must ensure that any new dependencies have compatible licenses before merging. |
Now it has! Could you please queue it for benchmarking? I do not think I have enough rights to do it myself, nor do I really remember how it is done. |
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
rustc_errors: use perfect hashing for character replacements The correctness of code in rust-lang#128200 relies on an array being sorted (so that it can be used in binary search later), which is currently enforced with `// tidy-alphabetical` (and characters being written in `\u{XXXX}` form), as well as lack of duplicate entries with conflicting keys, which is not currently enforced. A const assert or a test can be added checking that (implemented in rust-lang#128465). But this PR tries to use [perfect hashing](https://en.wikipedia.org/wiki/Perfect_hash_function) instead. The performance implications are unclear. Asymptotically it's faster, but in reality we should just benchmark. Plus if there are no significant performance wins, this entire things is probably not even worse the additional dependencies it brings. UPD: funnily enough, there's a PR optimizing the binary search implementation (rust-lang#128254) in the queue right now. So I guess we have to wait until that is merged too before benchmarking this.
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (1de00dc): comparison URL. Overall result: ❌ regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results (primary 2.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary -4.2%, secondary 0.5%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 760.99s -> 765.201s (0.55%) |
Well, the regressions are on syn again (#128169 (comment)), and there are no improvements whatsoever. At least on the instruction count. Number of cycles seem to have gotten a bit better overall, but not significantly, so probably not worth all the additional dependencies. |
Some `const { }` asserts for rust-lang#128200 The correctness of code in rust-lang#128200 relies on an array being sorted (so that it can be used in binary search later), which is currently enforced with `// tidy-alphabetical` (and characters being written in `\u{XXXX}` form), as well as lack of duplicate entries with conflicting keys, which is not currently enforced. This PR changes it to using a `const{ }` assertion (and also checks for duplicate entries). Sadly, we cannot use the recently-stabilized `is_sorted_by_key` here, because it is not const (but it would not allow us to check for uniqueness anyways). Instead, let's write a manual loop. Alternative approach (perfect hash function): rust-lang#128463 r? `@ghost`
Some `const { }` asserts for rust-lang#128200 The correctness of code in rust-lang#128200 relies on an array being sorted (so that it can be used in binary search later), which is currently enforced with `// tidy-alphabetical` (and characters being written in `\u{XXXX}` form), as well as lack of duplicate entries with conflicting keys, which is not currently enforced. This PR changes it to using a `const{ }` assertion (and also checks for duplicate entries). Sadly, we cannot use the recently-stabilized `is_sorted_by_key` here, because it is not const (but it would not allow us to check for uniqueness anyways). Instead, let's write a manual loop. Alternative approach (perfect hash function): rust-lang#128463 r? `@ghost`
The correctness of code in #128200 relies on an array being sorted (so that it can be used in binary search later), which is currently enforced with
// tidy-alphabetical
(and characters being written in\u{XXXX}
form), as well as lack of duplicate entries with conflicting keys, which is not currently enforced.A const assert or a test can be added checking that (implemented in #128465).
But this PR tries to use perfect hashing instead.
The performance implications are unclear. Asymptotically it's faster, but in reality we should just benchmark. Plus if there are no significant performance wins, this entire things is probably not even worse the additional dependencies it brings.
UPD: funnily enough, there's a PR optimizing the binary search implementation (#128254) in the queue right now. So I guess we have to wait until that is merged too before benchmarking this.