Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an unreachable branch helps optimize the code when matching on x % N #93514

Open
WaffleLapkin opened this issue Jan 31, 2022 · 12 comments · May be fixed by #134626
Open

Adding an unreachable branch helps optimize the code when matching on x % N #93514

WaffleLapkin opened this issue Jan 31, 2022 · 12 comments · May be fixed by #134626
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@WaffleLapkin
Copy link
Member

WaffleLapkin commented Jan 31, 2022

Consider this example:

pub fn parse(x: usize) -> usize {
    match x % 5 {
        0 => f1(x),
        1 => f2(x),
        2 => f3(x),
        3 => f4(x),
        4 => f5(x),
        // 5 | 6 | 7 => loop{},
        _ => loop{},
    }
}

It currently generates similar LLVM-IR:

start:
  %_2 = urem i64 %x, 5, !dbg !10
  switch i64 %_2, label %bb12 [
    i64 0, label %bb2
    i64 1, label %bb4
    i64 2, label %bb6
    i64 3, label %bb8
    i64 4, label %bb10
  ], !dbg !11
; ...
bb12:
  br label %bb12, !dbg !32

Even though the default branch is unreachable (x % 5 can't be greater than 4) it is still generated.

But, when un-commenting 5 | 6 | 7 => loop{} line (still unreachable branch, that does the same as default) the default branch becomes unreachable:

start:
  %_2 = urem i64 %x, 5, !dbg !10
  switch i64 %_2, label %bb14 [
    i64 0, label %bb2
    i64 1, label %bb4
    i64 2, label %bb6
    i64 3, label %bb8
    i64 4, label %bb10
  ], !dbg !11
; ...
bb14:
  unreachable

So, adding an unreachable branch that does the same as the default helps optimize the code.

Some notes:

  • These differences then propagate to assembler, i.e. it also has an unreachable branch when 5 | 6 | 7 is commented-out
  • godbolt link
  • Range patterns (like 5..=7) do not help
  • Making the default branch 5 | 6 | 7 | _ doesn't help either
  • Similar things happen for any N in x % N that is not a power of two -- adding unreachable branches until the last branch has a power-of-two-minus-one-value helps optimizing the code
  • loop{}s can be replaced with anything else, for example unreachable!() or panic!(), effect is still the same
    • in case of panic-like things not removing the panicking branch seems like a big deal
    • I've used loop{}s to make diffs less noisy
  • if-else-if chains are identical to match here
  • clang seems to handle this situation just fine in any case: godbolt link
    • it also makes the last reachable branch default in llvm-ir, instead of generating an unreachable one
@WaffleLapkin
Copy link
Member Author

@rustbot label +A-codegen +C-enhancement +I-slow +T-compiler

@rustbot rustbot added A-codegen Area: Code generation C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jan 31, 2022
@scottmcm
Copy link
Member

I think there's an LLVM bug here, despite your clang example.

Here's a demonstration using just opt that it doesn't notice the defaultdest is unreachable, and thus leaves the infinite loop there: https://llvm.godbolt.org/z/GcG1dM3sq

@scottmcm scottmcm added the A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. label Jan 31, 2022
@erikdesjardins
Copy link
Contributor

In LLVM, when the optimization fires, the unreachable bb is replaced with .unreachabledefault (https://llvm.godbolt.org/z/jWeeWYKhE), so we can tell it's created by createUnreachableSwitchDefault, called from here: https://github.com/llvm/llvm-project/blob/c25ba3c79020246b24aa658ac02770be07eee0f5/llvm/lib/Transforms/Utils/SimplifyCFG.cpp#L4990.

That code determines whether or not the default condition is reachable based on the number of significant bits in the condition, which is why it doesn't work until you have power-of-2-minus-1 cases. It could be made more precise. (I'm looking into this.)

@nikic
Copy link
Contributor

nikic commented Feb 2, 2022

I believe https://reviews.llvm.org/D106056 has already fixed this. Unfortunately, this exposed some compile-time issues in the backend, which haven't been resolved yet, so the patch is reverted for now.

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
@erikdesjardins
Copy link
Contributor

Looks like this will be handled better in LLVM 18 (after #120055): https://godbolt.org/z/reaf5oKcb

@nikic
Copy link
Contributor

nikic commented Jan 27, 2024

@erikdesjardins I believe the relevant change has been reverted.

@mati865
Copy link
Contributor

mati865 commented Jan 27, 2024

There is a PR that adds unreachable like that: #120268

@DianQK

This comment was marked as duplicate.

@DianQK
Copy link
Member

DianQK commented Jan 27, 2024

I believe https://reviews.llvm.org/D106056 has already fixed this. Unfortunately, this exposed some compile-time issues in the backend, which haven't been resolved yet, so the patch is reverted for now.

Looks like the same issue I'm having.

@DianQK
Copy link
Member

DianQK commented Jan 27, 2024

There is a PR that adds unreachable like that: #120268

It only solves enum scenarios. The enum scenario has been solved, this PR is an extension.

llvm/llvm-project#76669 will partially solve the issue. It was also reverted due to some backend issues.

@erikdesjardins
Copy link
Contributor

erikdesjardins commented Aug 1, 2024

llvm/llvm-project#76669 will partially solve the issue. It was also reverted due to some backend issues.

That change got unreverted and was included in the recent LLVM 19 upgrade, so I believe this is now fixed: https://godbolt.org/z/1MvffeWff

@rustbot label E-needs-test

@rustbot rustbot added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Aug 1, 2024
@DianQK
Copy link
Member

DianQK commented Aug 3, 2024

This is probably fixed by llvm/llvm-project#79993.

Hmm, looks like 1.78 has fixed this: https://rust.godbolt.org/z/n4fbG4xj4.

@veera-sivarajan veera-sivarajan linked a pull request Dec 21, 2024 that will close this issue
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 27, 2024
…r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 12, 2025
…r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
bors added a commit to rust-lang-ci/rust that referenced this issue Jan 19, 2025
…r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Feb 10, 2025
…, r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
Zalathar added a commit to Zalathar/rust that referenced this issue Feb 10, 2025
…, r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Feb 10, 2025
…, r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
bors added a commit to rust-lang-ci/rust that referenced this issue Feb 11, 2025
…r=Mark-Simulacrum

Add Four Codegen Tests

Closes rust-lang#74615
Closes rust-lang#123216
Closes rust-lang#49572
Closes rust-lang#93514

This PR adds four codegen tests. The FileCheck assertions were generated with the help of `update_test_checks.py` and `update_llc_test_checks.py` from the LLVM project.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants