Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llvm][CodeGen] Fixed max cycle calculation with zero-cost instructions for window scheduler #99454

Merged
merged 2 commits into from
Jul 24, 2024

Conversation

kaiyan96
Copy link
Contributor

We discovered some scheduling failures occurring when zero-cost instructions were involved. This issue will be addressed by this patch.

@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2024

@llvm/pr-subscribers-backend-hexagon

Author: Kai Yan (kaiyan96)

Changes

We discovered some scheduling failures occurring when zero-cost instructions were involved. This issue will be addressed by this patch.


Full diff: https://github.com/llvm/llvm-project/pull/99454.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/WindowScheduler.cpp (+10-6)
  • (added) llvm/test/CodeGen/Hexagon/swp-ws-zero-cost.mir (+42)
diff --git a/llvm/lib/CodeGen/WindowScheduler.cpp b/llvm/lib/CodeGen/WindowScheduler.cpp
index 0777480499e55..595f7e2e202dd 100644
--- a/llvm/lib/CodeGen/WindowScheduler.cpp
+++ b/llvm/lib/CodeGen/WindowScheduler.cpp
@@ -437,12 +437,16 @@ int WindowScheduler::calculateMaxCycle(ScheduleDAGInstrs &DAG,
       int PredCycle = getOriCycle(PredMI);
       ExpectCycle = std::max(ExpectCycle, PredCycle + (int)Pred.getLatency());
     }
-    // ResourceManager can be used to detect resource conflicts between the
-    // current MI and the previously inserted MIs.
-    while (!RM.canReserveResources(*SU, CurCycle) || CurCycle < ExpectCycle) {
-      ++CurCycle;
-      if (CurCycle == (int)WindowIILimit)
-        return CurCycle;
+    // Zero cost instructions do not need to check resource.
+    if (!TII->isZeroCost(MI.getOpcode())) {
+      // ResourceManager can be used to detect resource conflicts between the
+      // current MI and the previously inserted MIs.
+      while (!RM.canReserveResources(*SU, CurCycle) || CurCycle < ExpectCycle) {
+        ++CurCycle;
+        if (CurCycle == (int)WindowIILimit)
+          return CurCycle;
+      }
+      RM.reserveResources(*SU, CurCycle);
     }
     RM.reserveResources(*SU, CurCycle);
     OriToCycle[getOriMI(&MI)] = CurCycle;
diff --git a/llvm/test/CodeGen/Hexagon/swp-ws-zero-cost.mir b/llvm/test/CodeGen/Hexagon/swp-ws-zero-cost.mir
new file mode 100644
index 0000000000000..24f227d83ea0d
--- /dev/null
+++ b/llvm/test/CodeGen/Hexagon/swp-ws-zero-cost.mir
@@ -0,0 +1,42 @@
+# REQUIRES: asserts
+# RUN: llc --march=hexagon %s -run-pass=pipeliner -debug-only=pipeliner \
+# RUN: -window-sched=force -filetype=null -verify-machineinstrs 2>&1 \
+# RUN: | FileCheck %s
+
+# CHECK-NOT: Can't find a valid II. Keep searching...
+
+---
+name:            relu
+tracksRegLiveness: true
+body:             |
+  bb.0:
+    successors: %bb.2(0x30000000), %bb.1(0x50000000)
+    liveins: $r0, $r1, $r2
+    %0:intregs = COPY $r2
+    %1:intregs = COPY $r1
+    %2:intregs = COPY $r0
+    %3:predregs = C2_cmpeqi %2, 0
+    J2_jumpt killed %3, %bb.2, implicit-def dead $pc
+    J2_jump %bb.1, implicit-def dead $pc
+  bb.1:
+    successors: %bb.3(0x80000000)
+    %4:hvxvr = V6_vd0
+    %5:intregs = A2_addi %2, 31
+    %6:intregs = S2_lsr_i_r %5, 5
+    %7:intregs = COPY %6
+    J2_loop0r %bb.3, %7, implicit-def $lc0, implicit-def $sa0, implicit-def $usr
+    J2_jump %bb.3, implicit-def dead $pc
+  bb.2:
+    PS_jmpret $r31, implicit-def dead $pc
+  bb.3 (machine-block-address-taken):
+    successors: %bb.3(0x7c000000), %bb.2(0x04000000)
+    %8:intregs = PHI %1, %bb.1, %9, %bb.3
+    %10:intregs = PHI %0, %bb.1, %14, %bb.3
+    %11:hvxvr, %9:intregs = V6_vL32b_pi %8, 128
+    %12:intregs = COPY %10
+    %13:hvxvr = V6_vmaxw killed %11, %4
+    %14:intregs = V6_vS32b_pi %12, 128, killed %13
+    ENDLOOP0 %bb.3, implicit-def $pc, implicit-def $lc0, implicit $sa0, implicit $lc0
+    J2_jump %bb.2, implicit-def dead $pc
+...
+

@huaatian huaatian requested a review from arsenm July 18, 2024 08:45
@huaatian huaatian added the mi-sched machine instruction scheduler label Jul 18, 2024
if (CurCycle == (int)WindowIILimit)
return CurCycle;
// Zero cost instructions do not need to check resource.
if (!TII->isZeroCost(MI.getOpcode())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this hook exist? How is it different from isMetaInstruction?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also copy should need resource checks?

Copy link
Contributor Author

@kaiyan96 kaiyan96 Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this hook exist? How is it different from isMetaInstruction?

  • Metainstructions do not participate in scheduling, but zero-cost instructions do. COPY is a typical example of a zero-cost instruction

Also copy should need resource checks?

  • COPY do not occupy hardware resources so they do not require resource checks, they still need to be handled during instruction scheduling to ensure proper execution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COPY do not occupy hardware resources

This is a very target specific assumption but ok

# RUN: -window-sched=force -filetype=null -verify-machineinstrs 2>&1 \
# RUN: | FileCheck %s

# CHECK-NOT: Can't find a valid II. Keep searching...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A check-not check for a specific message is fragile and practically useless. This needs positive checks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHECK-NEXT and -implicit-check-not are more reliable than putting a CHECK-NOT in one place

# RUN: -window-sched=force -filetype=null -verify-machineinstrs 2>&1 \
# RUN: | FileCheck %s

# CHECK-NOT: Can't find a valid II. Keep searching...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHECK-NEXT and -implicit-check-not are more reliable than putting a CHECK-NOT in one place

@huaatian huaatian merged commit d27ee36 into llvm:main Jul 24, 2024
7 checks passed
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
…ns for window scheduler (#99454)

Summary:
We discovered some scheduling failures occurring when zero-cost
instructions were involved. This issue will be addressed by this patch.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60250685
@nathanchance
Copy link
Member

This change crashes clang when building the Linux kernel for ARCH=hexagon. cvise spits out:

enum { false } div64_u64_rem(long, long long *);
struct dp_tu_calc_input {
  int nlanes;
  int dsc_en;
};
struct tu_algo_data {
  long long pclk_fp;
};
long long dp_panel_update_tu_timings___trans_tmp_5,
    _dp_ctrl_calc_tu___trans_tmp_8, _dp_ctrl_calc_tu_temp2_fp;
long _dp_ctrl_calc_tu_temp1_fp;
__attribute__((__malloc__)) void *kzalloc();
long long drm_fixp_from_fraction(long b) {
  _Bool b_neg = 0;
  long b_abs = b_neg ?: b;
  unsigned long long rem;
  long long res_abs = div64_u64_rem(0, &rem);
  int i = 32;
  do {
    res_abs <<= 1;
    if (rem >= b_abs) {
      res_abs |= 1;
      rem -= b_abs;
    }
  } while (--i);
  return res_abs;
}
void _dp_ctrl_calc_tu(struct dp_tu_calc_input *in) {
  struct tu_algo_data *tu = kzalloc();
  struct dp_tu_calc_input __trans_tmp_1 = *in;
  struct tu_algo_data __trans_tmp_2 = *tu;
  {
    struct dp_tu_calc_input in = __trans_tmp_1;
    __trans_tmp_2;
    int nlanes = in.nlanes;
    long temp3_fp;
    if (in.dsc_en)
      goto fec_check;
    _dp_ctrl_calc_tu_temp2_fp = drm_fixp_from_fraction(1);
    temp3_fp = 0;
    long a = temp3_fp;
    a;
    nlanes;
    tu->pclk_fp = _dp_ctrl_calc_tu_temp2_fp;
  fec_check:
    _dp_ctrl_calc_tu_temp1_fp = drm_fixp_from_fraction(1000);
    long b = _dp_ctrl_calc_tu_temp1_fp;
    unsigned result = b;
    dp_panel_update_tu_timings___trans_tmp_5 = result;
  }
  long long b = tu->pclk_fp, result = b;
  _dp_ctrl_calc_tu___trans_tmp_8 = result;
}
void dp_ctrl_calc_tu_parameters() {
  struct dp_tu_calc_input in;
  _dp_ctrl_calc_tu(&in);
}
$ clang --target=hexagon -O2 -c -o /dev/null dp_ctrl.i
...
clang: /home/nathan/tmp/cvise.pDMTndXeEh/src/llvm/include/llvm/ADT/SmallVector.h:308: const_reference llvm::SmallVectorTemplateCommon<std::pair<llvm::SlotIndex, llvm::SlotIndex>>::operator[](size_type) const [T = std::pair<llvm::SlotIndex, llvm::SlotIndex>]: Assertion `idx < size()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.	Program arguments: clang --target=hexagon -O2 -c -o /dev/null dp_ctrl.i
1.	<eof> parser at end of file
2.	Code generation
3.	Running pass 'Function Pass Manager' on module 'dp_ctrl.i'.
4.	Running pass 'Modulo Software Pipelining' on function '@dp_ctrl_calc_tu_parameters'
 #0 0x0000564b03199086 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x5629086)
 #1 0x0000564b03196b4e llvm::sys::RunSignalHandlers() (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x5626b4e)
 #2 0x0000564b031193cd CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #3 0x00007f8e32ac2ae0 (/usr/lib/libc.so.6+0x3cae0)
 #4 0x00007f8e32b1ae44 (/usr/lib/libc.so.6+0x94e44)
 #5 0x00007f8e32ac2a30 raise (/usr/lib/libc.so.6+0x3ca30)
 #6 0x00007f8e32aaa4c3 abort (/usr/lib/libc.so.6+0x244c3)
 #7 0x00007f8e32aaa3df (/usr/lib/libc.so.6+0x243df)
 #8 0x00007f8e32abac67 (/usr/lib/libc.so.6+0x34c67)
 #9 0x0000564b02694625 llvm::LiveRangeCalc::findReachingDefs(llvm::LiveRange&, llvm::MachineBasicBlock&, llvm::SlotIndex, unsigned int, llvm::ArrayRef<llvm::SlotIndex>) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4b24625)
#10 0x0000564b02693757 llvm::LiveRangeCalc::extend(llvm::LiveRange&, llvm::SlotIndex, unsigned int, llvm::ArrayRef<llvm::SlotIndex>) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4b23757)
#11 0x0000564b026972bc llvm::LiveIntervalCalc::extendToUses(llvm::LiveRange&, llvm::Register, llvm::LaneBitmask, llvm::LiveInterval*) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4b272bc)
#12 0x0000564b02696e19 llvm::LiveIntervalCalc::calculate(llvm::LiveInterval&, bool) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4b26e19)
#13 0x0000564b02685125 llvm::LiveIntervals::repairIntervalsInRange(llvm::MachineBasicBlock*, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::MachineInstrBundleIterator<llvm::MachineInstr, false>, llvm::ArrayRef<llvm::Register>) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4b15125)
#14 0x0000564b02ad6b7f llvm::WindowScheduler::updateLiveIntervals() (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4f66b7f)
#15 0x0000564b02ad31c7 llvm::WindowScheduler::generateTripleMBB() (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4f631c7)
#16 0x0000564b02ad1e2c llvm::WindowScheduler::preProcess() (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4f61e2c)
#17 0x0000564b02ad0af6 llvm::WindowScheduler::run() (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4f60af6)
#18 0x0000564b02749967 llvm::MachinePipeliner::runWindowScheduler(llvm::MachineLoop&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4bd9967)
#19 0x0000564b02747b8b llvm::MachinePipeliner::scheduleLoop(llvm::MachineLoop&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4bd7b8b)
#20 0x0000564b027479eb llvm::MachinePipeliner::runOnMachineFunction(llvm::MachineFunction&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4bd79eb)
#21 0x0000564b027076d2 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x4b976d2)
#22 0x0000564b02c900e6 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x51200e6)
#23 0x0000564b02c98972 llvm::FPPassManager::runOnModule(llvm::Module&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x5128972)
#24 0x0000564b02c90bc2 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x5120bc2)
#25 0x0000564b039d0e20 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x5e60e20)
#26 0x0000564b039f7f87 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x5e87f87)
#27 0x0000564b04e87f69 clang::ParseAST(clang::Sema&, bool, bool) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x7317f69)
#28 0x0000564b03e8ddbd clang::FrontendAction::Execute() (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x631ddbd)
#29 0x0000564b03df1c9d clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x6281c9d)
#30 0x0000564b03f6ac4c clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x63fac4c)
#31 0x0000564b011eb5d5 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x367b5d5)
#32 0x0000564b011e7c1d ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) driver.cpp:0:0
#33 0x0000564b03c16709 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) Job.cpp:0:0
#34 0x0000564b03119106 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x55a9106)
#35 0x0000564b03c15da3 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x60a5da3)
#36 0x0000564b03bcd957 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x605d957)
#37 0x0000564b03bcdeb7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x605deb7)
#38 0x0000564b03bf0659 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x6080659)
#39 0x0000564b011e70bd clang_main(int, char**, llvm::ToolContext const&) (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x36770bd)
#40 0x0000564b011f8a46 main (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x3688a46)
#41 0x00007f8e32aabc88 (/usr/lib/libc.so.6+0x25c88)
#42 0x00007f8e32aabd4c __libc_start_main (/usr/lib/libc.so.6+0x25d4c)
#43 0x0000564b011e54e5 _start (/home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin/clang-20+0x36754e5)
clang: error: clang frontend command failed with exit code 134 (use -v to see invocation)
ClangBuiltLinux clang version 20.0.0git (https://github.com/llvm/llvm-project.git d27ee36cdef28845b4aba4f438c7d8bef4be4da7)
Target: hexagon
Thread model: posix
InstalledDir: /home/nathan/tmp/cvise.pDMTndXeEh/install/llvm-bad/bin
Build config: +assertions
clang: note: diagnostic msg: Error generating preprocessed source(s) - no preprocessable inputs.

@kaiyan96
Copy link
Contributor Author

Thank you for the bug report. We will take care of it promptly.

Harini0924 pushed a commit to Harini0924/llvm-project that referenced this pull request Aug 1, 2024
…ns for window scheduler (llvm#99454)

We discovered some scheduling failures occurring when zero-cost
instructions were involved. This issue will be addressed by this patch.
kaiyan96 added a commit to kaiyan96/llvm-project that referenced this pull request Aug 12, 2024
…ns for window scheduler (llvm#99454)

We discovered some scheduling failures occurring when zero-cost
instructions were involved. This issue will be addressed by this patch.
tru pushed a commit to kaiyan96/llvm-project that referenced this pull request Aug 20, 2024
…ns for window scheduler (llvm#99454)

We discovered some scheduling failures occurring when zero-cost
instructions were involved. This issue will be addressed by this patch.
tru pushed a commit to kaiyan96/llvm-project that referenced this pull request Sep 1, 2024
…ns for window scheduler (llvm#99454)

We discovered some scheduling failures occurring when zero-cost
instructions were involved. This issue will be addressed by this patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:Hexagon mi-sched machine instruction scheduler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants