[TailDup] Allow large number of predecessors/successors without phis. #116072

fhahn · 2024-11-13T16:02:12Z

This adjusts the threshold logic added in #78582 to only trigger for cases where there are actually phis to duplicate in either TailBB or in one of the successors.

In cases there are no phis, we only have to pay the cost of extra edges, but have no explosion in PHI related instructions.

This improves performance of Python on some inputs by 2-3% on Apple Silicon CPUs.

This adjusts the threshold logic added in llvm#78582 to only trigger for cases where there are actually phis to duplicate in either TailBB or in one of the successors. In cases there are no phis, we only have to pay the cost of extra edges, but have no explosion in PHI related instructions. This improves performance of Python on some inputs by 2-3% on Apple Silicon CPUs.

llvmbot · 2024-11-13T16:02:47Z

@llvm/pr-subscribers-backend-x86

Author: Florian Hahn (fhahn)

Changes

This adjusts the threshold logic added in #78582 to only trigger for cases where there are actually phis to duplicate in either TailBB or in one of the successors.

In cases there are no phis, we only have to pay the cost of extra edges, but have no explosion in PHI related instructions.

This improves performance of Python on some inputs by 2-3% on Apple Silicon CPUs.

Full diff: https://github.com/llvm/llvm-project/pull/116072.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/TailDuplicator.cpp (+15-8)
(modified) llvm/test/CodeGen/X86/tail-dup-pred-succ-size.mir (+32-28)

diff --git a/llvm/lib/CodeGen/TailDuplicator.cpp b/llvm/lib/CodeGen/TailDuplicator.cpp
index d0013923838b35..a333122b2ba035 100644
--- a/llvm/lib/CodeGen/TailDuplicator.cpp
+++ b/llvm/lib/CodeGen/TailDuplicator.cpp
@@ -573,14 +573,6 @@ bool TailDuplicator::shouldTailDuplicate(bool IsSimple,
   if (TailBB.isSuccessor(&TailBB))
     return false;
 
-  // Duplicating a BB which has both multiple predecessors and successors will
-  // result in a complex CFG and also may cause huge amount of PHI nodes. If we
-  // want to remove this limitation, we have to address
-  // https://github.com/llvm/llvm-project/issues/78578.
-  if (TailBB.pred_size() > TailDupPredSize &&
-      TailBB.succ_size() > TailDupSuccSize)
-    return false;
-
   // Set the limit on the cost to duplicate. When optimizing for size,
   // duplicate only one, because one branch instruction can be eliminated to
   // compensate for the duplication.
@@ -618,6 +610,7 @@ bool TailDuplicator::shouldTailDuplicate(bool IsSimple,
   // Check the instructions in the block to determine whether tail-duplication
   // is invalid or unlikely to be profitable.
   unsigned InstrCount = 0;
+  unsigned NumPhis = 0;
   for (MachineInstr &MI : TailBB) {
     // Non-duplicable things shouldn't be tail-duplicated.
     // CFI instructions are marked as non-duplicable, because Darwin compact
@@ -661,6 +654,20 @@ bool TailDuplicator::shouldTailDuplicate(bool IsSimple,
 
     if (InstrCount > MaxDuplicateCount)
       return false;
+    NumPhis += MI.isPHI();
+  }
+
+  // Duplicating a BB which has both multiple predecessors and successors will
+  // may cause huge amount of PHI nodes. If we want to remove this limitation,
+  // we have to address https://github.com/llvm/llvm-project/issues/78578.
+  if (TailBB.pred_size() > TailDupPredSize &&
+      TailBB.succ_size() > TailDupSuccSize) {
+    // If TailBB or any of its successors contains a phi, we may have to add a
+    // large number of additional phis with additional incoming values.
+    if (NumPhis != 0 || any_of(TailBB.successors(), [](MachineBasicBlock *MBB) {
+          return any_of(*MBB, [](MachineInstr &MI) { return MI.isPHI(); });
+        }))
+      return false;
   }
 
   // Check if any of the successors of TailBB has a PHI node in which the
diff --git a/llvm/test/CodeGen/X86/tail-dup-pred-succ-size.mir b/llvm/test/CodeGen/X86/tail-dup-pred-succ-size.mir
index 1d17672e2c6bd0..2f1ff76fda76c9 100644
--- a/llvm/test/CodeGen/X86/tail-dup-pred-succ-size.mir
+++ b/llvm/test/CodeGen/X86/tail-dup-pred-succ-size.mir
@@ -538,43 +538,47 @@ body:             |
   ; LIMIT-NEXT:   JMP64m $noreg, 8, [[SUBREG_TO_REG]], %jump-table.0, $noreg
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.2:
-  ; LIMIT-NEXT:   successors: %bb.7(0x80000000)
+  ; LIMIT-NEXT:   successors: %bb.9(0x20000000), %bb.10(0x20000000), %bb.11(0x20000000), %bb.12(0x20000000)
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT:   [[MOV32rm:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   JMP_1 %bb.7
+  ; LIMIT-NEXT:   [[SHR32ri1:%[0-9]+]]:gr32 = SHR32ri [[COPY]], 2, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[AND32ri1:%[0-9]+]]:gr32 = AND32ri [[SHR32ri1]], 7, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SUBREG_TO_REG1:%[0-9]+]]:gr64_nosp = SUBREG_TO_REG 0, [[AND32ri1]], %subreg.sub_32bit
+  ; LIMIT-NEXT:   JMP64m $noreg, 8, [[SUBREG_TO_REG1]], %jump-table.1, $noreg
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.3:
-  ; LIMIT-NEXT:   successors: %bb.7(0x80000000)
+  ; LIMIT-NEXT:   successors: %bb.9(0x20000000), %bb.10(0x20000000), %bb.11(0x20000000), %bb.12(0x20000000)
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT:   [[MOV32rm1:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   [[SHR32ri1:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm1]], 1, implicit-def dead $eflags
-  ; LIMIT-NEXT:   JMP_1 %bb.7
+  ; LIMIT-NEXT:   [[SHR32ri2:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm1]], 1, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SHR32ri3:%[0-9]+]]:gr32 = SHR32ri [[COPY]], 2, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[AND32ri2:%[0-9]+]]:gr32 = AND32ri [[SHR32ri3]], 7, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SUBREG_TO_REG2:%[0-9]+]]:gr64_nosp = SUBREG_TO_REG 0, [[AND32ri2]], %subreg.sub_32bit
+  ; LIMIT-NEXT:   JMP64m $noreg, 8, [[SUBREG_TO_REG2]], %jump-table.1, $noreg
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.4:
-  ; LIMIT-NEXT:   successors: %bb.7(0x80000000)
+  ; LIMIT-NEXT:   successors: %bb.9(0x20000000), %bb.10(0x20000000), %bb.11(0x20000000), %bb.12(0x20000000)
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT:   [[MOV32rm2:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   [[SHR32ri2:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm2]], 2, implicit-def dead $eflags
-  ; LIMIT-NEXT:   JMP_1 %bb.7
+  ; LIMIT-NEXT:   [[SHR32ri4:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm2]], 2, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SHR32ri5:%[0-9]+]]:gr32 = SHR32ri [[COPY]], 2, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[AND32ri3:%[0-9]+]]:gr32 = AND32ri [[SHR32ri5]], 7, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SUBREG_TO_REG3:%[0-9]+]]:gr64_nosp = SUBREG_TO_REG 0, [[AND32ri3]], %subreg.sub_32bit
+  ; LIMIT-NEXT:   JMP64m $noreg, 8, [[SUBREG_TO_REG3]], %jump-table.1, $noreg
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.5:
-  ; LIMIT-NEXT:   successors: %bb.7(0x80000000)
+  ; LIMIT-NEXT:   successors: %bb.9(0x20000000), %bb.10(0x20000000), %bb.11(0x20000000), %bb.12(0x20000000)
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT:   [[MOV32rm3:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   [[SHR32ri3:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm3]], 3, implicit-def dead $eflags
-  ; LIMIT-NEXT:   JMP_1 %bb.7
+  ; LIMIT-NEXT:   [[SHR32ri6:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm3]], 3, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SHR32ri7:%[0-9]+]]:gr32 = SHR32ri [[COPY]], 2, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[AND32ri4:%[0-9]+]]:gr32 = AND32ri [[SHR32ri7]], 7, implicit-def dead $eflags
+  ; LIMIT-NEXT:   [[SUBREG_TO_REG4:%[0-9]+]]:gr64_nosp = SUBREG_TO_REG 0, [[AND32ri4]], %subreg.sub_32bit
+  ; LIMIT-NEXT:   JMP64m $noreg, 8, [[SUBREG_TO_REG4]], %jump-table.1, $noreg
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.6:
   ; LIMIT-NEXT:   successors:
   ; LIMIT-NEXT: {{  $}}
-  ; LIMIT-NEXT: bb.7:
-  ; LIMIT-NEXT:   successors: %bb.9(0x20000000), %bb.10(0x20000000), %bb.11(0x20000000), %bb.12(0x20000000)
-  ; LIMIT-NEXT: {{  $}}
-  ; LIMIT-NEXT:   [[SHR32ri4:%[0-9]+]]:gr32 = SHR32ri [[COPY]], 2, implicit-def dead $eflags
-  ; LIMIT-NEXT:   [[AND32ri1:%[0-9]+]]:gr32 = AND32ri [[SHR32ri4]], 7, implicit-def dead $eflags
-  ; LIMIT-NEXT:   [[SUBREG_TO_REG1:%[0-9]+]]:gr64_nosp = SUBREG_TO_REG 0, killed [[AND32ri1]], %subreg.sub_32bit
-  ; LIMIT-NEXT:   JMP64m $noreg, 8, [[SUBREG_TO_REG1]], %jump-table.1, $noreg
-  ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.9:
   ; LIMIT-NEXT:   [[MOV32rm4:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
   ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[MOV32rm4]] :: (store (s32))
@@ -583,23 +587,23 @@ body:             |
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.10:
   ; LIMIT-NEXT:   [[MOV32rm5:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   [[SHR32ri5:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm5]], 1, implicit-def dead $eflags
-  ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[SHR32ri5]] :: (store (s32))
-  ; LIMIT-NEXT:   $eax = COPY [[SHR32ri5]]
+  ; LIMIT-NEXT:   [[SHR32ri8:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm5]], 1, implicit-def dead $eflags
+  ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[SHR32ri8]] :: (store (s32))
+  ; LIMIT-NEXT:   $eax = COPY [[SHR32ri8]]
   ; LIMIT-NEXT:   RET 0, $eax
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.11:
   ; LIMIT-NEXT:   [[MOV32rm6:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   [[SHR32ri6:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm6]], 2, implicit-def dead $eflags
-  ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[SHR32ri6]] :: (store (s32))
-  ; LIMIT-NEXT:   $eax = COPY [[SHR32ri6]]
+  ; LIMIT-NEXT:   [[SHR32ri9:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm6]], 2, implicit-def dead $eflags
+  ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[SHR32ri9]] :: (store (s32))
+  ; LIMIT-NEXT:   $eax = COPY [[SHR32ri9]]
   ; LIMIT-NEXT:   RET 0, $eax
   ; LIMIT-NEXT: {{  $}}
   ; LIMIT-NEXT: bb.12:
   ; LIMIT-NEXT:   [[MOV32rm7:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 1, $noreg, 0, $noreg
-  ; LIMIT-NEXT:   [[SHR32ri7:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm7]], 6, implicit-def dead $eflags
-  ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[SHR32ri7]] :: (store (s32))
-  ; LIMIT-NEXT:   $eax = COPY [[SHR32ri7]]
+  ; LIMIT-NEXT:   [[SHR32ri10:%[0-9]+]]:gr32 = SHR32ri [[MOV32rm7]], 6, implicit-def dead $eflags
+  ; LIMIT-NEXT:   MOV32mr [[COPY1]], 1, $noreg, 0, $noreg, [[SHR32ri10]] :: (store (s32))
+  ; LIMIT-NEXT:   $eax = COPY [[SHR32ri10]]
   ; LIMIT-NEXT:   RET 0, $eax
   ;
   ; NOLIMIT-LABEL: name: foo_no_phis

efriedma-quic · 2024-11-13T21:43:49Z

What's the relationship between this and #114990?

DianQK · 2024-11-14T01:43:22Z

Are these inputs computed gotos or jump tables? If they are computed gotos, I think we can land #114990.

In fact, I don’t see any difference between computed gotos and jump tables—both can include or exclude many PHIs. To me, the only difference is that a jump table has one extra jump at the source level compared to a computed goto. So, I believe that when users use computed gotos, they expect longer compile times and larger code size.

fhahn · 2024-11-18T21:30:13Z

Are these inputs computed gotos or jump tables? If they are computed gotos, I think we can land #114990.

In fact, I don’t see any difference between computed gotos and jump tables—both can include or exclude many PHIs. To me, the only difference is that a jump table has one extra jump at the source level compared to a computed goto. So, I believe that when users use computed gotos, they expect longer compile times and larger code size.

Ah I missed the other PR, thanks! For this particular case, it is a computed GOTO, but there are other computed GOTOs, but completely removing the cutoff increases the # of instructions by 10%, without any gain.

In general, I think we should try to avoid cut-offs if there are cases we can handle with reasonable compile-times. I am not sure if completely ignoring the cutoff for computed GOTOs is the best way forward, as I don't see a fundamental difference between coming from jump tables or compute GOTOs.

Ideally we would address allow cases we can reasonably handle (e.g. because there are no phis to add) and/or by addressing the extra complexity for adding the additional edges to the CFG.

For the test-case from #106846 (http://www.jikos.cz/~mikulas/testcases/clang/computed-goto.c), tail folding is applied with the current patch as well, but only the inner level I think. I don't have a suitable X86 to test if it restores the original performance

DianQK · 2024-11-19T01:40:40Z

Are these inputs computed gotos or jump tables? If they are computed gotos, I think we can land #114990.
In fact, I don’t see any difference between computed gotos and jump tables—both can include or exclude many PHIs. To me, the only difference is that a jump table has one extra jump at the source level compared to a computed goto. So, I believe that when users use computed gotos, they expect longer compile times and larger code size.

Ah I missed the other PR, thanks! For this particular case, it is a computed GOTO, but there are other computed GOTOs, but completely removing the cutoff increases the # of instructions by 10%, without any gain.

I think this is related to the specific performance of CPU branch prediction, which is why sometimes we cannot observe performance improvements.

In general, I think we should try to avoid cut-offs if there are cases we can handle with reasonable compile-times. I am not sure if completely ignoring the cutoff for computed GOTOs is the best way forward, as I don't see a fundamental difference between coming from jump tables or compute GOTOs.

For me, we can view this transformation as restoring the CFG of the code written by the user. This is the only way I can see the difference between jump tables (switch) and computed GOTOs. That I'm saying, when users use computed GOTOs, they accept longer compile times and larger code size.

Ideally we would address allow cases we can reasonably handle (e.g. because there are no phis to add) and/or by addressing the extra complexity for adding the additional edges to the CFG.

For me, it makes sense to apply this to jump tables under some limited.

For the test-case from #106846 (http://www.jikos.cz/~mikulas/testcases/clang/computed-goto.c), tail folding is applied with the current patch as well, but only the inner level I think. I don't have a suitable X86 to test if it restores the original performance

IMO, the current patch addresses some specific scenarios. In fact, I believe the main compile time is spent on the PHI nodes added after duplication. As we see it, there is no fundamental difference between jump tables and computed GOTOs, so I think this patch is odd that we still prevent some computed GOTOs.

fhahn · 2025-01-22T22:46:04Z

Are these inputs computed gotos or jump tables? If they are computed gotos, I think we can land #114990.
In fact, I don’t see any difference between computed gotos and jump tables—both can include or exclude many PHIs. To me, the only difference is that a jump table has one extra jump at the source level compared to a computed goto. So, I believe that when users use computed gotos, they expect longer compile times and larger code size.

Ah I missed the other PR, thanks! For this particular case, it is a computed GOTO, but there are other computed GOTOs, but completely removing the cutoff increases the # of instructions by 10%, without any gain.

I think this is related to the specific performance of CPU branch prediction, which is why sometimes we cannot observe performance improvements.

In general, I think we should try to avoid cut-offs if there are cases we can handle with reasonable compile-times. I am not sure if completely ignoring the cutoff for computed GOTOs is the best way forward, as I don't see a fundamental difference between coming from jump tables or compute GOTOs.

For me, we can view this transformation as restoring the CFG of the code written by the user. This is the only way I can see the difference between jump tables (switch) and computed GOTOs. That I'm saying, when users use computed GOTOs, they accept longer compile times and larger code size.

Ideally we would address allow cases we can reasonably handle (e.g. because there are no phis to add) and/or by addressing the extra complexity for adding the additional edges to the CFG.

For me, it makes sense to apply this to jump tables under some limited.

For the test-case from #106846 (http://www.jikos.cz/~mikulas/testcases/clang/computed-goto.c), tail folding is applied with the current patch as well, but only the inner level I think. I don't have a suitable X86 to test if it restores the original performance

IMO, the current patch addresses some specific scenarios. In fact, I believe the main compile time is spent on the PHI nodes added after duplication. As we see it, there is no fundamental difference between jump tables and computed GOTOs, so I think this patch is odd that we still prevent some computed GOTOs.

IIUC #114990 in the current version only allows duplication for computed GOTOs? With this patch, there's no extra restrictions on the types we can fold, we just skip/relax the aggressive cut-offs when it won't hurt compile-time (much).

It would be great if we could get this resolved one way or another for the Clang 20 release, as this at the moment causes a 2-3% performance loss for Python workloads on ARM64 :) I rebased and tested #114990 and unfortunately it doesn't yield the same perf gain ( only in the noise/ < 0.5%)

DianQK

It would be great if we could get this resolved one way or another for the Clang 20 release, as this at the moment causes a 2-3% performance loss for Python workloads on ARM64 :) I rebased and tested #114990 and unfortunately it doesn't yield the same perf gain ( only in the noise/ < 0.5%)

LGTM. I believe these two PRs address different issues. I can verify that the PR doesn't resolve #106846 due to phis.

fhahn · 2025-01-23T18:19:01Z

It would be great if we could get this resolved one way or another for the Clang 20 release, as this at the moment causes a 2-3% performance loss for Python workloads on ARM64 :) I rebased and tested #114990 and unfortunately it doesn't yield the same perf gain ( only in the noise/ < 0.5%)

LGTM. I believe these two PRs address different issues. I can verify that the PR doesn't resolve #106846 due to phis.

Thanks for double-checking, that's my understanding as well!

…thout phis. (#116072) This adjusts the threshold logic added in #78582 to only trigger for cases where there are actually phis to duplicate in either TailBB or in one of the successors. In cases there are no phis, we only have to pay the cost of extra edges, but have no explosion in PHI related instructions. This improves performance of Python on some inputs by 2-3% on Apple Silicon CPUs. PR: llvm/llvm-project#116072

…llvm#116072) This adjusts the threshold logic added in llvm#78582 to only trigger for cases where there are actually phis to duplicate in either TailBB or in one of the successors. In cases there are no phis, we only have to pay the cost of extra edges, but have no explosion in PHI related instructions. This improves performance of Python on some inputs by 2-3% on Apple Silicon CPUs. PR: llvm#116072

fhahn requested review from MatzeB, aemerson, DianQK and efriedma-quic November 13, 2024 16:02

llvmbot added the backend:X86 label Nov 13, 2024

fhahn mentioned this pull request Jan 22, 2025

[TailDuplicator] Do not restrict the computed gotos #114990

Open

DianQK approved these changes Jan 23, 2025

View reviewed changes

Merge branch 'main' into taildup-no-limit-phis

fdafc3f

nikic changed the title ~~[TaiDup] Allow large number of predecessors/successors without phis.~~ [TailDup] Allow large number of predecessors/successors without phis. Jan 23, 2025

fhahn merged commit 0d01908 into llvm:main Jan 23, 2025
8 checks passed

fhahn deleted the taildup-no-limit-phis branch January 23, 2025 18:24

fhahn mentioned this pull request Jan 24, 2025

[TailDup] Allow large number of predecessors/successors without phis.… swiftlang/llvm-project#9878

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TailDup] Allow large number of predecessors/successors without phis. #116072

[TailDup] Allow large number of predecessors/successors without phis. #116072

fhahn commented Nov 13, 2024

llvmbot commented Nov 13, 2024

efriedma-quic commented Nov 13, 2024

DianQK commented Nov 14, 2024

fhahn commented Nov 18, 2024

DianQK commented Nov 19, 2024 •

edited

Loading

fhahn commented Jan 22, 2025

DianQK left a comment

fhahn commented Jan 23, 2025

[TailDup] Allow large number of predecessors/successors without phis. #116072

[TailDup] Allow large number of predecessors/successors without phis. #116072

Conversation

fhahn commented Nov 13, 2024

llvmbot commented Nov 13, 2024

efriedma-quic commented Nov 13, 2024

DianQK commented Nov 14, 2024

fhahn commented Nov 18, 2024

DianQK commented Nov 19, 2024 • edited Loading

fhahn commented Jan 22, 2025

DianQK left a comment

Choose a reason for hiding this comment

fhahn commented Jan 23, 2025

DianQK commented Nov 19, 2024 •

edited

Loading