[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

lukel97 · 2024-07-30T08:46:44Z

As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different which leads to an incorrect transform.

This checks the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in #100367

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

…ng EEW As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different. This is incorrect if we end up changing the mask or AVL of the op. This gets the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in llvm#100367

wangpc-pp · 2024-07-30T12:34:40Z

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

+    // the VL or mask.
+    if (Log2_64(True.getScalarValueSizeInBits()) !=
+        N->getConstantOperandVal(
+            RISCVII::getSEWOpNum(TII->get(N->getMachineOpcode())) - 1))


The opcode is known? So I think we don't need to get SEW operand index from TSFlag, we can hardcode it here.

wangpc-pp · 2024-07-30T12:36:19Z

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

    if (mayRaiseFPException(True.getNode()) &&
        !True->getFlags().hasNoFPExcept())
      return false;

+    // If the EEW of True is different from vmerge's SEW, then we cannot change
+    // the VL or mask.
+    if (Log2_64(True.getScalarValueSizeInBits()) !=


Maybe don't use log here, we do a shift left to log2sew instead. Shift is cheaper than log I think.

preames · 2024-07-30T15:29:30Z

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

@@ -3855,11 +3855,19 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) {
  // If we end up changing the VL or mask of True, then we need to make sure it
  // doesn't raise any observable fp exceptions, since changing the active
  // elements will affect how fflags is set.
-  if (TrueVL != VL || !IsMasked)
+  if (TrueVL != VL || !IsMasked) {


I don't think this is specific to the change case. Even if we have exactly equal VL values, those values describe a different number of bits. So folding away the vmerge.vv and vmv.v.v is still illegal.

I think you can also use a much easier check here - the VT of the TrueOp should equal the VT of the vmerge or vmv.v.v. (Really the respective operand, but we don't have widening or narrowing versions of either so that's equivalent.)

llvmbot · 2024-07-30T15:39:26Z

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different. This is incorrect if we end up changing the mask or AVL of the op.

This gets the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in #100367

Full diff: https://github.com/llvm/llvm-project/pull/101152.diff

3 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+4)
(modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+21)
(modified) llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll (+14)

diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 4418905ce21ed..4de38db6e1fe9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3733,6 +3733,10 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) {
   assert(!Mask || cast<RegisterSDNode>(Mask)->getReg() == RISCV::V0);
   assert(!Glue || Glue.getValueType() == MVT::Glue);
 
+  // If the EEW of True is different from vmerge's SEW, then we can't fold.
+  if (True.getSimpleValueType() != N->getSimpleValueType(0))
+    return false;
+
   // We require that either passthru and false are the same, or that passthru
   // is undefined.
   if (Passthru != False && !isImplicitDef(Passthru))
diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
index a08bcae074b9b..259515f160048 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
@@ -1196,3 +1196,24 @@ define <vscale x 2 x i32> @true_mask_vmerge_implicit_passthru(<vscale x 2 x i32>
   )
   ret <vscale x 2 x i32> %b
 }
+
+
+define <vscale x 2 x i32> @unfoldable_mismatched_sew(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, <vscale x 2 x i1> %mask, i64 %avl) {
+; CHECK-LABEL: unfoldable_mismatched_sew:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v9, v9, v10
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, i64 %avl)
+  %a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
+  %b = call <vscale x 2 x i32> @llvm.riscv.vmerge.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> %passthru,
+    <vscale x 2 x i32> %passthru,
+    <vscale x 2 x i32> %a.bitcast,
+    <vscale x 2 x i1> splat (i1 true),
+    i64 %avl
+  )
+  ret <vscale x 2 x i32> %b
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
index 8a589a31a1535..3952e48c5c28f 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
@@ -180,3 +180,17 @@ define <vscale x 2 x i32> @unfoldable_vredsum(<vscale x 2 x i32> %passthru, <vsc
   %b = call <vscale x 2 x i32> @llvm.riscv.vmv.v.v.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a, iXLen 1)
   ret <vscale x 2 x i32> %b
 }
+
+define <vscale x 2 x i32> @unfoldable_mismatched_sew(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, iXLen %avl) {
+; CHECK-LABEL: unfoldable_mismatched_sew:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v9, v9, v10
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, iXLen %avl)
+  %a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
+  %b = call <vscale x 2 x i32> @llvm.riscv.vmv.v.v.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a.bitcast, iXLen %avl)
+  ret <vscale x 2 x i32> %b
+}

preames

LGTM

lukel97 · 2024-07-30T16:29:04Z

/cherry-pick d01c051

llvmbot · 2024-07-30T16:33:52Z

/cherry-pick d01c051

Error: Command failed due to missing milestone.

lukel97 · 2024-07-30T16:40:58Z

/cherry-pick d01c051

llvmbot · 2024-07-30T16:45:45Z

Failed to cherry-pick: d01c051

https://github.com/llvm/llvm-project/actions/runs/10166340246

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

Since we don't have access to the MVTs, we can also do the same thing by checking the VLMAXs are the same (i.e. the sew/lmul ratio)

…ng EEW (llvm#101152) As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different which leads to an incorrect transform. This checks the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in llvm#100367

Since we don't have access to the MVTs, we can also do the same thing by checking the VLMAXs are the same (i.e. the sew/lmul ratio)

lukel97 requested review from preames, topperc, wangpc-pp and yetingk July 30, 2024 08:46

wangpc-pp reviewed Jul 30, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp Outdated Show resolved Hide resolved

lukel97 added 3 commits July 30, 2024 17:48

Precommit tests

27e82f1

Use getScalarValueSizeInBits

0190737

lukel97 force-pushed the performCombineVMergeAndVOps-check-SEW branch from c48a5c8 to 0190737 Compare July 30, 2024 09:49

wangpc-pp reviewed Jul 30, 2024

View reviewed changes

preames reviewed Jul 30, 2024

View reviewed changes

Just check MVTs, bail in every case, remove redundant tests

70395f3

llvmbot added the backend:RISC-V label Jul 30, 2024

preames approved these changes Jul 30, 2024

View reviewed changes

lukel97 merged commit d01c051 into llvm:main Jul 30, 2024
6 of 7 checks passed

lukel97 added this to the LLVM 19.X Release milestone Jul 30, 2024

llvmbot added the release:cherry-pick-failed label Jul 30, 2024

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jul 31, 2024

Rebase, carry over the EEW check added in llvm#101152

a11e7ff

Since we don't have access to the MVTs, we can also do the same thing by checking the VLMAXs are the same (i.e. the sew/lmul ratio)

lukel97 mentioned this pull request Aug 1, 2024

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101464

Merged

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Aug 5, 2024

Rebase, carry over the EEW check added in llvm#101152

a621896

Since we don't have access to the MVTs, we can also do the same thing by checking the VLMAXs are the same (i.e. the sew/lmul ratio)

mgabka mentioned this pull request Aug 22, 2024

Add release note about ABI mgabka/llvm-project#6

Open

thewtex mentioned this pull request Feb 10, 2025

llvmorg 19.1.5 libcxxabi pthread lib name #126605

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

lukel97 commented Jul 30, 2024 •

edited

Loading

wangpc-pp Jul 30, 2024

wangpc-pp Jul 30, 2024

preames Jul 30, 2024

llvmbot commented Jul 30, 2024

preames left a comment

lukel97 commented Jul 30, 2024

llvmbot commented Jul 30, 2024

lukel97 commented Jul 30, 2024

llvmbot commented Jul 30, 2024

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

Conversation

lukel97 commented Jul 30, 2024 • edited Loading

wangpc-pp Jul 30, 2024

Choose a reason for hiding this comment

wangpc-pp Jul 30, 2024

Choose a reason for hiding this comment

preames Jul 30, 2024

Choose a reason for hiding this comment

llvmbot commented Jul 30, 2024

preames left a comment

Choose a reason for hiding this comment

lukel97 commented Jul 30, 2024

llvmbot commented Jul 30, 2024

lukel97 commented Jul 30, 2024

llvmbot commented Jul 30, 2024

lukel97 commented Jul 30, 2024 •

edited

Loading