Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RISCV] Fix vmerge.vvm/vmv.v.v getting folded into ops with mismatching EEW #101152

Merged
merged 4 commits into from
Jul 30, 2024

Conversation

lukel97
Copy link
Contributor

@lukel97 lukel97 commented Jul 30, 2024

As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different which leads to an incorrect transform.

This checks the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in #100367

lukel97 added 3 commits July 30, 2024 17:48
…ng EEW

As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different. This is incorrect if we end up changing the mask or AVL of the op.

This gets the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in llvm#100367
@lukel97 lukel97 force-pushed the performCombineVMergeAndVOps-check-SEW branch from c48a5c8 to 0190737 Compare July 30, 2024 09:49
// the VL or mask.
if (Log2_64(True.getScalarValueSizeInBits()) !=
N->getConstantOperandVal(
RISCVII::getSEWOpNum(TII->get(N->getMachineOpcode())) - 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opcode is known? So I think we don't need to get SEW operand index from TSFlag, we can hardcode it here.

if (mayRaiseFPException(True.getNode()) &&
!True->getFlags().hasNoFPExcept())
return false;

// If the EEW of True is different from vmerge's SEW, then we cannot change
// the VL or mask.
if (Log2_64(True.getScalarValueSizeInBits()) !=
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe don't use log here, we do a shift left to log2sew instead. Shift is cheaper than log I think.

@@ -3855,11 +3855,19 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) {
// If we end up changing the VL or mask of True, then we need to make sure it
// doesn't raise any observable fp exceptions, since changing the active
// elements will affect how fflags is set.
if (TrueVL != VL || !IsMasked)
if (TrueVL != VL || !IsMasked) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is specific to the change case. Even if we have exactly equal VL values, those values describe a different number of bits. So folding away the vmerge.vv and vmv.v.v is still illegal.

I think you can also use a much easier check here - the VT of the TrueOp should equal the VT of the vmerge or vmv.v.v. (Really the respective operand, but we don't have widening or narrowing versions of either so that's equivalent.)

@llvmbot
Copy link
Member

llvmbot commented Jul 30, 2024

@llvm/pr-subscribers-backend-risc-v

Author: Luke Lau (lukel97)

Changes

As noted in https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we currently fold in vmerge.vvms and vmv.v.vs into their ops even if the EEW is different. This is incorrect if we end up changing the mask or AVL of the op.

This gets the op's EEW via its simple value type for now since there doesn't seem to be any existing information about the EEW size of instructions. We'll probably need to encode this at some point if we want to be able to access it at the MachineInstr level in #100367


Full diff: https://github.com/llvm/llvm-project/pull/101152.diff

3 Files Affected:

  • (modified) llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp (+4)
  • (modified) llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll (+21)
  • (modified) llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll (+14)
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
index 4418905ce21ed..4de38db6e1fe9 100644
--- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
@@ -3733,6 +3733,10 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) {
   assert(!Mask || cast<RegisterSDNode>(Mask)->getReg() == RISCV::V0);
   assert(!Glue || Glue.getValueType() == MVT::Glue);
 
+  // If the EEW of True is different from vmerge's SEW, then we can't fold.
+  if (True.getSimpleValueType() != N->getSimpleValueType(0))
+    return false;
+
   // We require that either passthru and false are the same, or that passthru
   // is undefined.
   if (Passthru != False && !isImplicitDef(Passthru))
diff --git a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
index a08bcae074b9b..259515f160048 100644
--- a/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/rvv-peephole-vmerge-vops.ll
@@ -1196,3 +1196,24 @@ define <vscale x 2 x i32> @true_mask_vmerge_implicit_passthru(<vscale x 2 x i32>
   )
   ret <vscale x 2 x i32> %b
 }
+
+
+define <vscale x 2 x i32> @unfoldable_mismatched_sew(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, <vscale x 2 x i1> %mask, i64 %avl) {
+; CHECK-LABEL: unfoldable_mismatched_sew:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v9, v9, v10
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, i64 %avl)
+  %a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
+  %b = call <vscale x 2 x i32> @llvm.riscv.vmerge.nxv2i32.nxv2i32(
+    <vscale x 2 x i32> %passthru,
+    <vscale x 2 x i32> %passthru,
+    <vscale x 2 x i32> %a.bitcast,
+    <vscale x 2 x i1> splat (i1 true),
+    i64 %avl
+  )
+  ret <vscale x 2 x i32> %b
+}
diff --git a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
index 8a589a31a1535..3952e48c5c28f 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vmv.v.v-peephole.ll
@@ -180,3 +180,17 @@ define <vscale x 2 x i32> @unfoldable_vredsum(<vscale x 2 x i32> %passthru, <vsc
   %b = call <vscale x 2 x i32> @llvm.riscv.vmv.v.v.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a, iXLen 1)
   ret <vscale x 2 x i32> %b
 }
+
+define <vscale x 2 x i32> @unfoldable_mismatched_sew(<vscale x 2 x i32> %passthru, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, iXLen %avl) {
+; CHECK-LABEL: unfoldable_mismatched_sew:
+; CHECK:       # %bb.0:
+; CHECK-NEXT:    vsetvli zero, a0, e64, m1, ta, ma
+; CHECK-NEXT:    vadd.vv v9, v9, v10
+; CHECK-NEXT:    vsetvli zero, a0, e32, m1, tu, ma
+; CHECK-NEXT:    vmv.v.v v8, v9
+; CHECK-NEXT:    ret
+  %a = call <vscale x 1 x i64> @llvm.riscv.vadd.nxv1i64.nxv1i64(<vscale x 1 x i64> poison, <vscale x 1 x i64> %x, <vscale x 1 x i64> %y, iXLen %avl)
+  %a.bitcast = bitcast <vscale x 1 x i64> %a to <vscale x 2 x i32>
+  %b = call <vscale x 2 x i32> @llvm.riscv.vmv.v.v.nxv2i32(<vscale x 2 x i32> %passthru, <vscale x 2 x i32> %a.bitcast, iXLen %avl)
+  ret <vscale x 2 x i32> %b
+}

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lukel97 lukel97 merged commit d01c051 into llvm:main Jul 30, 2024
6 of 7 checks passed
@lukel97
Copy link
Contributor Author

lukel97 commented Jul 30, 2024

/cherry-pick d01c051

@llvmbot
Copy link
Member

llvmbot commented Jul 30, 2024

/cherry-pick d01c051

Error: Command failed due to missing milestone.

@lukel97 lukel97 added this to the LLVM 19.X Release milestone Jul 30, 2024
@lukel97
Copy link
Contributor Author

lukel97 commented Jul 30, 2024

/cherry-pick d01c051

@llvmbot
Copy link
Member

llvmbot commented Jul 30, 2024

Failed to cherry-pick: d01c051

https://github.com/llvm/llvm-project/actions/runs/10166340246

Please manually backport the fix and push it to your github fork. Once this is done, please create a pull request

lukel97 added a commit to lukel97/llvm-project that referenced this pull request Jul 31, 2024
Since we don't have access to the MVTs, we can also do the same thing by checking the VLMAXs are the same (i.e. the sew/lmul ratio)
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Aug 1, 2024
…ng EEW (llvm#101152)

As noted in
https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we
currently fold in vmerge.vvms and vmv.v.vs into their ops even if the
EEW is different which leads to an incorrect transform.

This checks the op's EEW via its simple value type for now since there
doesn't seem to be any existing information about the EEW size of
instructions. We'll probably need to encode this at some point if we
want to be able to access it at the MachineInstr level in llvm#100367
tru pushed a commit to lukel97/llvm-project that referenced this pull request Aug 2, 2024
…ng EEW (llvm#101152)

As noted in
https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we
currently fold in vmerge.vvms and vmv.v.vs into their ops even if the
EEW is different which leads to an incorrect transform.

This checks the op's EEW via its simple value type for now since there
doesn't seem to be any existing information about the EEW size of
instructions. We'll probably need to encode this at some point if we
want to be able to access it at the MachineInstr level in llvm#100367
tru pushed a commit to lukel97/llvm-project that referenced this pull request Aug 2, 2024
…ng EEW (llvm#101152)

As noted in
https://github.com/llvm/llvm-project/pull/100367/files#r1695448771, we
currently fold in vmerge.vvms and vmv.v.vs into their ops even if the
EEW is different which leads to an incorrect transform.

This checks the op's EEW via its simple value type for now since there
doesn't seem to be any existing information about the EEW size of
instructions. We'll probably need to encode this at some point if we
want to be able to access it at the MachineInstr level in llvm#100367
lukel97 added a commit to lukel97/llvm-project that referenced this pull request Aug 5, 2024
Since we don't have access to the MVTs, we can also do the same thing by checking the VLMAXs are the same (i.e. the sew/lmul ratio)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

4 participants