Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AArch64] Guard against non-vector abd long nodes. #102026

Merged
merged 1 commit into from
Aug 6, 2024

Conversation

davemgreen
Copy link
Collaborator

This fixes a problem if abd nodes are generated more readily (#92576). The folding of abd nodes into abdl needs to check that the inputs are the correct form of vector. The added test requires vector legalization to occur in order to hit the combine at the wrong time.

This fixes a problem if abd nodes are generated more readily (llvm#92576). The
folding of abd nodes into abdl needs to check that the inputs are the correct
form of vector. The added test requires vector legalization to occur.
@llvmbot
Copy link
Collaborator

llvmbot commented Aug 5, 2024

@llvm/pr-subscribers-backend-aarch64

Author: David Green (davemgreen)

Changes

This fixes a problem if abd nodes are generated more readily (#92576). The folding of abd nodes into abdl needs to check that the inputs are the correct form of vector. The added test requires vector legalization to occur in order to hit the combine at the wrong time.


Full diff: https://github.com/llvm/llvm-project/pull/102026.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1)
  • (modified) llvm/test/CodeGen/AArch64/abds.ll (+22)
  • (modified) llvm/test/CodeGen/AArch64/abdu.ll (+26)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 7704321a0fc3a..f0c3afc4f9b5d 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -21769,6 +21769,7 @@ static SDValue performExtendCombine(SDNode *N,
   // helps the backend to decide that an sabdl2 would be useful, saving a real
   // extract_high operation.
   if (!DCI.isBeforeLegalizeOps() && N->getOpcode() == ISD::ZERO_EXTEND &&
+      N->getOperand(0).getValueType().is64BitVector() &&
       (N->getOperand(0).getOpcode() == ISD::ABDU ||
        N->getOperand(0).getOpcode() == ISD::ABDS)) {
     SDNode *ABDNode = N->getOperand(0).getNode();
diff --git a/llvm/test/CodeGen/AArch64/abds.ll b/llvm/test/CodeGen/AArch64/abds.ll
index d4ad33f963ba7..215907c66a6e8 100644
--- a/llvm/test/CodeGen/AArch64/abds.ll
+++ b/llvm/test/CodeGen/AArch64/abds.ll
@@ -571,6 +571,28 @@ define i32 @abd_sub_i32(i32 %a, i32 %b) nounwind {
   ret i32 %abs
 }
 
+define i64 @vector_legalized(i16 %a, i16 %b) {
+; CHECK-LABEL: vector_legalized:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    movi v0.2d, #0000000000000000
+; CHECK-NEXT:    sxth w8, w0
+; CHECK-NEXT:    sub w8, w8, w1, sxth
+; CHECK-NEXT:    addp d0, v0.2d
+; CHECK-NEXT:    cmp w8, #0
+; CHECK-NEXT:    cneg w8, w8, mi
+; CHECK-NEXT:    fmov x9, d0
+; CHECK-NEXT:    add x0, x9, x8
+; CHECK-NEXT:    ret
+  %ea = sext i16 %a to i32
+  %eb = sext i16 %b to i32
+  %s = sub i32 %ea, %eb
+  %ab = call i32 @llvm.abs.i32(i32 %s, i1 false)
+  %e = zext i32 %ab to i64
+  %red = call i64 @llvm.vector.reduce.add.v32i64(<32 x i64> zeroinitializer)
+  %z = add i64 %red, %e
+  ret i64 %z
+}
+
 
 declare i8 @llvm.abs.i8(i8, i1)
 declare i16 @llvm.abs.i16(i16, i1)
diff --git a/llvm/test/CodeGen/AArch64/abdu.ll b/llvm/test/CodeGen/AArch64/abdu.ll
index 983db629e4498..f70f095d7daba 100644
--- a/llvm/test/CodeGen/AArch64/abdu.ll
+++ b/llvm/test/CodeGen/AArch64/abdu.ll
@@ -409,6 +409,32 @@ define i128 @abd_cmp_i128(i128 %a, i128 %b) nounwind {
   ret i128 %sel
 }
 
+;
+; negative tests
+;
+
+define i64 @vector_legalized(i16 %a, i16 %b) {
+; CHECK-LABEL: vector_legalized:
+; CHECK:       // %bb.0:
+; CHECK-NEXT:    movi v0.2d, #0000000000000000
+; CHECK-NEXT:    and w8, w0, #0xffff
+; CHECK-NEXT:    sub w8, w8, w1, uxth
+; CHECK-NEXT:    cmp w8, #0
+; CHECK-NEXT:    addp d0, v0.2d
+; CHECK-NEXT:    cneg w8, w8, mi
+; CHECK-NEXT:    fmov x9, d0
+; CHECK-NEXT:    add x0, x9, x8
+; CHECK-NEXT:    ret
+  %ea = zext i16 %a to i32
+  %eb = zext i16 %b to i32
+  %s = sub i32 %ea, %eb
+  %ab = call i32 @llvm.abs.i32(i32 %s, i1 false)
+  %e = zext i32 %ab to i64
+  %red = call i64 @llvm.vector.reduce.add.v32i64(<32 x i64> zeroinitializer)
+  %z = add i64 %red, %e
+  ret i64 %z
+}
+
 declare i8 @llvm.abs.i8(i8, i1)
 declare i16 @llvm.abs.i16(i16, i1)
 declare i32 @llvm.abs.i32(i32, i1)

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - cheers

@davemgreen davemgreen merged commit 6e4c580 into llvm:main Aug 6, 2024
9 checks passed
@davemgreen davemgreen deleted the gh-a64-absscalar branch August 6, 2024 06:57
banach-space pushed a commit to banach-space/llvm-project that referenced this pull request Aug 7, 2024
This fixes a problem if abd nodes are generated more readily (llvm#92576).
The folding of abd nodes into abdl needs to check that the inputs are
the correct form of vector. The added test requires vector legalization
to occur in order to hit the combine at the wrong time.
kstoimenov pushed a commit to kstoimenov/llvm-project that referenced this pull request Aug 15, 2024
This fixes a problem if abd nodes are generated more readily (llvm#92576).
The folding of abd nodes into abdl needs to check that the inputs are
the correct form of vector. The added test requires vector legalization
to occur in order to hit the combine at the wrong time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants