-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt -indvars with -verify-scev-strict causes Trip Count Changed! error #128
Comments
…812.1 (llvm#128) [release/11.x] Update dependencies from dotnet/arcade
Doesn't reproduce anymore, so I'll assume it is fixed. |
Here's an updated repro for the same kind of failure: target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7"
define void @f(i32 %arg) {
bb:
%i = and i32 %arg, 31
%i1 = icmp ugt i32 %i, 0
br i1 %i1, label %bb2, label %bb8
bb2:
%i3 = phi i32 [ %i6, %bb2 ], [ 0, %bb ]
%i4 = shl i32 %i3, 1
%i5 = call i32 @g(i32 %i4)
%i6 = add i32 %i3, 1
%i7 = icmp eq i32 %i6, %i
br i1 %i7, label %bb8, label %bb2
bb8:
ret void
}
declare i32 @g(i32)
|
@max-quazan FYI |
Do we know the patch when it started? |
I don't know, sorry, but I think it was probably a long time ago. |
FWIW, a |
That's a strange use of "verify". Should it be an Optimization Remark instead? |
verify-scev-strict can fail either because of a bug, or because SCEV cannot determine equivalence of expressions. This is why by default it is not enabled, and instead only the cases that are "definitely wrong" are detected. It's intended as a verification option, but it has too many false positives to be practical for most purposes. As to why this particular issue occurs, this is the usual problem with add -> sub canonicalization. The This looks like a pretty basic case though, so maybe it would be worthwhile to special case it. |
I tried to add support for this like this: diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 424ff1c26c68..07307980e287 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -3536,6 +3536,23 @@ const SCEV *ScalarEvolution::getUDivExpr(const SCEV *LHS,
if (Operands.size() == A->getNumOperands())
return getAddExpr(Operands);
}
+
+ // (Op+NegC)/RHSC == (Op-C)/RHSC == (Op/RHSC) - (C/RHSC) if
+ // the subtract is nuw and the divisions are exact. This is handled
+ // separately because SCEV can't represent sub nuw.
+ const auto *NegC = dyn_cast<SCEVConstant>(A->getOperand(0));
+ if (A->getNumOperands() == 2 && NegC && NegC->getAPInt().isNegative()) {
+ const SCEV *Op = A->getOperand(1);
+ APInt C = -NegC->getAPInt();
+ APInt NewC = C.udiv(RHSC->getAPInt());
+ if (C == NewC * RHSC->getAPInt() &&
+ getUnsignedRange(Op).unsignedSubMayOverflow(C) ==
+ ConstantRange::OverflowResult::NeverOverflows) {
+ const SCEV *NewOp = getUDivExpr(Op, RHS);
+ if (!isa<SCEVUDivExpr>(NewOp) && getMulExpr(NewOp, RHS) != Op)
+ return getMinusSCEV(NewOp, getConstant(NewC));
+ }
+ }
}
// Fold if both operands are constant. But this doesn't work because we don't know that |
Looking at this SCEV:
why didn't we combine |
Summary: feature coverage is a useful signal that is available during the merge process, but was not printed previously. Output example: ``` $ ./fuzzer -use_value_profile=1 -merge=1 new_corpus/ seed_corpus/ INFO: Seed: 1676551929 INFO: Loaded 1 modules (2380 inline 8-bit counters): 2380 [0x90d180, 0x90dacc), INFO: Loaded 1 PC tables (2380 PCs): 2380 [0x684018,0x68d4d8), MERGE-OUTER: 180 files, 78 in the initial corpus MERGE-OUTER: attempt 1 INFO: Seed: 1676574577 INFO: Loaded 1 modules (2380 inline 8-bit counters): 2380 [0x90d180, 0x90dacc), INFO: Loaded 1 PC tables (2380 PCs): 2380 [0x684018,0x68d4d8), INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes MERGE-INNER: using the control file '/tmp/libFuzzerTemp.111754.txt' MERGE-INNER: 180 total files; 0 processed earlier; will process 180 files now llvm#1 pulse cov: 134 ft: 330 exec/s: 0 rss: 37Mb llvm#2 pulse cov: 142 ft: 462 exec/s: 0 rss: 38Mb llvm#4 pulse cov: 152 ft: 651 exec/s: 0 rss: 38Mb llvm#8 pulse cov: 152 ft: 943 exec/s: 0 rss: 38Mb llvm#16 pulse cov: 520 ft: 2783 exec/s: 0 rss: 39Mb llvm#32 pulse cov: 552 ft: 3280 exec/s: 0 rss: 41Mb llvm#64 pulse cov: 576 ft: 3641 exec/s: 0 rss: 50Mb llvm#78 LOADED cov: 602 ft: 3936 exec/s: 0 rss: 88Mb llvm#128 pulse cov: 611 ft: 3996 exec/s: 0 rss: 93Mb llvm#180 DONE cov: 611 ft: 4016 exec/s: 0 rss: 155Mb MERGE-OUTER: succesfull in 1 attempt(s) MERGE-OUTER: the control file has 39741 bytes MERGE-OUTER: consumed 0Mb (37Mb rss) to parse the control file MERGE-OUTER: 9 new files with 80 new features added; 9 new coverage edges ``` Reviewers: hctim, morehouse Reviewed By: morehouse Subscribers: delcypher, #sanitizers, llvm-commits, kcc Tags: #llvm, #sanitizers Differential Revision: https://reviews.llvm.org/D66030 llvm-svn: 368617
…2023.08.02 Promote develop branch (as of a118509) to mainline
…sions (llvm#128) Extends the fix-up logic for `trip_count` calculation in `target` regions. Previously, if an array element was used to compute any of the loop bounds, the trip-count calculation ops would extract arra elements from the mapped declaration of the array inside the target region. This commit hanles that situation.
This patch does 3 things: 1. Add support for optimizing the address mode of HVX load/store instructions 2. Reduce the value of Add instruction immediates by replacing with the difference from other Addi instructions that share common base: For Example, If we have the below sequence of instructions: r1 = add(r2,llvm#1024) ... r3 = add(r2,llvm#1152) ... r4 = add(r2,llvm#1280) Where the register r2 has the same reaching definition, They get modified to the below sequence: r1 = add(r2,llvm#1024) ... r3 = add(r1,llvm#128) ... r4 = add(r1,llvm#256) 3. Fixes a bug pass where the addi instructions were modified based on a predicated register definition, leading to incorrect output. Eg: INST-1: if (p0) r2 = add(r13,llvm#128) INST-2: r1 = add(r2,llvm#1024) INST-3: r3 = add(r2,llvm#1152) INST-4: r5 = add(r2,llvm#1280) In the above case, since r2's definition is predicated, we do not want to modify the uses of r2 in INST-3/INST-4 with add(r1,llvm#128/256) 4.Fixes a corner case It looks like we never check whether the offset register is actually live (not clobbered) at optimization site. Add the check whether it is live at MBB entrance. The rest should have already been verified. 5. Fixes a bad codegen For whatever reason we do transformation without checking if the value in register actually reaches the user. This is second identical fix for this pass. Co-authored-by: Anirudh Sundar <quic_sanirudh@quicinc.com> Co-authored-by: Sergei Larin <slarin@quicinc.com>
This patch does 3 things: 1. Add support for optimizing the address mode of HVX load/store instructions 2. Reduce the value of Add instruction immediates by replacing with the difference from other Addi instructions that share common base: For Example, If we have the below sequence of instructions: r1 = add(r2,# 1024) ... r3 = add(r2,# 1152) ... r4 = add(r2,# 1280) Where the register r2 has the same reaching definition, They get modified to the below sequence: r1 = add(r2,# 1024) ... r3 = add(r1,# 128) ... r4 = add(r1,# 256) 3. Fixes a bug pass where the addi instructions were modified based on a predicated register definition, leading to incorrect output. Eg: INST-1: if (p0) r2 = add(r13,# 128) INST-2: r1 = add(r2,# 1024) INST-3: r3 = add(r2,# 1152) INST-4: r5 = add(r2,# 1280) In the above case, since r2's definition is predicated, we do not want to modify the uses of r2 in INST-3/INST-4 with add(r1,#128/256) 4.Fixes a corner case It looks like we never check whether the offset register is actually live (not clobbered) at optimization site. Add the check whether it is live at MBB entrance. The rest should have already been verified. 5. Fixes a bad codegen For whatever reason we do transformation without checking if the value in register actually reaches the user. This is second identical fix for this pass. Co-authored-by: Anirudh Sundar <quic_sanirudh@quicinc.com> Co-authored-by: Sergei Larin <slarin@quicinc.com>
With the attached r.ll (in r.zip) and a debug build of opt, I get:
The text was updated successfully, but these errors were encountered: