Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NFC] [DSE] Refactor DSE #100956

Merged
merged 11 commits into from
Aug 29, 2024
Merged

[NFC] [DSE] Refactor DSE #100956

merged 11 commits into from
Aug 29, 2024

Conversation

haopliu
Copy link
Contributor

@haopliu haopliu commented Jul 29, 2024

Refactor DSE with MemoryDefWrapper and MemoryLocationWrapper.

Normally, one MemoryDef accesses one MemoryLocation. With "initializes" attribute, one MemoryDef (like call instruction) could initialize multiple MemoryLocations.

Refactor DSE as a preparation to apply "initializes" attribute in DSE in a follow-up PR (58dd8a4).

@llvmbot
Copy link
Collaborator

llvmbot commented Jul 29, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Haopeng Liu (haopliu)

Changes

Refactor DSE with MemoryDefWrapper and MemoryLocationWrapper.

Normally, one MemoryDef accesses one MemoryLocation. With "initializes" attribute, one MemoryDef (like call instruction) could initialize multiple MemoryLocations.

Refactor DSE as a preparation to apply "initializes" attribute in DSE in a follow-up PR.


Patch is 26.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100956.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (+245-191)
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 931606c6f8fe1..e64b7da818a98 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -806,6 +806,63 @@ bool canSkipDef(MemoryDef *D, bool DefVisibleToCaller) {
   return false;
 }
 
+/// Returns true if \p I is a memory terminator instruction like
+/// llvm.lifetime.end or free.
+bool isMemTerminatorInst(Instruction *I, const TargetLibraryInfo &TLI) {
+  auto *CB = dyn_cast<CallBase>(I);
+  return CB && (CB->getIntrinsicID() == Intrinsic::lifetime_end ||
+                getFreedOperand(CB, &TLI) != nullptr);
+}
+
+struct DSEState;
+enum class ChangeStateEnum : uint8_t {
+  NoChange,
+  DeleteByMemTerm,
+  CompleteDeleteByNonMemTerm,
+  PartiallyDeleteByNonMemTerm,
+};
+
+// A memory location wrapper that represents a MemoryLocation, `MemLoc`,
+// defined by `MemDef`.
+class MemoryLocationWrapper {
+public:
+  MemoryLocationWrapper(MemoryLocation MemLoc, DSEState &State,
+                        MemoryDef *MemDef)
+      : MemLoc(MemLoc), State(State), MemDef(MemDef) {
+    assert(MemLoc.Ptr && "MemLoc should be not null");
+    UnderlyingObject = getUnderlyingObject(MemLoc.Ptr);
+    DefInst = MemDef->getMemoryInst();
+  }
+
+  // Try to eliminate dead defs killed by this MemoryLocation and return the
+  // change state.
+  ChangeStateEnum eliminateDeadDefs();
+  MemoryAccess *GetDefiningAccess() const {
+    return MemDef->getDefiningAccess();
+  }
+
+  MemoryLocation MemLoc;
+  const Value *UnderlyingObject;
+  DSEState &State;
+  MemoryDef *MemDef;
+  Instruction *DefInst;
+};
+
+// A memory def wrapper that represents a MemoryDef and the MemoryLocation(s)
+// defined by this MemoryDef.
+class MemoryDefWrapper {
+public:
+  MemoryDefWrapper(MemoryDef *MemDef, DSEState &State);
+  // Try to eliminate dead defs killed by this MemoryDef and return the
+  // change state.
+  bool eliminateDeadDefs();
+
+  MemoryDef *MemDef;
+  Instruction *DefInst;
+  DSEState &State;
+  SmallVector<MemoryLocationWrapper, 1> DefinedLocations;
+};
+
 struct DSEState {
   Function &F;
   AliasAnalysis &AA;
@@ -883,7 +940,7 @@ struct DSEState {
 
         auto *MD = dyn_cast_or_null<MemoryDef>(MA);
         if (MD && MemDefs.size() < MemorySSADefsPerBlockLimit &&
-            (getLocForWrite(&I) || isMemTerminatorInst(&I)))
+            (getLocForWrite(&I) || isMemTerminatorInst(&I, TLI)))
           MemDefs.push_back(MD);
       }
     }
@@ -1225,14 +1282,6 @@ struct DSEState {
     return std::nullopt;
   }
 
-  /// Returns true if \p I is a memory terminator instruction like
-  /// llvm.lifetime.end or free.
-  bool isMemTerminatorInst(Instruction *I) const {
-    auto *CB = dyn_cast<CallBase>(I);
-    return CB && (CB->getIntrinsicID() == Intrinsic::lifetime_end ||
-                  getFreedOperand(CB, &TLI) != nullptr);
-  }
-
   /// Returns true if \p MaybeTerm is a memory terminator for \p Loc from
   /// instruction \p AccessI.
   bool isMemTerminator(const MemoryLocation &Loc, Instruction *AccessI,
@@ -1325,17 +1374,15 @@ struct DSEState {
   // (completely) overwrite \p KillingLoc. Currently we bail out when we
   // encounter an aliasing MemoryUse (read).
   std::optional<MemoryAccess *>
-  getDomMemoryDef(MemoryDef *KillingDef, MemoryAccess *StartAccess,
-                  const MemoryLocation &KillingLoc, const Value *KillingUndObj,
+  getDomMemoryDef(MemoryLocationWrapper &KillingLoc, MemoryAccess *StartAccess,
                   unsigned &ScanLimit, unsigned &WalkerStepLimit,
-                  bool IsMemTerm, unsigned &PartialLimit) {
+                  unsigned &PartialLimit) {
     if (ScanLimit == 0 || WalkerStepLimit == 0) {
       LLVM_DEBUG(dbgs() << "\n    ...  hit scan limit\n");
       return std::nullopt;
     }
 
     MemoryAccess *Current = StartAccess;
-    Instruction *KillingI = KillingDef->getMemoryInst();
     LLVM_DEBUG(dbgs() << "  trying to get dominating access\n");
 
     // Only optimize defining access of KillingDef when directly starting at its
@@ -1344,8 +1391,8 @@ struct DSEState {
     // it should be sufficient to disable optimizations for instructions that
     // also read from memory.
     bool CanOptimize = OptimizeMemorySSA &&
-                       KillingDef->getDefiningAccess() == StartAccess &&
-                       !KillingI->mayReadFromMemory();
+                       KillingLoc.GetDefiningAccess() == StartAccess &&
+                       !KillingLoc.DefInst->mayReadFromMemory();
 
     // Find the next clobbering Mod access for DefLoc, starting at StartAccess.
     std::optional<MemoryLocation> CurrentLoc;
@@ -1361,15 +1408,15 @@ struct DSEState {
       // Reached TOP.
       if (MSSA.isLiveOnEntryDef(Current)) {
         LLVM_DEBUG(dbgs() << "   ...  found LiveOnEntryDef\n");
-        if (CanOptimize && Current != KillingDef->getDefiningAccess())
+        if (CanOptimize && Current != KillingLoc.GetDefiningAccess())
           // The first clobbering def is... none.
-          KillingDef->setOptimized(Current);
+          KillingLoc.MemDef->setOptimized(Current);
         return std::nullopt;
       }
 
       // Cost of a step. Accesses in the same block are more likely to be valid
       // candidates for elimination, hence consider them cheaper.
-      unsigned StepCost = KillingDef->getBlock() == Current->getBlock()
+      unsigned StepCost = KillingLoc.MemDef->getBlock() == Current->getBlock()
                               ? MemorySSASameBBStepCost
                               : MemorySSAOtherBBStepCost;
       if (WalkerStepLimit <= StepCost) {
@@ -1390,21 +1437,23 @@ struct DSEState {
       MemoryDef *CurrentDef = cast<MemoryDef>(Current);
       Instruction *CurrentI = CurrentDef->getMemoryInst();
 
-      if (canSkipDef(CurrentDef, !isInvisibleToCallerOnUnwind(KillingUndObj))) {
+      if (canSkipDef(CurrentDef, !isInvisibleToCallerOnUnwind(
+                                     KillingLoc.UnderlyingObject))) {
         CanOptimize = false;
         continue;
       }
 
       // Before we try to remove anything, check for any extra throwing
       // instructions that block us from DSEing
-      if (mayThrowBetween(KillingI, CurrentI, KillingUndObj)) {
+      if (mayThrowBetween(KillingLoc.DefInst, CurrentI,
+                          KillingLoc.UnderlyingObject)) {
         LLVM_DEBUG(dbgs() << "  ... skip, may throw!\n");
         return std::nullopt;
       }
 
       // Check for anything that looks like it will be a barrier to further
       // removal
-      if (isDSEBarrier(KillingUndObj, CurrentI)) {
+      if (isDSEBarrier(KillingLoc.UnderlyingObject, CurrentI)) {
         LLVM_DEBUG(dbgs() << "  ... skip, barrier\n");
         return std::nullopt;
       }
@@ -1413,14 +1462,16 @@ struct DSEState {
       // clobber, bail out, as the path is not profitable. We skip this check
       // for intrinsic calls, because the code knows how to handle memcpy
       // intrinsics.
-      if (!isa<IntrinsicInst>(CurrentI) && isReadClobber(KillingLoc, CurrentI))
+      if (!isa<IntrinsicInst>(CurrentI) &&
+          isReadClobber(KillingLoc.MemLoc, CurrentI))
         return std::nullopt;
 
       // Quick check if there are direct uses that are read-clobbers.
       if (any_of(Current->uses(), [this, &KillingLoc, StartAccess](Use &U) {
             if (auto *UseOrDef = dyn_cast<MemoryUseOrDef>(U.getUser()))
               return !MSSA.dominates(StartAccess, UseOrDef) &&
-                     isReadClobber(KillingLoc, UseOrDef->getMemoryInst());
+                     isReadClobber(KillingLoc.MemLoc,
+                                   UseOrDef->getMemoryInst());
             return false;
           })) {
         LLVM_DEBUG(dbgs() << "   ...  found a read clobber\n");
@@ -1438,32 +1489,33 @@ struct DSEState {
       // AliasAnalysis does not account for loops. Limit elimination to
       // candidates for which we can guarantee they always store to the same
       // memory location and not located in different loops.
-      if (!isGuaranteedLoopIndependent(CurrentI, KillingI, *CurrentLoc)) {
+      if (!isGuaranteedLoopIndependent(CurrentI, KillingLoc.DefInst,
+                                       *CurrentLoc)) {
         LLVM_DEBUG(dbgs() << "  ... not guaranteed loop independent\n");
         CanOptimize = false;
         continue;
       }
 
-      if (IsMemTerm) {
+      if (isMemTerminatorInst(KillingLoc.DefInst, TLI)) {
         // If the killing def is a memory terminator (e.g. lifetime.end), check
         // the next candidate if the current Current does not write the same
         // underlying object as the terminator.
-        if (!isMemTerminator(*CurrentLoc, CurrentI, KillingI)) {
+        if (!isMemTerminator(*CurrentLoc, CurrentI, KillingLoc.DefInst)) {
           CanOptimize = false;
           continue;
         }
       } else {
         int64_t KillingOffset = 0;
         int64_t DeadOffset = 0;
-        auto OR = isOverwrite(KillingI, CurrentI, KillingLoc, *CurrentLoc,
-                              KillingOffset, DeadOffset);
+        auto OR = isOverwrite(KillingLoc.DefInst, CurrentI, KillingLoc.MemLoc,
+                              *CurrentLoc, KillingOffset, DeadOffset);
         if (CanOptimize) {
           // CurrentDef is the earliest write clobber of KillingDef. Use it as
           // optimized access. Do not optimize if CurrentDef is already the
           // defining access of KillingDef.
-          if (CurrentDef != KillingDef->getDefiningAccess() &&
+          if (CurrentDef != KillingLoc.GetDefiningAccess() &&
               (OR == OW_Complete || OR == OW_MaybePartial))
-            KillingDef->setOptimized(CurrentDef);
+            KillingLoc.MemDef->setOptimized(CurrentDef);
 
           // Once a may-aliasing def is encountered do not set an optimized
           // access.
@@ -1496,7 +1548,7 @@ struct DSEState {
     // the blocks with killing (=completely overwriting MemoryDefs) and check if
     // they cover all paths from MaybeDeadAccess to any function exit.
     SmallPtrSet<Instruction *, 16> KillingDefs;
-    KillingDefs.insert(KillingDef->getMemoryInst());
+    KillingDefs.insert(KillingLoc.DefInst);
     MemoryAccess *MaybeDeadAccess = Current;
     MemoryLocation MaybeDeadLoc = *CurrentLoc;
     Instruction *MaybeDeadI = cast<MemoryDef>(MaybeDeadAccess)->getMemoryInst();
@@ -1558,7 +1610,8 @@ struct DSEState {
         continue;
       }
 
-      if (UseInst->mayThrow() && !isInvisibleToCallerOnUnwind(KillingUndObj)) {
+      if (UseInst->mayThrow() &&
+          !isInvisibleToCallerOnUnwind(KillingLoc.UnderlyingObject)) {
         LLVM_DEBUG(dbgs() << "  ... found throwing instruction\n");
         return std::nullopt;
       }
@@ -1582,7 +1635,7 @@ struct DSEState {
       // if it reads the memory location.
       // TODO: It would probably be better to check for self-reads before
       // calling the function.
-      if (KillingDef == UseAccess || MaybeDeadAccess == UseAccess) {
+      if (KillingLoc.MemDef == UseAccess || MaybeDeadAccess == UseAccess) {
         LLVM_DEBUG(dbgs() << "    ... skipping killing def/dom access\n");
         continue;
       }
@@ -1602,7 +1655,7 @@ struct DSEState {
           BasicBlock *MaybeKillingBlock = UseInst->getParent();
           if (PostOrderNumbers.find(MaybeKillingBlock)->second <
               PostOrderNumbers.find(MaybeDeadAccess->getBlock())->second) {
-            if (!isInvisibleToCallerAfterRet(KillingUndObj)) {
+            if (!isInvisibleToCallerAfterRet(KillingLoc.UnderlyingObject)) {
               LLVM_DEBUG(dbgs()
                          << "    ... found killing def " << *UseInst << "\n");
               KillingDefs.insert(UseInst);
@@ -1620,7 +1673,7 @@ struct DSEState {
     // For accesses to locations visible after the function returns, make sure
     // that the location is dead (=overwritten) along all paths from
     // MaybeDeadAccess to the exit.
-    if (!isInvisibleToCallerAfterRet(KillingUndObj)) {
+    if (!isInvisibleToCallerAfterRet(KillingLoc.UnderlyingObject)) {
       SmallPtrSet<BasicBlock *, 16> KillingBlocks;
       for (Instruction *KD : KillingDefs)
         KillingBlocks.insert(KD->getParent());
@@ -2134,180 +2187,181 @@ struct DSEState {
   }
 };
 
-static bool eliminateDeadStores(Function &F, AliasAnalysis &AA, MemorySSA &MSSA,
-                                DominatorTree &DT, PostDominatorTree &PDT,
-                                const TargetLibraryInfo &TLI,
-                                const LoopInfo &LI) {
-  bool MadeChange = false;
+MemoryDefWrapper::MemoryDefWrapper(MemoryDef *MemDef, DSEState &State)
+    : MemDef(MemDef), State(State) {
+  DefInst = MemDef->getMemoryInst();
 
-  DSEState State(F, AA, MSSA, DT, PDT, TLI, LI);
-  // For each store:
-  for (unsigned I = 0; I < State.MemDefs.size(); I++) {
-    MemoryDef *KillingDef = State.MemDefs[I];
-    if (State.SkipStores.count(KillingDef))
+  if (isMemTerminatorInst(DefInst, State.TLI)) {
+    if (auto KillingLoc = State.getLocForTerminator(DefInst)) {
+      DefinedLocations.push_back(
+          MemoryLocationWrapper(KillingLoc->first, State, MemDef));
+    }
+    return;
+  }
+
+  if (auto KillingLoc = State.getLocForWrite(DefInst)) {
+    DefinedLocations.push_back(
+        MemoryLocationWrapper(*KillingLoc, State, MemDef));
+  }
+}
+
+ChangeStateEnum MemoryLocationWrapper::eliminateDeadDefs() {
+  ChangeStateEnum ChangeState = ChangeStateEnum::NoChange;
+  unsigned ScanLimit = MemorySSAScanLimit;
+  unsigned WalkerStepLimit = MemorySSAUpwardsStepLimit;
+  unsigned PartialLimit = MemorySSAPartialStoreLimit;
+  // Worklist of MemoryAccesses that may be killed by KillingDef.
+  SmallSetVector<MemoryAccess *, 8> ToCheck;
+  ToCheck.insert(GetDefiningAccess());
+
+  // Check if MemoryAccesses in the worklist are killed by KillingDef.
+  for (unsigned I = 0; I < ToCheck.size(); I++) {
+    MemoryAccess *Current = ToCheck[I];
+    if (State.SkipStores.count(Current))
       continue;
-    Instruction *KillingI = KillingDef->getMemoryInst();
+    std::optional<MemoryAccess *> MaybeDeadAccess = State.getDomMemoryDef(
+        *this, Current, ScanLimit, WalkerStepLimit, PartialLimit);
 
-    std::optional<MemoryLocation> MaybeKillingLoc;
-    if (State.isMemTerminatorInst(KillingI)) {
-      if (auto KillingLoc = State.getLocForTerminator(KillingI))
-        MaybeKillingLoc = KillingLoc->first;
-    } else {
-      MaybeKillingLoc = State.getLocForWrite(KillingI);
+    if (!MaybeDeadAccess) {
+      LLVM_DEBUG(dbgs() << "  finished walk\n");
+      continue;
     }
-
-    if (!MaybeKillingLoc) {
-      LLVM_DEBUG(dbgs() << "Failed to find analyzable write location for "
-                        << *KillingI << "\n");
+    MemoryAccess *DeadAccess = *MaybeDeadAccess;
+    LLVM_DEBUG(dbgs() << " Checking if we can kill " << *DeadAccess);
+    if (isa<MemoryPhi>(DeadAccess)) {
+      LLVM_DEBUG(dbgs() << "\n  ... adding incoming values to worklist\n");
+      for (Value *V : cast<MemoryPhi>(DeadAccess)->incoming_values()) {
+        MemoryAccess *IncomingAccess = cast<MemoryAccess>(V);
+        BasicBlock *IncomingBlock = IncomingAccess->getBlock();
+        BasicBlock *PhiBlock = DeadAccess->getBlock();
+
+        // We only consider incoming MemoryAccesses that come before the
+        // MemoryPhi. Otherwise we could discover candidates that do not
+        // strictly dominate our starting def.
+        if (State.PostOrderNumbers[IncomingBlock] >
+            State.PostOrderNumbers[PhiBlock])
+          ToCheck.insert(IncomingAccess);
+      }
       continue;
     }
-    MemoryLocation KillingLoc = *MaybeKillingLoc;
-    assert(KillingLoc.Ptr && "KillingLoc should not be null");
-    const Value *KillingUndObj = getUnderlyingObject(KillingLoc.Ptr);
-    LLVM_DEBUG(dbgs() << "Trying to eliminate MemoryDefs killed by "
-                      << *KillingDef << " (" << *KillingI << ")\n");
-
-    unsigned ScanLimit = MemorySSAScanLimit;
-    unsigned WalkerStepLimit = MemorySSAUpwardsStepLimit;
-    unsigned PartialLimit = MemorySSAPartialStoreLimit;
-    // Worklist of MemoryAccesses that may be killed by KillingDef.
-    SmallSetVector<MemoryAccess *, 8> ToCheck;
-    // Track MemoryAccesses that have been deleted in the loop below, so we can
-    // skip them. Don't use SkipStores for this, which may contain reused
-    // MemoryAccess addresses.
-    SmallPtrSet<MemoryAccess *, 8> Deleted;
-    [[maybe_unused]] unsigned OrigNumSkipStores = State.SkipStores.size();
-    ToCheck.insert(KillingDef->getDefiningAccess());
-
-    bool Shortend = false;
-    bool IsMemTerm = State.isMemTerminatorInst(KillingI);
-    // Check if MemoryAccesses in the worklist are killed by KillingDef.
-    for (unsigned I = 0; I < ToCheck.size(); I++) {
-      MemoryAccess *Current = ToCheck[I];
-      if (Deleted.contains(Current))
-        continue;
+    MemoryDefWrapper DeadDefWrapper(cast<MemoryDef>(DeadAccess), State);
+    MemoryLocationWrapper &DeadLoc = DeadDefWrapper.DefinedLocations.front();
+    LLVM_DEBUG(dbgs() << " (" << *DeadDefWrapper.DefInst << ")\n");
+    ToCheck.insert(DeadLoc.GetDefiningAccess());
+    NumGetDomMemoryDefPassed++;
 
-      std::optional<MemoryAccess *> MaybeDeadAccess = State.getDomMemoryDef(
-          KillingDef, Current, KillingLoc, KillingUndObj, ScanLimit,
-          WalkerStepLimit, IsMemTerm, PartialLimit);
-
-      if (!MaybeDeadAccess) {
-        LLVM_DEBUG(dbgs() << "  finished walk\n");
+    if (!DebugCounter::shouldExecute(MemorySSACounter))
+      continue;
+    if (isMemTerminatorInst(DefInst, State.TLI)) {
+      if (!(UnderlyingObject == DeadLoc.UnderlyingObject))
         continue;
+      LLVM_DEBUG(dbgs() << "DSE: Remove Dead Store:\n  DEAD: "
+                        << *DeadDefWrapper.DefInst << "\n  KILLER: " << *DefInst
+                        << '\n');
+      State.deleteDeadInstruction(DeadDefWrapper.DefInst);
+      ++NumFastStores;
+      ChangeState = ChangeStateEnum::DeleteByMemTerm;
+    } else {
+      // Check if DeadI overwrites KillingI.
+      int64_t KillingOffset = 0;
+      int64_t DeadOffset = 0;
+      OverwriteResult OR =
+          State.isOverwrite(DefInst, DeadDefWrapper.DefInst, MemLoc,
+                            DeadLoc.MemLoc, KillingOffset, DeadOffset);
+      if (OR == OW_MaybePartial) {
+        auto Iter = State.IOLs.insert(
+            std::make_pair<BasicBlock *, InstOverlapIntervalsTy>(
+                DeadDefWrapper.DefInst->getParent(), InstOverlapIntervalsTy()));
+        auto &IOL = Iter.first->second;
+        OR = isPartialOverwrite(MemLoc, DeadLoc.MemLoc, KillingOffset,
+                                DeadOffset, DeadDefWrapper.DefInst, IOL);
       }
-
-      MemoryAccess *DeadAccess = *MaybeDeadAccess;
-      LLVM_DEBUG(dbgs() << " Checking if we can kill " << *DeadAccess);
-      if (isa<MemoryPhi>(DeadAccess)) {
-        LLVM_DEBUG(dbgs() << "\n  ... adding incoming values to worklist\n");
-        for (Value *V : cast<MemoryPhi>(DeadAccess)->incoming_values()) {
-          MemoryAccess *IncomingAccess = cast<MemoryAccess>(V);
-          BasicBlock *IncomingBlock = IncomingAccess->getBlock();
-          BasicBlock *PhiBlock = DeadAccess->getBlock();
-
-          // We only consider incoming MemoryAccesses that come before the
-          // MemoryPhi. Otherwise we could discover candidates that do not
-          // strictly dominate our starting def.
-          if (State.PostOrderNumbers[IncomingBlock] >
-              State.PostOrderNumbers[PhiBlock])
-            ToCheck.insert(IncomingAccess);
+      if (EnablePartialStoreMerging && OR == OW_PartialEarlierWithFullLater) {
+        auto *DeadSI = dyn_cast<StoreInst>(DeadDefWrapper.DefInst);
+        auto *KillingSI = dyn_cast<StoreInst>(DefInst);
+        // We are re-using tryToMergePartialOverlappingStores, which requires
+        // DeadSI to dominate KillingSI.
+        // TODO: implement tryToMergeParialOverlappingStores using MemorySSA.
+        if (DeadSI && KillingSI && State.DT.dominates(DeadSI, KillingSI)) {
+          if (Constant *Merged = tryToMergePartialOverlappingStores(
+                  KillingSI, DeadSI, KillingOffset, DeadOffset, State.DL,
+                  State.BatchAA, &State.DT)) {
+
+            // Update stored value of earlier store to merged constant.
+            D...
[truncated]

@tschuett tschuett requested a review from fhahn July 29, 2024 04:56
@nikic nikic self-requested a review July 29, 2024 07:54
@tschuett
Copy link
Member

tschuett commented Jul 29, 2024

Can we promote somebody to code owner to make the assignment automatic?

@haopliu haopliu changed the title Refactor DSE [DSE] Refactor DSE Jul 29, 2024
@aeubanks aeubanks requested a review from alinas August 2, 2024 01:32
@aeubanks
Copy link
Contributor

aeubanks commented Aug 6, 2024

with just this patch, it's still one MemoryLocation per MemoryDef. can we keep that assumption for now in this PR? and then make the change in the initializes DSE patch

@aeubanks
Copy link
Contributor

aeubanks commented Aug 7, 2024

also, the dependency cycle between Memory*Wrapper and DSEState is awkward, is it possible to keep the eliminateDeadDefs() implementations as part of DSEState?

@haopliu
Copy link
Contributor Author

haopliu commented Aug 7, 2024

with just this patch, it's still one MemoryLocation per MemoryDef. can we keep that assumption for now in this PR? and then make the change in the initializes DSE patch

It's correct! With this PR, we still keep one MemoryLocation per MemoryDef.
See assert(DefinedLocations.size() == 1 && "Expected a single defined location");
This PR only refactors and aims to be equivalent to the existing implementation.

also, the dependency cycle between Memory*Wrapper and DSEState is awkward, is it possible to keep the eliminateDeadDefs() implementations as part of DSEState?

We can do that to avoid the dependency cycle. DSEState is a big struct (1300 lines of code). Is it ok to continue adding code in this struct?

@aeubanks
Copy link
Contributor

aeubanks commented Aug 7, 2024

with just this patch, it's still one MemoryLocation per MemoryDef. can we keep that assumption for now in this PR? and then make the change in the initializes DSE patch

It's correct! With this PR, we still keep one MemoryLocation per MemoryDef. See assert(DefinedLocations.size() == 1 && "Expected a single defined location"); This PR only refactors and aims to be equivalent to the existing implementation.

I mean in this PR DefinedLocation shouldn't be a SmallVector, just a MemoryLocationWrapper. The PR that introduces multiple MemoryLocations per MemoryDef should make that change instead.

also, the dependency cycle between Memory*Wrapper and DSEState is awkward, is it possible to keep the eliminateDeadDefs() implementations as part of DSEState?

We can do that to avoid the dependency cycle. DSEState is a big struct (1300 lines of code). Is it ok to continue adding code in this struct?

I thought the code was originally in DSEState?

@haopliu
Copy link
Contributor Author

haopliu commented Aug 7, 2024

I mean in this PR DefinedLocation shouldn't be a SmallVector, just a MemoryLocationWrapper. The PR that introduces multiple MemoryLocations per MemoryDef should make that change instead.

Changed the DefinedLocations (SmallVector) field to DefinedLocation (optional).

I thought the code was originally in DSEState?

The major code was originally in static bool eliminateDeadStores(...), this loop:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp#L2145

@aeubanks
Copy link
Contributor

aeubanks commented Aug 8, 2024

I thought the code was originally in DSEState?

The major code was originally in static bool eliminateDeadStores(...), this loop: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp#L2145

Moving a function from a static function to a DSEState method seems fine, I don't see anything wrong with that

@haopliu
Copy link
Contributor Author

haopliu commented Aug 8, 2024

Moving a function from a static function to a DSEState method seems fine, I don't see anything wrong with that

Okay, moved MemoryDefWrapper and MemoryLocationWrapper to DSEState!

@aeubanks
Copy link
Contributor

aeubanks commented Aug 8, 2024

sorry, I meant that we should keep Memory*Wrapper as simple structs that just hold some data without any complex methods like eliminateDeadDefs, and then DSEState contains all the real logic, using Memory*Wrapper wherever necessary. not to move Memory*Wrapper under DSEState. Memory*Wrapper should probably be outside DSEState like before

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes this is what I was looking for, thanks

I'll review this more in depth soon

MemoryDef *KillingDef = State.MemDefs[I];
if (State.SkipStores.count(KillingDef))
// Try to eliminate dead defs killed by `KillingDefWrapper` and return the
// change state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is returning a bool instead of a change state

};
// Try to eliminate dead defs killed by `KillingLocWrapper` and return the
// change state.
ChangeStateEnum eliminateDeadDefs(MemoryLocationWrapper &KillingLocWrapper);
Copy link
Contributor

@aeubanks aeubanks Aug 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a lot of the enum values of ChangeStateEnum aren't actually checked. following the original code more closely, can we return a std::pair<bool, bool> of [Changed, Shortened]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense. Done.

@aeubanks
Copy link
Contributor

I'd add [NFC] to the commit title to emphasize this is pure refactoring

@haopliu haopliu changed the title [DSE] Refactor DSE [NFC] [DSE] Refactor DSE Aug 12, 2024
const TargetLibraryInfo &TLI,
const LoopInfo &LI) {
bool MadeChange = false;
// Try to eliminate dead defs killed by `KillingLocWrapper` and return the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, Shortened really means "KillingI was deleted" (please correct me if I'm wrong). So maybe change the variable name to DeletedKillingLoc and the comment to

Try to eliminate dead defs killed by `KillingLocWrapper`. Return whether any changes were made, and whether `KillingLocWrapper` was deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aside from one comment, lgtm. do you have a followup DSE commit you could reference in this commit's description?

Copy link
Contributor

@jvoung jvoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in reviewing! A few comments

llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp Outdated Show resolved Hide resolved
if (ScanLimit == 0 || WalkerStepLimit == 0) {
LLVM_DEBUG(dbgs() << "\n ... hit scan limit\n");
return std::nullopt;
}

MemoryAccess *Current = StartAccess;
Instruction *KillingI = KillingDef->getMemoryInst();
LLVM_DEBUG(dbgs() << " trying to get dominating access\n");

// Only optimize defining access of KillingDef when directly starting at its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the comment about KillingDef? There's no longer that variable and it's wrapped in KillingLoc (or make a local var KillingDef = KillingLoc.MemDef, or ...?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed "KillingLoc" parameter to "KillingLocWrapper", and updated the comments in getDomMemoryDef and two eliminateDeadDefs functions:
"KillingLoc" --> "KillingLocWrapper.MemLoc"
"KillingDef" --> "KillingLocWrapper.MemDef"

Let me know if we prefer to make a local var KillingDef = KillingLocWrapper.MemDef.

auto OR = isOverwrite(KillingI, CurrentI, KillingLoc, *CurrentLoc,
KillingOffset, DeadOffset);
auto OR = isOverwrite(KillingLoc.DefInst, CurrentI, KillingLoc.MemLoc,
*CurrentLoc, KillingOffset, DeadOffset);
if (CanOptimize) {
// CurrentDef is the earliest write clobber of KillingDef. Use it as
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: similar here -- update the comment about KillingDef?

I think a few more cases below if you can update too. thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

unsigned ScanLimit = MemorySSAScanLimit;
unsigned WalkerStepLimit = MemorySSAUpwardsStepLimit;
unsigned PartialLimit = MemorySSAPartialStoreLimit;
// Worklist of MemoryAccesses that may be killed by KillingDef.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar: Update comment about KillingDef, or make a local var KillingDef, or ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if (MemLoc.has_value())
DefinedLocation = MemoryLocationWrapper(*MemLoc, MemDef);
}
std::optional<MemoryLocationWrapper> DefinedLocation = std::nullopt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a later stage this becomes a SmallVector of MemoryLocationWrapper. Is that right?

Do the MemoryLocationWrapper all have the same MemoryDef and DefInst?
I wonder if that's actually worth bundling in MemoryLocationWrapper (and thus repeating), or if the SmallVector should just be the MemoryLocation and UnderlyingObject (the non-repeating parts)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this will become a SmallVector of MemoryLocationWrapper and all have the same MemoryDef and DefInst.

Initially, I bundled them in MemoryDefWrapper but I realized that most accesses are through MemoryLocationWrapper in getDomMemoryDef. It seems that we should either bundle in MemoryLocationWrapper or keep a MemoryDefWrapper reference in each MemoryLocationWrapper. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm my preference is not to keep a reference to MemoryDefWrapper in each MemoryLocationWrapper and not make it circular =)

Sorry, maybe I should have discussed this more before asking about the comment updates, but I didn't notice this until later (already added note about comments).

What do you think of splitting out the MemDef as a separate parameter to getDomMemoryDef as it was before (KillingDef)? And then DefInst can be derived in a local scope as before (KillingI). Does DefInst need to be extracted early?

Can you clarify the benefits of bundling them in a single parameter?

  • Initially you had eliminateDeadDefs as a method of the wrapper, but that's now moved back into DSEState.
  • I can see that there's a lot of parameters for getDomMemoryDef and the relationships are loose and only tied together by the same name prefix "Killing"
  • similar for the "DeadLoc", "DeadDefAccess", ...

But then

  • there is the repetition
  • the "Wrapper" part of MemoryLocationWrapper name doesn't convey much information to the reader (related to the "killed by" comments where it was unclear there's a MemDef in the MemoryLocationWrapper class, and now say "killed by `KillingLocWrapper.MemDef"). Similar for the MemoryDefWrapper but the other way -- it may be unclear there are MemoryLocations in the class. If both Wrappers have access to the Def and some Locations it feels harder to tell the difference from the name.

A middle ground could be a single class "MemDefAndLocations" for bundling purposes (Killing vs Dead, etc.), but then allow splitting off MemDef and MemoryLocation when the execution starts focusing on one of the MemoryLocations related to the MemDef?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that there's a lot of parameters for getDomMemoryDef and the relationships are loose and only tied together by the same name prefix "Killing"

This is the main reason. I tried to make getDomMemoryDef with a bit clearer interface: given a (killing) MemoryLocWrapper, return a MemoryDef that dominates the MemoryLoc or underlying object. This process needs DefInst but mostly for additional checks, I think, so I choose MemoryLocWrapper as a parameter and try to bundle all killing-related stuff in this parameter.

Thanks for raising this concern! I agree this bundling is too surface. May be an ideal way is to refactor/divide the getDomMemoryDef logic then bundle the members and the corresponding logic together.

How about we step back and keep getDomMemoryDef as it was?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we step back and keep getDomMemoryDef as it was?

Sounds good, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Please take another look. Thanks!

const TargetLibraryInfo &TLI,
const LoopInfo &LI) {
bool MadeChange = false;
// Try to eliminate dead defs killed by `KillingLocWrapper` and return the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "eliminate dead defs killed by KillingLocWrapper" I'm not sure killed by KillingLocWrapper is as clear. Which part of KillingLocWrapper are the defs killed by? It's not the MemoryLocation parts right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to "Try to eliminate dead defs that access KillingLocWrapper.MemLoc and are killed by KillingLocWrapper.MemDef".

<< *DefInst << "\n");
return false;
}
LLVM_DEBUG(dbgs() << "Trying to eliminate MemoryDefs killed by " << *MemDef
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe naive question, but I'm having a bit of trouble determining where MemDef and DefInst are coming from in this scope. Are they fields of DSEState? How do they relate to the KillingDefWrapper that is being processed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, code in LLVM_DEBUG was ignored in my cmake (release mode) then unit test cannot find such mistakes! Fixed these mistakes.

Comment on lines 811 to 812
class MemoryLocationWrapper {
public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class MemoryLocationWrapper {
public:
struct MemoryLocationWrapper {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Comment on lines 828 to 829
class MemoryDefWrapper {
public:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class MemoryDefWrapper {
public:
struct MemoryDefWrapper {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

// killed by `KillingLocWrapper.MemDef`. Return whether
// any changes were made, and whether `KillingLocWrapper.DefInst` was deleted.
std::pair<bool, bool>
eliminateDeadDefs(MemoryLocationWrapper &KillingLocWrapper);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can KillingLocWrapper be modified? If so, document, if not make const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marked with const!

if (State.SkipStores.count(KillingDef))
// Try to eliminate dead defs killed by `KillingDefWrapper` and return the
// change state: whether make any change.
bool eliminateDeadDefs(MemoryDefWrapper &KillingDefWrapper);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can KillingLocWrapper be modified? If so, document, if not make const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marked with const!

Copy link
Contributor

@jvoung jvoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit but otherwise lgtm too

if (!DebugCounter::shouldExecute(MemorySSACounter))
continue;
if (isMemTerminatorInst(KillingLocWrapper.DefInst)) {
if (!(KillingLocWrapper.UnderlyingObject ==
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can do LHS != RHS instead ?

@haopliu
Copy link
Contributor Author

haopliu commented Aug 26, 2024

Thank you all! Here is the follow-up DSE commit for reference (58dd8a4).
Will merge this refactoring PR this week if no other comments.

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@haopliu haopliu merged commit 6421dcc into llvm:main Aug 29, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants