[LICM][TBAA] LICM needlessly drops TBAA #53794

wsmoses · 2022-02-12T19:52:17Z

; ModuleID = '<source>'
source_filename = "<source>"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: mustprogress nofree norecurse nosync nounwind uwtable
define void @licm(double** align 8 dereferenceable(8) %_M_start.i, i64 %numElem) {
entry:
  br label %for.cond

for.cond:                                         ; preds = %for.body, %entry
  %k.0 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
  %cmp = icmp ult i64 %k.0, %numElem
  br i1 %cmp, label %for.body, label %for.cond.cleanup

for.body:                                         ; preds = %for.cond
  %0 = load double*, double** %_M_start.i, align 8, !tbaa !3
  %add.ptr.i = getelementptr inbounds double, double* %0, i64 %k.0
  store double 2.000000e+00, double* %add.ptr.i, align 8, !tbaa !8
  %inc = add nuw i64 %k.0, 1
  br label %for.cond, !llvm.loop !10

for.cond.cleanup:                                 ; preds = %for.cond
  ret void
}

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git fc510998f7c287df2bc1304673e0cd8452d50b31)"}
!3 = !{!4, !5, i64 0}
!4 = !{!"_ZTSNSt12_Vector_baseIdSaIdEE17_Vector_impl_dataE", !5, i64 0, !5, i64 8, !5, i64 16}
!5 = !{!"any pointer", !6, i64 0}
!6 = !{!"omnipotent char", !7, i64 0}
!7 = !{!"Simple C++ TBAA"}
!8 = !{!9, !9, i64 0}
!9 = !{!"double", !6, i64 0}
!10 = distinct !{!10, !11, !12}
!11 = !{!"llvm.loop.mustprogress"}
!12 = !{!"llvm.loop.unroll.disable"}

When the %0 load is hoisted, it no longer has TBAA:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @licm(double** align 8 dereferenceable(8) %_M_start.i, i64 %numElem) {
entry:
  %0 = load double*, double** %_M_start.i, align 8
  br label %for.cond

for.cond:                                         ; preds = %for.body, %entry
  %k.0 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
  %cmp = icmp ult i64 %k.0, %numElem
  br i1 %cmp, label %for.body, label %for.cond.cleanup

for.body:                                         ; preds = %for.cond
  %add.ptr.i = getelementptr inbounds double, double* %0, i64 %k.0
  store double 2.000000e+00, double* %add.ptr.i, align 8, !tbaa !3
  %inc = add nuw i64 %k.0, 1
  br label %for.cond, !llvm.loop !7

for.cond.cleanup:                                 ; preds = %for.cond
  ret void
}

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git fc510998f7c287df2bc1304673e0cd8452d50b31)"}
!3 = !{!4, !4, i64 0}
!4 = !{!"double", !5, i64 0}
!5 = !{!"omnipotent char", !6, i64 0}
!6 = !{!"Simple C++ TBAA"}
!7 = distinct !{!7, !8, !9}
!8 = !{!"llvm.loop.mustprogress"}
!9 = !{!"llvm.loop.unroll.disable"}

davidbolvansky · 2022-02-12T20:00:36Z

LICM drops some metadata after hoisting which may not be valid anymore in new location.
cc @nikic

wsmoses · 2022-02-12T20:08:23Z

This pattern appears quite common (as this originally arises in the context of a std::vector).

In this context, I think preserving TBAA should always be correct, even after hoisting

nikic · 2022-02-12T20:10:36Z

It is not legal to preserve TBAA on a speculatively executed load. LICM will preserve the metadata if the load is guaranteed to execute. In your particular example, the load may not be executed if %numElems is zero.

wsmoses · 2022-02-12T20:36:32Z

@nikic This particular behavior presents a performance decay because of the stripped AA (hindering subsequent LICM, as an example). Consider the case here: https://godbolt.org/z/8v3bnhzY6

#include <vector>

void compute(std::vector<int> &data, unsigned long long numElems) {
    for (unsigned long long i=0; i<100; i++) {
        for (unsigned long long j=0; j<numElems; j++) {
            data[j]++;
        }
    }
}

LICM is first run before loop rotation and thus the inner load is speculative. Speculatively hoisting the load outside of the inner loop drops TBAA. This means the data pointer load can't be speculatively hoisted outside of the outer loop (and moreover the store into the data itself now appears to alias with the load of the data pointer due to the dropped TBAA). Allowing the load of the data pointer, and of the actual data to alias can make a significant performance impact.

In contrast, first performing the rotation (making it non-speculative), succeeds at hoisting the data pointer out (though at this point it would still drop the TBAA): https://godbolt.org/z/a44crzPGx

#include <vector>

void compute(std::vector<int> &data, unsigned long long numElems) {
    for (unsigned long long i=0; i<100; i++) {
        if (numElems > 0) {
            for (unsigned long long j=0; ; j++) {
                data[j]++;
                if (j == numElems) break;
            }
        }
    }
}

At minimum this feels like we should always ensure LICM follows loop rotation, though I'd also be happy if we don't speculate loads with alias information.

nikic · 2022-02-12T20:48:05Z

@wsmoses As a matter of general policy, we prefer doing optimizations that require dropping metadata or attributes over not doing them. This is obviously better on average, and avoids awkward situations where inferring more information in one place inhibits later optimizations.

We're certainly not going to inhibit load hoisting to preserve TBAA. We can reconsider doing LICM before LoopRotate though. This is a recent change made in https://reviews.llvm.org/D99249, that is a win in some cases and a loss in others. You could probably present an argument that this is not the right tradeoff.

wsmoses · 2022-02-12T20:59:33Z

Would it be reasonable to make the speculative instruction conditionally executed above the loop in a limited set of circumstances (and thus preserve metadata).

The original code is complex because the condition is not loop-invariant, but suppose the speculation was due to a single loop-invariant condition, we could reproduce that outside the loop.

The loop-dependent condition is a bit more tricky and would require the outside loop condition to be a "any of the inner conditions are true".

wsmoses · 2022-02-12T21:07:48Z

Another less-intrusive option:

Since we now run LICM both before and after Loop rotate, could we add an option to LICM to not speculatively hoist data that would drop information.

Thus we could still run LICM prior to LoopRotate and see those benefits (while preserving information), and the "full" LICM run would still perform all of the speculative information.

nikic · 2022-02-12T21:10:07Z

Would it be reasonable to make the speculative instruction conditionally executed above the loop in a limited set of circumstances (and thus preserve metadata).

The original code is complex because the condition is not loop-invariant, but suppose the speculation was due to a single loop-invariant condition, we could reproduce that outside the loop.

The loop-dependent condition is a bit more tricky and would require the outside loop condition to be a "any of the inner conditions are true".

I don't think that would be a profitable transform. In fact, if you do that, SimplifyCFG is just going to speculate the load out the branch lateron and drop the metadata anyway. Keeping around control flow to preserve TBAA metadata will be non-profitable outside a few very special circumstances.

nikic · 2022-02-12T21:28:13Z

Another less-intrusive option:

Since we now run LICM both before and after Loop rotate, could we add an option to LICM to not speculatively hoist data that would drop information.

Thus we could still run LICM prior to LoopRotate and see those benefits (while preserving information), and the "full" LICM run would still perform all of the speculative information.

I think this is principally viable. I would phrase this more generally in terms of only hoisting guaranteed-to-execute instructions, because any instructions that can be hoisted through speculation can also be speculated post-rotation.

Though I think we'd be better off dropping the first LICM run than complicating things in this direction. I believe the specific problem it was intended to solve was later addressed by emitting necessary alignment attributes in clang.

wsmoses · 2022-02-12T21:35:49Z

I'm okay with either approach, what do you recommend and how do we proceed?

davidbolvansky · 2022-02-12T22:11:11Z

Drop LICM run before LoopRotate in pass manager?

wsmoses · 2022-02-16T19:38:21Z

Added the speculative option in the PR here: https://reviews.llvm.org/D119965.

…otate LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information. This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D119965

…otate LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (llvm#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information. This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D119965 (cherry picked from commit d9da6a5)

…otate LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information. This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D119965 (cherry picked from commit d9da6a5)

…otate LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (llvm/llvm-project#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information. This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D119965

wsmoses added the llvm:core label Feb 12, 2022

nikic closed this as completed Feb 12, 2022

wsmoses reopened this Feb 12, 2022

wsmoses changed the title ~~[LICM][TBAA] LICM needlessly drops TBAA~~ [LICM][TBAA] LICM needlessly drops TBAA on a speculative hoist Feb 12, 2022

wsmoses changed the title ~~[LICM][TBAA] LICM needlessly drops TBAA on a speculative hoist~~ [LICM][TBAA] LICM needlessly drops TBAA Feb 12, 2022

wsmoses closed this as completed Feb 18, 2022

EugeneZelenko added llvm:optimizations and removed llvm:core labels Feb 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LICM][TBAA] LICM needlessly drops TBAA #53794

[LICM][TBAA] LICM needlessly drops TBAA #53794

wsmoses commented Feb 12, 2022 •

edited

Loading

davidbolvansky commented Feb 12, 2022

wsmoses commented Feb 12, 2022 •

edited

Loading

nikic commented Feb 12, 2022

wsmoses commented Feb 12, 2022

nikic commented Feb 12, 2022

wsmoses commented Feb 12, 2022 •

edited

Loading

wsmoses commented Feb 12, 2022

nikic commented Feb 12, 2022

nikic commented Feb 12, 2022

wsmoses commented Feb 12, 2022

davidbolvansky commented Feb 12, 2022

wsmoses commented Feb 16, 2022

[LICM][TBAA] LICM needlessly drops TBAA #53794

[LICM][TBAA] LICM needlessly drops TBAA #53794

Comments

wsmoses commented Feb 12, 2022 • edited Loading

davidbolvansky commented Feb 12, 2022

wsmoses commented Feb 12, 2022 • edited Loading

nikic commented Feb 12, 2022

wsmoses commented Feb 12, 2022

nikic commented Feb 12, 2022

wsmoses commented Feb 12, 2022 • edited Loading

wsmoses commented Feb 12, 2022

nikic commented Feb 12, 2022

nikic commented Feb 12, 2022

wsmoses commented Feb 12, 2022

davidbolvansky commented Feb 12, 2022

wsmoses commented Feb 16, 2022

wsmoses commented Feb 12, 2022 •

edited

Loading

wsmoses commented Feb 12, 2022 •

edited

Loading

wsmoses commented Feb 12, 2022 •

edited

Loading