Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LICM][TBAA] LICM needlessly drops TBAA #53794

Closed
wsmoses opened this issue Feb 12, 2022 · 12 comments
Closed

[LICM][TBAA] LICM needlessly drops TBAA #53794

wsmoses opened this issue Feb 12, 2022 · 12 comments

Comments

@wsmoses
Copy link
Member

wsmoses commented Feb 12, 2022

See: https://godbolt.org/z/bcj3zEz7P

; ModuleID = '<source>'
source_filename = "<source>"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; Function Attrs: mustprogress nofree norecurse nosync nounwind uwtable
define void @licm(double** align 8 dereferenceable(8) %_M_start.i, i64 %numElem) {
entry:
  br label %for.cond

for.cond:                                         ; preds = %for.body, %entry
  %k.0 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
  %cmp = icmp ult i64 %k.0, %numElem
  br i1 %cmp, label %for.body, label %for.cond.cleanup

for.body:                                         ; preds = %for.cond
  %0 = load double*, double** %_M_start.i, align 8, !tbaa !3
  %add.ptr.i = getelementptr inbounds double, double* %0, i64 %k.0
  store double 2.000000e+00, double* %add.ptr.i, align 8, !tbaa !8
  %inc = add nuw i64 %k.0, 1
  br label %for.cond, !llvm.loop !10

for.cond.cleanup:                                 ; preds = %for.cond
  ret void
}

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git fc510998f7c287df2bc1304673e0cd8452d50b31)"}
!3 = !{!4, !5, i64 0}
!4 = !{!"_ZTSNSt12_Vector_baseIdSaIdEE17_Vector_impl_dataE", !5, i64 0, !5, i64 8, !5, i64 16}
!5 = !{!"any pointer", !6, i64 0}
!6 = !{!"omnipotent char", !7, i64 0}
!7 = !{!"Simple C++ TBAA"}
!8 = !{!9, !9, i64 0}
!9 = !{!"double", !6, i64 0}
!10 = distinct !{!10, !11, !12}
!11 = !{!"llvm.loop.mustprogress"}
!12 = !{!"llvm.loop.unroll.disable"}

When the %0 load is hoisted, it no longer has TBAA:

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define void @licm(double** align 8 dereferenceable(8) %_M_start.i, i64 %numElem) {
entry:
  %0 = load double*, double** %_M_start.i, align 8
  br label %for.cond

for.cond:                                         ; preds = %for.body, %entry
  %k.0 = phi i64 [ 0, %entry ], [ %inc, %for.body ]
  %cmp = icmp ult i64 %k.0, %numElem
  br i1 %cmp, label %for.body, label %for.cond.cleanup

for.body:                                         ; preds = %for.cond
  %add.ptr.i = getelementptr inbounds double, double* %0, i64 %k.0
  store double 2.000000e+00, double* %add.ptr.i, align 8, !tbaa !3
  %inc = add nuw i64 %k.0, 1
  br label %for.cond, !llvm.loop !7

for.cond.cleanup:                                 ; preds = %for.cond
  ret void
}

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 7, !"uwtable", i32 1}
!2 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git fc510998f7c287df2bc1304673e0cd8452d50b31)"}
!3 = !{!4, !4, i64 0}
!4 = !{!"double", !5, i64 0}
!5 = !{!"omnipotent char", !6, i64 0}
!6 = !{!"Simple C++ TBAA"}
!7 = distinct !{!7, !8, !9}
!8 = !{!"llvm.loop.mustprogress"}
!9 = !{!"llvm.loop.unroll.disable"}
@davidbolvansky
Copy link
Collaborator

LICM drops some metadata after hoisting which may not be valid anymore in new location.
cc @nikic

@wsmoses
Copy link
Member Author

wsmoses commented Feb 12, 2022

This pattern appears quite common (as this originally arises in the context of a std::vector).

In this context, I think preserving TBAA should always be correct, even after hoisting

@nikic
Copy link
Contributor

nikic commented Feb 12, 2022

It is not legal to preserve TBAA on a speculatively executed load. LICM will preserve the metadata if the load is guaranteed to execute. In your particular example, the load may not be executed if %numElems is zero.

@nikic nikic closed this as completed Feb 12, 2022
@wsmoses
Copy link
Member Author

wsmoses commented Feb 12, 2022

@nikic This particular behavior presents a performance decay because of the stripped AA (hindering subsequent LICM, as an example). Consider the case here: https://godbolt.org/z/8v3bnhzY6

#include <vector>

void compute(std::vector<int> &data, unsigned long long numElems) {
    for (unsigned long long i=0; i<100; i++) {
        for (unsigned long long j=0; j<numElems; j++) {
            data[j]++;
        }
    }
}

LICM is first run before loop rotation and thus the inner load is speculative. Speculatively hoisting the load outside of the inner loop drops TBAA. This means the data pointer load can't be speculatively hoisted outside of the outer loop (and moreover the store into the data itself now appears to alias with the load of the data pointer due to the dropped TBAA). Allowing the load of the data pointer, and of the actual data to alias can make a significant performance impact.

In contrast, first performing the rotation (making it non-speculative), succeeds at hoisting the data pointer out (though at this point it would still drop the TBAA): https://godbolt.org/z/a44crzPGx

#include <vector>

void compute(std::vector<int> &data, unsigned long long numElems) {
    for (unsigned long long i=0; i<100; i++) {
        if (numElems > 0) {
            for (unsigned long long j=0; ; j++) {
                data[j]++;
                if (j == numElems) break;
            }
        }
    }
}

At minimum this feels like we should always ensure LICM follows loop rotation, though I'd also be happy if we don't speculate loads with alias information.

@wsmoses wsmoses reopened this Feb 12, 2022
@wsmoses wsmoses changed the title [LICM][TBAA] LICM needlessly drops TBAA [LICM][TBAA] LICM needlessly drops TBAA on a speculative hoist Feb 12, 2022
@wsmoses wsmoses changed the title [LICM][TBAA] LICM needlessly drops TBAA on a speculative hoist [LICM][TBAA] LICM needlessly drops TBAA Feb 12, 2022
@nikic
Copy link
Contributor

nikic commented Feb 12, 2022

@wsmoses As a matter of general policy, we prefer doing optimizations that require dropping metadata or attributes over not doing them. This is obviously better on average, and avoids awkward situations where inferring more information in one place inhibits later optimizations.

We're certainly not going to inhibit load hoisting to preserve TBAA. We can reconsider doing LICM before LoopRotate though. This is a recent change made in https://reviews.llvm.org/D99249, that is a win in some cases and a loss in others. You could probably present an argument that this is not the right tradeoff.

@wsmoses
Copy link
Member Author

wsmoses commented Feb 12, 2022

Would it be reasonable to make the speculative instruction conditionally executed above the loop in a limited set of circumstances (and thus preserve metadata).

The original code is complex because the condition is not loop-invariant, but suppose the speculation was due to a single loop-invariant condition, we could reproduce that outside the loop.

The loop-dependent condition is a bit more tricky and would require the outside loop condition to be a "any of the inner conditions are true".

@wsmoses
Copy link
Member Author

wsmoses commented Feb 12, 2022

Another less-intrusive option:

Since we now run LICM both before and after Loop rotate, could we add an option to LICM to not speculatively hoist data that would drop information.

Thus we could still run LICM prior to LoopRotate and see those benefits (while preserving information), and the "full" LICM run would still perform all of the speculative information.

@nikic
Copy link
Contributor

nikic commented Feb 12, 2022

Would it be reasonable to make the speculative instruction conditionally executed above the loop in a limited set of circumstances (and thus preserve metadata).

The original code is complex because the condition is not loop-invariant, but suppose the speculation was due to a single loop-invariant condition, we could reproduce that outside the loop.

The loop-dependent condition is a bit more tricky and would require the outside loop condition to be a "any of the inner conditions are true".

I don't think that would be a profitable transform. In fact, if you do that, SimplifyCFG is just going to speculate the load out the branch lateron and drop the metadata anyway. Keeping around control flow to preserve TBAA metadata will be non-profitable outside a few very special circumstances.

@nikic
Copy link
Contributor

nikic commented Feb 12, 2022

Another less-intrusive option:

Since we now run LICM both before and after Loop rotate, could we add an option to LICM to not speculatively hoist data that would drop information.

Thus we could still run LICM prior to LoopRotate and see those benefits (while preserving information), and the "full" LICM run would still perform all of the speculative information.

I think this is principally viable. I would phrase this more generally in terms of only hoisting guaranteed-to-execute instructions, because any instructions that can be hoisted through speculation can also be speculated post-rotation.

Though I think we'd be better off dropping the first LICM run than complicating things in this direction. I believe the specific problem it was intended to solve was later addressed by emitting necessary alignment attributes in clang.

@wsmoses
Copy link
Member Author

wsmoses commented Feb 12, 2022

I'm okay with either approach, what do you recommend and how do we proceed?

@davidbolvansky
Copy link
Collaborator

Drop LICM run before LoopRotate in pass manager?

@wsmoses
Copy link
Member Author

wsmoses commented Feb 16, 2022

Added the speculative option in the PR here: https://reviews.llvm.org/D119965.

wsmoses added a commit that referenced this issue Feb 18, 2022
…otate

LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information.

This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D119965
@wsmoses wsmoses closed this as completed Feb 18, 2022
llvmbot pushed a commit to llvmbot/llvm-project that referenced this issue Mar 3, 2022
…otate

LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (llvm#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information.

This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D119965

(cherry picked from commit d9da6a5)
tstellar pushed a commit that referenced this issue Mar 8, 2022
…otate

LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information.

This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D119965

(cherry picked from commit d9da6a5)
mem-frob pushed a commit to draperlaboratory/hope-llvm-project that referenced this issue Oct 7, 2022
…otate

LICM will speculatively hoist code outside of loops. This requires removing information, like alias analysis (llvm/llvm-project#53794), range information (https://bugs.llvm.org/show_bug.cgi?id=50550), among others. Prior to https://reviews.llvm.org/D99249 , LICM would only be run after LoopRotate. Running Loop Rotate prior to LICM prevents a instruction hoist from being speculative, if it was conditionally executed by the iteration (as is commonly emitted by clang and other frontends). Adding the additional LICM pass first, however, forces all of these instructions to be considered speculative, even if they are not speculative after LoopRotate. This destroys information, resulting in performance losses for discarding this additional information.

This PR modifies LICM to accept a ``speculative'' parameter which allows LICM to be set to perform information-loss speculative hoists or not. Phase ordering is then modified to not perform the information-losing speculative hoists until after loop rotate is performed, preserving this additional information.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D119965
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants