Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CodeGen] Simplify codegen for array initialization #93956

Merged
merged 1 commit into from
Jun 10, 2024

Conversation

nikic
Copy link
Contributor

@nikic nikic commented May 31, 2024

This makes codegen for array initialization simpler in two ways:

  1. Drop the zero-index GEP at the start, which is no longer needed with opaque pointers.
  2. Emit GEPs directly to the correct element, instead of having a long chain of +1 GEPs. This is more canonical, and also avoids regressions in unoptimized builds from [ConstantFold] Remove non-trivial gep-of-gep fold #93823.

This makes codegen for array initialization simpler in two ways:
1. Drop the zero-index GEP at the start, which is no longer
needed with opaque pointers.
2. Emit GEPs directly to the correct element, instead of having a
long chain of +1 GEPs. This is more canonical, and also avoids
regressions in unoptimized builds from llvm#93823.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen coroutines C++20 coroutines clang:openmp OpenMP related changes to Clang labels May 31, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented May 31, 2024

@llvm/pr-subscribers-clang

Author: Nikita Popov (nikic)

Changes

This makes codegen for array initialization simpler in two ways:

  1. Drop the zero-index GEP at the start, which is no longer needed with opaque pointers.
  2. Emit GEPs directly to the correct element, instead of having a long chain of +1 GEPs. This is more canonical, and also avoids regressions in unoptimized builds from #93823.

Patch is 407.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93956.diff

76 Files Affected:

  • (modified) clang/lib/CodeGen/CGExprAgg.cpp (+9-20)
  • (modified) clang/test/CodeGen/paren-list-agg-init.cpp (+18-26)
  • (modified) clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp (+15-17)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-references.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp (+18-24)
  • (modified) clang/test/CodeGenCXX/partial-destruction.cpp (+17-21)
  • (modified) clang/test/CodeGenCXX/temporaries.cpp (+2-3)
  • (modified) clang/test/CodeGenCXX/value-init.cpp (+12-11)
  • (modified) clang/test/CodeGenCoroutines/coro-suspend-cleanups.cpp (+7-10)
  • (modified) clang/test/CodeGenObjC/arc-ternary-op.m (+4-6)
  • (modified) clang/test/CodeGenObjC/arc.m (+2-3)
  • (modified) clang/test/CodeGenObjCXX/arc-exceptions.mm (+8-11)
  • (modified) clang/test/OpenMP/distribute_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_firstprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_private_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_simd_firstprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_simd_private_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/for_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/for_reduction_codegen.cpp (+6-8)
  • (modified) clang/test/OpenMP/for_reduction_codegen_UDR.cpp (+12-16)
  • (modified) clang/test/OpenMP/parallel_copyin_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_firstprivate_codegen.cpp (+12-18)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_lastprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_simd_firstprivate_codegen.cpp (+12-18)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_simd_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/parallel_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_reduction_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/sections_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/sections_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/sections_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/sections_reduction_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/simd_private_taskloop_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/single_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/single_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/teams_distribute_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/threadprivate_codegen.cpp (+228-240)
  • (modified) clang/test/PCH/cxx_paren_init.cpp (+4-5)
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index 5b2039af6128b..b2a5ceeeae08b 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -513,15 +513,6 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
 
   QualType elementType =
       CGF.getContext().getAsArrayType(ArrayQTy)->getElementType();
-
-  // DestPtr is an array*.  Construct an elementType* by drilling
-  // down a level.
-  llvm::Value *zero = llvm::ConstantInt::get(CGF.SizeTy, 0);
-  llvm::Value *indices[] = { zero, zero };
-  llvm::Value *begin = Builder.CreateInBoundsGEP(DestPtr.getElementType(),
-                                                 DestPtr.emitRawPointer(CGF),
-                                                 indices, "arrayinit.begin");
-
   CharUnits elementSize = CGF.getContext().getTypeSizeInChars(elementType);
   CharUnits elementAlign =
     DestPtr.getAlignment().alignmentOfArrayElement(elementSize);
@@ -562,6 +553,7 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
   Address endOfInit = Address::invalid();
   CodeGenFunction::CleanupDeactivationScope deactivation(CGF);
 
+  llvm::Value *begin = DestPtr.emitRawPointer(CGF);
   if (dtorKind) {
     CodeGenFunction::AllocaTrackerRAII allocaTracker(CGF);
     // In principle we could tell the cleanup where we are more
@@ -585,19 +577,13 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
 
   llvm::Value *one = llvm::ConstantInt::get(CGF.SizeTy, 1);
 
-  // The 'current element to initialize'.  The invariants on this
-  // variable are complicated.  Essentially, after each iteration of
-  // the loop, it points to the last initialized element, except
-  // that it points to the beginning of the array before any
-  // elements have been initialized.
-  llvm::Value *element = begin;
-
   // Emit the explicit initializers.
   for (uint64_t i = 0; i != NumInitElements; ++i) {
-    // Advance to the next element.
+    llvm::Value *element = begin;
     if (i > 0) {
-      element = Builder.CreateInBoundsGEP(
-          llvmElementType, element, one, "arrayinit.element");
+      element = Builder.CreateInBoundsGEP(llvmElementType, begin,
+                                          llvm::ConstantInt::get(CGF.SizeTy, i),
+                                          "arrayinit.element");
 
       // Tell the cleanup that it needs to destroy up to this
       // element.  TODO: some of these stores can be trivially
@@ -624,9 +610,12 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
     //   do { *array++ = filler; } while (array != end);
 
     // Advance to the start of the rest of the array.
+    llvm::Value *element = begin;
     if (NumInitElements) {
       element = Builder.CreateInBoundsGEP(
-          llvmElementType, element, one, "arrayinit.start");
+          llvmElementType, element,
+          llvm::ConstantInt::get(CGF.SizeTy, NumInitElements),
+          "arrayinit.start");
       if (endOfInit.isValid()) Builder.CreateStore(element, endOfInit);
     }
 
diff --git a/clang/test/CodeGen/paren-list-agg-init.cpp b/clang/test/CodeGen/paren-list-agg-init.cpp
index 94d42431d125d..88b1834d42d87 100644
--- a/clang/test/CodeGen/paren-list-agg-init.cpp
+++ b/clang/test/CodeGen/paren-list-agg-init.cpp
@@ -271,14 +271,13 @@ const int* foo10() {
 // CHECK-NEXT: [[ARR_2:%.*]] = alloca [4 x i32], align 16
 // CHECK-NEXT: store i32 [[A:%.*]], ptr [[A_ADDR]], align 4
 // CHECK-NEXT: store i32 [[B:%.*]], ptr [[B_ADDR]], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [4 x i32], ptr [[ARR_2]], i64 0, i64 0
 // CHECK-NEXT: [[TMP_0:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_ELEM:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
+// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARR_2]], align 4
+// CHECK-NEXT: [[ARRINIT_ELEM:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 1
 // CHECK-NEXT: [[TMP_1:%.*]] = load i32, ptr [[B_ADDR]], align 4
 // CHECK-NEXT: store i32 [[TMP_1]], ptr [[ARRINIT_ELEM]], align 4
-// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_ELEM]], i64 1
-// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 4
+// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 2
+// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 4
 // CHECK-NEXT: br label [[ARRINIT_BODY:%.*]]
 // CHECK: [[ARRINIT_CUR:%.*]] = phi ptr [ [[ARRINIT_START]], %entry ], [ [[ARRINIT_NEXT:%.*]], [[ARRINIT_BODY]] ]
 // CHECK-NEXT: store i32 0, ptr [[ARRINIT_CUR]], align 4
@@ -297,10 +296,9 @@ void foo11(int a, int b) {
 // CHECK-NEXT: [[ARR_3:%.*]] = alloca [2 x i32], align 4
 // CHECK-NEXT: store i32 [[A:%.*]], ptr [[A_ADDR]], align 4
 // CHECK-NEXT: store i32 [[B:%.*]], ptr [[B_ADDR]], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARR_3]], i64 0, i64 0
 // CHECK-NEXT: [[TMP_0:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
+// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARR_3]], align 4
+// CHECK-NEXT: [[ARRINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARR_3]], i64 1
 // CHECK-NEXT: [[TMP_1:%.*]] = load i32, ptr [[B_ADDR]], align 4
 // CHECK-NEXT: store i32 [[TMP_1]], ptr [[ARRINIT_ELEMENT]], align 4
 // CHECK-NEXT: ret void
@@ -336,8 +334,7 @@ const int* foo15() {
 // CHECK-NEXT: entry:
 // CHECK-NEXT: [[ARR_6:%.*arr6.*]] = alloca ptr, align 8
 // CHECK-NEXT: [[REF_TMP:%.*]] = alloca [1 x i32], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [1 x i32], ptr [[REF_TMP]], i64 0, i64 0
-// CHECK-NEXT: store i32 3, ptr [[ARRINIT_BEGIN]], align 4
+// CHECK-NEXT: store i32 3, ptr [[REF_TMP]], align 4
 // CHECK-NEXT: store ptr [[REF_TMP]], ptr [[ARR_6]], align 8
 // CHECK-NEXT: ret void
 void foo16() {
@@ -348,10 +345,9 @@ void foo16() {
 // CHECK-NEXT: entry:
 // CHECK-NEXT: [[ARR_7:%.*arr7.*]] = alloca ptr, align 8
 // CHECK-NEXT: [[REF_TMP:%.*]] = alloca [2 x i32], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[REF_TMP]], i64 0, i64 0
-// CHECK-NEXT: store i32 4, ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
-// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 2
+// CHECK-NEXT: store i32 4, ptr [[REF_TMP]], align 4
+// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[REF_TMP]], i64 1
+// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[REF_TMP]], i64 2
 // CHECK-NEXT: br label [[ARRINIT_BODY]]
 // CHECK: [[ARRINIT_CUR:%.*]] = phi ptr [ [[ARRINIT_START]], %entry ], [ [[ARRINIT_NEXT:%.*]], [[ARRINIT_BODY]] ]
 // CHECK-NEXT: store i32 0, ptr [[ARRINIT_CUR]], align 4
@@ -533,14 +529,12 @@ namespace gh68198 {
   // CHECK-NEXT: entry
   // CHECK-NEXT: [[ARR_10:%.*arr9.*]] = alloca ptr, align 8
   // CHECK-NEXT: [[CALL_PTR]] = call noalias noundef nonnull ptr @_Znam(i64 noundef 16)
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[CALL]], i64 0, i64 0
-  // CHECK-NEXT: store i32 1, ptr [[ARRAYINIT_BEGIN]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN]], i64 1
+  // CHECK-NEXT: store i32 1, ptr [[CALL]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[CALL]], i64 1
   // CHECK-NEXT: store i32 2, ptr [[ARRAYINIT_ELEMENT]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT:%.*]] = getelementptr inbounds [2 x i32], ptr %call, i64 1
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN1:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 0, i64 0
-  // CHECK-NEXT: store i32 3, ptr [[ARRAYINIT_BEGIN1]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN1]], i64 1
+  // CHECK-NEXT: store i32 3, ptr [[ARRAY_EXP_NEXT]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store i32 4, ptr [[ARRAYINIT_ELEMENT2]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT3:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store ptr [[CALL_PTR]], ptr [[ARR_10]], align 8
@@ -553,14 +547,12 @@ namespace gh68198 {
   // CHECK-NEXT: entry
   // CHECK-NEXT: [[ARR_10:%.*arr10.*]] = alloca ptr, align 8
   // CHECK-NEXT: [[CALL_PTR]] = call noalias noundef nonnull ptr @_Znam(i64 noundef 32)
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[CALL]], i64 0, i64 0
-  // CHECK-NEXT: store i32 5, ptr [[ARRAYINIT_BEGIN]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN]], i64 1
+  // CHECK-NEXT: store i32 5, ptr [[CALL]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[CALL]], i64 1
   // CHECK-NEXT: store i32 6, ptr [[ARRAYINIT_ELEMENT]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT:%.*]] = getelementptr inbounds [2 x i32], ptr %call, i64 1
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN1:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 0, i64 0
-  // CHECK-NEXT: store i32 7, ptr [[ARRAYINIT_BEGIN1]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN1]], i64 1
+  // CHECK-NEXT: store i32 7, ptr [[ARRAY_EXP_NEXT]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store i32 8, ptr [[ARRAYINIT_ELEMENT2]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT3:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 4 [[ARRAY_EXP_NEXT3]], i8 0, i64 16, i1 false)
diff --git a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
index ac466ee5bba40..4eafa720e0cb4 100644
--- a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
+++ b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
@@ -158,16 +158,15 @@ void ArrayInit() {
   // CHECK-LABEL: define dso_local void @_Z9ArrayInitv()
   // CHECK: %arrayinit.endOfInit = alloca ptr, align 8
   // CHECK: %cleanup.dest.slot = alloca i32, align 4
-  // CHECK: %arrayinit.begin = getelementptr inbounds [4 x %struct.Printy], ptr %arr, i64 0, i64 0
-  // CHECK: store ptr %arrayinit.begin, ptr %arrayinit.endOfInit, align 8
+  // CHECK: store ptr %arr, ptr %arrayinit.endOfInit, align 8
   Printy arr[4] = {
     Printy("a"),
-    // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) %arrayinit.begin, ptr noundef @.str)
-    // CHECK: [[ARRAYINIT_ELEMENT1:%.+]] = getelementptr inbounds %struct.Printy, ptr %arrayinit.begin, i64 1
+    // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) %arr, ptr noundef @.str)
+    // CHECK: [[ARRAYINIT_ELEMENT1:%.+]] = getelementptr inbounds %struct.Printy, ptr %arr, i64 1
     // CHECK: store ptr [[ARRAYINIT_ELEMENT1]], ptr %arrayinit.endOfInit, align 8
     Printy("b"),
     // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) [[ARRAYINIT_ELEMENT1]], ptr noundef @.str.1)
-    // CHECK: [[ARRAYINIT_ELEMENT2:%.+]] = getelementptr inbounds %struct.Printy, ptr [[ARRAYINIT_ELEMENT1]], i64 1
+    // CHECK: [[ARRAYINIT_ELEMENT2:%.+]] = getelementptr inbounds %struct.Printy, ptr %arr, i64 2
     // CHECK: store ptr [[ARRAYINIT_ELEMENT2]], ptr %arrayinit.endOfInit, align 8
     ({
     // CHECK: br i1 {{.*}}, label %if.then, label %if.end
@@ -180,7 +179,7 @@ void ArrayInit() {
       // CHECK:       if.end:
       Printy("c");
       // CHECK-NEXT:    call void @_ZN6PrintyC1EPKc
-      // CHECK-NEXT:    %arrayinit.element2 = getelementptr inbounds %struct.Printy, ptr %arrayinit.element1, i64 1
+      // CHECK-NEXT:    %arrayinit.element2 = getelementptr inbounds %struct.Printy, ptr %arr, i64 3
       // CHECK-NEXT:    store ptr %arrayinit.element2, ptr %arrayinit.endOfInit, align 8
     }),
     ({
@@ -212,14 +211,14 @@ void ArrayInit() {
 
   // CHECK:       cleanup:
   // CHECK-NEXT:    %1 = load ptr, ptr %arrayinit.endOfInit, align 8
-  // CHECK-NEXT:    %arraydestroy.isempty = icmp eq ptr %arrayinit.begin, %1
+  // CHECK-NEXT:    %arraydestroy.isempty = icmp eq ptr %arr, %1
   // CHECK-NEXT:    br i1 %arraydestroy.isempty, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2:.+]]
 
   // CHECK:       [[ARRAY_DESTROY_BODY2]]:
   // CHECK-NEXT:    %arraydestroy.elementPast = phi ptr [ %1, %cleanup ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY2]] ]
   // CHECK-NEXT:    %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1
   // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element)
-  // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arrayinit.begin
+  // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr
   // CHECK-NEXT:    br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE2]], label %[[ARRAY_DESTROY_BODY2]]
 
   // CHECK:       [[ARRAY_DESTROY_DONE2]]:
@@ -238,8 +237,7 @@ void ArraySubobjects() {
       // CHECK: call void @_ZN6PrintyC1EPKc
       // CHECK: call void @_ZN6PrintyC1EPKc
       {Printy("a"),
-      // CHECK: [[ARRAYINIT_BEGIN:%.+]] = getelementptr inbounds [2 x %struct.Printy]
-      // CHECK: store ptr [[ARRAYINIT_BEGIN]], ptr %arrayinit.endOfInit, align 8
+      // CHECK: store ptr %arr2, ptr %arrayinit.endOfInit, align 8
       // CHECK: call void @_ZN6PrintyC1EPKc
       // CHECK: [[ARRAYINIT_ELEMENT:%.+]] = getelementptr inbounds %struct.Printy
       // CHECK: store ptr [[ARRAYINIT_ELEMENT]], ptr %arrayinit.endOfInit, align 8
@@ -248,7 +246,7 @@ void ArraySubobjects() {
            return;
            // CHECK:      if.then:
            // CHECK-NEXT:   [[V0:%.+]] = load ptr, ptr %arrayinit.endOfInit, align 8
-           // CHECK-NEXT:   %arraydestroy.isempty = icmp eq ptr [[ARRAYINIT_BEGIN]], [[V0]]
+           // CHECK-NEXT:   %arraydestroy.isempty = icmp eq ptr %arr2, [[V0]]
            // CHECK-NEXT:   br i1 %arraydestroy.isempty, label %[[ARRAY_DESTROY_DONE:.+]], label %[[ARRAY_DESTROY_BODY:.+]]
          }
          Printy("b");
@@ -268,7 +266,7 @@ void ArraySubobjects() {
     // CHECK-NEXT:    %arraydestroy.elementPast = phi ptr [ %0, %if.then ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY]] ]
     // CHECK-NEXT:    %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1
     // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element)
-    // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, [[ARRAYINIT_BEGIN]]
+    // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr2
     // CHECK-NEXT:    br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE]], label %[[ARRAY_DESTROY_BODY]]
 
     // CHECK:       [[ARRAY_DESTROY_DONE]]
@@ -277,11 +275,11 @@ void ArraySubobjects() {
     // CHECK-NEXT:    br label %[[ARRAY_DESTROY_BODY2:.+]]
 
     // CHECK:       [[ARRAY_DESTROY_BODY2]]:
-    // CHECK-NEXT:    %arraydestroy.elementPast5 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element6, %[[ARRAY_DESTROY_BODY2]] ]
-    // CHECK-NEXT:    %arraydestroy.element6 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast5, i64 -1
-    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element6)
-    // CHECK-NEXT:    %arraydestroy.done7 = icmp eq ptr %arraydestroy.element6, [[ARRAY_BEGIN]]
-    // CHECK-NEXT:    br i1 %arraydestroy.done7, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]]
+    // CHECK-NEXT:    %arraydestroy.elementPast4 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element5, %[[ARRAY_DESTROY_BODY2]] ]
+    // CHECK-NEXT:    %arraydestroy.element5 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast4, i64 -1
+    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element5)
+    // CHECK-NEXT:    %arraydestroy.done6 = icmp eq ptr %arraydestroy.element5, [[ARRAY_BEGIN]]
+    // CHECK-NEXT:    br i1 %arraydestroy.done6, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]]
 
 
     // CHECK:     [[ARRAY_DESTROY_DONE2]]:
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
index b19ca1d861b11..419525d0544d5 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
@@ -52,11 +52,10 @@ namespace reference {
     // CHECK-NEXT: store ptr %{{.*}}, ptr %{{.*}}, align
     const A &ra1{1, i};
 
-    // CHECK-NEXT: getelementptr inbounds [3 x i32], ptr %{{.*}}, i{{32|64}} 0, i{{32|64}} 0
     // CHECK-NEXT: store i32 1
     // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 1
     // CHECK-NEXT: store i32 2
-    // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 1
+    // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 2
     // CHECK-NEXT: %[[I2:.*]] = load i32, ptr
     // CHECK-NEXT: store i32 %[[I2]]
     // CHECK-NEXT: store ptr %{{.*}}, ptr %{{.*}}, align
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
index d9e4c5d5403c2..e2d566121331f 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
@@ -40,8 +40,7 @@ void fn1(int i) {
   // CHECK-LABEL: define{{.*}} void @_Z3fn1i
   // temporary array
   // CHECK: [[array:%[^ ]+]] = alloca [3 x i32]
-  // CHECK: getelementptr inbounds [3 x i32], ptr [[array]], i{{32|64}} 0
-  // CHECK-NEXT: store i32 1, ptr
+  // CHECK:      store i32 1, ptr
   // CHECK-NEXT: getelementptr
   // CHECK-NEXT: store
   // CHECK-NEXT: getelementptr
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
index aa2f078a5fb0c..3d0cf968bfd54 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -122,8 +122,7 @@ void fn1(int i) {
   // X86: [[array:%[^ ]+]] = alloca [3 x i32]
   // AMDGCN: [[alloca:%[^ ]+]] = alloca [3 x i32], align 4, addrspace(5)
   // AMDGCN: [[array:%[^ ]+]] ={{.*}} addrspacecast ptr addrspace(5) [[alloca]] to ptr
-  // CHECK: getelementptr inbounds [3 x i32], ptr [[array]], i{{32|64}} 0
-  // CHECK-NEXT: store i32 1, ptr
+  // CHECK:      store i32 1, ptr
   // CHECK-NEXT: getelementptr
   // CHECK-NEXT: store
   // CHECK-NEXT: getelementptr
diff --git a/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp b/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
index 48ee019a55f55..d86c712372cd8 100644
--- a/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
+++ b/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
@@ -16,22 +16,20 @@ void *p = new S[2][3]{ { 1, 2, 3 }, { 4, 5, 6 } };
 // { 1, 2, 3 }
 //
 //
-// CHECK: %[[S_0_0:.*]] = getelementptr inbounds [3 x %[[S:.*]]], ptr %[[START_AS_i8]], i64 0, i64 0
-// CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_0]], i32 noundef 1)
-// CHECK: %[[S_0_1:.*]] = getelementptr inbounds %[[S]], ptr %[[S_0_0]], i64 1
+// CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[START_AS_i8]], i32 noundef 1)
+// CHECK: %[[S_0_1:.*]] = getelementptr inbounds %[[S:.+]], ptr %[[START_AS_i8]], i64 1
 // CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_1]], i32 noundef 2)
-// CHECK: %[[S_0_2:.*]] = getelementptr inbounds %[[S]], ptr %[[S_0_1]], i64 1
+// CHECK: %[[S_0_2:.*]] = getelementptr inbounds %[[S]], ptr %[[START_AS_i8]], i64 2
 // CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_2...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented May 31, 2024

@llvm/pr-subscribers-coroutines

Author: Nikita Popov (nikic)

Changes

This makes codegen for array initialization simpler in two ways:

  1. Drop the zero-index GEP at the start, which is no longer needed with opaque pointers.
  2. Emit GEPs directly to the correct element, instead of having a long chain of +1 GEPs. This is more canonical, and also avoids regressions in unoptimized builds from #93823.

Patch is 407.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93956.diff

76 Files Affected:

  • (modified) clang/lib/CodeGen/CGExprAgg.cpp (+9-20)
  • (modified) clang/test/CodeGen/paren-list-agg-init.cpp (+18-26)
  • (modified) clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp (+15-17)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-references.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp (+18-24)
  • (modified) clang/test/CodeGenCXX/partial-destruction.cpp (+17-21)
  • (modified) clang/test/CodeGenCXX/temporaries.cpp (+2-3)
  • (modified) clang/test/CodeGenCXX/value-init.cpp (+12-11)
  • (modified) clang/test/CodeGenCoroutines/coro-suspend-cleanups.cpp (+7-10)
  • (modified) clang/test/CodeGenObjC/arc-ternary-op.m (+4-6)
  • (modified) clang/test/CodeGenObjC/arc.m (+2-3)
  • (modified) clang/test/CodeGenObjCXX/arc-exceptions.mm (+8-11)
  • (modified) clang/test/OpenMP/distribute_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_firstprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_private_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_simd_firstprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_simd_private_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/for_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/for_reduction_codegen.cpp (+6-8)
  • (modified) clang/test/OpenMP/for_reduction_codegen_UDR.cpp (+12-16)
  • (modified) clang/test/OpenMP/parallel_copyin_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_firstprivate_codegen.cpp (+12-18)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_lastprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_simd_firstprivate_codegen.cpp (+12-18)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_simd_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/parallel_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_reduction_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/sections_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/sections_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/sections_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/sections_reduction_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/simd_private_taskloop_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/single_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/single_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/teams_distribute_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/threadprivate_codegen.cpp (+228-240)
  • (modified) clang/test/PCH/cxx_paren_init.cpp (+4-5)
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index 5b2039af6128b..b2a5ceeeae08b 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -513,15 +513,6 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
 
   QualType elementType =
       CGF.getContext().getAsArrayType(ArrayQTy)->getElementType();
-
-  // DestPtr is an array*.  Construct an elementType* by drilling
-  // down a level.
-  llvm::Value *zero = llvm::ConstantInt::get(CGF.SizeTy, 0);
-  llvm::Value *indices[] = { zero, zero };
-  llvm::Value *begin = Builder.CreateInBoundsGEP(DestPtr.getElementType(),
-                                                 DestPtr.emitRawPointer(CGF),
-                                                 indices, "arrayinit.begin");
-
   CharUnits elementSize = CGF.getContext().getTypeSizeInChars(elementType);
   CharUnits elementAlign =
     DestPtr.getAlignment().alignmentOfArrayElement(elementSize);
@@ -562,6 +553,7 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
   Address endOfInit = Address::invalid();
   CodeGenFunction::CleanupDeactivationScope deactivation(CGF);
 
+  llvm::Value *begin = DestPtr.emitRawPointer(CGF);
   if (dtorKind) {
     CodeGenFunction::AllocaTrackerRAII allocaTracker(CGF);
     // In principle we could tell the cleanup where we are more
@@ -585,19 +577,13 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
 
   llvm::Value *one = llvm::ConstantInt::get(CGF.SizeTy, 1);
 
-  // The 'current element to initialize'.  The invariants on this
-  // variable are complicated.  Essentially, after each iteration of
-  // the loop, it points to the last initialized element, except
-  // that it points to the beginning of the array before any
-  // elements have been initialized.
-  llvm::Value *element = begin;
-
   // Emit the explicit initializers.
   for (uint64_t i = 0; i != NumInitElements; ++i) {
-    // Advance to the next element.
+    llvm::Value *element = begin;
     if (i > 0) {
-      element = Builder.CreateInBoundsGEP(
-          llvmElementType, element, one, "arrayinit.element");
+      element = Builder.CreateInBoundsGEP(llvmElementType, begin,
+                                          llvm::ConstantInt::get(CGF.SizeTy, i),
+                                          "arrayinit.element");
 
       // Tell the cleanup that it needs to destroy up to this
       // element.  TODO: some of these stores can be trivially
@@ -624,9 +610,12 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
     //   do { *array++ = filler; } while (array != end);
 
     // Advance to the start of the rest of the array.
+    llvm::Value *element = begin;
     if (NumInitElements) {
       element = Builder.CreateInBoundsGEP(
-          llvmElementType, element, one, "arrayinit.start");
+          llvmElementType, element,
+          llvm::ConstantInt::get(CGF.SizeTy, NumInitElements),
+          "arrayinit.start");
       if (endOfInit.isValid()) Builder.CreateStore(element, endOfInit);
     }
 
diff --git a/clang/test/CodeGen/paren-list-agg-init.cpp b/clang/test/CodeGen/paren-list-agg-init.cpp
index 94d42431d125d..88b1834d42d87 100644
--- a/clang/test/CodeGen/paren-list-agg-init.cpp
+++ b/clang/test/CodeGen/paren-list-agg-init.cpp
@@ -271,14 +271,13 @@ const int* foo10() {
 // CHECK-NEXT: [[ARR_2:%.*]] = alloca [4 x i32], align 16
 // CHECK-NEXT: store i32 [[A:%.*]], ptr [[A_ADDR]], align 4
 // CHECK-NEXT: store i32 [[B:%.*]], ptr [[B_ADDR]], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [4 x i32], ptr [[ARR_2]], i64 0, i64 0
 // CHECK-NEXT: [[TMP_0:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_ELEM:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
+// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARR_2]], align 4
+// CHECK-NEXT: [[ARRINIT_ELEM:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 1
 // CHECK-NEXT: [[TMP_1:%.*]] = load i32, ptr [[B_ADDR]], align 4
 // CHECK-NEXT: store i32 [[TMP_1]], ptr [[ARRINIT_ELEM]], align 4
-// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_ELEM]], i64 1
-// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 4
+// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 2
+// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 4
 // CHECK-NEXT: br label [[ARRINIT_BODY:%.*]]
 // CHECK: [[ARRINIT_CUR:%.*]] = phi ptr [ [[ARRINIT_START]], %entry ], [ [[ARRINIT_NEXT:%.*]], [[ARRINIT_BODY]] ]
 // CHECK-NEXT: store i32 0, ptr [[ARRINIT_CUR]], align 4
@@ -297,10 +296,9 @@ void foo11(int a, int b) {
 // CHECK-NEXT: [[ARR_3:%.*]] = alloca [2 x i32], align 4
 // CHECK-NEXT: store i32 [[A:%.*]], ptr [[A_ADDR]], align 4
 // CHECK-NEXT: store i32 [[B:%.*]], ptr [[B_ADDR]], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARR_3]], i64 0, i64 0
 // CHECK-NEXT: [[TMP_0:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
+// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARR_3]], align 4
+// CHECK-NEXT: [[ARRINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARR_3]], i64 1
 // CHECK-NEXT: [[TMP_1:%.*]] = load i32, ptr [[B_ADDR]], align 4
 // CHECK-NEXT: store i32 [[TMP_1]], ptr [[ARRINIT_ELEMENT]], align 4
 // CHECK-NEXT: ret void
@@ -336,8 +334,7 @@ const int* foo15() {
 // CHECK-NEXT: entry:
 // CHECK-NEXT: [[ARR_6:%.*arr6.*]] = alloca ptr, align 8
 // CHECK-NEXT: [[REF_TMP:%.*]] = alloca [1 x i32], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [1 x i32], ptr [[REF_TMP]], i64 0, i64 0
-// CHECK-NEXT: store i32 3, ptr [[ARRINIT_BEGIN]], align 4
+// CHECK-NEXT: store i32 3, ptr [[REF_TMP]], align 4
 // CHECK-NEXT: store ptr [[REF_TMP]], ptr [[ARR_6]], align 8
 // CHECK-NEXT: ret void
 void foo16() {
@@ -348,10 +345,9 @@ void foo16() {
 // CHECK-NEXT: entry:
 // CHECK-NEXT: [[ARR_7:%.*arr7.*]] = alloca ptr, align 8
 // CHECK-NEXT: [[REF_TMP:%.*]] = alloca [2 x i32], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[REF_TMP]], i64 0, i64 0
-// CHECK-NEXT: store i32 4, ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
-// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 2
+// CHECK-NEXT: store i32 4, ptr [[REF_TMP]], align 4
+// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[REF_TMP]], i64 1
+// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[REF_TMP]], i64 2
 // CHECK-NEXT: br label [[ARRINIT_BODY]]
 // CHECK: [[ARRINIT_CUR:%.*]] = phi ptr [ [[ARRINIT_START]], %entry ], [ [[ARRINIT_NEXT:%.*]], [[ARRINIT_BODY]] ]
 // CHECK-NEXT: store i32 0, ptr [[ARRINIT_CUR]], align 4
@@ -533,14 +529,12 @@ namespace gh68198 {
   // CHECK-NEXT: entry
   // CHECK-NEXT: [[ARR_10:%.*arr9.*]] = alloca ptr, align 8
   // CHECK-NEXT: [[CALL_PTR]] = call noalias noundef nonnull ptr @_Znam(i64 noundef 16)
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[CALL]], i64 0, i64 0
-  // CHECK-NEXT: store i32 1, ptr [[ARRAYINIT_BEGIN]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN]], i64 1
+  // CHECK-NEXT: store i32 1, ptr [[CALL]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[CALL]], i64 1
   // CHECK-NEXT: store i32 2, ptr [[ARRAYINIT_ELEMENT]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT:%.*]] = getelementptr inbounds [2 x i32], ptr %call, i64 1
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN1:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 0, i64 0
-  // CHECK-NEXT: store i32 3, ptr [[ARRAYINIT_BEGIN1]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN1]], i64 1
+  // CHECK-NEXT: store i32 3, ptr [[ARRAY_EXP_NEXT]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store i32 4, ptr [[ARRAYINIT_ELEMENT2]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT3:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store ptr [[CALL_PTR]], ptr [[ARR_10]], align 8
@@ -553,14 +547,12 @@ namespace gh68198 {
   // CHECK-NEXT: entry
   // CHECK-NEXT: [[ARR_10:%.*arr10.*]] = alloca ptr, align 8
   // CHECK-NEXT: [[CALL_PTR]] = call noalias noundef nonnull ptr @_Znam(i64 noundef 32)
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[CALL]], i64 0, i64 0
-  // CHECK-NEXT: store i32 5, ptr [[ARRAYINIT_BEGIN]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN]], i64 1
+  // CHECK-NEXT: store i32 5, ptr [[CALL]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[CALL]], i64 1
   // CHECK-NEXT: store i32 6, ptr [[ARRAYINIT_ELEMENT]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT:%.*]] = getelementptr inbounds [2 x i32], ptr %call, i64 1
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN1:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 0, i64 0
-  // CHECK-NEXT: store i32 7, ptr [[ARRAYINIT_BEGIN1]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN1]], i64 1
+  // CHECK-NEXT: store i32 7, ptr [[ARRAY_EXP_NEXT]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store i32 8, ptr [[ARRAYINIT_ELEMENT2]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT3:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 4 [[ARRAY_EXP_NEXT3]], i8 0, i64 16, i1 false)
diff --git a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
index ac466ee5bba40..4eafa720e0cb4 100644
--- a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
+++ b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
@@ -158,16 +158,15 @@ void ArrayInit() {
   // CHECK-LABEL: define dso_local void @_Z9ArrayInitv()
   // CHECK: %arrayinit.endOfInit = alloca ptr, align 8
   // CHECK: %cleanup.dest.slot = alloca i32, align 4
-  // CHECK: %arrayinit.begin = getelementptr inbounds [4 x %struct.Printy], ptr %arr, i64 0, i64 0
-  // CHECK: store ptr %arrayinit.begin, ptr %arrayinit.endOfInit, align 8
+  // CHECK: store ptr %arr, ptr %arrayinit.endOfInit, align 8
   Printy arr[4] = {
     Printy("a"),
-    // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) %arrayinit.begin, ptr noundef @.str)
-    // CHECK: [[ARRAYINIT_ELEMENT1:%.+]] = getelementptr inbounds %struct.Printy, ptr %arrayinit.begin, i64 1
+    // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) %arr, ptr noundef @.str)
+    // CHECK: [[ARRAYINIT_ELEMENT1:%.+]] = getelementptr inbounds %struct.Printy, ptr %arr, i64 1
     // CHECK: store ptr [[ARRAYINIT_ELEMENT1]], ptr %arrayinit.endOfInit, align 8
     Printy("b"),
     // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) [[ARRAYINIT_ELEMENT1]], ptr noundef @.str.1)
-    // CHECK: [[ARRAYINIT_ELEMENT2:%.+]] = getelementptr inbounds %struct.Printy, ptr [[ARRAYINIT_ELEMENT1]], i64 1
+    // CHECK: [[ARRAYINIT_ELEMENT2:%.+]] = getelementptr inbounds %struct.Printy, ptr %arr, i64 2
     // CHECK: store ptr [[ARRAYINIT_ELEMENT2]], ptr %arrayinit.endOfInit, align 8
     ({
     // CHECK: br i1 {{.*}}, label %if.then, label %if.end
@@ -180,7 +179,7 @@ void ArrayInit() {
       // CHECK:       if.end:
       Printy("c");
       // CHECK-NEXT:    call void @_ZN6PrintyC1EPKc
-      // CHECK-NEXT:    %arrayinit.element2 = getelementptr inbounds %struct.Printy, ptr %arrayinit.element1, i64 1
+      // CHECK-NEXT:    %arrayinit.element2 = getelementptr inbounds %struct.Printy, ptr %arr, i64 3
       // CHECK-NEXT:    store ptr %arrayinit.element2, ptr %arrayinit.endOfInit, align 8
     }),
     ({
@@ -212,14 +211,14 @@ void ArrayInit() {
 
   // CHECK:       cleanup:
   // CHECK-NEXT:    %1 = load ptr, ptr %arrayinit.endOfInit, align 8
-  // CHECK-NEXT:    %arraydestroy.isempty = icmp eq ptr %arrayinit.begin, %1
+  // CHECK-NEXT:    %arraydestroy.isempty = icmp eq ptr %arr, %1
   // CHECK-NEXT:    br i1 %arraydestroy.isempty, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2:.+]]
 
   // CHECK:       [[ARRAY_DESTROY_BODY2]]:
   // CHECK-NEXT:    %arraydestroy.elementPast = phi ptr [ %1, %cleanup ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY2]] ]
   // CHECK-NEXT:    %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1
   // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element)
-  // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arrayinit.begin
+  // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr
   // CHECK-NEXT:    br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE2]], label %[[ARRAY_DESTROY_BODY2]]
 
   // CHECK:       [[ARRAY_DESTROY_DONE2]]:
@@ -238,8 +237,7 @@ void ArraySubobjects() {
       // CHECK: call void @_ZN6PrintyC1EPKc
       // CHECK: call void @_ZN6PrintyC1EPKc
       {Printy("a"),
-      // CHECK: [[ARRAYINIT_BEGIN:%.+]] = getelementptr inbounds [2 x %struct.Printy]
-      // CHECK: store ptr [[ARRAYINIT_BEGIN]], ptr %arrayinit.endOfInit, align 8
+      // CHECK: store ptr %arr2, ptr %arrayinit.endOfInit, align 8
       // CHECK: call void @_ZN6PrintyC1EPKc
       // CHECK: [[ARRAYINIT_ELEMENT:%.+]] = getelementptr inbounds %struct.Printy
       // CHECK: store ptr [[ARRAYINIT_ELEMENT]], ptr %arrayinit.endOfInit, align 8
@@ -248,7 +246,7 @@ void ArraySubobjects() {
            return;
            // CHECK:      if.then:
            // CHECK-NEXT:   [[V0:%.+]] = load ptr, ptr %arrayinit.endOfInit, align 8
-           // CHECK-NEXT:   %arraydestroy.isempty = icmp eq ptr [[ARRAYINIT_BEGIN]], [[V0]]
+           // CHECK-NEXT:   %arraydestroy.isempty = icmp eq ptr %arr2, [[V0]]
            // CHECK-NEXT:   br i1 %arraydestroy.isempty, label %[[ARRAY_DESTROY_DONE:.+]], label %[[ARRAY_DESTROY_BODY:.+]]
          }
          Printy("b");
@@ -268,7 +266,7 @@ void ArraySubobjects() {
     // CHECK-NEXT:    %arraydestroy.elementPast = phi ptr [ %0, %if.then ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY]] ]
     // CHECK-NEXT:    %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1
     // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element)
-    // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, [[ARRAYINIT_BEGIN]]
+    // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr2
     // CHECK-NEXT:    br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE]], label %[[ARRAY_DESTROY_BODY]]
 
     // CHECK:       [[ARRAY_DESTROY_DONE]]
@@ -277,11 +275,11 @@ void ArraySubobjects() {
     // CHECK-NEXT:    br label %[[ARRAY_DESTROY_BODY2:.+]]
 
     // CHECK:       [[ARRAY_DESTROY_BODY2]]:
-    // CHECK-NEXT:    %arraydestroy.elementPast5 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element6, %[[ARRAY_DESTROY_BODY2]] ]
-    // CHECK-NEXT:    %arraydestroy.element6 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast5, i64 -1
-    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element6)
-    // CHECK-NEXT:    %arraydestroy.done7 = icmp eq ptr %arraydestroy.element6, [[ARRAY_BEGIN]]
-    // CHECK-NEXT:    br i1 %arraydestroy.done7, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]]
+    // CHECK-NEXT:    %arraydestroy.elementPast4 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element5, %[[ARRAY_DESTROY_BODY2]] ]
+    // CHECK-NEXT:    %arraydestroy.element5 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast4, i64 -1
+    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element5)
+    // CHECK-NEXT:    %arraydestroy.done6 = icmp eq ptr %arraydestroy.element5, [[ARRAY_BEGIN]]
+    // CHECK-NEXT:    br i1 %arraydestroy.done6, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]]
 
 
     // CHECK:     [[ARRAY_DESTROY_DONE2]]:
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
index b19ca1d861b11..419525d0544d5 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
@@ -52,11 +52,10 @@ namespace reference {
     // CHECK-NEXT: store ptr %{{.*}}, ptr %{{.*}}, align
     const A &ra1{1, i};
 
-    // CHECK-NEXT: getelementptr inbounds [3 x i32], ptr %{{.*}}, i{{32|64}} 0, i{{32|64}} 0
     // CHECK-NEXT: store i32 1
     // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 1
     // CHECK-NEXT: store i32 2
-    // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 1
+    // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 2
     // CHECK-NEXT: %[[I2:.*]] = load i32, ptr
     // CHECK-NEXT: store i32 %[[I2]]
     // CHECK-NEXT: store ptr %{{.*}}, ptr %{{.*}}, align
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
index d9e4c5d5403c2..e2d566121331f 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
@@ -40,8 +40,7 @@ void fn1(int i) {
   // CHECK-LABEL: define{{.*}} void @_Z3fn1i
   // temporary array
   // CHECK: [[array:%[^ ]+]] = alloca [3 x i32]
-  // CHECK: getelementptr inbounds [3 x i32], ptr [[array]], i{{32|64}} 0
-  // CHECK-NEXT: store i32 1, ptr
+  // CHECK:      store i32 1, ptr
   // CHECK-NEXT: getelementptr
   // CHECK-NEXT: store
   // CHECK-NEXT: getelementptr
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
index aa2f078a5fb0c..3d0cf968bfd54 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -122,8 +122,7 @@ void fn1(int i) {
   // X86: [[array:%[^ ]+]] = alloca [3 x i32]
   // AMDGCN: [[alloca:%[^ ]+]] = alloca [3 x i32], align 4, addrspace(5)
   // AMDGCN: [[array:%[^ ]+]] ={{.*}} addrspacecast ptr addrspace(5) [[alloca]] to ptr
-  // CHECK: getelementptr inbounds [3 x i32], ptr [[array]], i{{32|64}} 0
-  // CHECK-NEXT: store i32 1, ptr
+  // CHECK:      store i32 1, ptr
   // CHECK-NEXT: getelementptr
   // CHECK-NEXT: store
   // CHECK-NEXT: getelementptr
diff --git a/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp b/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
index 48ee019a55f55..d86c712372cd8 100644
--- a/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
+++ b/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
@@ -16,22 +16,20 @@ void *p = new S[2][3]{ { 1, 2, 3 }, { 4, 5, 6 } };
 // { 1, 2, 3 }
 //
 //
-// CHECK: %[[S_0_0:.*]] = getelementptr inbounds [3 x %[[S:.*]]], ptr %[[START_AS_i8]], i64 0, i64 0
-// CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_0]], i32 noundef 1)
-// CHECK: %[[S_0_1:.*]] = getelementptr inbounds %[[S]], ptr %[[S_0_0]], i64 1
+// CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[START_AS_i8]], i32 noundef 1)
+// CHECK: %[[S_0_1:.*]] = getelementptr inbounds %[[S:.+]], ptr %[[START_AS_i8]], i64 1
 // CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_1]], i32 noundef 2)
-// CHECK: %[[S_0_2:.*]] = getelementptr inbounds %[[S]], ptr %[[S_0_1]], i64 1
+// CHECK: %[[S_0_2:.*]] = getelementptr inbounds %[[S]], ptr %[[START_AS_i8]], i64 2
 // CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_2...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented May 31, 2024

@llvm/pr-subscribers-clang-codegen

Author: Nikita Popov (nikic)

Changes

This makes codegen for array initialization simpler in two ways:

  1. Drop the zero-index GEP at the start, which is no longer needed with opaque pointers.
  2. Emit GEPs directly to the correct element, instead of having a long chain of +1 GEPs. This is more canonical, and also avoids regressions in unoptimized builds from #93823.

Patch is 407.01 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/93956.diff

76 Files Affected:

  • (modified) clang/lib/CodeGen/CGExprAgg.cpp (+9-20)
  • (modified) clang/test/CodeGen/paren-list-agg-init.cpp (+18-26)
  • (modified) clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp (+15-17)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-references.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp (+1-2)
  • (modified) clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp (+18-24)
  • (modified) clang/test/CodeGenCXX/partial-destruction.cpp (+17-21)
  • (modified) clang/test/CodeGenCXX/temporaries.cpp (+2-3)
  • (modified) clang/test/CodeGenCXX/value-init.cpp (+12-11)
  • (modified) clang/test/CodeGenCoroutines/coro-suspend-cleanups.cpp (+7-10)
  • (modified) clang/test/CodeGenObjC/arc-ternary-op.m (+4-6)
  • (modified) clang/test/CodeGenObjC/arc.m (+2-3)
  • (modified) clang/test/CodeGenObjCXX/arc-exceptions.mm (+8-11)
  • (modified) clang/test/OpenMP/distribute_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_firstprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_parallel_for_simd_private_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/distribute_simd_firstprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/distribute_simd_private_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/for_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/for_reduction_codegen.cpp (+6-8)
  • (modified) clang/test/OpenMP/for_reduction_codegen_UDR.cpp (+12-16)
  • (modified) clang/test/OpenMP/parallel_copyin_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_firstprivate_codegen.cpp (+12-18)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_lastprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_simd_firstprivate_codegen.cpp (+12-18)
  • (modified) clang/test/OpenMP/parallel_master_taskloop_simd_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/parallel_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/parallel_reduction_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/sections_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/sections_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/sections_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/sections_reduction_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/simd_private_taskloop_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/single_firstprivate_codegen.cpp (+2-3)
  • (modified) clang/test/OpenMP/single_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_parallel_for_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/target_teams_distribute_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_firstprivate_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_lastprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/teams_distribute_parallel_for_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_distribute_simd_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_distribute_simd_lastprivate_codegen.cpp (+16-24)
  • (modified) clang/test/OpenMP/teams_distribute_simd_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_firstprivate_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp (+4-6)
  • (modified) clang/test/OpenMP/teams_private_codegen.cpp (+8-12)
  • (modified) clang/test/OpenMP/threadprivate_codegen.cpp (+228-240)
  • (modified) clang/test/PCH/cxx_paren_init.cpp (+4-5)
diff --git a/clang/lib/CodeGen/CGExprAgg.cpp b/clang/lib/CodeGen/CGExprAgg.cpp
index 5b2039af6128b..b2a5ceeeae08b 100644
--- a/clang/lib/CodeGen/CGExprAgg.cpp
+++ b/clang/lib/CodeGen/CGExprAgg.cpp
@@ -513,15 +513,6 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
 
   QualType elementType =
       CGF.getContext().getAsArrayType(ArrayQTy)->getElementType();
-
-  // DestPtr is an array*.  Construct an elementType* by drilling
-  // down a level.
-  llvm::Value *zero = llvm::ConstantInt::get(CGF.SizeTy, 0);
-  llvm::Value *indices[] = { zero, zero };
-  llvm::Value *begin = Builder.CreateInBoundsGEP(DestPtr.getElementType(),
-                                                 DestPtr.emitRawPointer(CGF),
-                                                 indices, "arrayinit.begin");
-
   CharUnits elementSize = CGF.getContext().getTypeSizeInChars(elementType);
   CharUnits elementAlign =
     DestPtr.getAlignment().alignmentOfArrayElement(elementSize);
@@ -562,6 +553,7 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
   Address endOfInit = Address::invalid();
   CodeGenFunction::CleanupDeactivationScope deactivation(CGF);
 
+  llvm::Value *begin = DestPtr.emitRawPointer(CGF);
   if (dtorKind) {
     CodeGenFunction::AllocaTrackerRAII allocaTracker(CGF);
     // In principle we could tell the cleanup where we are more
@@ -585,19 +577,13 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
 
   llvm::Value *one = llvm::ConstantInt::get(CGF.SizeTy, 1);
 
-  // The 'current element to initialize'.  The invariants on this
-  // variable are complicated.  Essentially, after each iteration of
-  // the loop, it points to the last initialized element, except
-  // that it points to the beginning of the array before any
-  // elements have been initialized.
-  llvm::Value *element = begin;
-
   // Emit the explicit initializers.
   for (uint64_t i = 0; i != NumInitElements; ++i) {
-    // Advance to the next element.
+    llvm::Value *element = begin;
     if (i > 0) {
-      element = Builder.CreateInBoundsGEP(
-          llvmElementType, element, one, "arrayinit.element");
+      element = Builder.CreateInBoundsGEP(llvmElementType, begin,
+                                          llvm::ConstantInt::get(CGF.SizeTy, i),
+                                          "arrayinit.element");
 
       // Tell the cleanup that it needs to destroy up to this
       // element.  TODO: some of these stores can be trivially
@@ -624,9 +610,12 @@ void AggExprEmitter::EmitArrayInit(Address DestPtr, llvm::ArrayType *AType,
     //   do { *array++ = filler; } while (array != end);
 
     // Advance to the start of the rest of the array.
+    llvm::Value *element = begin;
     if (NumInitElements) {
       element = Builder.CreateInBoundsGEP(
-          llvmElementType, element, one, "arrayinit.start");
+          llvmElementType, element,
+          llvm::ConstantInt::get(CGF.SizeTy, NumInitElements),
+          "arrayinit.start");
       if (endOfInit.isValid()) Builder.CreateStore(element, endOfInit);
     }
 
diff --git a/clang/test/CodeGen/paren-list-agg-init.cpp b/clang/test/CodeGen/paren-list-agg-init.cpp
index 94d42431d125d..88b1834d42d87 100644
--- a/clang/test/CodeGen/paren-list-agg-init.cpp
+++ b/clang/test/CodeGen/paren-list-agg-init.cpp
@@ -271,14 +271,13 @@ const int* foo10() {
 // CHECK-NEXT: [[ARR_2:%.*]] = alloca [4 x i32], align 16
 // CHECK-NEXT: store i32 [[A:%.*]], ptr [[A_ADDR]], align 4
 // CHECK-NEXT: store i32 [[B:%.*]], ptr [[B_ADDR]], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [4 x i32], ptr [[ARR_2]], i64 0, i64 0
 // CHECK-NEXT: [[TMP_0:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_ELEM:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
+// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARR_2]], align 4
+// CHECK-NEXT: [[ARRINIT_ELEM:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 1
 // CHECK-NEXT: [[TMP_1:%.*]] = load i32, ptr [[B_ADDR]], align 4
 // CHECK-NEXT: store i32 [[TMP_1]], ptr [[ARRINIT_ELEM]], align 4
-// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_ELEM]], i64 1
-// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 4
+// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 2
+// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARR_2]], i64 4
 // CHECK-NEXT: br label [[ARRINIT_BODY:%.*]]
 // CHECK: [[ARRINIT_CUR:%.*]] = phi ptr [ [[ARRINIT_START]], %entry ], [ [[ARRINIT_NEXT:%.*]], [[ARRINIT_BODY]] ]
 // CHECK-NEXT: store i32 0, ptr [[ARRINIT_CUR]], align 4
@@ -297,10 +296,9 @@ void foo11(int a, int b) {
 // CHECK-NEXT: [[ARR_3:%.*]] = alloca [2 x i32], align 4
 // CHECK-NEXT: store i32 [[A:%.*]], ptr [[A_ADDR]], align 4
 // CHECK-NEXT: store i32 [[B:%.*]], ptr [[B_ADDR]], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARR_3]], i64 0, i64 0
 // CHECK-NEXT: [[TMP_0:%.*]] = load i32, ptr [[A_ADDR]], align 4
-// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
+// CHECK-NEXT: store i32 [[TMP_0]], ptr [[ARR_3]], align 4
+// CHECK-NEXT: [[ARRINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARR_3]], i64 1
 // CHECK-NEXT: [[TMP_1:%.*]] = load i32, ptr [[B_ADDR]], align 4
 // CHECK-NEXT: store i32 [[TMP_1]], ptr [[ARRINIT_ELEMENT]], align 4
 // CHECK-NEXT: ret void
@@ -336,8 +334,7 @@ const int* foo15() {
 // CHECK-NEXT: entry:
 // CHECK-NEXT: [[ARR_6:%.*arr6.*]] = alloca ptr, align 8
 // CHECK-NEXT: [[REF_TMP:%.*]] = alloca [1 x i32], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [1 x i32], ptr [[REF_TMP]], i64 0, i64 0
-// CHECK-NEXT: store i32 3, ptr [[ARRINIT_BEGIN]], align 4
+// CHECK-NEXT: store i32 3, ptr [[REF_TMP]], align 4
 // CHECK-NEXT: store ptr [[REF_TMP]], ptr [[ARR_6]], align 8
 // CHECK-NEXT: ret void
 void foo16() {
@@ -348,10 +345,9 @@ void foo16() {
 // CHECK-NEXT: entry:
 // CHECK-NEXT: [[ARR_7:%.*arr7.*]] = alloca ptr, align 8
 // CHECK-NEXT: [[REF_TMP:%.*]] = alloca [2 x i32], align 4
-// CHECK-NEXT: [[ARRINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[REF_TMP]], i64 0, i64 0
-// CHECK-NEXT: store i32 4, ptr [[ARRINIT_BEGIN]], align 4
-// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 1
-// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[ARRINIT_BEGIN]], i64 2
+// CHECK-NEXT: store i32 4, ptr [[REF_TMP]], align 4
+// CHECK-NEXT: [[ARRINIT_START:%.*]] = getelementptr inbounds i32, ptr [[REF_TMP]], i64 1
+// CHECK-NEXT: [[ARRINIT_END:%.*]] = getelementptr inbounds i32, ptr [[REF_TMP]], i64 2
 // CHECK-NEXT: br label [[ARRINIT_BODY]]
 // CHECK: [[ARRINIT_CUR:%.*]] = phi ptr [ [[ARRINIT_START]], %entry ], [ [[ARRINIT_NEXT:%.*]], [[ARRINIT_BODY]] ]
 // CHECK-NEXT: store i32 0, ptr [[ARRINIT_CUR]], align 4
@@ -533,14 +529,12 @@ namespace gh68198 {
   // CHECK-NEXT: entry
   // CHECK-NEXT: [[ARR_10:%.*arr9.*]] = alloca ptr, align 8
   // CHECK-NEXT: [[CALL_PTR]] = call noalias noundef nonnull ptr @_Znam(i64 noundef 16)
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[CALL]], i64 0, i64 0
-  // CHECK-NEXT: store i32 1, ptr [[ARRAYINIT_BEGIN]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN]], i64 1
+  // CHECK-NEXT: store i32 1, ptr [[CALL]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[CALL]], i64 1
   // CHECK-NEXT: store i32 2, ptr [[ARRAYINIT_ELEMENT]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT:%.*]] = getelementptr inbounds [2 x i32], ptr %call, i64 1
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN1:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 0, i64 0
-  // CHECK-NEXT: store i32 3, ptr [[ARRAYINIT_BEGIN1]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN1]], i64 1
+  // CHECK-NEXT: store i32 3, ptr [[ARRAY_EXP_NEXT]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store i32 4, ptr [[ARRAYINIT_ELEMENT2]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT3:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store ptr [[CALL_PTR]], ptr [[ARR_10]], align 8
@@ -553,14 +547,12 @@ namespace gh68198 {
   // CHECK-NEXT: entry
   // CHECK-NEXT: [[ARR_10:%.*arr10.*]] = alloca ptr, align 8
   // CHECK-NEXT: [[CALL_PTR]] = call noalias noundef nonnull ptr @_Znam(i64 noundef 32)
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN:%.*]] = getelementptr inbounds [2 x i32], ptr [[CALL]], i64 0, i64 0
-  // CHECK-NEXT: store i32 5, ptr [[ARRAYINIT_BEGIN]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN]], i64 1
+  // CHECK-NEXT: store i32 5, ptr [[CALL]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT:%.*]] = getelementptr inbounds i32, ptr [[CALL]], i64 1
   // CHECK-NEXT: store i32 6, ptr [[ARRAYINIT_ELEMENT]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT:%.*]] = getelementptr inbounds [2 x i32], ptr %call, i64 1
-  // CHECK-NEXT: [[ARRAYINIT_BEGIN1:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 0, i64 0
-  // CHECK-NEXT: store i32 7, ptr [[ARRAYINIT_BEGIN1]], align 4
-  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAYINIT_BEGIN1]], i64 1
+  // CHECK-NEXT: store i32 7, ptr [[ARRAY_EXP_NEXT]], align 4
+  // CHECK-NEXT: [[ARRAYINIT_ELEMENT2:%.*]] = getelementptr inbounds i32, ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: store i32 8, ptr [[ARRAYINIT_ELEMENT2]], align 4
   // CHECK-NEXT: [[ARRAY_EXP_NEXT3:%.*]] = getelementptr inbounds [2 x i32], ptr [[ARRAY_EXP_NEXT]], i64 1
   // CHECK-NEXT: call void @llvm.memset.p0.i64(ptr align 4 [[ARRAY_EXP_NEXT3]], i8 0, i64 16, i1 false)
diff --git a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
index ac466ee5bba40..4eafa720e0cb4 100644
--- a/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
+++ b/clang/test/CodeGenCXX/control-flow-in-stmt-expr.cpp
@@ -158,16 +158,15 @@ void ArrayInit() {
   // CHECK-LABEL: define dso_local void @_Z9ArrayInitv()
   // CHECK: %arrayinit.endOfInit = alloca ptr, align 8
   // CHECK: %cleanup.dest.slot = alloca i32, align 4
-  // CHECK: %arrayinit.begin = getelementptr inbounds [4 x %struct.Printy], ptr %arr, i64 0, i64 0
-  // CHECK: store ptr %arrayinit.begin, ptr %arrayinit.endOfInit, align 8
+  // CHECK: store ptr %arr, ptr %arrayinit.endOfInit, align 8
   Printy arr[4] = {
     Printy("a"),
-    // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) %arrayinit.begin, ptr noundef @.str)
-    // CHECK: [[ARRAYINIT_ELEMENT1:%.+]] = getelementptr inbounds %struct.Printy, ptr %arrayinit.begin, i64 1
+    // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) %arr, ptr noundef @.str)
+    // CHECK: [[ARRAYINIT_ELEMENT1:%.+]] = getelementptr inbounds %struct.Printy, ptr %arr, i64 1
     // CHECK: store ptr [[ARRAYINIT_ELEMENT1]], ptr %arrayinit.endOfInit, align 8
     Printy("b"),
     // CHECK: call void @_ZN6PrintyC1EPKc(ptr noundef nonnull align 8 dereferenceable(8) [[ARRAYINIT_ELEMENT1]], ptr noundef @.str.1)
-    // CHECK: [[ARRAYINIT_ELEMENT2:%.+]] = getelementptr inbounds %struct.Printy, ptr [[ARRAYINIT_ELEMENT1]], i64 1
+    // CHECK: [[ARRAYINIT_ELEMENT2:%.+]] = getelementptr inbounds %struct.Printy, ptr %arr, i64 2
     // CHECK: store ptr [[ARRAYINIT_ELEMENT2]], ptr %arrayinit.endOfInit, align 8
     ({
     // CHECK: br i1 {{.*}}, label %if.then, label %if.end
@@ -180,7 +179,7 @@ void ArrayInit() {
       // CHECK:       if.end:
       Printy("c");
       // CHECK-NEXT:    call void @_ZN6PrintyC1EPKc
-      // CHECK-NEXT:    %arrayinit.element2 = getelementptr inbounds %struct.Printy, ptr %arrayinit.element1, i64 1
+      // CHECK-NEXT:    %arrayinit.element2 = getelementptr inbounds %struct.Printy, ptr %arr, i64 3
       // CHECK-NEXT:    store ptr %arrayinit.element2, ptr %arrayinit.endOfInit, align 8
     }),
     ({
@@ -212,14 +211,14 @@ void ArrayInit() {
 
   // CHECK:       cleanup:
   // CHECK-NEXT:    %1 = load ptr, ptr %arrayinit.endOfInit, align 8
-  // CHECK-NEXT:    %arraydestroy.isempty = icmp eq ptr %arrayinit.begin, %1
+  // CHECK-NEXT:    %arraydestroy.isempty = icmp eq ptr %arr, %1
   // CHECK-NEXT:    br i1 %arraydestroy.isempty, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2:.+]]
 
   // CHECK:       [[ARRAY_DESTROY_BODY2]]:
   // CHECK-NEXT:    %arraydestroy.elementPast = phi ptr [ %1, %cleanup ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY2]] ]
   // CHECK-NEXT:    %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1
   // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element)
-  // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arrayinit.begin
+  // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr
   // CHECK-NEXT:    br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE2]], label %[[ARRAY_DESTROY_BODY2]]
 
   // CHECK:       [[ARRAY_DESTROY_DONE2]]:
@@ -238,8 +237,7 @@ void ArraySubobjects() {
       // CHECK: call void @_ZN6PrintyC1EPKc
       // CHECK: call void @_ZN6PrintyC1EPKc
       {Printy("a"),
-      // CHECK: [[ARRAYINIT_BEGIN:%.+]] = getelementptr inbounds [2 x %struct.Printy]
-      // CHECK: store ptr [[ARRAYINIT_BEGIN]], ptr %arrayinit.endOfInit, align 8
+      // CHECK: store ptr %arr2, ptr %arrayinit.endOfInit, align 8
       // CHECK: call void @_ZN6PrintyC1EPKc
       // CHECK: [[ARRAYINIT_ELEMENT:%.+]] = getelementptr inbounds %struct.Printy
       // CHECK: store ptr [[ARRAYINIT_ELEMENT]], ptr %arrayinit.endOfInit, align 8
@@ -248,7 +246,7 @@ void ArraySubobjects() {
            return;
            // CHECK:      if.then:
            // CHECK-NEXT:   [[V0:%.+]] = load ptr, ptr %arrayinit.endOfInit, align 8
-           // CHECK-NEXT:   %arraydestroy.isempty = icmp eq ptr [[ARRAYINIT_BEGIN]], [[V0]]
+           // CHECK-NEXT:   %arraydestroy.isempty = icmp eq ptr %arr2, [[V0]]
            // CHECK-NEXT:   br i1 %arraydestroy.isempty, label %[[ARRAY_DESTROY_DONE:.+]], label %[[ARRAY_DESTROY_BODY:.+]]
          }
          Printy("b");
@@ -268,7 +266,7 @@ void ArraySubobjects() {
     // CHECK-NEXT:    %arraydestroy.elementPast = phi ptr [ %0, %if.then ], [ %arraydestroy.element, %[[ARRAY_DESTROY_BODY]] ]
     // CHECK-NEXT:    %arraydestroy.element = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast, i64 -1
     // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element)
-    // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, [[ARRAYINIT_BEGIN]]
+    // CHECK-NEXT:    %arraydestroy.done = icmp eq ptr %arraydestroy.element, %arr2
     // CHECK-NEXT:    br i1 %arraydestroy.done, label %[[ARRAY_DESTROY_DONE]], label %[[ARRAY_DESTROY_BODY]]
 
     // CHECK:       [[ARRAY_DESTROY_DONE]]
@@ -277,11 +275,11 @@ void ArraySubobjects() {
     // CHECK-NEXT:    br label %[[ARRAY_DESTROY_BODY2:.+]]
 
     // CHECK:       [[ARRAY_DESTROY_BODY2]]:
-    // CHECK-NEXT:    %arraydestroy.elementPast5 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element6, %[[ARRAY_DESTROY_BODY2]] ]
-    // CHECK-NEXT:    %arraydestroy.element6 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast5, i64 -1
-    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element6)
-    // CHECK-NEXT:    %arraydestroy.done7 = icmp eq ptr %arraydestroy.element6, [[ARRAY_BEGIN]]
-    // CHECK-NEXT:    br i1 %arraydestroy.done7, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]]
+    // CHECK-NEXT:    %arraydestroy.elementPast4 = phi ptr [ %1, %[[ARRAY_DESTROY_DONE]] ], [ %arraydestroy.element5, %[[ARRAY_DESTROY_BODY2]] ]
+    // CHECK-NEXT:    %arraydestroy.element5 = getelementptr inbounds %struct.Printy, ptr %arraydestroy.elementPast4, i64 -1
+    // CHECK-NEXT:    call void @_ZN6PrintyD1Ev(ptr noundef nonnull align 8 dereferenceable(8) %arraydestroy.element5)
+    // CHECK-NEXT:    %arraydestroy.done6 = icmp eq ptr %arraydestroy.element5, [[ARRAY_BEGIN]]
+    // CHECK-NEXT:    br i1 %arraydestroy.done6, label %[[ARRAY_DESTROY_DONE2:.+]], label %[[ARRAY_DESTROY_BODY2]]
 
 
     // CHECK:     [[ARRAY_DESTROY_DONE2]]:
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
index b19ca1d861b11..419525d0544d5 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-references.cpp
@@ -52,11 +52,10 @@ namespace reference {
     // CHECK-NEXT: store ptr %{{.*}}, ptr %{{.*}}, align
     const A &ra1{1, i};
 
-    // CHECK-NEXT: getelementptr inbounds [3 x i32], ptr %{{.*}}, i{{32|64}} 0, i{{32|64}} 0
     // CHECK-NEXT: store i32 1
     // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 1
     // CHECK-NEXT: store i32 2
-    // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 1
+    // CHECK-NEXT: getelementptr inbounds i32, ptr %{{.*}}, i{{32|64}} 2
     // CHECK-NEXT: %[[I2:.*]] = load i32, ptr
     // CHECK-NEXT: store i32 %[[I2]]
     // CHECK-NEXT: store ptr %{{.*}}, ptr %{{.*}}, align
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
index d9e4c5d5403c2..e2d566121331f 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist-startend.cpp
@@ -40,8 +40,7 @@ void fn1(int i) {
   // CHECK-LABEL: define{{.*}} void @_Z3fn1i
   // temporary array
   // CHECK: [[array:%[^ ]+]] = alloca [3 x i32]
-  // CHECK: getelementptr inbounds [3 x i32], ptr [[array]], i{{32|64}} 0
-  // CHECK-NEXT: store i32 1, ptr
+  // CHECK:      store i32 1, ptr
   // CHECK-NEXT: getelementptr
   // CHECK-NEXT: store
   // CHECK-NEXT: getelementptr
diff --git a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
index aa2f078a5fb0c..3d0cf968bfd54 100644
--- a/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
+++ b/clang/test/CodeGenCXX/cxx0x-initializer-stdinitializerlist.cpp
@@ -122,8 +122,7 @@ void fn1(int i) {
   // X86: [[array:%[^ ]+]] = alloca [3 x i32]
   // AMDGCN: [[alloca:%[^ ]+]] = alloca [3 x i32], align 4, addrspace(5)
   // AMDGCN: [[array:%[^ ]+]] ={{.*}} addrspacecast ptr addrspace(5) [[alloca]] to ptr
-  // CHECK: getelementptr inbounds [3 x i32], ptr [[array]], i{{32|64}} 0
-  // CHECK-NEXT: store i32 1, ptr
+  // CHECK:      store i32 1, ptr
   // CHECK-NEXT: getelementptr
   // CHECK-NEXT: store
   // CHECK-NEXT: getelementptr
diff --git a/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp b/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
index 48ee019a55f55..d86c712372cd8 100644
--- a/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
+++ b/clang/test/CodeGenCXX/cxx11-initializer-array-new.cpp
@@ -16,22 +16,20 @@ void *p = new S[2][3]{ { 1, 2, 3 }, { 4, 5, 6 } };
 // { 1, 2, 3 }
 //
 //
-// CHECK: %[[S_0_0:.*]] = getelementptr inbounds [3 x %[[S:.*]]], ptr %[[START_AS_i8]], i64 0, i64 0
-// CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_0]], i32 noundef 1)
-// CHECK: %[[S_0_1:.*]] = getelementptr inbounds %[[S]], ptr %[[S_0_0]], i64 1
+// CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[START_AS_i8]], i32 noundef 1)
+// CHECK: %[[S_0_1:.*]] = getelementptr inbounds %[[S:.+]], ptr %[[START_AS_i8]], i64 1
 // CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_1]], i32 noundef 2)
-// CHECK: %[[S_0_2:.*]] = getelementptr inbounds %[[S]], ptr %[[S_0_1]], i64 1
+// CHECK: %[[S_0_2:.*]] = getelementptr inbounds %[[S]], ptr %[[START_AS_i8]], i64 2
 // CHECK: call void @_ZN1SC1Ei(ptr {{[^,]*}} %[[S_0_2...
[truncated]

@nikic
Copy link
Contributor Author

nikic commented Jun 7, 2024

ping

Copy link
Contributor

@rjmccall rjmccall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, LGTM.

@nikic nikic merged commit 12d24e0 into llvm:main Jun 10, 2024
12 checks passed
@nikic nikic deleted the clang-array-init-rework branch June 10, 2024 07:19
nikic added a commit that referenced this pull request Jun 10, 2024
Reapply after #93956, which
changed clang array initialization codegen to avoid size regressions
for unoptimized builds.

-----

This fold is subtly incorrect, because DL-unaware constant folding does
not know the correct index type to use, and just performs the addition
in the type that happens to already be there. This is incorrect, since
sext(X)+sext(Y) is generally not the same as sext(X+Y). See the
`@constexpr_gep_of_gep_with_narrow_type()` for a miscompile with the
current implementation.

One could try to restrict the fold to cases where no overflow occurs,
but I'm not bothering with that here, because the DL-aware constant
folding will take care of this anyway. I've only kept the
straightforward zero-index case, where we just concatenate two GEPs.
@HerrCai0907 HerrCai0907 mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category coroutines C++20 coroutines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants