-
Notifications
You must be signed in to change notification settings - Fork 12.4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
RegisterCoalescer: Add implicit-def of super register when coalescing…
… SUBREG_TO_REG Currently coalescing with SUBREG_TO_REG introduces an invisible load bearing undef. There is liveness for the super register not represented in the MIR. This is part 1 of a fix for regressions that appeared after b7836d8. The allocator started recognizing undef-def subregister MOVs as copies. Since there was no representation for the dependency on the high bits, different undef segments of the super register ended up disconnected and downstream users ended up observing different undefs than they did previously. This does not yet fix the regression. The isCopyInstr handling needs to start handling implicit-defs on any instruction. I wanted to include an end to end IR test since the actual failure only appeared with an interaction between the coalescer and the allocator. It's a bit bigger than I'd like but I'm having a bit of trouble reducing it to something which definitely shows a diff that's meaningful. The same problem likely exists everywhere trying to do anything with SUBREG_TO_REG. I don't understand how this managed to be broken for so long. This needs to be applied to the release branch. https://reviews.llvm.org/D156345
- Loading branch information
Showing
5 changed files
with
623 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
185 changes: 185 additions & 0 deletions
185
llvm/test/CodeGen/X86/coalescer-breaks-subreg-to-reg-liveness.ll
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,185 @@ | ||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 2 | ||
; RUN: llc -mtriple=x86_64-grtev4-linux-gnu < %s | FileCheck %s | ||
|
||
%struct.wibble = type { %struct.wombat } | ||
%struct.wombat = type { %struct.ham, [3 x i8] } | ||
%struct.ham = type { %struct.zot } | ||
%struct.zot = type { %struct.blam } | ||
%struct.blam = type { %struct.ham.0 } | ||
%struct.ham.0 = type { %struct.bar } | ||
%struct.bar = type { %struct.bar.1 } | ||
%struct.bar.1 = type { %struct.baz, i8 } | ||
%struct.baz = type { %struct.snork } | ||
%struct.snork = type <{ %struct.spam, i8, [3 x i8] }> | ||
%struct.spam = type { %struct.snork.2, %struct.snork.2 } | ||
%struct.snork.2 = type { i32 } | ||
%struct.snork.3 = type { %struct.baz, i8, [3 x i8] } | ||
|
||
define void @foo(ptr %arg, ptr %arg1, i40 %arg2, ptr %arg3, i32 %arg4) #0 { | ||
; CHECK-LABEL: foo: | ||
; CHECK: # %bb.0: # %bb | ||
; CHECK-NEXT: pushq %rbp | ||
; CHECK-NEXT: .cfi_def_cfa_offset 16 | ||
; CHECK-NEXT: .cfi_offset %rbp, -16 | ||
; CHECK-NEXT: movq %rsp, %rbp | ||
; CHECK-NEXT: .cfi_def_cfa_register %rbp | ||
; CHECK-NEXT: pushq %r15 | ||
; CHECK-NEXT: pushq %r14 | ||
; CHECK-NEXT: pushq %r13 | ||
; CHECK-NEXT: pushq %r12 | ||
; CHECK-NEXT: pushq %rbx | ||
; CHECK-NEXT: subq $24, %rsp | ||
; CHECK-NEXT: .cfi_offset %rbx, -56 | ||
; CHECK-NEXT: .cfi_offset %r12, -48 | ||
; CHECK-NEXT: .cfi_offset %r13, -40 | ||
; CHECK-NEXT: .cfi_offset %r14, -32 | ||
; CHECK-NEXT: .cfi_offset %r15, -24 | ||
; CHECK-NEXT: movl %r8d, %r14d | ||
; CHECK-NEXT: movq %rcx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill | ||
; CHECK-NEXT: movq %rdx, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill | ||
; CHECK-NEXT: movq %rsi, %r13 | ||
; CHECK-NEXT: movq %rdi, %r15 | ||
; CHECK-NEXT: incl %r14d | ||
; CHECK-NEXT: xorl %ebx, %ebx | ||
; CHECK-NEXT: # implicit-def: $r12 | ||
; CHECK-NEXT: movq %rsi, {{[-0-9]+}}(%r{{[sb]}}p) # 8-byte Spill | ||
; CHECK-NEXT: jmp .LBB0_3 | ||
; CHECK-NEXT: .p2align 4, 0x90 | ||
; CHECK-NEXT: .LBB0_1: # %bb17 | ||
; CHECK-NEXT: # in Loop: Header=BB0_3 Depth=1 | ||
; CHECK-NEXT: movq %r15, %r13 | ||
; CHECK-NEXT: xorl %r15d, %r15d | ||
; CHECK-NEXT: testq %rbx, %rbx | ||
; CHECK-NEXT: sete %r15b | ||
; CHECK-NEXT: xorl %edi, %edi | ||
; CHECK-NEXT: callq _Znwm@PLT | ||
; CHECK-NEXT: shlq $4, %r15 | ||
; CHECK-NEXT: addq {{[-0-9]+}}(%r{{[sb]}}p), %r15 # 8-byte Folded Reload | ||
; CHECK-NEXT: movq %r12, %rcx | ||
; CHECK-NEXT: shrq $32, %rcx | ||
; CHECK-NEXT: movb %cl, 12(%rax) | ||
; CHECK-NEXT: movl %r12d, 8(%rax) | ||
; CHECK-NEXT: movq %r15, %rbx | ||
; CHECK-NEXT: movq %r13, %r15 | ||
; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %r13 # 8-byte Reload | ||
; CHECK-NEXT: decl %r14d | ||
; CHECK-NEXT: je .LBB0_8 | ||
; CHECK-NEXT: .LBB0_3: # %bb7 | ||
; CHECK-NEXT: # =>This Inner Loop Header: Depth=1 | ||
; CHECK-NEXT: callq widget@PLT | ||
; CHECK-NEXT: cmpb $-5, (%r13) | ||
; CHECK-NEXT: jae .LBB0_5 | ||
; CHECK-NEXT: # %bb.4: # in Loop: Header=BB0_3 Depth=1 | ||
; CHECK-NEXT: movl %r12d, %r12d | ||
; CHECK-NEXT: cmpq %r15, %rbx | ||
; CHECK-NEXT: jbe .LBB0_1 | ||
; CHECK-NEXT: jmp .LBB0_7 | ||
; CHECK-NEXT: .p2align 4, 0x90 | ||
; CHECK-NEXT: .LBB0_5: # %bb12 | ||
; CHECK-NEXT: # in Loop: Header=BB0_3 Depth=1 | ||
; CHECK-NEXT: movq 0, %rax | ||
; CHECK-NEXT: movq 8, %rax | ||
; CHECK-NEXT: movq {{[-0-9]+}}(%r{{[sb]}}p), %r12 # 8-byte Reload | ||
; CHECK-NEXT: cmpq %r15, %rbx | ||
; CHECK-NEXT: jbe .LBB0_1 | ||
; CHECK-NEXT: .LBB0_7: # in Loop: Header=BB0_3 Depth=1 | ||
; CHECK-NEXT: xorl %eax, %eax | ||
; CHECK-NEXT: xorl %ebx, %ebx | ||
; CHECK-NEXT: decl %r14d | ||
; CHECK-NEXT: jne .LBB0_3 | ||
; CHECK-NEXT: .LBB0_8: # %bb21 | ||
; CHECK-NEXT: cmpb $0, 12(%rax) | ||
; CHECK-NEXT: jne .LBB0_10 | ||
; CHECK-NEXT: # %bb.9: # %bb26 | ||
; CHECK-NEXT: addq $24, %rsp | ||
; CHECK-NEXT: popq %rbx | ||
; CHECK-NEXT: popq %r12 | ||
; CHECK-NEXT: popq %r13 | ||
; CHECK-NEXT: popq %r14 | ||
; CHECK-NEXT: popq %r15 | ||
; CHECK-NEXT: popq %rbp | ||
; CHECK-NEXT: .cfi_def_cfa %rsp, 8 | ||
; CHECK-NEXT: retq | ||
; CHECK-NEXT: .LBB0_10: # %bb25 | ||
; CHECK-NEXT: .cfi_def_cfa %rbp, 16 | ||
; CHECK-NEXT: movq %r15, %rdi | ||
; CHECK-NEXT: callq pluto@PLT | ||
bb: | ||
br label %bb7 | ||
|
||
bb5: ; preds = %bb17, %bb14 | ||
%phi = phi ptr [ %call19, %bb17 ], [ null, %bb14 ] | ||
%phi6 = phi ptr [ %getelementptr, %bb17 ], [ null, %bb14 ] | ||
%add = add i32 %phi9, 1 | ||
%icmp = icmp eq i32 %phi9, %arg4 | ||
br i1 %icmp, label %bb21, label %bb7 | ||
|
||
bb7: ; preds = %bb5, %bb | ||
%phi8 = phi ptr [ null, %bb ], [ %phi6, %bb5 ] | ||
%phi9 = phi i32 [ 0, %bb ], [ %add, %bb5 ] | ||
%phi10 = phi i40 [ undef, %bb ], [ %phi15, %bb5 ] | ||
%call = call ptr @widget() | ||
%load = load i8, ptr %arg1, align 8 | ||
%icmp11 = icmp ult i8 %load, -5 | ||
%and = and i40 %phi10, 4294967295 | ||
br i1 %icmp11, label %bb14, label %bb12 | ||
|
||
bb12: ; preds = %bb7 | ||
%load13 = load volatile { i64, i64 }, ptr null, align 4294967296 | ||
br label %bb14 | ||
|
||
bb14: ; preds = %bb12, %bb7 | ||
%phi15 = phi i40 [ %and, %bb7 ], [ %arg2, %bb12 ] | ||
%icmp16 = icmp ugt ptr %phi8, %arg | ||
br i1 %icmp16, label %bb5, label %bb17 | ||
|
||
bb17: ; preds = %bb14 | ||
%icmp18 = icmp eq ptr %phi8, null | ||
%zext = zext i1 %icmp18 to i64 | ||
%call19 = call ptr @_Znwm(i64 0) | ||
%getelementptr = getelementptr %struct.wibble, ptr %arg3, i64 %zext | ||
%getelementptr20 = getelementptr i8, ptr %call19, i64 8 | ||
store i40 %phi15, ptr %getelementptr20, align 4 | ||
br label %bb5 | ||
|
||
bb21: ; preds = %bb5 | ||
%getelementptr22 = getelementptr %struct.snork.3, ptr %phi, i64 0, i32 1 | ||
%load23 = load i8, ptr %getelementptr22, align 4 | ||
%icmp24 = icmp eq i8 %load23, 0 | ||
br i1 %icmp24, label %bb26, label %bb25 | ||
|
||
bb25: ; preds = %bb21 | ||
call void @pluto(ptr %arg) | ||
unreachable | ||
|
||
bb26: ; preds = %bb21 | ||
ret void | ||
} | ||
|
||
define void @eggs(ptr %arg, ptr %arg1) { | ||
; CHECK-LABEL: eggs: | ||
; CHECK: # %bb.0: # %bb | ||
; CHECK-NEXT: pushq %rax | ||
; CHECK-NEXT: .cfi_def_cfa_offset 16 | ||
; CHECK-NEXT: movq %rdi, %rax | ||
; CHECK-NEXT: movq %rsi, %rdi | ||
; CHECK-NEXT: movq %rax, %rsi | ||
; CHECK-NEXT: xorl %edx, %edx | ||
; CHECK-NEXT: xorl %ecx, %ecx | ||
; CHECK-NEXT: xorl %r8d, %r8d | ||
; CHECK-NEXT: callq foo@PLT | ||
; CHECK-NEXT: popq %rax | ||
; CHECK-NEXT: .cfi_def_cfa_offset 8 | ||
; CHECK-NEXT: retq | ||
bb: | ||
call void @foo(ptr %arg1, ptr %arg, i40 0, ptr null, i32 0) | ||
ret void | ||
} | ||
|
||
declare ptr @widget() | ||
|
||
declare void @pluto(ptr) | ||
|
||
declare ptr @_Znwm(i64) | ||
|
||
attributes #0 = { noinline "frame-pointer"="all" } |
47 changes: 47 additions & 0 deletions
47
llvm/test/CodeGen/X86/coalescing-subreg-to-reg-requires-subrange-update.mir
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3 | ||
# RUN: llc -mtriple=x86_64-- -run-pass=register-coalescer -enable-subreg-liveness -verify-coalescing -o - %s | FileCheck %s | ||
|
||
|
||
# FIXME: Need to handle subrange updates when coalescing with subreg_to_reg | ||
# This will fail if x86 enables subregister liveness. | ||
--- | ||
name: requires_new_subrange_coalesce_subreg_to_reg | ||
tracksRegLiveness: true | ||
body: | | ||
; CHECK-LABEL: name: requires_new_subrange_coalesce_subreg_to_reg | ||
; CHECK: bb.0: | ||
; CHECK-NEXT: successors: %bb.2(0x40000000), %bb.1(0x40000000) | ||
; CHECK-NEXT: liveins: $eax | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: undef %a.sub_32bit:gr64_with_sub_8bit = COPY $eax | ||
; CHECK-NEXT: %b:gr32 = IMPLICIT_DEF | ||
; CHECK-NEXT: %c:gr64 = INSERT_SUBREG %a, %b, %subreg.sub_32bit | ||
; CHECK-NEXT: JCC_1 %bb.2, 4, implicit undef $eflags | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: bb.1: | ||
; CHECK-NEXT: successors: %bb.2(0x80000000) | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: undef %a.sub_32bit:gr64_with_sub_8bit = MOV32r0 implicit-def dead $eflags | ||
; CHECK-NEXT: %c.sub_32bit:gr64 = COPY %a | ||
; CHECK-NEXT: {{ $}} | ||
; CHECK-NEXT: bb.2: | ||
; CHECK-NEXT: %c.sub_32bit:gr64 = SUBREG_TO_REG %a, %b, %subreg.sub_32bit | ||
; CHECK-NEXT: RET 0, implicit %c | ||
bb.0: | ||
liveins: $eax | ||
%init_eax:gr32 = COPY $eax | ||
%a:gr64 = SUBREG_TO_REG 0, %init_eax, %subreg.sub_32bit | ||
%b:gr32 = IMPLICIT_DEF | ||
%c:gr64 = INSERT_SUBREG %a, %b, %subreg.sub_32bit | ||
JCC_1 %bb.2, 4, implicit undef $eflags | ||
bb.1: | ||
%imm0:gr32 = MOV32r0 implicit-def dead $eflags | ||
%a = SUBREG_TO_REG 0, %imm0, %subreg.sub_32bit | ||
%c.sub_32bit = COPY %a | ||
bb.2: | ||
%c.sub_32bit = SUBREG_TO_REG %a, %b, %subreg.sub_32bit | ||
RET 0, implicit %c | ||
... |
Oops, something went wrong.
414ff81
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a failure affecting sanitizer bots: https://lab.llvm.org/buildbot/#/builders/168/builds/16024/steps/9/logs/stdio
.......
#9 0x00005574558c16a5 (anonymous namespace)::RegisterCoalescer::reMaterializeTrivialDef(llvm::CoalescerPair const&, llvm::MachineInstr*, bool&) RegisterCoalescer.cpp:0:0
#10 0x00005574558bcb72 (anonymous namespace)::RegisterCoalescer::joinCopy(llvm::MachineInstr*, bool&) RegisterCoalescer.cpp:0:0
#11 0x00005574558bac5d (anonymous namespace)::RegisterCoalescer::copyCoalesceWorkList(llvm::MutableArrayRefllvm::MachineInstr*) RegisterCoalescer.cpp:0:0
.......
I am not sure if it is this PR or some of your other related work in the RegisterCoalescer code, but it looks related. Could you please take a look?
Thanks.
414ff81
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have confirmed that this is the culprit patch and I have reverted it.