Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AVR] Branch offset off by two in llvm-mc's output #60019

Closed
gergoerdi opened this issue Jan 14, 2023 · 14 comments
Closed

[AVR] Branch offset off by two in llvm-mc's output #60019

gergoerdi opened this issue Jan 14, 2023 · 14 comments
Assignees

Comments

@gergoerdi
Copy link

gergoerdi commented Jan 14, 2023

Jump straight to #60019 (comment) to avoid red herrings and dead ends...

I started digging into #59962 and #47093 to figure out what exactly is broken, and it seems that as of 5fcdf76, branch offsets are in general mis-assembled. This can be demonstrated with very simple, very small inputs.

First, assembly with llvm-mc from hand-written code:

	.text
        cpi r24, 42
	brne .FOO
        ret
.FOO:
        rjmp .FOO

I have tried assembling this with llvm-mc thus:

$ llvm-mc --arch=avr --mcpu=atmega32u4 -filetype=obj foo.s -o foo.s.o
$ avr-objdump -d foo.s.o

foo.s.o:     file format elf32-avr


Disassembly of section .text:

00000000 <.FOO-0x6>:
   0:	8a 32       	cpi	r24, 0x2A	; 42
   2:	01 f4       	brne	.+0      	; 0x4 <.FOO-0x2>          <--- HERE
   4:	08 95       	ret

00000006 <.FOO>:
   6:	00 c0       	rjmp	.+0      	; 0x8 <.FOO+0x2>

At the marked line, we see that brne .+2 (pointing to .FOO) has been mis-assembled ito brne .+0 (a nop).

For the second test, we can write LLVM IR and try compiling that with llc:

target datalayout = "e-P1-p:16:8-i8:8-i16:8-i32:8-i64:8-f32:8-f64:8-n8-a:8"
target triple = "avr-unknown-unknown"

define internal fastcc i8 @foo(i8 %x) {
start:
  %b = icmp eq i8 %x, 42
  br i1 %b, label %bb1, label %bb2
bb1:
  ret i8 12
bb2:
  ret i8 34
}

If we ask llc to produce an assembly file, that one looks correct:

$ llc -march=avr -mcpu=atmega32u4 -filetype=asm bar.ll -o -
	.text
.set __tmp_reg__, 0
.set __zero_reg__, 1
.set __SREG__, 63
.set __SP_H__, 62
.set __SP_L__, 61
	.file	"bar.ll"
	.p2align	1                               ; -- Begin function foo
	.type	foo,@function
foo:                                    ; @foo
; %bb.0:                                ; %start
	cpi	r24, 42
	brne	.LBB0_2
; %bb.1:                                ; %bb1
	ldi	r24, 12
	ret
.LBB0_2:                                ; %bb2
	ldi	r24, 34
	ret
.Lfunc_end0:
	.size	foo, .Lfunc_end0-foo
                                        ; -- End function

However, if we go to object file, we have the same problem that the branch offset in %bb.0 becomes zero:

$ llc -march=avr -mcpu=atmega32u4 -filetype=obj bar.ll -o bar.ll.o
$ avr-objdump -d bar.ll.o

bar.ll.o:     file format elf32-avr


Disassembly of section .text:

00000000 <foo>:
   0:	8a 32       	cpi	r24, 0x2A	; 42
   2:	01 f4       	brne	.+0      	; 0x4 <foo+0x4>       <---- HERE
   4:	8c e0       	ldi	r24, 0x0C	; 12
   6:	08 95       	ret
   8:	82 e2       	ldi	r24, 0x22	; 34
   a:	08 95       	ret
@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

The value 0 seems to come from the following function, when the branch offset is an expression (such as the label .FOO in our example):

template <AVR::Fixups Fixup>
unsigned
AVRMCCodeEmitter::encodeRelCondBrTarget(const MCInst &MI, unsigned OpNo,
SmallVectorImpl<MCFixup> &Fixups,
const MCSubtargetInfo &STI) const {
const MCOperand &MO = MI.getOperand(OpNo);
if (MO.isExpr()) {
Fixups.push_back(
MCFixup::create(0, MO.getExpr(), MCFixupKind(Fixup), MI.getLoc()));
return 0;
}

If, instead, we change the input to contain an explicit branch offset:

	.text
        cpi r24, 42
	brne 2
        ret
.FOO:
        rjmp .FOO

Then we get correct output.

Changing the return 0 to some other value changes the branch offset emitted for branches to labels, suggesting that the problem is that the expression is not evaluated / the placeholder 0 is not replaced with the result of the evaluation.

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

encodeRelCondBrTarget is instantiated for fixup_7_pcrel in our case, and printing from it, I can see that it is indeed called (so there is some effort to process MO.getExpr()), but it is called with a Value of 0 instead of what I would expect the evaluated value of .FOO.

A natural next question is, which value ultimately ends up in the object file: the placeholder returned by encodeRelCondBrTarget, or the value from fixup_7_pcrel? It turns out the eventual value is the sum of the two values.

@gergoerdi
Copy link
Author

On a debug build, dump()ing MO.getExpr() in encodeRelCondBrTarget we can at least verify that the expression passed in MO is correct (i.e. .FOO). Why it is evaluated to 0 before calling fixup_7_pcrel is still a mystery.

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

I've added a lot of tracing to MCAssembler::evaluateFixup and related functions, and it seems that it computes the correct branch offset into Value.

At the end of evaluateFixup, we have a correct Value, !IsResolved, and WasForced. I don't know yet what IsResolved is supposed to mean; I would have guessed that we have indeed resolved .FOO into an offset. IsResolved is set to false in the following code:

// Let the backend force a relocation if needed.
if (IsResolved && getBackend().shouldForceRelocation(*this, Fixup, Target)) {
IsResolved = false;
WasForced = true;
}

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

This is the point where the correct offset in FixedValue becomes overwritten with the incorrect 0 value, in line 804

if (!IsResolved) {
// The fixup was unresolved, we need a relocation. Inform the object
// writer of the relocation, and give it an opportunity to adjust the
// fixup value if need be.
getWriter().recordRelocation(*this, Layout, &F, Fixup, Target, FixedValue);
}

In ELFObjectWriter::recordRelocation, our FixedValue is decomposed into an Addend and a FixedValue of 0.

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

If I understand this right, AVRAsmBackend::shouldForceRelocation is behaving as if this was an external reference, requiring link-time relocation.

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

The relocation does make it way into the resulting .o file:

$ avr-objdump -d -r foo.s.o

foo.s.o:     file format elf32-avr


Disassembly of section .text:

00000000 <.FOO-0x6>:
   0:	8a 32       	cpi	r24, 0x2A	; 42
   2:	01 f4       	brne	.+0      	; 0x4 <.FOO-0x2>
			2: R_AVR_7_PCREL	.text+0x6
   4:	08 95       	ret

00000006 <.FOO>:
   6:	00 c0       	rjmp	.+0      	; 0x8 <.FOO+0x2>
			6: R_AVR_13_PCREL	.text+0x6

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

Oh crap, maybe this is all PEBKAC, and I should have always looked at the output with relocations! If I link it into an ELF file, indeed the branch offsets look correct:

000000c4 <main>:
  c4:	8a 32       	cpi	r24, 0x2A	; 42
  c6:	09 f4       	brne	.+2      	; 0xca <.FOO>
  c8:	08 95       	ret

000000ca <.FOO>:
  ca:	ff cf       	rjmp	.-2      	; 0xca <.FOO>

@benshi001 benshi001 self-assigned this Jan 14, 2023
@gergoerdi
Copy link
Author

Something's still not right though, because if I jump back to my real program, the one built by llc directly from .ll works, and the one built via llvm-mc doesn't.

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

OK turns out I'm not crazy. But all the analysis above is garbage: because I was looking at un-relocated object files, I was chasing a red herring of all-zero branch offsets. In reality, what happens is that the branch offsets are not all 0, but rather, off by 2 bytes (!) in the output of llvm-mc:

--- /dev/fd/63	2023-01-14 14:30:47.719017929 +0800
+++ /dev/fd/62	2023-01-14 14:30:47.719017929 +0800
@@ -1,5 +1,5 @@
 
-worduino_avr-b35d7970ef451ba9.ll.o.elf:     file format elf32-avr
+worduino_avr-b35d7970ef451ba9.ll.s.o.elf:     file format elf32-avr
 
 
 Disassembly of section .text:
@@ -216,7 +216,7 @@
      28e:	45 91       	lpm	r20, Z+
      290:	4d 93       	st	X+, r20
      292:	31 50       	subi	r19, 0x01	; 1
-     294:	e1 f7       	brne	.-8      	; 0x28e <_ZN15worduino_engine4draw11draw_sprite17hc7657142962e61a4E+0x96>
+     294:	d9 f7       	brne	.-10     	; 0x28c <_ZN15worduino_engine4draw11draw_sprite17hc7657142962e61a4E+0x94>
      296:	38 2f       	mov	r19, r24
      298:	32 0f       	add	r19, r18
      29a:	a3 2f       	mov	r26, r19
@@ -302,7 +302,7 @@
      33a:	d5 90       	lpm	r13, Z+
      33c:	dd 92       	st	X+, r13
      33e:	51 50       	subi	r21, 0x01	; 1
-     340:	e1 f7       	brne	.-8      	; 0x33a <_ZN15worduino_engine4draw11draw_sprite17hc7657142962e61a4E+0x142>
+     340:	d9 f7       	brne	.-10     	; 0x338 <_ZN15worduino_engine4draw11draw_sprite17hc7657142962e61a4E+0x140>
      342:	f9 8d       	ldd	r31, Y+25	; 0x19
      344:	f8 0f       	add	r31, r24
      346:	ef 2f       	mov	r30, r31
@@ -601,7 +601,7 @@
      590:	45 91       	lpm	r20, Z+
      592:	4d 93       	st	X+, r20
      594:	31 50       	subi	r19, 0x01	; 1
-     596:	e1 f7       	brne	.-8      	; 0x590 <__EEPROM_REGION_LENGTH__+0x190>
+     596:	d9 f7       	brne	.-10     	; 0x58e <__EEPROM_REGION_LENGTH__+0x18e>
      598:	38 2f       	mov	r19, r24
      59a:	32 0f       	add	r19, r18
      59c:	a3 2f       	mov	r26, r19
@@ -711,7 +711,7 @@
      66c:	d5 90       	lpm	r13, Z+
      66e:	dd 92       	st	X+, r13
      670:	51 50       	subi	r21, 0x01	; 1
-     672:	e1 f7       	brne	.-8      	; 0x66c <__EEPROM_REGION_LENGTH__+0x26c>
+     672:	d9 f7       	brne	.-10     	; 0x66a <__EEPROM_REGION_LENGTH__+0x26a>
      674:	f9 8d       	ldd	r31, Y+25	; 0x19
      676:	f8 0f       	add	r31, r24
      678:	ef 2f       	mov	r30, r31
@@ -1216,7 +1216,7 @@
      a60:	d5 90       	lpm	r13, Z+
      a62:	dd 92       	st	X+, r13
      a64:	01 50       	subi	r16, 0x01	; 1
-     a66:	e1 f7       	brne	.-8      	; 0xa60 <__DATA_REGION_LENGTH__+0x60>
+     a66:	d9 f7       	brne	.-10     	; 0xa5e <__DATA_REGION_LENGTH__+0x5e>
      a68:	f8 2f       	mov	r31, r24
      a6a:	f2 0f       	add	r31, r18
      a6c:	cf 2e       	mov	r12, r31
@@ -1590,7 +1590,7 @@
      d4c:	d5 90       	lpm	r13, Z+
      d4e:	dd 92       	st	X+, r13
      d50:	01 50       	subi	r16, 0x01	; 1
-     d52:	e1 f7       	brne	.-8      	; 0xd4c <__stack+0x24d>
+     d52:	d9 f7       	brne	.-10     	; 0xd4a <__stack+0x24b>
      d54:	f8 2f       	mov	r31, r24
      d56:	f2 0f       	add	r31, r18
      d58:	cf 2e       	mov	r12, r31
@@ -1905,7 +1905,7 @@
      fc8:	35 91       	lpm	r19, Z+
      fca:	3d 93       	st	X+, r19
      fcc:	21 50       	subi	r18, 0x01	; 1
-     fce:	e1 f7       	brne	.-8      	; 0xfc8 <__stack+0x4c9>
+     fce:	d9 f7       	brne	.-10     	; 0xfc6 <__stack+0x4c7>
      fd0:	f3 cf       	rjmp	.-26     	; 0xfb8 <__stack+0x4b9>
      fd2:	e6 0f       	add	r30, r22
      fd4:	f7 1f       	adc	r31, r23
@@ -1916,7 +1916,7 @@
      fde:	35 91       	lpm	r19, Z+
      fe0:	3d 93       	st	X+, r19
      fe2:	21 50       	subi	r18, 0x01	; 1
-     fe4:	e1 f7       	brne	.-8      	; 0xfde <__stack+0x4df>
+     fe4:	d9 f7       	brne	.-10     	; 0xfdc <__stack+0x4dd>
      fe6:	e8 cf       	rjmp	.-48     	; 0xfb8 <__stack+0x4b9>
      fe8:	81 2d       	mov	r24, r1
      fea:	0b c0       	rjmp	.+22     	; 0x1002 <__stack+0x503>
@@ -2695,7 +2695,7 @@
     15f0:	95 91       	lpm	r25, Z+
     15f2:	9d 93       	st	X+, r25
     15f4:	81 50       	subi	r24, 0x01	; 1
-    15f6:	e1 f7       	brne	.-8      	; 0x15f0 <main+0x86>
+    15f6:	d9 f7       	brne	.-10     	; 0x15ee <main+0x84>
     15f8:	0f b6       	in	r0, 0x3f	; 63
     15fa:	c1 53       	subi	r28, 0x31	; 49
     15fc:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2716,7 +2716,7 @@
     161a:	95 91       	lpm	r25, Z+
     161c:	9d 93       	st	X+, r25
     161e:	81 50       	subi	r24, 0x01	; 1
-    1620:	e1 f7       	brne	.-8      	; 0x161a <main+0xb0>
+    1620:	d9 f7       	brne	.-10     	; 0x1618 <main+0xae>
     1622:	0f b6       	in	r0, 0x3f	; 63
     1624:	c1 53       	subi	r28, 0x31	; 49
     1626:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2737,7 +2737,7 @@
     1644:	95 91       	lpm	r25, Z+
     1646:	9d 93       	st	X+, r25
     1648:	81 50       	subi	r24, 0x01	; 1
-    164a:	e1 f7       	brne	.-8      	; 0x1644 <main+0xda>
+    164a:	d9 f7       	brne	.-10     	; 0x1642 <main+0xd8>
     164c:	0f b6       	in	r0, 0x3f	; 63
     164e:	c1 53       	subi	r28, 0x31	; 49
     1650:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2758,7 +2758,7 @@
     166e:	95 91       	lpm	r25, Z+
     1670:	9d 93       	st	X+, r25
     1672:	81 50       	subi	r24, 0x01	; 1
-    1674:	e1 f7       	brne	.-8      	; 0x166e <main+0x104>
+    1674:	d9 f7       	brne	.-10     	; 0x166c <main+0x102>
     1676:	0f b6       	in	r0, 0x3f	; 63
     1678:	c1 53       	subi	r28, 0x31	; 49
     167a:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2779,7 +2779,7 @@
     1698:	95 91       	lpm	r25, Z+
     169a:	9d 93       	st	X+, r25
     169c:	81 50       	subi	r24, 0x01	; 1
-    169e:	e1 f7       	brne	.-8      	; 0x1698 <main+0x12e>
+    169e:	d9 f7       	brne	.-10     	; 0x1696 <main+0x12c>
     16a0:	0f b6       	in	r0, 0x3f	; 63
     16a2:	c1 53       	subi	r28, 0x31	; 49
     16a4:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2800,7 +2800,7 @@
     16c2:	95 91       	lpm	r25, Z+
     16c4:	9d 93       	st	X+, r25
     16c6:	81 50       	subi	r24, 0x01	; 1
-    16c8:	e1 f7       	brne	.-8      	; 0x16c2 <main+0x158>
+    16c8:	d9 f7       	brne	.-10     	; 0x16c0 <main+0x156>
     16ca:	0f b6       	in	r0, 0x3f	; 63
     16cc:	c1 53       	subi	r28, 0x31	; 49
     16ce:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2821,7 +2821,7 @@
     16ec:	95 91       	lpm	r25, Z+
     16ee:	9d 93       	st	X+, r25
     16f0:	81 50       	subi	r24, 0x01	; 1
-    16f2:	e1 f7       	brne	.-8      	; 0x16ec <main+0x182>
+    16f2:	d9 f7       	brne	.-10     	; 0x16ea <main+0x180>
     16f4:	0f b6       	in	r0, 0x3f	; 63
     16f6:	c1 53       	subi	r28, 0x31	; 49
     16f8:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2842,7 +2842,7 @@
     1716:	95 91       	lpm	r25, Z+
     1718:	9d 93       	st	X+, r25
     171a:	81 50       	subi	r24, 0x01	; 1
-    171c:	e1 f7       	brne	.-8      	; 0x1716 <main+0x1ac>
+    171c:	d9 f7       	brne	.-10     	; 0x1714 <main+0x1aa>
     171e:	0f b6       	in	r0, 0x3f	; 63
     1720:	c1 53       	subi	r28, 0x31	; 49
     1722:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2863,7 +2863,7 @@
     1740:	95 91       	lpm	r25, Z+
     1742:	9d 93       	st	X+, r25
     1744:	81 50       	subi	r24, 0x01	; 1
-    1746:	e1 f7       	brne	.-8      	; 0x1740 <main+0x1d6>
+    1746:	d9 f7       	brne	.-10     	; 0x173e <main+0x1d4>
     1748:	0f b6       	in	r0, 0x3f	; 63
     174a:	c1 53       	subi	r28, 0x31	; 49
     174c:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2884,7 +2884,7 @@
     176a:	95 91       	lpm	r25, Z+
     176c:	9d 93       	st	X+, r25
     176e:	81 50       	subi	r24, 0x01	; 1
-    1770:	e1 f7       	brne	.-8      	; 0x176a <main+0x200>
+    1770:	d9 f7       	brne	.-10     	; 0x1768 <main+0x1fe>
     1772:	0f b6       	in	r0, 0x3f	; 63
     1774:	c1 53       	subi	r28, 0x31	; 49
     1776:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2905,7 +2905,7 @@
     1794:	95 91       	lpm	r25, Z+
     1796:	9d 93       	st	X+, r25
     1798:	81 50       	subi	r24, 0x01	; 1
-    179a:	e1 f7       	brne	.-8      	; 0x1794 <main+0x22a>
+    179a:	d9 f7       	brne	.-10     	; 0x1792 <main+0x228>
     179c:	0f b6       	in	r0, 0x3f	; 63
     179e:	c1 53       	subi	r28, 0x31	; 49
     17a0:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2926,7 +2926,7 @@
     17be:	95 91       	lpm	r25, Z+
     17c0:	9d 93       	st	X+, r25
     17c2:	81 50       	subi	r24, 0x01	; 1
-    17c4:	e1 f7       	brne	.-8      	; 0x17be <main+0x254>
+    17c4:	d9 f7       	brne	.-10     	; 0x17bc <main+0x252>
     17c6:	0f b6       	in	r0, 0x3f	; 63
     17c8:	c1 53       	subi	r28, 0x31	; 49
     17ca:	df 4f       	sbci	r29, 0xFF	; 255
@@ -2948,7 +2948,7 @@
     17ea:	95 91       	lpm	r25, Z+
     17ec:	9d 93       	st	X+, r25
     17ee:	81 50       	subi	r24, 0x01	; 1
-    17f0:	e1 f7       	brne	.-8      	; 0x17ea <main+0x280>
+    17f0:	d9 f7       	brne	.-10     	; 0x17e8 <main+0x27e>
     17f2:	0f b6       	in	r0, 0x3f	; 63
     17f4:	c1 53       	subi	r28, 0x31	; 49
     17f6:	df 4f       	sbci	r29, 0xFF	; 255
@@ -4972,7 +4972,7 @@
     27d4:	95 91       	lpm	r25, Z+
     27d6:	9d 93       	st	X+, r25
     27d8:	81 50       	subi	r24, 0x01	; 1
-    27da:	e1 f7       	brne	.-8      	; 0x27d4 <main+0x126a>
+    27da:	d9 f7       	brne	.-10     	; 0x27d2 <main+0x1268>
     27dc:	0f b6       	in	r0, 0x3f	; 63
     27de:	c6 5a       	subi	r28, 0xA6	; 166
     27e0:	df 4f       	sbci	r29, 0xFF	; 255
@@ -5581,7 +5581,7 @@
     2ca0:	35 91       	lpm	r19, Z+
     2ca2:	3d 93       	st	X+, r19
     2ca4:	21 50       	subi	r18, 0x01	; 1
-    2ca6:	e1 f7       	brne	.-8      	; 0x2ca0 <main+0x1736>
+    2ca6:	d9 f7       	brne	.-10     	; 0x2c9e <main+0x1734>
     2ca8:	2a e0       	ldi	r18, 0x0A	; 10
     2caa:	0f b6       	in	r0, 0x3f	; 63
     2cac:	cb 53       	subi	r28, 0x3B	; 59
@@ -6010,7 +6010,7 @@
     3002:	95 91       	lpm	r25, Z+
     3004:	9d 93       	st	X+, r25
     3006:	81 50       	subi	r24, 0x01	; 1
-    3008:	e1 f7       	brne	.-8      	; 0x3002 <main+0x1a98>
+    3008:	d9 f7       	brne	.-10     	; 0x3000 <main+0x1a96>
     300a:	8a e0       	ldi	r24, 0x0A	; 10
     300c:	0f b6       	in	r0, 0x3f	; 63
     300e:	c6 55       	subi	r28, 0x56	; 86
@@ -10438,7 +10438,7 @@
     52ce:	45 91       	lpm	r20, Z+
     52d0:	4d 93       	st	X+, r20
     52d2:	21 50       	subi	r18, 0x01	; 1
-    52d4:	e1 f7       	brne	.-8      	; 0x52ce <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x19e>
+    52d4:	d9 f7       	brne	.-10     	; 0x52cc <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x19c>
     52d6:	4a 85       	ldd	r20, Y+10	; 0x0a
     52d8:	5b 85       	ldd	r21, Y+11	; 0x0b
     52da:	60 30       	cpi	r22, 0x00	; 0
@@ -10473,7 +10473,7 @@
     5314:	55 91       	lpm	r21, Z+
     5316:	5d 93       	st	X+, r21
     5318:	21 50       	subi	r18, 0x01	; 1
-    531a:	e1 f7       	brne	.-8      	; 0x5314 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x1e4>
+    531a:	d9 f7       	brne	.-10     	; 0x5312 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x1e2>
     531c:	2c 85       	ldd	r18, Y+12	; 0x0c
     531e:	40 30       	cpi	r20, 0x00	; 0
     5320:	d9 f0       	breq	.+54     	; 0x5358 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x228>
@@ -10595,7 +10595,7 @@
     5408:	25 91       	lpm	r18, Z+
     540a:	2d 93       	st	X+, r18
     540c:	41 50       	subi	r20, 0x01	; 1
-    540e:	e1 f7       	brne	.-8      	; 0x5408 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x2d8>
+    540e:	d9 f7       	brne	.-10     	; 0x5406 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x2d6>
     5410:	4a 85       	ldd	r20, Y+10	; 0x0a
     5412:	5b 85       	ldd	r21, Y+11	; 0x0b
     5414:	60 30       	cpi	r22, 0x00	; 0
@@ -10635,7 +10635,7 @@
     5458:	45 91       	lpm	r20, Z+
     545a:	4d 93       	st	X+, r20
     545c:	31 50       	subi	r19, 0x01	; 1
-    545e:	e1 f7       	brne	.-8      	; 0x5458 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x328>
+    545e:	d9 f7       	brne	.-10     	; 0x5456 <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x326>
     5460:	3c 85       	ldd	r19, Y+12	; 0x0c
     5462:	20 30       	cpi	r18, 0x00	; 0
     5464:	e1 f0       	breq	.+56     	; 0x549e <_ZN15worduino_engine6engine16blocked_by_walls17h2cd2afaac03476e8E+0x36e>

@gergoerdi gergoerdi changed the title [AVR] Branch offset zero in assembler output [AVR] Branch offset off by two in llvm-mc's output Jan 14, 2023
@gergoerdi
Copy link
Author

Annoyingly, I can't trigger this with a small .ll file. I've uploaded my real input to https://gist.github.com/gergoerdi/558a28630013d091d4e1cd22aa32b349

@gergoerdi
Copy link
Author

gergoerdi commented Jan 14, 2023

Note that off by 2 basically means that the relocation entry stored in the object file is computed from the start of the given rjmp instruction, but it is applied from its end.

So let's say we have this example:

	.text
	.globl	main
main:
        nop
        nop
	rjmp	.+6

Here, because the two nops each take up two bytes, the rjmp starts at 0x0004. LLVM puts 0x0004 + 6 = 0x000a in the relocation table:

00000004 R_AVR_13_PCREL    .text+0x0000000a

But right after the rjmp .+6, we are at address 0x0006 already, so a jump forward by +6 should result in 0x00c. Accordingly, the relocation should point to .text+0x0000000c.

@benshi001
Copy link
Member

fixed by https://reviews.llvm.org/D143901

@benshi001
Copy link
Member

fixed by 697a162

CarlosAlbertoEnciso pushed a commit to SNSystems/llvm-debuginfo-analyzer that referenced this issue Feb 15, 2023
In avr-gcc, the destination of "rjmp label + offset" is address
'label + offset', while destination of "rjmp . + offset" is
'address_of_rjmp + offset + 2'.

Clang is in accordance with avr-gcc for "rjmp label + offset", but
emits incorrect destination of "rjmp . + offset" to
'address_of_rjmp + offset', in which the expected offset 2 is missing.

This patch fixes the above issue.

Fixes llvm/llvm-project#60019

Reviewed By: jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D143901
veselypeta pushed a commit to veselypeta/cherillvm that referenced this issue Aug 12, 2024
In avr-gcc, the destination of "rjmp label + offset" is address
'label + offset', while destination of "rjmp . + offset" is
'address_of_rjmp + offset + 2'.

Clang is in accordance with avr-gcc for "rjmp label + offset", but
emits incorrect destination of "rjmp . + offset" to
'address_of_rjmp + offset', in which the expected offset 2 is missing.

This patch fixes the above issue.

Fixes llvm/llvm-project#60019

Reviewed By: jacquesguan, aykevl

Differential Revision: https://reviews.llvm.org/D143901
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants