Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aarch64: enhance main makefile to support building from host for aarch64 #1106

Closed
wkozaczuk opened this issue Nov 2, 2020 · 15 comments
Closed

Comments

@wkozaczuk
Copy link
Collaborator

Currently, we only support cross-compiling aarch64 version of OSv kernel on x64 Fedora builds machine/container. GIven these days there is cheaper yet powerful arm64 hardware available like RPI 4, it would be nice to support building natively from the host for aarch64 architecture.

@punitagrawal
Copy link
Contributor

As a step towards enabling native builds, I'd like to get rid of the dependency on locally unpacked RPM packages.

To scope out the problem, I updated the Makefile to drop references to the downloaded files; instead using the distro provided library and headers. See the work in progress changes in the linked commit for the actual changes.

With this, I was able to build stage 1 (make stage1) on the RockPro64 running Debian Bullseye.

Can you try using the linked commit in your environment? Does the approach make sense to you?

@punitagrawal
Copy link
Contributor

I was able to test the above changes to compile natively for x86_64 on Ubuntu 18.04 and it seems like things still work. whew

Thinking about build use cases, the following are commonly ones that would be good to support -

  • Native compilation on x86
    • Ideally independent of distribution and using distro native compiler / libraries
  • Cross compilation on arm64
  • Native compilation on arm64
    • Independent of distribution and using distro native compiler / libraries

Thoughts?

@wkozaczuk
Copy link
Collaborator Author

If the second use case is "Cross-compilation to arm64 on x86_64", then you are right that these are the 3 use cases we want to support. Now the commit in your close you were referring to above seems to indicate that you are removing support of the 2nd use case, am I wrong?

Also, the difficulty with use case 2 is that cross-compiling to arm64 on Ubuntu, Fedora, and possibly other Linux distributions would differ because of different locations of aarch64 headers and libraries on the x64 host and have to be accounted for in the main makefile I think (but I might be wrong). Also, we would need to modify setup.py to make sure correct packages are instaled for a given distribution.

But yes ideally we want to support all 3 use cases.

BTW were you able to compile OSv natively on RockPro64 and run it using your modified makefile?

@punitagrawal
Copy link
Contributor

If the second use case is "Cross-compilation to arm64 on x86_64", then you are right that these are the 3 use cases we want to support.

Indeed - that's what I meant.

Now the commit in your close you were referring to above seems to indicate that you are removing support of the 2nd use case, am I wrong?

The branch is still work-in-progress and needs more updates before it's done. More below.

Also, the difficulty with use case 2 is that cross-compiling to arm64 on Ubuntu, Fedora, and possibly other Linux distributions would differ because of different locations of aarch64 headers and libraries on the x64 host and have to be accounted for in the main makefile I think (but I might be wrong).

I am not too familiar with Fedora but I had naively assumed that it would have something similar to Debian multi-arch support - where it's possible to install packages (including libraries and development headers) from multiple architecture (link) on the same filesystem. Once the appropriate compiler for the target architecture is chosen, it can be queried for the library locations. With the multiarch support, there's no difference between cross and native compiling - which is a big win in terms of build system complexity. Following this, I started making the changes to the Makefile in the linked branch.

On further investigation, it seems that Fedora does not support cross install of userspace libraries; only the compiler is available. This is somewhat OK for Linux kernel (or bare metal) development but doesn't work when cross compiling applications that need userspace libraries without resorting to tricks like what's being done for OSv.

On Fedora like systems, one way forward would be create a unified sysroot with all the required cross dependencies, vs one-per-package as being done right now.

BTW were you able to compile OSv natively on RockPro64 and run it using your modified makefile?

Unfortunately the build is broken on the RockPro64 due to unrelated issue with gcc-10 and the need for __getauxval() when linking against libgcc. Cross-compiling on Ubuntu succeeds but doesn't show any output when run.

What would be the simplest test for building / running the kernel? Need to see where things are stuck once I figure out how to look at this through gdb (or something).

@wkozaczuk
Copy link
Collaborator Author

wkozaczuk commented Nov 10, 2020

I think the easiest way to support option 2 would be to replace all conditionals ifeq ($(arch),aarch64) with

ifeq ($(host_arch),x64)
ifeq ($(arch),aarch64) 

and let native aarch64 be handled mostly (more about this below) as for x64. For completeness, we should catch host_arch == aarch64 and arch == x64 and fail it as I doubt we would ever support it.

As far as cross-building aarch64 on x64, for now, I would just stick to Fedora and tackle that separately (in the worst case one can always cross-built in a Fedora container).

On a similar note, I actually took a stab at building OSv on Ubuntu 20.04.1 Raspberry PI 4B by using your updated Makefile and slightly adjusting it and other scripts. Here is a complete diff:

diff --git a/Makefile b/Makefile
index d1597263..cf017fab 100644
--- a/Makefile
+++ b/Makefile
@@ -192,25 +192,6 @@ local-includes =
 INCLUDES = $(local-includes) -Iarch/$(arch) -I. -Iinclude  -Iarch/common
 INCLUDES += -isystem include/glibc-compat
 
-aarch64_gccbase = build/downloaded_packages/aarch64/gcc/install
-aarch64_boostbase = build/downloaded_packages/aarch64/boost/install
-
-ifeq ($(arch),aarch64)
-ifeq (,$(wildcard $(aarch64_gccbase)))
-    $(error Missing $(aarch64_gccbase) directory. Please run "./scripts/download_fedora_aarch64_packages.py")
-endif
-ifeq (,$(wildcard $(aarch64_boostbase)))
-    $(error Missing $(aarch64_boostbase) directory. Please run "./scripts/download_fedora_aarch64_packages.py")
-endif
-endif
-
-ifeq ($(arch),aarch64)
-  gcc-inc-base := $(dir $(shell find $(aarch64_gccbase)/ -name vector | grep -v -e debug/vector$$ -e profile/vector$$ -e experimental/vector$$))
-  gcc-inc-base3 := $(dir $(shell dirname `find $(aarch64_gccbase)/ -name c++config.h | grep -v /32/`))
-  INCLUDES += -isystem $(gcc-inc-base)
-  INCLUDES += -isystem $(gcc-inc-base3)
-endif
-
 ifeq ($(arch),x64)
 INCLUDES += -isystem external/$(arch)/acpica/source/include
 endif
@@ -221,24 +202,23 @@ INCLUDES += -isystem $(libfdt_base)
 endif
 
 INCLUDES += $(boost-includes)
-ifeq ($(arch),x64)
 # Starting in Gcc 6, the standard C++ header files (which we do not change)
 # must precede in the include path the C header files (which we replace).
 # This is explained in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70722.
 # So we are forced to list here (before include/api) the system's default
 # C++ include directories, though they are already in the default search path.
 INCLUDES += $(shell $(CXX) -E -xc++ - -v </dev/null 2>&1 | awk '/^End/ {exit} /^ .*c\+\+/ {print "-isystem" $$0}')
-endif
 INCLUDES += $(pre-include-api)
 INCLUDES += -isystem include/api
 INCLUDES += -isystem include/api/$(arch)
-ifeq ($(arch),aarch64)
-  gcc-inc-base2 := $(dir $(shell find $(aarch64_gccbase)/ -name unwind.h))
-  # must be after include/api, since it includes some libc-style headers:
-  INCLUDES += -isystem $(gcc-inc-base2)
-endif
 INCLUDES += -isystem $(out)/gen/include
 INCLUDES += $(post-includes-bsd)
+# must be after include/api, since it includes some libc-style headers:
+#INCLUDES += -isystem /usr/include -isystem /usr/include/$(arch)-linux-gnu -isystem /usr/lib/gcc/aarch64-linux-gnu/9/include
+INCLUDES += -isystem /usr/lib/gcc/aarch64-linux-gnu/9/include \
+-isystem /usr/local/include \
+-isystem /usr/include/aarch64-linux-gnu \
+-isystem /usr/include
 
 post-includes-bsd += -isystem bsd/sys
 # For acessing machine/ in cpp xen drivers
@@ -266,8 +246,6 @@ $(out)/musl/%.o: source-dialects =
 
 kernel-defines = -D_KERNEL $(source-dialects)
 
-gcc-sysroot = $(if $(CROSS_PREFIX), --sysroot $(aarch64_gccbase)) \
-
 # This play the same role as "_KERNEL", but _KERNEL unfortunately is too
 # overloaded. A lot of files will expect it to be set no matter what, specially
 # in headers. "userspace" inclusion of such headers is valid, and lacking
@@ -289,7 +267,7 @@ COMMON = $(autodepend) -g -Wall -Wno-pointer-arith $(CFLAGS_WERROR) -Wformat=0 -
        $(kernel-defines) \
        -fno-omit-frame-pointer $(compiler-specific) \
        -include compiler/include/intrinsics.hh \
-       $(arch-cflags) $(conf-opt) $(acpi-defines) $(tracing-flags) $(gcc-sysroot) \
+       $(arch-cflags) $(conf-opt) $(acpi-defines) $(tracing-flags) \
        $(configuration) -D__OSV__ -D__XEN_INTERFACE_VERSION__="0x00030207" -DARCH_STRING=$(ARCH_STR) $(EXTRA_FLAGS)
 ifeq ($(arch),aarch64)
   COMMON += -nostdinc
@@ -1798,56 +1776,41 @@ objects += $(addprefix fs/, $(fs_objs))
 objects += $(addprefix libc/, $(libc))
 objects += $(addprefix musl/src/, $(musl))
 
-ifeq ($(arch),x64)
-    libstdc++.a := $(shell $(CXX) -print-file-name=libstdc++.a)
-    ifeq ($(filter /%,$(libstdc++.a)),)
-        $(error Error: libstdc++.a needs to be installed.)
-    endif
+libstdc++.a := $(shell $(CXX) -print-file-name=libstdc++.a)
+ifeq ($(filter /%,$(libstdc++.a)),)
+    $(error Error: libstdc++.a needs to be installed.)
+endif
 
-    libsupc++.a := $(shell $(CXX) -print-file-name=libsupc++.a)
-    ifeq ($(filter /%,$(libsupc++.a)),)
-        $(error Error: libsupc++.a needs to be installed.)
-    endif
-else
-    libstdc++.a := $(shell find $(aarch64_gccbase)/ -name libstdc++.a)
-    libsupc++.a := $(shell find $(aarch64_gccbase)/ -name libsupc++.a)
+libsupc++.a := $(shell $(CXX) -print-file-name=libsupc++.a)
+ifeq ($(filter /%,$(libsupc++.a)),)
+    $(error Error: libsupc++.a needs to be installed.)
 endif
 
-ifeq ($(arch),x64)
-    libgcc.a := $(shell $(CC) -print-libgcc-file-name)
-    ifeq ($(filter /%,$(libgcc.a)),)
-        $(error Error: libgcc.a needs to be installed.)
-    endif
+libgcc.a := $(shell $(CC) -print-libgcc-file-name)
+ifeq ($(filter /%,$(libgcc.a)),)
+    $(error Error: libgcc.a needs to be installed.)
+endif
 
-    libgcc_eh.a := $(shell $(CC) -print-file-name=libgcc_eh.a)
-    ifeq ($(filter /%,$(libgcc_eh.a)),)
-        $(error Error: libgcc_eh.a needs to be installed.)
-    endif
-else
-    libgcc.a := $(shell find $(aarch64_gccbase)/ -name libgcc.a |  grep -v /32/)
-    libgcc_eh.a := $(shell find $(aarch64_gccbase)/ -name libgcc_eh.a |  grep -v /32/)
+libgcc_eh.a := $(shell $(CC) -print-file-name=libgcc_eh.a)
+ifeq ($(filter /%,$(libgcc_eh.a)),)
+    $(error Error: libgcc_eh.a needs to be installed.)
 endif
 
-ifeq ($(arch),x64)
-    # link with -mt if present, else the base version (and hope it is multithreaded)
-    boost-mt := -mt
+
+# link with -mt if present, else the base version (and hope it is multithreaded)
+boost-mt := -mt
+boost-lib-dir := $(dir $(shell $(CC) --print-file-name libboost_system$(boost-mt).a))
+ifeq ($(filter /%,$(boost-lib-dir)),)
+    boost-mt :=
     boost-lib-dir := $(dir $(shell $(CC) --print-file-name libboost_system$(boost-mt).a))
     ifeq ($(filter /%,$(boost-lib-dir)),)
-        boost-mt :=
-        boost-lib-dir := $(dir $(shell $(CC) --print-file-name libboost_system$(boost-mt).a))
-        ifeq ($(filter /%,$(boost-lib-dir)),)
-            $(error Error: libboost_system.a needs to be installed.)
-        endif
+        $(error Error: libboost_system.a needs to be installed.)
     endif
-    # When boost_env=host, we won't use "-nostdinc", so the build machine's
-    # header files will be used normally. So we don't need to add anything
-    # special for Boost.
-    boost-includes =
-else
-    boost-lib-dir := $(firstword $(dir $(shell find $(aarch64_boostbase)/ -name libboost_system*.a)))
-    boost-mt := $(if $(filter %-mt.a, $(wildcard $(boost-lib-dir)/*.a)),-mt)
-    boost-includes = -isystem $(aarch64_boostbase)/usr/include
 endif
+# When boost_env=host, we won't use "-nostdinc", so the build machine's
+# header files will be used normally. So we don't need to add anything
+# special for Boost.
+boost-includes =
 
 boost-libs := $(boost-lib-dir)/libboost_system$(boost-mt).a
 
@@ -1928,11 +1891,7 @@ $(bootfs_manifest_dep): phony
                echo -n $(bootfs_manifest) > $(bootfs_manifest_dep) ; \
        fi
 
-ifeq ($(arch),x64)
 libgcc_s_dir := $(dir $(shell $(CC) -print-file-name=libgcc_s.so.1))
-else
-libgcc_s_dir := ../../$(aarch64_gccbase)/lib64
-endif
 
 $(out)/bootfs.bin: scripts/mkbootfs.py $(bootfs_manifest) $(bootfs_manifest_dep) $(tools:%=$(out)/%) \
                $(out)/zpool.so $(out)/zfs.so $(out)/libenviron.so $(out)/libvdso.so
diff --git a/scripts/build b/scripts/build
index fed78fd3..c71e78ce 100755
--- a/scripts/build
+++ b/scripts/build
@@ -262,11 +262,11 @@ kernel_end=$(($loader_size+2097151 & ~2097151))
 # the case in our old build.mk).
 cd $OUT
 
-if [[ "$arch" == 'aarch64' ]]; then
-       libgcc_s_dir=$(readlink -f ../downloaded_packages/aarch64/gcc/install/lib64)
-else
+#if [[ "$arch" == 'aarch64' ]]; then
+#      libgcc_s_dir=$(readlink -f ../downloaded_packages/aarch64/gcc/install/lib64)
+#else
        libgcc_s_dir=$(dirname $(readlink -f $(gcc -print-file-name=libgcc_s.so.1)))
-fi
+#fi
 
 if [ "$export" != "none" ]; then
        export_dir=${vars[export_dir]-$SRC/build/export}
diff --git a/scripts/setup.py b/scripts/setup.py
index 80ae31bf..f77bb89c 100755
--- a/scripts/setup.py
+++ b/scripts/setup.py
@@ -228,7 +228,7 @@ class Ubuntu(object):
                 'build-essential',
                 'curl',
                 'flex',
-                'g++-multilib',
+                #'g++-multilib',
                 'gawk',
                 'gdb',
                 'genromfs',
@@ -246,7 +246,7 @@ class Ubuntu(object):
                 'openssl',
                 'p11-kit',
                 'python3-requests',
-                'qemu-system-x86',
+                'qemu-system-aarch64',
                 'qemu-utils',
                 'tcpdump',
                 'unzip',

Please note these critical lines are different from your version of the makefile:

+#INCLUDES += -isystem /usr/include -isystem /usr/include/$(arch)-linux-gnu
+INCLUDES += -isystem /usr/lib/gcc/aarch64-linux-gnu/9/include \
+-isystem /usr/local/include \
+-isystem /usr/include/aarch64-linux-gnu \
+-isystem /usr/include

Without /usr/lib/gcc/aarch64-linux-gnu/9/include and /usr/local/include (the latter I am not sure if critical) my build would fail with compiler failing to find unwind.h. What is interesting, natively building x64 on both Fedora and Ubuntu does not require those paths.

Now when I compare output of gcc -E -xc++ - -v </dev/null 2>&1 between Ubuntu on RPI4 and x64, I see similar output:

(only relevant lines):
#include <...> search starts here:
 /usr/include/c++/9
 /usr/include/aarch64-linux-gnu/c++/9
 /usr/include/c++/9/backward
 /usr/lib/gcc/<arch>-linux-gnu/9/include
 /usr/local/include
 /usr/include/<arch>-linux-gnu
 /usr/include
End of search list.

But I do not understand why we have to explicitly add these 4 paths on aarch64. Any ideas?

Now with my Makefile and slightly adjusted build this command succeeds on RPI 4 (it takes around 12 minutes versus ~4 minutes on my x64 machine which is not bad):

./scripts/build image=native-example fs=rofs -j4 arch=aarch64 --create-disk

Unfortunately, trying to run it on the same PI4 leads to this crash:

./scripts/run.py -c 1
error: kvm run failed Function not implemented
 PC=0000000040205afc X00=0000000040723f4c X01=0000000000000010
X02=0000000000000000 X03=0000000000000001 X04=0000000000000021
X05=0000000000000040 X06=00000000000000f4 X07=0000000000001af4
X08=0000000000000009 X09=0000000048000000 X10=0000000040723e9c
X11=0000000000001af0 X12=00000000ffffffff X13=0000000000000000
X14=0000000048000000 X15=0000000040723e9c X16=0000000000000000
X17=0000000000000000 X18=0000000000000000 X19=0000000000000000
X20=0000000000000000 X21=0000000040723f4c X22=0000000000000000
X23=0000000000000000 X24=0000000000000000 X25=0000000000000000
X26=0000000000000000 X27=0000000000000000 X28=0000000000000000
X29=0000000040723f10 X30=0000000040205ad0  SP=0000000040723f10
PSTATE=600003c5 -ZC- EL1h

After connecting with gdb I see this:

(gdb) bt
#0  0x0000000040205afc in arch_init_early_console () at /usr/include/c++/9/new:174
#1  0x00000000400d7464 in premain () at loader.cc:99
#2  0x00000000400c004c in start_elf () at arch/aarch64/boot.S:37
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) disas 0x0000000040205afc-32,0x0000000040205afc+16
Dump of assembler code from 0x40205adc to 0x40205b0c:
   0x0000000040205adc <arch_init_early_console()+116>:	adrp	x2, 0x4009b000
   0x0000000040205ae0 <arch_init_early_console()+120>:	mov	w4, #0x21                  	// #33
   0x0000000040205ae4 <arch_init_early_console()+124>:	ldr	x19, [x19, #3032]
   0x0000000040205ae8 <arch_init_early_console()+128>:	mov	w3, #0x1                   	// #1
   0x0000000040205aec <arch_init_early_console()+132>:	ldr	x1, [x1, #2992]
   0x0000000040205af0 <arch_init_early_console()+136>:	mov	x0, x21
   0x0000000040205af4 <arch_init_early_console()+140>:	ldr	x2, [x2, #2072]
   0x0000000040205af8 <arch_init_early_console()+144>:	add	x1, x1, #0x10
=> 0x0000000040205afc <arch_init_early_console()+148>:	stp	x1, xzr, [x19]
   0x0000000040205b00 <arch_init_early_console()+152>:	str	w4, [x19, #16]
   0x0000000040205b04 <arch_init_early_console()+156>:	str	xzr, [x19, #24]
   0x0000000040205b08 <arch_init_early_console()+160>:	str	xzr, [x20, #8]
End of assembler dump.

And it always crashes the same way.

Based on what I see in the host /var/log/kern.log:

Nov 10 21:16:24 ubuntu kernel: [703172.159627] kvm [67476]: load/store instruction decoding not implemented

and the info in this link and the value of x19 register = 0, it seems like we are trying to write to an invalid part of memory at the address 0. What is interesting the same source code cross-built on Fedora x64 machine works just fine on the same RPI 4. So there is some issue with building on Ubuntu (maybe the same issue regardless of whether it is native or cross-built) - it must be producing a different binary.

From what I thought I saw in this issue conversation you reported that were able to boot OSv image on RockPro64 cross-built in Fedora container on some x64 machine, right?

PS. BTW I am not sure how well Debian is supported on x64 (per setup.py it seems to be supported, but maybe a long time ago) but I have never tried.

@punitagrawal
Copy link
Contributor

Many thanks for trying out the patch and your input.

I think the easiest way to support option 2 would be to replace all conditionals ifeq ($(arch),aarch64) with

ifeq ($(host_arch),x64)
ifeq ($(arch),aarch64) 

If you look more closely at the patch, on x86 (but also on arm64 natively), instead of hard-coding the library and include paths, they are being queried from the compiler. Why do we need to add the special casing?

As far as cross-building aarch64 on x64, for now, I would just stick to Fedora and tackle that separately (in the worst case one can always cross-built in a Fedora container).

I agree, for simplicity let's focus on enabling / fixing the native build for arm64 now.

As outlined below, there's a good chance, we'll get the cross builds for free once we fix the native build issues.

On Debian (and also Ubuntu), this is true also for the cross compile option (Option 2) as long as we query using the right compiler (aarch64-linux-gnu-gcc instead of gcc). As a consequence, there isn't much difference between native and cross compilation once the appropriate compiler is selected and flags (via configs/($arch).mk) are setup. I verified that the right include / library paths are being used (V=1 during the build) by using the patch to cross-build for arm64 on Ubuntu 20.04 and also natively on RockPro64 running Debian.

For Fedora, we maybe able to achieve something similar but we have to include -sysroot <path to unpacked header / libraries>. Still need to verify this though.

and let native aarch64 be handled mostly (more about this below) as for x64. For completeness, we should catch host_arch == aarch64 and arch == x64 and fail it as I doubt we would ever support it.

If the above works out, then support for cross compiling for x86 on arm64 should fall out automatically assuming that the distribution provides packages for the cross-headers / libraries.

(I'll split the thread here for ease of discussion)

@wkozaczuk
Copy link
Collaborator Author

I think I have found a culprit causing kernel built on Ubuntu to hang like (or similar to) what I described above - the error: kvm run failed Function not implemented issue.

Here is a snippet of the makefile diff that fixes it:

@@ -772,10 +750,15 @@ libtsm += drivers/libtsm/tsm_vte_charsets.o
 drivers := $(bsd) $(solaris)
 drivers += core/mmu.o
 drivers += arch/$(arch)/early-console.o
+$(out)/arch/$(arch)/early-console.o: CXXFLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
 drivers += drivers/console.o
+$(out)/drivers/console.o: CXXFLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
 drivers += drivers/console-multiplexer.o
+$(out)/drivers/console-multiplexer.o: CXXFLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
 drivers += drivers/console-driver.o
+$(out)/drivers/console-driver.o: CXXFLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
 drivers += drivers/line-discipline.o
+$(out)/drivers/line-discipline.o: CXXFLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
 drivers += drivers/clock.o
 drivers += drivers/clock-common.o
 drivers += drivers/clockevent.o
@@ -822,6 +805,7 @@ endif # x64
 ifeq ($(arch),aarch64)
 drivers += drivers/mmio-isa-serial.o
 drivers += drivers/pl011.o
+$(out)/drivers/pl011.o: CXXFLAGS += -fvisibility=hidden -fvisibility-inlines-hidden
 drivers += drivers/xenconsole.o
 drivers += drivers/virtio.o
 drivers += drivers/virtio-pci-device.o
@@ -1798,56 +1782,41 @@ objects += $(addprefix fs/, $(fs_objs))

I will be opening a proper issue for it but in short, the problem is caused by arch_init_early_console code compiled in a way that it relies on GOT of the kernel and specific R_AARCH64_GLOB_DAT entries for related early console fields processed before this code is used. Unfortunately, right now this processing is done in core/elf.cc:init_table get_init(Elf64_Ehdr* header) which is called AFTER the arch_init_early_console() call in loader.cc. Interestingly, the same code compiled by gcc on Fedora does not have these issues.

There are probably at least 2 ways of solving this:

  1. Hide related symbols (which eventually should be hidden) and thus force the compiler not use the GOT.
  2. Call get_init(Elf64_Ehdr* header) before arch_init_early_console() which might be even easier but might cause some other unforeseen consequences.

Please try to make similar changes to your Makefile to see if you can build AArch64 OSv on your Ubuntu machine and see if it runs successfully.

@wkozaczuk
Copy link
Collaborator Author

Actually after closer examination of the generated assembly:

(gdb) disas arch_init_early_console
Dump of assembler code for function arch_init_early_console():
=> 0x0000000040205a68 <+0>:	stp	x29, x30, [sp, #-48]!
   0x0000000040205a6c <+4>:	adrp	x0, 0x4009b000
   0x0000000040205a70 <+8>:	adrp	x1, 0x4009a000
   0x0000000040205a74 <+12>:	mov	x29, sp
   0x0000000040205a78 <+16>:	ldr	x0, [x0, #936]
   0x0000000040205a7c <+20>:	str	x19, [sp, #16]
   0x0000000040205a80 <+24>:	adrp	x19, 0x4009b000
   0x0000000040205a84 <+28>:	adrp	x2, 0x4009b000
   0x0000000040205a88 <+32>:	ldr	x1, [x1, #2992]
   0x0000000040205a8c <+36>:	mov	w5, #0x21                  	// #33
   0x0000000040205a90 <+40>:	ldr	x19, [x19, #3016]
   0x0000000040205a94 <+44>:	add	x1, x1, #0x10
   0x0000000040205a98 <+48>:	ldr	x4, [x0]
   0x0000000040205a9c <+52>:	mov	w3, #0x1                   	// #1
   0x0000000040205aa0 <+56>:	ldr	x2, [x2, #2056]
   0x0000000040205aa4 <+60>:	stp	x1, xzr, [x19]
   0x0000000040205aa8 <+64>:	add	x0, sp, #0x2c
   0x0000000040205aac <+68>:	str	w5, [x19, #16]
   0x0000000040205ab0 <+72>:	str	xzr, [x19, #24]
   0x0000000040205ab4 <+76>:	str	xzr, [x4, #8]
   0x0000000040205ab8 <+80>:	strb	w3, [x2]
   0x0000000040205abc <+84>:	bl	0x4020d340 <dtb_get_uart(int*)>
   0x0000000040205ac0 <+88>:	cbz	x0, 0x40205adc <arch_init_early_console()+116>
   0x0000000040205ac4 <+92>:	mov	x1, x0
   0x0000000040205ac8 <+96>:	mov	x0, x19
   0x0000000040205acc <+100>:	bl	0x401f8378 <console::PL011_Console::set_base_addr(unsigned long)>
   0x0000000040205ad0 <+104>:	ldr	w1, [sp, #44]
   0x0000000040205ad4 <+108>:	mov	x0, x19
   0x0000000040205ad8 <+112>:	bl	0x401f8388 <console::PL011_Console::set_irqid(int)>
   0x0000000040205adc <+116>:	ldr	x19, [sp, #16]
   0x0000000040205ae0 <+120>:	ldp	x29, x30, [sp], #48
   0x0000000040205ae4 <+124>:	ret
End of assembler dump.

It looks like it is same or similar code (see 0x0000000040205aa4 <+60>: stp x1, xzr, [x19]) but the difference is that the memory entry at 0x4009b000 + 3016 has non-zero address (populated by linker).

The bottom line is the same.

@wkozaczuk
Copy link
Collaborator Author

Hi @punitagrawal,

With this commit addressing __getauxval linker issue and this patch I sent to the mailing list to make the early console work without symbol relocation, you should be able to build bootable AArch64 kernel on Ubuntu with gcc 10.3. I tested it on my RPI 4.

@punitagrawal
Copy link
Contributor

Hi @wkozaczuk, Apologies for the silence in the past few days - I've been distracted by a few other things.

I had a quick peek at the patch but won't have a chance to test it and your other suggestion about hiding symbols until the weekend. Instead of pulling in __getauxval, I was considering adding -mno-outline-atomics to the architectural CFLAGS in aarch64.mk.

ATM, I am not sure there's a strong usecase for supporting LSE atomics (typically useful on multi-processor workloads in systems with large number of cores).

@wkozaczuk
Copy link
Collaborator Author

Based on my tests even if you compile with -mno-outline-atomics flags libgcc.a still wants __getauxval. This is actually not a big deal as __getauxval is simply an alias to the existing getauxval.

@punitagrawal
Copy link
Contributor

... this patch I sent to the mailing list to make the early console work without symbol relocation, you should be able to build bootable AArch64 kernel on Ubuntu with gcc 10.3. I tested it on my RPI 4.

I applied the patch and am able to successfully run native-example about half the time. The other half it crashes with -

...
getauxval() stubbed
smp_launch ENTERED, lr=00000000400da784
page_fault ENTERED, lr=000000004020e828
faulting address ffff80000a000000
elr exception ra 00000000402eb080
page fault outside application, addr: 0xffff80000a000000
[registers]
PC: 0x00000000402eb080 <mmio_getl(void volatile*)+0>
X00: 0xffff80000a000000 X01: 0x008000000a000713 X02: 0xffff80000a001000
X03: 0x008000000a000713 X04: 0xffff80000a000000 X05: 0xffff80000a000fff
X06: 0xffff80000a000000 X07: 0x0000000000000000 X08: 0x0000000000000001
X09: 0xffff800040c87008 X10: 0x0000000000000713 X11: 0xfffffffffffff8a3
X12: 0x0080000000000003 X13: 0x0000200000100a50 X14: 0x0000000000000703
X15: 0x0000000000000000 X16: 0x0000000000000000 X17: 0x0000000000000000
X18: 0x0000000000000000 X19: 0xffffa00040c89180 X20: 0xffffa000407d71c0
X21: 0x0000000000000030 X22: 0x0000000000000200 X23: 0x00000000406d4e00
X24: 0xffffa000407d7280 X25: 0xffffa00040955d00 X26: 0x0000000040200eb0
X27: 0x000000000a000000 X28: 0x0000000040723e98 X29: 0x0000200000100ab0
X30: 0x0000000040200bd4 SP:  0x0000200000100ab0 ESR: 0x0000000096000007
PSTATE: 0x0000000080000345
Aborted

[backtrace]
0x00000000401dc6fc <mmu::vm_fault(unsigned long, exception_frame*)+748>
0x000000004020ea28 <page_fault+244>
0x000000004020e824 <???+1075898404>
0x0000000040200d44 <virtio::register_mmio_devices(hw::device_manager*)+180>
0x000000004020be2c <arch_init_drivers()+152>
0x00000000400db00c <do_main_thread(void*)+108>
0x0000000040349d50 <???+1077189968>
0x00000000402e8760 <thread_main_c+32>
0x00000000402e6060 <sched::cpu::reschedule_from_interrupt(bool, std::chrono::duration<long, std::ratio<1l, 1000000000l> >)+812>

@wkozaczuk
Copy link
Collaborator Author

wkozaczuk commented Dec 1, 2020 via email

@punitagrawal
Copy link
Contributor

It seems to be failing in trying to initialize virtio over mmio devices. Do you expect any? Are you trying to run it on QEMU with KVM? Can you execute run.py with --dry-run and send it to me?

I am indeed running it with Qemu on KVM. Are you normally running it straight from the boot loader?

Here's the build and qemu command line (ignore the taskset, I need that due to big.LITTLE on RockPro64) -

% ./scripts/build image=native-example fs=rofs -j4 arch=aarch64 --create-disk
...
% taskset -c 0-3 ./scripts/run.py --arch=aarch64 -c 1 --dry-run
qemu-system-aarch64 \
-m 2G \
-smp 1 \
--nographic \
-gdb tcp::1234,server,nowait \
-kernel /home/punit/src/osv/build/last/loader.img \
-append "--disable_rofs_cache --rootfs=rofs /hello" \
-machine virt \
-device virtio-blk-pci,id=blk0,drive=hd0,scsi=off \
-drive file=/home/punit/src/osv/build/last/disk.img,if=none,id=hd0,cache=none,aio=native \
-netdev user,id=un0,net=192.168.122.0/24,host=192.168.122.1 \
-device virtio-net-pci,netdev=un0 \
-device virtio-rng-pci \
-enable-kvm \
-cpu host

@wkozaczuk
Copy link
Collaborator Author

Can you comment out the line virtio::register_mmio_devices(device_manager::instance()); in arch/aarch/arch-setup.cc?

I wonder if we have some bug either in parsing DTB or memory-mapping the areas corresponding to where where these devices should be mapped.

Can you also add some debugging statements around the logic that parses the TDB (dtb_parse_mmio_virtio_devices()). It is interesting I do not see this problem when running on RPI4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants