[GSoC] coreboot for ARM64 Qemu – Week #7

This was a tough week. After having passed the coreboot building stage, I thought my work would be easier now. But I had another thing coming.

As I had mentioned in my last post, I didn’t have any output while booting on qemu. So, the first aim was to get qemu monitor working. After some debug, I was able to get qemu monitor working to print onto my terminal (stdio)
This gave me then following :

qemu: fatal: Trying to execute code outside RAM or ROM at 0x0000000008000000

R00=00000950 R01=ffffffff R02=44000000 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=00000000 R14=40010010 R15=08000000
PSR=400001db -Z– A und32
s00=00000000 s01=00000000 d00=0000000000000000
s02=00000000 s03=00000000 d01=0000000000000000
s04=00000000 s05=00000000 d02=0000000000000000
s06=00000000 s07=00000000 d03=0000000000000000
s08=00000000 s09=00000000 d04=0000000000000000
s10=00000000 s11=00000000 d05=0000000000000000
s12=00000000 s13=00000000 d06=0000000000000000
s14=00000000 s15=00000000 d07=0000000000000000
s16=00000000 s17=00000000 d08=0000000000000000
s18=00000000 s19=00000000 d09=0000000000000000
s20=00000000 s21=00000000 d10=0000000000000000
s22=00000000 s23=00000000 d11=0000000000000000
s24=00000000 s25=00000000 d12=0000000000000000
s26=00000000 s27=00000000 d13=0000000000000000
s28=00000000 s29=00000000 d14=0000000000000000
s30=00000000 s31=00000000 d15=0000000000000000
s32=00000000 s33=00000000 d16=0000000000000000
s34=00000000 s35=00000000 d17=0000000000000000
s36=00000000 s37=00000000 d18=0000000000000000
s38=00000000 s39=00000000 d19=0000000000000000
s40=000000 Abort trap: 6

I did some searching, this meant that the bootloader could not be loaded. And realised maybe the ROM qemu is being allotted is not sufficient. The ‘execute outside ram or rom’ is usually a jump to somewhere that qemu does not recognize as ROM/RAM.
Since we expect
CONFIG_BOOTBLOCK_BASE is 0x10000
CONFIG_ROMSTAGE_BASE  is 0x20000
CONFIG_SYS_SDRAM_BASE is 0x1000000
i.e ROM to start at 64k. So I ran qemu by giving a -m 2048M (for testing) and got over this fatal qemu error, but still wasn’t able to get coreboot to boot (no output on serial). This meant, some more debugging was needed.

I started to debug this using gdb. I created a gdb stub in the qemu boot (by using -s -S), but running gdb to connect to it gave me :
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
warning: Architecture rejected target-supplied description
0x40080000 in ?? ()

Which probably meant I will have to have to build a cross gdb (aarch64-linux-gnu-gdb) and use that.
For this, on linux we could have something called gdb-multiarch, but this is not available for macOSX.

I then turned to using Valgrind. There are Valgrind tools available that can help detect many memory management bugs.

This is what I got on valgrind,

==2070== Memcheck, a memory error detector
==2070== Copyright (C) 2002-2013, and GNU GPL’d, by Julian Seward et al.
==2070== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==2070== Command: aarch64-softmmu/qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -nographic -m 2048 -kernel /Users/naman/gsoc/coreboot2.0/coreboot/build/coreboot.rom
==2070==
–2070– aarch64-softmmu/qemu-system-aarch64:
–2070– dSYM directory is missing; consider using –dsymutil=yes
UNKNOWN __pthread_sigmask is unsupported. This warning will not be repeated.
–2070– WARNING: unhandled syscall: unix:330
–2070– You may be able to write your own handler.
–2070– Read the file README_MISSING_SYSCALL_OR_IOCTL.
–2070– Nevertheless we consider this a bug.  Please report
–2070– it at http://valgrind.org/support/bug_reports.html.
==2070== Warning: set address range perms: large range [0x1053c5040, 0x1253c5040) (undefined)
==2070== Warning: set address range perms: large range [0x239e56040, 0x255e55cc0) (undefined)
==2070== Warning: set address range perms: large range [0x255e56000, 0x2d5e56000) (defined)

2070 set address range perms means that the permissions changed on a particularly large block of memory.
That can happen because when a very large memory allocation or deallocation occurs – a mmap or umap call. Which meant we are leaking some memory, but we need to find where. I read some documentations and believe something called a massif tool (in valgrind) could be used. I am now looking at how to find where this memory gets eaten.

On the target now is getting some answers on valgrind if possible. But if I dont get sufficient leads, I would have to switch to gdb (aarch64 on macOSX) and continue my debugging.