[GSoC] coreboot for ARM64 Qemu – Week #9 #10

In the last post I talked about using aarch64-linux-gnu-gdb and debugging in qemu. In these two weeks I was intensely involved in stepping through gdb, disassembly and in-turn debugging the qemu port. I summarise the major highlights below.

Firstly, the correct instruction to invoke qemu is as follows

./aarch64-softmmu/qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -nographic -smp 1 -m 2048 -bios ~/coreboot/build coreboot.rom -s -S

After invoking gdb, I moved onto tracing the execution of the instructions step by step to determine where and how the code fails. A compendium of the code execution is as follows

gdb) target remote :1234
Remote debugging using :1234
(gdb) set disassemble-next-line on
(gdb) stepi
0x0000000000000980 in ?? ()
=> 0x0000000000000980: 02 00 00 14 b 0x988
0x0000000000000988 in ?? ()
=> 0x0000000000000988: 1a 00 80 d2 mov x26, #0x0                    // #0
0x000000000000098c in ?? ()
=> 0x000000000000098c: 02 00 00 14 b 0x994
(gdb) c
Program received signal SIGINT, Interrupt.
0x0000000000000750 in ?? ()
=> 0x0000000000000750: 3f 08 00 71 cmp w1, #0x2

The detailed version can be seen here.

The first sign of error can be seen here, where the instruction is 0 and the address is way off.

0x64672d3337303031 in ?? ()
=> 0x64672d3337303031: 00 00 00 00 .inst 0x00000000 ; undefined

To find insights as to why this is happening, I resorted to tracing in gdb. This can be done by adding the following in the qemu invoke command. This creates a log file in /tmp which can be read to determine suitable information.

-d out_asm,in_asm,exec,cpu,int,guest_errors -D /tmp/qemu.log

Looking at the disassembly, it can be seen that execution of instructions till 0x784 is correct and it goes bonkers immediately after it. Looking at the trace, this is where the code hangs

0x0000000000000784:  d65f03c0      ret
The ret goes to somewhere bad. So the stack has been blown or it has executed into an area it should have prior to this. Next, I did a objdump on the bootblock.debug file. Relating to the code at this address, it could be determined that the code fails at “ret in 0000000000010758 <raw_write_sctlr>:”
Next up was determining where the stack gets blown or corrupt. For this, while stepping through each instruction, I looked at the stack pointer. The output here shows the details. Everything seems to function correctly till 0x00000000000007a0 (0x00000000000007a0: f3 7b 40 a9 ldp x19, x30, [sp] ), then next is 0x00000000000007a4: ff 43 00 91 add sp, sp, #0x10 . This is where saved pc goes corrupt. This code gets called in the “raw_write_sctlr_current” (using objdump)
From the trace, we have the following information : The ret goes bad at 0000000000010758 <raw_write_sctlr>:
0x0000000000000908:  97fffe06      bl #-0x7e8 (addr 0x120)
0x0000000000000120:  3800a017      sturb w23, [x0, #10]
0x0000000000000124:  001c00d5      unallocated (Unallocated)
Taking exception 1 [Undefined Instruction]
…from EL1
…with ESR 0x2000000
Which is here:
0000000000010908 <arm64_c_environment>:
   10908: 97fffe06  bl 10120 <loop3_csw+0x1b>
   1090c: aa0003f8  mov x24, x0
This finally gave some leads in the qemu debug. There seems be some misalignment in smp_processor_id.
While tracing in gdb, we have
0x0000000000000908 in ?? ()
=> 0x0000000000000908: 06 fe ff 97   bl  0x120
(which is actually bl smp_processor_id (from src/arch/arm64/stage_entry.S))
Under arm64_c_environment (in objdump) we have;
10908:       97fffe06        bl      10120 <loop3_csw+0x1b>
Also in the trace we have
0x0000000000000908:  97fffe06      bl #-0x7e8 (addr 0x120)

Now loop3_csw is defined at (from objdump)
0000000000010105 <loop3_csw>:

So this + 0x1b = 10120

Thus it wants to branch and link to 0x120 but smp_processor_id is at 121.

smp_processor_id is at (from objdump)
0000000000010121 <smp_processor_id>:

This gives us where the code is failing. Next up is finding out the reason for this misalignment and rectifying it.




[GSoC] coreboot for ARM64 Qemu – Week #8

As I had discussed in my last blog post, currently I am onto the debug of the qemu boot. I was intending to use Valgrind tools to detect various memory managements bugs and use that information for my debug. But sadly the information provided by Valgrind was not of much use since it didn’t deal with the execution stream of the coreboot code in qemu. I ultimately had to turn to gdb and use it for further debug.

This was an initial hiccup, since, as in my last post, building aarch64-linux-gnu-gdb on MacOSX was not straightforward, since there was no direct replacement for the “gdb-multiarch”. I was able to get this done. I discuss some of the basic steps of how to set it up below.

First, we need a couple to packages to build gdb. They are listed below:

expat guile texinfo

Next, download the aarch64-gdb from here. Now, you need to configure CC to gcc (GNU gcc and not the innate symlink to clang). Then proceed to,

$ ./configure --target=aarch64-linux-gnu
$ make
$ make install

If this completes successfully, you would have aarch64-gdb installed on your system correctly. The important thing to remember is to use GNU gcc (>=4.9) and not the innate MacOS gcc.

To run gdb you must

$ aarch64-linux-gnu-gdb

The output looks like this :

GNU gdb (GDB) 7.9
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-apple-darwin13.3.0 --target=aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
Find the GDB manual and other documentation resources online at:
For help, type "help".
Type "apropos word" to search for commands related to "word".

Now I had gdb working. Then I did started the debug by giving a “-s -S” while invoking qemu. After this, I need to connect to gdb remotely using

(gdb) target remote : 1234

Some of the debug information I received was this :

(gdb) target remote :1234
Remote debugging using :1234
0x0000000000000200 in ?? ()
(gdb) run
The “remote” target does not support “run”.  Try “help target” or “continue”.
(gdb) continue
Program received signal SIGINT, Interrupt.
0x0000000000000200 in ?? ()

On trying single-step execution on gdb, I received :

(gdb) step
Cannot find bounds of current function
An error like this usually seen when we overflow a buffer and corrupt the stack, the proper return address is destroyed. When the debugger tries to figure out which function the address is in, it fails, because the address is not in any of the functions in the program.
On running the simple where on gdb I get [where displays the current line and function and the stack of calls that got you there]
(gdb) where
#0  0x0000000000000200 in ?? ()

After some unscrambling of the source code using information from gdb, we were pointed to some issues under the stage_entry in src/arch/arm64/stage_entry.S. I am onto re-setting those and continuing the debug further now.  



[GSoC] EC/H8S firmware week #7|#8

Week #7 was is little bit frustrating, because of no real progress, only more unfinished things which aren’t working. Week #8 was a lot better.

1. Sniffing the communication between the 2 embedded controllers H8S and PMH4.

I’ve tried to build an protocol analyser with the msp430, but the data output was somehow strange. For testing purpose I used my H8S firmware to produce testing data. But the msp430 decoded only wrong data. I’m using IRQs on the clock to do the magic and writing it to a buffer before transmitting it via UART. Maybe the msp430 is too slow for that? Possible. Set a GPIO to high when the IRQ routing start and to low when it ends. Visualize the clock signal and connect the  IRQ measure pin to an oscilloscope. The msp430 is far too slow. I’m using memory dereference in the IRQ routine, which takes a lot of time. Maybe the msp430 is fast enough, when using asm routine and registers to buffer the 3 byte transmission. But a logic analyser would definitely work. So I borrowed two logic analyser. An OLS (Openbench Logic Sniffer) and a Saleae Logic16.

There isn’t so much data on the lines. Every 50 ms there is a short transmission of 3 byte. But I don’t want to decode the data by hand. So it needs a decoder for the logic analyser. sigrok looks like the best start point and both analyser are supported.

I’ve started with the Openbench Logic Sniffer, but unfortunately it doesn’t have enough RAM to buffer the input long enough. Maybe the external trigger input can be used. But before doing additional things I would like to test with the Logic16.

The Logic16 doesn’t support any triggers but it can stream all data over USB even with multiple MHz. Good enough to capture all data. I found out that the best samplerate is 2 MHz. Otherwise the LE signal isn’t captured, because it’s a lot shorter than a clock change. In the end I created a decoder with libsigrokdecode.

sigrok-cli -i boots_and_shutdown_later_because_too_hot.sr –channels 0-3 -P ec_xp:clk=2:data=3:le=1:oe=0 | uniq -c 

67 0x01 0x07 0xc8
3 0x01 0x04 0xc8 
4 0x01 0x10 0x48
1120 0x01 0x17 0x48
67 0x01 0x07 0xc8

0x01 0x07 0xc8 is called when only power is plugged in, like a watchdog(every 500ms)
0x01 0x17 0x48 is called when the device is powered on, like a watchdog (every 50ms)
0x01 0x04 0xc8 around the time power button pressed
0x01 0x10 0x48 around the time power button pressed

2. Flash back the OEM H8S firmare

The OEM H8S firmware is included in the bios updates. cabextract and strings is enough for extracting it out of the update. Look for SREC lines. Put the SREC lines into a separate file and flash them back via UART bootloader and the renesas flash tool. The display powers up and it’s booting again with OEM BIOS.
I could imagine they are using a similar update method like the UART bootloader. First transfer a flasher application into RAM and afterwards communicate with the flasher to transfer the new firmware, but the communication works over LPC instead of UART.

3. Progress on the bootloader

I’ve implemented the ADC converter to enable the speaker amp and the display backlight brightness.

Written down LPC registers and just enable the Interface in order to get GateA20 working. Still unclear how far this works.

4. How to break into the bootloader?

The idea of the bootloader is providing a brick free environment for further development. The bootloader loads the application which adds full support for everything. It should be possible to stop the loading application and flash a new application into the EC flash. When starting development on the x60 or x201 I want to use I2C line as debug interface. I2C chips have a big footstep and are easy to access. But there must be a way to abort the loading. I will use the function key in combination with the leds.

  1. Remove the battery and power plug.
  2. Press the function key
  3. Put the power plug in
  4. Wait until leds blinking
  5. release the function key within 5 seconds after the leds starting to blink to enter the bootloader.

The H8S will become I2C slave on a specific address.

What next?

  • Add new PMH4 commands to the H8S
  • solder additional pins to MAINOFF PWRSW_H8 A20 KBRC
  • use the logic analyser to put the communication in relation with these signals
  • UART shell
  • I2C master & client
  • solder LPC pins to analyse firmware update process
  • test T40 board with new PMH4 commands and look if all power rails are on

[GSoC] coreboot for ARM64 Qemu – Week #7

This was a tough week. After having passed the coreboot building stage, I thought my work would be easier now. But I had another thing coming.

As I had mentioned in my last post, I didn’t have any output while booting on qemu. So, the first aim was to get qemu monitor working. After some debug, I was able to get qemu monitor working to print onto my terminal (stdio)
This gave me then following :

qemu: fatal: Trying to execute code outside RAM or ROM at 0x0000000008000000

R00=00000950 R01=ffffffff R02=44000000 R03=00000000
R04=00000000 R05=00000000 R06=00000000 R07=00000000
R08=00000000 R09=00000000 R10=00000000 R11=00000000
R12=00000000 R13=00000000 R14=40010010 R15=08000000
PSR=400001db -Z– A und32
s00=00000000 s01=00000000 d00=0000000000000000
s02=00000000 s03=00000000 d01=0000000000000000
s04=00000000 s05=00000000 d02=0000000000000000
s06=00000000 s07=00000000 d03=0000000000000000
s08=00000000 s09=00000000 d04=0000000000000000
s10=00000000 s11=00000000 d05=0000000000000000
s12=00000000 s13=00000000 d06=0000000000000000
s14=00000000 s15=00000000 d07=0000000000000000
s16=00000000 s17=00000000 d08=0000000000000000
s18=00000000 s19=00000000 d09=0000000000000000
s20=00000000 s21=00000000 d10=0000000000000000
s22=00000000 s23=00000000 d11=0000000000000000
s24=00000000 s25=00000000 d12=0000000000000000
s26=00000000 s27=00000000 d13=0000000000000000
s28=00000000 s29=00000000 d14=0000000000000000
s30=00000000 s31=00000000 d15=0000000000000000
s32=00000000 s33=00000000 d16=0000000000000000
s34=00000000 s35=00000000 d17=0000000000000000
s36=00000000 s37=00000000 d18=0000000000000000
s38=00000000 s39=00000000 d19=0000000000000000
s40=000000 Abort trap: 6

I did some searching, this meant that the bootloader could not be loaded. And realised maybe the ROM qemu is being allotted is not sufficient. The ‘execute outside ram or rom’ is usually a jump to somewhere that qemu does not recognize as ROM/RAM.
Since we expect
i.e ROM to start at 64k. So I ran qemu by giving a -m 2048M (for testing) and got over this fatal qemu error, but still wasn’t able to get coreboot to boot (no output on serial). This meant, some more debugging was needed.

I started to debug this using gdb. I created a gdb stub in the qemu boot (by using -s -S), but running gdb to connect to it gave me :
(gdb) target remote localhost:1234
Remote debugging using localhost:1234
warning: Architecture rejected target-supplied description
0x40080000 in ?? ()

Which probably meant I will have to have to build a cross gdb (aarch64-linux-gnu-gdb) and use that.
For this, on linux we could have something called gdb-multiarch, but this is not available for macOSX.

I then turned to using Valgrind. There are Valgrind tools available that can help detect many memory management bugs.

This is what I got on valgrind,

==2070== Memcheck, a memory error detector
==2070== Copyright (C) 2002-2013, and GNU GPL’d, by Julian Seward et al.
==2070== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==2070== Command: aarch64-softmmu/qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -nographic -m 2048 -kernel /Users/naman/gsoc/coreboot2.0/coreboot/build/coreboot.rom
–2070– aarch64-softmmu/qemu-system-aarch64:
–2070– dSYM directory is missing; consider using –dsymutil=yes
UNKNOWN __pthread_sigmask is unsupported. This warning will not be repeated.
–2070– WARNING: unhandled syscall: unix:330
–2070– You may be able to write your own handler.
–2070– Nevertheless we consider this a bug.  Please report
–2070– it at http://valgrind.org/support/bug_reports.html.
==2070== Warning: set address range perms: large range [0x1053c5040, 0x1253c5040) (undefined)
==2070== Warning: set address range perms: large range [0x239e56040, 0x255e55cc0) (undefined)
==2070== Warning: set address range perms: large range [0x255e56000, 0x2d5e56000) (defined)

2070 set address range perms means that the permissions changed on a particularly large block of memory.
That can happen because when a very large memory allocation or deallocation occurs – a mmap or umap call. Which meant we are leaking some memory, but we need to find where. I read some documentations and believe something called a massif tool (in valgrind) could be used. I am now looking at how to find where this memory gets eaten.

On the target now is getting some answers on valgrind if possible. But if I dont get sufficient leads, I would have to switch to gdb (aarch64 on macOSX) and continue my debugging.


[GSoC] coreboot for ARM64 Qemu – Week #6

This week I worked on completing the build and sorting all complications imposed by it. As I talked in the last post, I was facing some issues regarding setting up smp for this port. I solved this issue by adding an assembly file which declared smp_processor_id and then defined it by setting the right registers. I had to do some background reading on arm64 details. This provided me with the information I needed.

Next up, was another hitch. During the build, ‘mmu_enable()’ and ‘arch_secondary_cpu_init()’ function calls are happening for all stages but the definitions for these functions are getting compiled only for ramstage. So this gave recurrent errors since the compiler couldn’t find these definitions. While attempting to sort this, I stumbled across something on the chromium tree. There was a patch which dealt with some of the issues, similar to mine. I had to cherry pick and apply this change.

After debugging and sorting through some more errors, I was finally able to get it to build successfully.

coreboot.rom: 4096 kB, bootblocksize 37008, romsize 4194304, offset 0x90c0 alignment: 64 bytes, architecture: arm64

Name                                       Offset     Type         Size
fallback/romstage               0x90c0     stage        12108
fallback/ramstage               0xc080     stage        17768
config                                    0x10640    raw          2034
revision                                0x10e80    raw          577
(empty)                                 0x11100    null         4124312
      HOSTCC     cbfstool/rmodtool.o
     HOSTCC     cbfstool/rmodule.o
     HOSTCC     cbfstool/rmodtool (link)
The complete build can be found here.
I attempted to boot off on qemu after this,
$qemu-system-aarch64 -machine type=virt -nographic -bios ~/coreboot/build/coreboot.rom
This did not give any output, which meant probably I had to make some changes in the uart set up. I attempted to debug this by adding few printks early in bootblock once the console_init is done. This process ongoing, I hope to get through. Another aspect in question is the bootblock initialisation. The src/arch/arm64/armv8/bootblock_simple.c calls for an appropriate bootblock_cpu_init(). This is another thing I will be working on in the coming days.

[GSoC] EC/H8S firmware week #6

This week I looked at the communication between the EC H8S and the PMH4. The PMH4 (likely power management hub) is an ASIC which takes care of the power control. It controls who get’s power and who not. It doesn’t do any high level work, more like a big logic gatter. The PMH4 has inputs from several power good pins from different power rails and chips. On the output side it controls some power rails. It can also reset the H8S. The PMH4 also knows over some pins in which power state (ACPI S0,S4,S5) the board is. It doesn’t do any high level work. It’s more like a big logic gatter. There are no ADC on any power lines.

The PMH4 is connected to the H8S via 4 Pins. ~OE LE DATA CLK.

gsoc 2015 pmh4 connector t40

I connected a buspirate in SPI sniffer mode to debug the protocol. But the output looked a little bit strange. There was no data from the PMH4 to H8S (MISO) and the data comes in burst. To get more knowledge on the protocol I used a digital oscilloscope.

pmh4 oscilloscope

The protocol doesn’t look like SPI. LE get’s low after every transmission, ~OE is just high, clock and data just transfer the data. Sometimes when the H8S gets an interupt the Clock pause for some time and continues with the data afterwards. The clock is around ~400kHz.

I confirmed the protocol via the oscilloscope, but still I don’t get any sign from the board. No fan, nothing else. There must be more than this single transmisison. Maybe the board is to much damaged. My modified board was already broken when I got it. There is a loose connection related to the cardbus. Maybe this is my problem I don’t know.

I’ve two board with two connectors for the PMH4 here. Why not using the OEM one as starter help for the other one?

t42 gives some starting help

I think the PMH4 does what it should do. The H8S has an digital-analog-converter pin connected to the video brightness. But I haven’t implemented it yet. But I don’t think the device booted, because neither the CPU nor the chipset produce any heat. Ok, maybe it does, I only used my finger as thermometer. A thermal camera would help here. I’ll borrow a thermal camera for that.

There are lot of pins which I ignore atm. E.g. A20 pin. Is there something to do in a specific time serie?

What’s next?

  • build a small protocol sniffer for the PMH4 XP using a msp430 or stellaris arm
  • make progress on the bootloader
  • find a way to flash back the OEM H8S firmware
  • find a way to flash my bootloader via OEM flash tools

My requirements to the bootloader are

  • UART flashing via XMODEM
  • a simple UART shell
  • I2C as recovery and shell as well

I2C pins are a lot easier to find and modify than the H8S UART. I’m not yet sure if the H8S should be the master or the slave and on what address he should use? Multiple? UART tx is working. Rx is a task to do.

PMH4 / PMH7 / Thinker communication

On newer board the PMH interfaces changed (>= x60, t60, …). They merge the LPC interface and the XP interface into an protocol over SPI. And the new PMH is used as GPIO expander as well.

pmh4 pmh7 thinker communication


[GSoC] EC/H8S firmware week #5

The T40 is flashing leds! The toolchain is still a little bit tricky. I’m using the debian package gcc-h8300-hms, written a small linker script and took the startup assembly routine from Johann Gysin’s led radiator.

Now I can flash leds. But what about booting the board? I would say it’s enough to put

  • (!MAINOFF) = high
  • FAN ON = high
  • pulse high on (!PWRSW_H8)

But it’s not enough. Also the FAN isn’t starting to rotate. I’ll try to debug every pin this week and solder some debug pins for the 2nd EC (PMHx) to the my modified T40 as well as to an unmodified T42p. The H8S is talking to the PMHx via SPI, while the H8S is the master and is doing bit banging SPI in software, because it doesn’t have a hardware unit for that. I’ll also use these pins for testing my SPI implementation. I’ll try to reuse an open source SPI implementation.

I also asked me if it’s a good idea to port coreboot for the T40 before continuing any efforts to the EC, but it’s a little bit harder, because the T40 uses a LPC/FWH flash in a TSOP40 case. Another option is changing the hardware to a board which is already supported by coreboot like a x60/t60 or x201. But it’s much more harder to access the 8 pins for flashing the EC on these boards.

Before switching to another board, the powersequencing must work and I need a robust recovery way, because when you kill the EC by flashing a new firmware, you don’t get a second chance, unless you solder a lot. Chrome EC fix this problem by splitting the EC firmware into 2 parts. One read-only part and one read-write’able part. Only the second part gets updated and the read-only part can at least boots the device.

Before starting the H8S port for Chrome EC I want to have a bootloader. Because it would improve developing speed. I think implement this is much faster than doing the full Chrome EC support and most of the bootloader code can be re-used for Chrome EC.

I’m also not perfectly sure Chrome EC is the best solution. It’s special use-case is EC, which is perfect. But neither the documentation (I think there is more than one page) nor the bugtracker is public. Thus it makes difficult to use. I’m also not sure if Chrome EC would apply my H8S port into their repository.

[GSoC] coreboot for ARM64 Qemu – Week #4 and #5

From this week I started dealing with the core aspects of aarch64 design. I continued with the process of building the armv8, along with handling the required patching-up, interfacing, hook-ups. In my last post, I had talked about the toolchain building error (in binutils) for arm64 which I was facing on OSX. I had to remove the –enable-gold flag from the binutils. After making this small update, the build_BINUTILS looked like this, and I was able to get the toolchain working.

build_BINUTILS() {
 if [ $TARGETARCH == "x86_64-elf" ]; then
 CC="$CC" ../binutils-${BINUTILS_VERSION}/configure --prefix=$TARGETDIR \
 --target=${TARGETARCH} --enable-targets=${TARGETARCH}${ADDITIONALTARGET} \
 --disable-werror --disable-nls --enable-lto \
 --enable-plugins --enable-multilibs CFLAGS="$HOSTCFLAGS" || touch .failed
 $MAKE $JOBS || touch .failed
 $MAKE install DESTDIR=$DESTDIR || touch .failed

The work had just began after fixing the toolchain. On attempting to building, the faced error I got was :

toolchain.inc:137: *** building bootblock without the required toolchain. Stop.

This was due to certain wrongly configured CONFIG options in the Kconfig. After this stage the initial bring-up of arm64 looked stable. Moving forward, I was met with an error in src/arch/arm64/c_entry.c

src/arch/arm64/c_entry.c: In function ‘arm64_init’: src/arch/arm64/c_entry.c:52:2: error: implicit declaration of function ‘cpu_set_bsp’ [-Werror=implicit-function-declaration] cpu_set_bsp();

The inclusion of necessary files and structures were correct, and I kept getting this error. Furquan ultimately pointed to change 257370  following which, I could get past this. After this, I had to solve another BSD/OSX issue about date in genbuild_h.sh to get my build progressing.

Subsequently, some architectural decisions had to be made for the armv8. In the initial version, I had been banking on cbfs_media based structure in media.c, for creating functions for read, write and map. But the older formulation (in cbfs_core.c and cbfs_core.h) is changed now. In order to keep up, and to maintain uniformity, we decided to handle this as it is handled in armv7, i.e by creating a mapping to the qemu memory mapped space. Another point of discussion for stage loading. It is being brought up similar to armv7 for now. This might change in the future. Also the organisation for UART was finalised. plo11.c is included, as in  src/drivers/uart/Makefile.inc. by setting the DRIVERS_UART_PL011 in armv8 Kconfig.

Next hitch was dealing with SMP. In my proposal, I had suggested that incorporating smp into the emulation could be a long term goal. But since smp is a part of core arm64 logic, this cannot be completely ignored at this stage. I am met with this

coreboot/src/arch/arm64/stage_entry.S:94: undefined reference to `smp_processor_id’
build/arch/arm64/c_entry.romstage.o: In function `seed_stack’:


A cpu file which adds smp_processor_id() is needed at this moment, which I am currently working on. Next week’s plan is to get past these (and meet new unforeseen issues 😛 ) and boot off on qemu.




[GSoC] EC/H8S firmware week #3|4

In the last 2 weeks I managed to flash the H8S on the T40 using the OEM Renesas Flash Tool including their flash application. Flashing works in 2 steps, first upload a flash application into the H8S. Second this flash application will receive the firmware (via serial) and write it into the flash. Thanks to Renesas this application is available in source code. I would like to write an own flasher later.

But I wasn’t able to create a proper application yet. I could write the led programm in assembly, but having a working c compile is needed anyway.

I built a toolchain with gcc 4.9.2. The toolchain buildscript is very simple and can be found on github. I stopped my building efforts for now (building one based on gcc 4.4.6). There’s also a debian package for h8300 (based on gcc 3.4.6) which may be a good alternative. Before continuing in building toolchains and my led application, I’m reading me into linkerscipts and take a look how the compiler is working (e.g. what must a crt0 do?).

At the moment I know how the application should be compiled, where the reset vectors are and where the entrypoint. But putting these things together into a binary image is my task now.

The dev board I mentioned in my last post was stuck by the german post for the last 2 weeks, because there were on strike. The board is now in the custom office and I’ll collect in the next days, which will takes severals hours in Berlin.

[GSoC] Thinking about Logging

Hey coreboot community!

It’s been just under a month of working on coreboot. I’ve not been keeping up with my blog posts. So here is the first of three particularly long posts detailing what I’ve been up to, what decisions I’ve been making, and what problems I’ve faced. This post is a background on logging, the problems it has as of now, and how not to fix them.

Tomorrow I will be posting a blog post giving some background on a change I made to solve these problems in a fair way. I will also be posting a blog post about what’s up next for me.

So how does logging work?

For most programs, logging works simply through slinging text at some interface. A developer wanting to leak some information uses a function to print some information. You can then look at stdout/stderr, or an output file, or syslog, or whatever, and get that leaked run-time information.

Where coreboot differs from most programs is that coreboot is pre-OS. The abstractions that our operating systems provide aren’t available for coreboot, and not just the higher abstractions like syslog. Coreboot is self contained. Where a Hello World using stdio might be something like:

fprintf(stdout, "Hello World\n")

coreboot doesn’t have that privilege. In theory, we could provide all the same familiar functions, but why? We really don’t need them. This information ultimately ends up going out on some UART, so why bother with needless abstractions? We’re constrained to fitting coreboot and a payload and a video option rom and a TXE and so on into 8MB or less… It’s important that we are prudent in what we make available.

So what does the coreboot project do? We have printk(), which you might associate with the linux kernel (‘print kernel’). In fact, this is exactly the case. If you go into src/console/printk.c, you’ll see a humorous message in the copyright notice:

* blatantly copied from linux/kernel/printk.c
* Copyright (C) 1991, 1992 Linus Torvalds

Why? Recall that coreboot used to be linuxbios, and linuxbios was largely an effort to bootstrap linux. For a really awesome background on coreboot and its origins, and particularly for an interesting look at just how much coreboot has changed over time, you might be interested in watching the first few minutes of this video

But just because we’re inspired by linux doesn’t mean we’re inside linux. In spite of the copyright, coreboot’s printk() does not resemble linux at all anymore besides in name. We’re fully responsible for this function, so how does it work?

Let’s follow a debugging print statement, for example, one at src/lib/cbfs.c:102

DEBUG("Checking offset %zx\n", offset);

Already we have something interesting. What’s 'DEBUG()'? I thought it was printk()?

Go to the top of the file and you’ll see some preprocessor

#define DEBUG(x...) printk(BIOS_SPEW, "CBFS: " x)
#define DEBUG(x...)

Oh, okay. So depending on DEBUG_CBFS, the above DEBUG() call either means an empty statement, or

printk(BIOS_SPEW, "CBFS: " "Checking offset %zx\n", offset);

Firstly, what’s a 'BIOS_SPEW'? Just a constant, as defined in src/include/console/loglevels.h. You’ll notice again that these loglevels mirror the linux loglevels exactly, replacing 'KERNEL_' with 'BIOS_'.

What’s the point of it? If you want to control the verbosity of your logs, you set CONFIG_DEFAULT_CONSOLE_LOGLEVEL to the loglevel you want and you’ll get more or less verbosity as a result. If your configured log level is greater than the loglevel the printk statement is tagged with, it prints.

Why is this file doing this alias of printk? Is it done elsewhere? Why is there a separate CONFIG_DEBUG_CBFS variable? I’ll get to that in a second. This is actually some of the problem I’m setting out to fix. For now, let’s continue by assuming that our configuration is set such that it expands out to printk. Where, then, is printk defined?

Go into src/include/console/console.h and you’ll see what printk() is about:

static inline void printk(int LEVEL, const char *fmt, ...) {}


#define printk(LEVEL, fmt, args...) \
do { do_printk(LEVEL, fmt, ##args); } while(0)

Hmm, why the wrapper with preprocessor? If we are producing a final version of coreboot to go into a production setting, there’s an advantage to muting the logging information at compile time, because at aggregate, debugging statements do have a meaningful performance penalty, and we can increase the speed of our boot by making them empty statements.

So it’s really do_printk. It comes from src/console/printk.c. I won’t post the full contents, but I’ll explain briefly what it does:

  1. checks LEVEL with the configured log level, muting statements if they’re more verbose than what is requested
  2. when configured as such, in pre-ram settings, mute statements not originating from the bootstrap processor.
  3. Temporarily stop function tracing (a configuration option to watch functions and their callsites)
  4. temporarily lock the console resources to the a single thread, ensuring writes do not overlap.
  5. deal with varargs loading into its type via va_start().
    (notice the 'args...' in the signature? That allows printk to take
    a flexible amount of variables. It places n arguments onto the stack
    and so that they can be popped off later. Functions interact with
    variable arguments via va_arg().)
  6. calls vtxprintf with wrap_putchar, the format string, the variable argument list, and NULL. This interpolates the varargs into the format string and pushes it to the outputs.
  7. clean up variable arguments with va_end()
  8. flush the tx (not all output methods require it, but it’s harmless in those cases)
  9. Unlock the console resources
  10. re-enable tracing
  11. return the number of bytes of the output.

One cool thing to take away from this is how we abstract the factor of where console output is going from do_printk. wrap_putchar is another (aptly named) wrapper for do_putchar. do_putchar calls src/console/console.c‘s console_tx_byte(), which contains a bunch of functions like '__uart_tx_byte()' or '__usb_tx_byte()' or '__cbmemc_tx_byte()' which might be defined or left empty via preprocessor.

This allows us to not have to consider their differences or implementation in do_printk(), just in configuration and in their own implementations of init(), tx_byte(), and tx_flush()

The entirety of that is interesting, which is why I wanted to talk about it, but it’s not what I’m touching. Now we know how logging works, we know where we can influence it in do_printk(), and we’ve seen that it’s influenced elsewhere in preprocessor, too.

Let’s go back to the proprocessor 'DEBUG()' in src/lib/cbfs.c

#define DEBUG(x...) printk(BIOS_SPEW, "CBFS: " x)
#define DEBUG(x...)

If loglevels control verbosity, why are there additional CONFIG_DEBUG_CBFS variables? The gist of it is that mapping all statements to a level of 1-8 really isn’t granular enough for coreboot anymore. Sometimes you want to see most everything, but you’re not working on changes for CBFS and don’t need to see it dump its tables. Other times you’re working on a bug in CBFS and need all the information you can get, so you set the variable and get more output.

CONFIG_DEBUG_CBFS is not the only variable like this, nor should it be. There’s many different parts of coreboot, some are relevant all the time in your console output, some are rarely relevant, some are relevant when you need them to be, and if we can get more granularity without it being at the expense of performance or of making a complex mess, we should do it.

But there’s more than just CBFS and SMM and the things already defined with CONFIG_DEBUG_* variables. What that means is this will require quite a bit of grokking the source of coreboot, and classifying things. (Can we do this in a programmatic way?)

The first way I tackled this problem was to think about loglevels. What did they mean? The descriptions we had in the header were the same as those in the linux kernel:

#define KERN_EMERG "<0>" /* system is unusable*/
#define KERN_ALERT "<1>" /* action must be taken immediately*/
#define KERN_CRIT "<2>" /* critical conditions*/
#define KERN_ERR "<3>" /* error conditions*/
#define KERN_WARNING "<4>" /* warning conditions*/
#define KERN_NOTICE "<5>" /* normal but significant condition*/
#define KERN_INFO "<6>" /* informational*/
#define KERN_DEBUG "<7>" /* debug-level messages*/

I think this is poorly defined, particularly for coreboot. What’s the difference between a “normal but significant condition” and an “informational” message? What’s critical, what’s an error? In the context of the linux kernel these are clearly distinct but coreboot is do-or-die in a lot of cases. It stands to reason that a kernel can take many more steps than coreboot to recover.

I made a contribution to the header to try to define these in a more clear way. I didn’t get a whole lot of feedback, but it made it in. If you’re reading this and you didn’t see it, please look it over. If you think anything is wrong here, let’s fix it. I felt pretty confident that this mapped well to their current usage within coreboot.


But that’s really none of the battle. Shockingly, the granularity problem still exists even though I put some comments in a file!

My first thought was, “Well let’s just make a bunch of subsystems, each with their own loglevel 1-8. We can have a CBFS subsystem and an SMM subsystem and a POST code subsystem and so on.”

Let’s explore this idea now. I didn’t. I jumped in and implemented it and before thinking and realized some problems with this idea, quickly putting me back to square one.

When would a POST code qualify as BIOS_SPEW and when would it be BIOS_CRIT? It’s a laughable question. Some of these things are more fitting for toggle rather than a varying level of verbosity, and the current system for doing so is genuinely the best way to make decisions about printing such information… (Though, that doesn’t mean it’s implemented in the best way.)

Meanwhile CBFS probably does have enough of a variation that we can give it its own loglevels. But does it need to be mapped over 8 values? Maybe 4 would be more fitting.

So we want POST mapped to two values. We want CBFS mapped to 4. We want X mapped to N. How can we make a framework for this without creating a mess of Kconfig? How can we make sure that developers know what values they can use?

How are we going to do this in such a way that we don’t mess with preexisting systems? People can set their debug level in CMOS, how is this going to play into that system, where space is limited? We can’t afford to put unlimited configuration values into there. How are we going to do this in a way that extends printk, rather than re-implements it, without breaking everything that exists now? Remember above when I wondered about if other places do preprocessor definitions of printk like 'DEBUG()'? The answer is yes. Nearly everywhere.

It’s a problem of comparing of compile-time configuration against runtime configuration, of weighing preprocessor against implementing things in code, and of weighing simplicity against ease.

So in summation:

  • We need more granularity in logging. Right now we’re using putting the preprocessor to work to accomplish that, and while I first thought this was a bad practice, I came to feel feel that it actually isn’t so bad.
  • Outside of the additional variables to accomplish extra verbosity, there is little organization of printk statements besides by their log
    level. If log messages were properly tagged, logs would be much cleaner and easier to parse. This needs to be accomplished as well.
  • Logging works through a interface called do_printk(), which is separated out from the lower levels of printing, giving us a lot of freedom to work.
  • We must stay conscious of problems that we can introduce through creating too much new configuration.

Check in tomorrow for how I chose to solve this.