[GSoC] Better RISC-V Support, week #10

This past week, I worked on the virtual memory initialization code of coreboot on RISC-V. The first part of this was to update encoding.h a file that defines constants such as bit masks which are necessary for interacting with RISC-V’s Control and Status Registers. As a result, I also had to change a few files that relied on outdated constants. Then I wrote some code to walk the page table structures, and fixed one or two bugs in the page table setup code. Unfortunately my patches aren’t as finished as I would like them to be.

When I tested my changes with a Linux payload (which I had to edit the ELF headers of, because Linux uses virtual addresses, while cbfstool and the SELF loader use physical addresses), I stumbled upon another strange error: I get a store access fault in the instruction of the trap handler that saves the first register, even if I initialize the stack pointer to a value that’s within the RAM. When the trap handler runs again to handle this store access fault, it runs without any problems. This fault is especially confusing, because machine mode should always be able to access RAM through its physical addresses.

What’s next?

During the next week, I’ll be traveling and won’t be able to work on coreboot. When I return, I will rework my patches so they can be merged, and hopefully understand the aforementioned access fault problem. Properly set-up page tables should bring me a step closer to running Linux on coreboot/RISC-V (without bbl in the middle).

If time permits, I will start porting coreboot to the Nexys 4 DDR devboard.

[GSOC] Panic Room, week #7

What have you procrastinated on this past week?

As usual, I’m glad you asked.

During the past week I worked every day(/night) on the porting of the region_device API (commit) as I probably mentioned last week. The port is now complete and it should be bug-free (?).
It’s been tested pretty thoroughly since I immediately used it to adapt Bayou from the old LAR/coreboot-v3 paradigm to the new CBFS/coreboot-v4 one.
A bunch of bugs cropped up with the new API during the transition (debugging the API wasn’t fun at all) but they have been solved and unless for some corner-cases that I missed it should just work.

The second part of the week, as you might have guessed read, has been devoted to fixing Bayou (commit). This has been quite fun, almost everything was broken or relied on some old function that has since been dropped from libpayload.
It now works quite well and supports all of the original features.
The only problem now are with the other payloads since they aren’t all able to exit cleanly and give back control to bayou (i.e. coreinfo reboots before returning, tint halts, etc).
If you want to know more details take a look at the commit message.

Apart from this two items I wrote a couple of patches concerned with libpayload and a patch that adds support for sequential lz4 decompression during the boot process (Thank you, Julius Werner, for pointing that out).
I also just finished updating the wiki entry for Bayou, so if someone wants to test it out… 🙂

What’s next?

With less than a month left of my GSOC time I plan to dedicate solely to maintain the patches that I currently have and work on the coreboot/serialice integration (so as to finally do something pertinent to the original proposal).

Have a nice weekend!

 

P.S.

I had to drop the idea of working on the FILO (flashupdate) project since it relies on a series of patches that have not been finished yet (libflashrom). It will be a project for another time perhaps.

P.S.S.

I think I lost count of a few weeks here and there…

[GSoC] Better RISC-V support, week #6/7/8/9

Since I haven’t posted an update of my coreboot-on-RISC-V work in a while, this will be a slightly longer post.

Week 6

In week 6, I started documenting how to build and boot coreboot on RISC-V, in the coreboot wiki.
It is now a bit outdated, because we’re moving away from using bbl to boot Linux.

Week 7

I wrote some patches: I removed code that used the old Host-Target Interface (HTIF), because it’s deprecated. I submitted an improved version of my workaround for the bug that causes Spike to only execute 5000 instructions in some cases. I informed the coreboot resource management subsystem about the position of the RAM in physical address space, so that the program loader wouldn’t refuse to load segments into RAM. I submitted two patches to fix compiler errors with the new toolchain.

Meanwhile, there were some good news in the RISC-V world:

Week 8

I submitted a few more patches and started to explore the Nexys4 board. The precompiled bitstream and kernel from the lowRISC version 0.3 tutorial worked without any problems, and after a few days and some help from the lowRISC mailing list, I was able to recompile the lowRISC bitstream.

This week

I discussed the choice of boot medium with the lowRISC developers, and they agreed that a memory-mapped flash would be useful. Once it is implemented, I
can start porting coreboot to lowRISC on the Nexys 4 DDR board. Luckily the Nexys4 already has large enough flash to use for this purpose.

My mentors and I agreed that the switch from machine mode to supervisor mode should be left completely to the payload.

What’s next?

I will continue to work on running coreboot on the Spike emulator. Currently I’m facing the following problems and tasks:

  • Linux, when compiled to an ELF file (vmlinux) specifies that it wants to be loaded at the physical address 0x0 and at the virtual address 0xffffffff80000000. Since coreboot’s ELF loading code only looks at the physical address, it refuses to load Linux, since RAM starts at 0x80000000 on RISC-V.
  • Low level platform information (most importantly the memory layout) is passed to the firmware (coreboot in this case) as a configuration string, which is dynamically generated by the emulator, in the case of Spike. I still need to implemented a parser for this format, so coreboot can know how much memory is available.
  • The RISC-V Privileged Architecture Specification 1.9 specifies that there shall be a page at the top of the virtual address space where the operating system can call a few functions exposed by the firmware (this is the Supervisor Binary Interface).

[GSOC] Panic Room, week #4/5/6

What have you been up to these past few weeks?

My facetious alter ego, I must admit I sinned, I strayed oh so much from my original timeline. (Quite obvious if you read the previous posts)

Since my last post I mainly kept focusing on the region_device API, I attempted to go through with my week #3 plan to continue the phasing out of rdev_mmap_full() from the selfload code (described in previous post).

I think I overestimated what I could accomplish regarding that project, I tried to modify the current lzma API used inside to coreboot to make it possible to easily load each chunk of data from memory/chip and decompress it gradually.
Theoretically it shouldn’t be too difficult, in practice the decompression code is particularly nasty and it seems it is a customised version of the LZMA SDK code.
I, not so quickly, realised that I didn’t have the time to delve into that, so I momentarily dropped the idea. (It could be a nice project to do after the GSOC ends.)

I also spent a chunk of my time in porting the time stamps reading functionality from the cbmem utility into the coreinfo payload (commit).
It is now possible to read how much time the coreboot boot sequence takes without having a working OS environment.
I needed to measure if there had been any boot time improvements after the rdev_mmap_full commit. (Still haven’t got around to doing that though…)

What are you working on now then?

I’m currently working on porting the region_device API from coreboot to libpayload, meanwhile replacing all the functionality that was provided by its predecessor: cbfs_media.

Today I finished (in before my code has to be reworked completely ) replacing the old api inside libpayload and started converting the only payload that used cbfs_media: depthcharge, the payload used to boot ChromeOS.
Hopefully it will be a painless process.

What’s next?

So, after I finish these current patch (possibly for the weekend), I need to re-focus my attention on my actual timeline.
I plan to test the current FILO master branch to check for possible problems and show-stoppers before an eventual new release in the (near?) future.
It would be quite helpful in that the current stable is not stable at all, it doesn’t even compile and neither does the master tag (unless you apply this patch which is still pending approval).
It would also be the first release that includes the new flashupdate command and would allow for some interesting things to be done (i.e. path to the recovery from a bad coreboot flash/configuration).

Furthermore I would like to get the SerialICE firmware-side code to be merged into the main coreboot codebase (commit), so I plan to work out the current problems with the time that I still have at my disposal. (Uh, sounds a bit cliche’ by now)

See you next week time! (Just to be on the safe side 🙂 )

[GSoC] Better RISC-V support, week #4/5

Week 4

In week 4, I tracked down why coreboot halted after about one line of output. It turned out to be a spike bug, that I wrote up in this bug report, and affect any program that doesn’t have a tohost symbol. As a workaround, I extended my script that turns coreboot.rom into a ELF to also include this symbol.

After some more patches I could run coreboot in spike and get the familiar “Payload not loaded” line.

Week 5

I was now clearly moving towards being able to run linux on spike/coreboot. But there was a problem: The RISC-V linux port requires a working implementation of the Supervisor Binary Interface (SBI), which is a collection of functions that the supervisor (i.e. the linux kernel) can call in the system firmware.

Coreboot has an implementation of the SBI, but it’s probably outdated by now. To get an up-to-date SBI implementation, I decided to use bbl as a payload. When I built bbl with coreboot’s RISC-V toolchain, I noticed that it depends on a libc being installed in two ways:

  • The autoconf-generated configure script checks that the C compiler can compile and link a program, which only succeeds if it finds a linker script (riscv.ld) and a crt0.o in the right place.
  • bbl relies on the libc headers to declare some common functions and types (it doesn’t use any of the implementations in the libc, though).

The coreboot toolchain script doesn’t, however, install a libc, because coreboot doesn’t need one.

I tweaked the bbl source code until it didn’t need the libc headers, changed the implementation of mcall_console_putchar to use my 8250 UART, got the payload section of bbl (where linux is stored before it’s loaded) out of the way of the CBFS by moving it to 0x81000000 (bbl/bbl.lds is the relevant file for this change), and could finally observe Linux booting in spike, on top of coreboot and bbl. It stops with a kernel panic, though, because it doesn’t have a root filesystem.

Plans for this week

This week I will document my work on the Spike wiki page in the coreboot wiki, so others can run coreboot on spike, too.

[GSoC] Multiple status registers, block protection and OTP support, week #3 and #4

The most important accomplishment during these weeks was developing a generic algorithm which would generate the block protect range table for a given flash chip. This algorithm can be applied to a majority of chips.

The previous solution that I had developed was to have range tables stored in a struct and a pointer to it would be in struct flashchips for chips that share the range definition. This array of struct range would be indexed by block protect (BP) bitfield, so essesntially we could get the protected range for a chip given the bitfield in O(1) time. While it looked neat, it was only marginally an improvement over existing solution in ChromiumOS.

I really wanted to try to have a more elegant and robust solution. So after getting back some feedback from my mentor David, I decided to investigated further. I started with around 80 datasheets from AMIC, Eon, GigaDevice, Macronix, Winbond and others with objective to find whether a generic pattern could be observed such that given a chip and its properties can we compute its BP ranges. After a few attempts, I settled on a model that fits all GigaDevice and Winbond chips, and to some AMIC and Eon chips, which together account for around 50% of the chips I investigated. Having the layout of the status register as a member of struct flashchip came in very handy. For the algorithm itself and its demo, please have a look at this mail in the discussion thread on the mailing list. Once this was in place, I went through another iteration to allow the code to automatically handle presence (or absence) of SEC/TB/CMP bits.

One could wonder that since this model generates range tables automatically for only about half of the chips, what about the remaining ones? As I like to put it slightly dramatically, the “grand” design is very flexible. The struct wp has function pointers that can be assigned chip specific functions, but if they are unassigned then the code falls back to using the generic functions. Quite a few chips that are not included in the 50% collection above, have ranges that can be computed with slight modification after the generic pattern has been applied, allowing maximum re-use of code. This idea is also part of the mail and demo linked above.

The investigation and subsequent pattern deduction took a fair amount of effort. Although I was very happy with the results, nonetheless they were at a slight tangent to what I had planned. The last week had not been as productive as I would have it. So as it stands, I am slightly amiss from the pre-midterm timeline I had proposed. However, I have taken corrective measures, and will ensure that post-midterm timeline is not affected.

Currently I am finalising the OTP support model. The functionality for reading/writing multiple status registers and block protection are in place. I am also simultaneously adding support for new infrastructure to existing chips. Later this week I will work on CLI to expose this functionality and wire up existing flashrom code to make use of the new infrastructure. Following that, I will be testing the new infrastructure on a few GigaDevice chips that I have. I have also ordered some chips from a few other manufacturers so hopefully they should arrive in time by then.

Thanks for your time. In case you have any feedback, I’d love to hear. See you later!

[GSOC] Panic Room, week #3

What happened during the past week ?

After many iteration of patches and bug hunting I finished writing and testing the code that added to cbfstool allows to convert between SELF and ELF.
The code has been now merged.

One of the most problematic things has been to get GRUB to work after the conversion to ELF whereas all of the other payloads were working wonderfully.
It turned out it is the only payload (that I tested) that used more than two segments to describe the memory image of  the program.
This also uncovered a bug contained inside the elf_writer code that was probably never triggered given that the majority of payloads only contain one segment (commit).

I also received the replacement mobo for my Lenovo X60 target, so I can get back on track with the SerialICE part of the project.

What are your plans for next week?

I am currently investigating a bug in the serial communication between QEMU and the target while using the most recent version of the patch that integrates SerialICE into coreboot.
I am also looking into some work related to selfboot.c and the region api; the objective is to avoid loading the payload all at once while it is being executed and allow for the various parts of the payload to be loaded when needed.
Hopefully I’ll manage to finish all of this before next week. (Sometimes I definitely feel too optimistic)

What?! Didn’t you have any mishaps© this week?

It’s indeed quite fascinating how my equipment keeps breaking… this week was my Raspberry Pi’s turn. It won’t boot anymore.
Fortunately I have a Bus Pirate and a BeagleBone Black to use for SPI flashing, so it’s all good.
EDIT: Scratch that… apparently you just need to wait for a while for it to reset… strange.

See you next week!

[GSoC] Better RISC-V support, week #3

Last week, after updating GCC (by applying Iru Cai’s patch) and commenting out uses of outdated instructions and CSRs (most notably eret and the HTIF CSRs), I noticed that coreboot crashed when it tried to access any global variables. This was because the coreboot build system thought coreboot would live near the start of the address space.

I found spike-riscv/memlayout.ld, and adjusted the starting offset. But then I got a linker error:

build/bootblock/arch/riscv/rom_media.o: In function `boot_device_ro': [...]/src/arch/riscv/rom_media.c:26:(.text.boot_device_ro+0x0): relocation truncated to fit: _RISCV_HI20 against `.LANCHOR0'

I played around with the start address and noticed that addresses below 0x78000000 worked, but if I chose an address that was too close to 0x80000000, it broke. This is, in fact, because pointers to global variables were determined with an instruction sequence like lui a0, 0xNNNNN; addi a0, a0, 0xNNN. On 32-bit RISC-V, the LUI instruction loads its argument into the upper 20 bits of a register, and ADDI adds a 12-bit number. On a 64-bit RISC-V system, lui a0, 0x80000 loads 0xffffffff80000000 into a0, because the number is sign extended.

After disassembling some .o files of coreboot and the RISC-V proxy kernel, I finally noticed that I had to use the -mcmodel=medany compiler option, which makes data accesses pc-relative.

Now that coreboot finally ran and could access its data section, I finished debugging the UART block that I promised last week. Coreboot can now print stuff, but it stops running pretty soon.

Plans for this week

This week I will debug why coreboot hangs, and will hopefully get it to boot until the “Payload not found” line again, which worked with an older version of Spike.

Also, Ron Minnich will be giving a talk about coreboot on RISC-V at the coreboot convention in San Francisco, in a few hours.

[GSOC] Panic Room, week #2

How was your last week?

Let’s say that it was a bit unexpected.

I spent the majority of it trying to wrap my head around the ELF (Executable Linkable Format) specification.
I used this new acquired knowledge to improve the utility cbfstool and allow it to extract payloads contained inside a CBFS directly into ELF instead of SELF (commit).

In order to achieve this cbfstool has to do a few things:

  • Extract the payload from the coreboot image
  • Parse the segment table contained inside the SELF payload in order to find out how many and which segments are present.
  • Using the elf_writer API generate a compliant ELF header
  • Take the content from each segment and copy it to the correspondent ELF section header and configure it accordingly
  • Once the section table is filled out, use elf_writer to generate the program header table and write out the final ELF

The final results would allow to, for example, easily move payloads from a CBFS to another one without having to re-build the payload, coreboot rom or mess with the build system configuration.
Right now the implementation it’s not complete yet but it works quite well with a good chunk of the payloads commonly used with coreboot such as SeaBIOS, coreinfo, nvramcui and others.
The major hurdles right now are to get the GRUB payload to work and add a way to handle the extraction of a compressed payload.

Wait a minute! Weren’t you working on SerialICE?

You are quite the inquisitive type, aren’t you?

Yes, my main goal is still to continue integrating SerialICE and coreboot.
Unfortunately there have been a few showstoppers this week, first my only test clip broke and now my target, Lenovo x60, stopped working and I am no longer able to flash its BIOS chip.
I already ordered a replacement but it’ll probably take a bit more than a week to arrive.

In the meantime my mentor (adurbin) kindly pointed out the task above to keep me busy while waiting.

What are your plans for the next week?

I plan to finish implementing the functionality described above and test all the remaining payloads.
Hopefully I will also be able to start looking at some of the other tasks that have been suggested to me by my mentors.

That’s it for today, see you next week!

[GSoC] Better RISC-V support, week #2

Last week, I updated my copy of spike (to commit 2fe8a17a), and familiarized myself with the differences between the old and the new version:

  • The Host-Target Interface (HTIF) isn’t accessed through the mtohost and mfromhost CSRs anymore. Instead, you have to define two ELF symbols (tohost and fromhost). Usually this is done by declaring two global variables with these names, but since the coreboot build system doesn’t natively produce an ELF file, it would get a little tricky.
  • Spike doesn’t implement a classic UART.
  • The memory layout is different. The default entry point is now at 0x1000, where spike puts a small ROM, which jumps to the start of the emulated RAM, at 0x80000000. One way to run coreboot is to load it at 0x80000000, but then it can’t catch exceptions: The exception vector is at 0x1010.
  • Within spike’s boot ROM, there’s also a text-based “platform tree”, which describes the installed peripherals.

“Why does coreboot need a serial console?”, you may ask. Coreboot uses it to log everything it does (at a configurable level of detail), and that’s quite useful for debugging and development.

Instead of working around the problems with HTIF, I decided to implement a minimal, 8250-compatible UART. I’m not done yet, but the goal is to use coreboot’s existing 8250 driver.

Plans for this week

This week, I will rewrite the bootblock and CBFS code to work with RISC-V’s new memory layout, and make sure that the spike UART works with coreboot’s 8250 UART driver. Booting Linux probably still takes some time.