[GSoC-2014] [cbfs_media] Stage 2

As per the plan, we set out to investigate the decompression algorithm that is being employed. Mid-way through, we found something interesting that appealed to us. That was the payload loading process. In our existing architecture, we memory map the entire payload. The selfload()’s current API assumes the payload has already been memory mapped. That’s the bad assumption that needs to change. Even if we investigate and resolve the decompression algorithm, and get a pipelined architecture relying on smaller buffer size, still this mapping will cost resources. Hence this needs to be rectified before we go on the the decompression thingy. So we decided this is what we will be targeting. 

I then spent some time trying to break the payload loading and see where we can put some map() saving efforts. Below shows some details of the process:
1. First we locate the payload. Here the process for stage loading happens where we have the following:
default_media->open()
Reading done. size = 24 bytes
Load entry 0x14440 file name (32 bytes)…
Mapping size is equal to 32
Found file:offset = 0x14478, len=95716
CBFS: Found file.
default_media->map(0x14478, 0x1761c)
Mapping size is equal to 95804
CBFS: located payload @ 7ec14298, 95716 bytes.
Thus we map the all the segments with this one big mapping.
2. Loading procedure begins by building the segment list (build_self_segment_list() does this) Here w.r.t. payload_segment_types, we check for proper destination address, file_size etc. After checking for the segment; we do simple aligning by pointing the prev and next pointers appropriately; to reach to the further places where to place the segment; and so on.
3. In load_self_segments() we run a simple for loop covering all segments. The loading that happens is
(i) A PAYLOAD_SEGMENT_CODE
–> Loading segment from rom address 0x7ec14298
code (compression=1)
New segment dstaddr 0x4a000000 memsize 0x39929 srcaddr 0x7ec142d0
filesize 0x175ac
(cleaned up) New segment addr 0x4a000000 size 0x39929 offset 0x7ec142d0 filesize 0x175ac
(ii) Next a PAYLOAD_SEGMENT_ENTRY
–> Loading segment from rom address 0x7ec142b4
Entry Point 0x4a000000
(iii) After this we come to load_self_segments()
First a bounce buffer is created:  Bounce Buffer at 7ffcf000, 186192 bytes.
We have one segment that is worked upon; which is compressed hence ulzma(src , dest) reads it.
–>Loading Segment: addr: 0x000000004a000000 memsz: 0x0000000000039929 filesz: 0x00000000000175ac
Post relocation: addr: 0x000000004a000000 memsz: 0x0000000000039929 filesz: 0x00000000000175ac
using LZMA
After that one segment that we see on the logs it says
–> Loaded segments
Hence process complete. In essence we had only 3 segments.
Currently, I am working on a strategy on deciding how to modify the architecture of the API so as to conserve as much sram memory consumption as possible.

[GSoC 2014][cbfs_media] Stage 1 : Mission Accomplished

Firstly, sorry for the delay in posting update on the work. I had been busy getting the design to code and wanted to post after its successful completion.

As I had talked about in the previous post, we did a detailed analysis on the existing read() and map() calls. The original log; with all the extra gibberish removed can be seen here. The first design modification that was done was to remove the mapping done for getting cbfs_header. These were the  0x20 size mappings we see in the log. These were unnecessary and could be done away with. And we did! 😛 This log shows the first optimized build; Stage 1 -> Part 1 ->done.

Now we moved on to the more complex and colossal mappings. A function cbfs_find_file() was created, which returned the absolute data_offset of the file based on the name and type we ask for. Once we have the whereabouts of the file; modifications were made in cbfs_load_stage() to appropriately read() and/or map() various files.

The files are arranged as  -> [  cbfs_file  ] [  cbfs_stage  ] [  data  ] <Thanks Aaron for this visualization >

cbfs_find_file() : worked with the cbfs_file to get details about the whereabouts of the file

cbfs_load_stage() : we first read fundamental information about the stage; and then do corresponding map() or read()

Voila!! Stage 1 Complete! 😀

Now, the major issue we have persisting is that the decompression of file data assumes memory mapped access to its contents, and hence is quite inefficient due the that ‘one’ large buffer. SO  this is what we tackle next, to be more precise, have a pipelined decompression strategy which would eliminate the need for one large data buffer.

Its getting fascinating to work on the project by the day! Until the next post, signing off.

P.S.  Thanks Aaron for helping out with any and every issue I face, and always finding the time to reply, even on sundays! 😀

GSoC 2014 [cbfs_media] Updates

This past week went into looking at the internal working of the cbfs_media interface. Some of the major observations were:

Locations of map() and read() calls
No read() calls at all. Also for the map() calls that were made, there weren’t any unmap() calls.

Size of mappings
The entire cbfs is pulled into the iram. There is a map call which puts about 28KB into the sram, to load romstage. The a10 has an sram of 32KB, hence we are using up most of the necessary ram.
The sequence followed is open() -> map()’s -> close().

Total Resources
Is just the sum of all the mappings, since there are no unmaps to subtract. This gives a benchmark to work upon. Now resource utilization is calculated each time coreboot loads, automatically and progress can tracked.
Now we are giving some thoughts on how to reduce the size of the mappings, one possibility being defining a limit (bound) on its size. What is happening currently, is the size is determined dynamically and hence some mappings are quite large. If we define a bound on it, and then repeat call ‘smaller’ map()s instead of one big one, that could do the job. But this wont always work as the decompression algorithm (LZMA) expects memory-mapped access to the entire compressed buffer. By the end of this week, we hope to strike a workaround this and get a more resource-efficient cbfs interface.

GSoC (board-)status update

Two weeks went past fast. Our industry partners have been kind and I  received some new boards  for coreboot testing.  I think I spent three  workdays on just the initial hardware setups and SPI recovery tools for the boards.

You should already see a posted board status report for Gizmosphere/Gizmo built from coreboot.org toolchain and source tree.  I have Adlink/CoreModule2-GF (aka LiPPERT/Frontrunner-AF) and PCEngines/apu1c4 next on my list to report. Latter board also requires it coreboot sources to be rebased and published.

These are all Agesa Family14 boards, more or less intended for embedded application use. When applicable, I will try to do some demonstration of the capabilities of Sage EDK and SmartProbe as those seem to be somewhat standard commercially available toolkit for AMD Embedded Partners.  It is clear we cannot beat a JTAG based in-circuit debugger with an alternative running over USB, but we may get surprisingly close by improving the command set of the GDB stub we run in ramstage.

Most of my time is still on solving some early memory space mapping issues for AMD platforms. As expected, reviews and test reports there are progressing slowly. My goal with these is to have CBMEM and USB consoles work in romstage too and additionally some boards currently lack S3 resume support and/or have PCI-e reset problems. I expect most of my development to work unmodified for fam15, fam15tn and fam16kb.

As a sidedish there has been more than the typical amount of review work for me on the table. Edward has done an amazing job sorting out the superio spaghetti romcc left us with. There was also a fair amount of discussion around IRQ routing and new Intel FSP ports to look at.

cbfs_media [Week 1]

This post covers the complete set-up and building coreboot for cubieboard. Due to lack of documentation for this, I had to spend sometime figuring out the details; hence decided to write it myself to help others in the future.

  1. Step 1: Build Payload

As mentioned here, what we have to do first is to build a payload to use later for coreboot.rom. A suitable ARM payload is the sunxi/uboot. Now, there are two ways to build uboot: natively or from another system. To build from another system, we need to get a suitable toolchain. For that you need to do:

apt-get install gcc-arm-linux-gnueabihf

Too many issues are faced to get this toolchain set up right 😐  A more suitable and convenient method is to build uboot natively from the cubieboard itself (thanks #cubieboard for the tip :P).  For this follow the instructions here. tl;dr Clone repository, choose you target board, make (without CROSS_COMPILE). This completes building the payload.  NOTE: The correct file to use as the payload is “u-boot”, not u-boot.bin. The “u-boot” file is the non-SPL part of uboot in elf format. The log for successful build can be seen here.

file u-boot
u-boot: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked (uses shared libs), not stripped

2.  Step 2: Build coreboot

Before following the instructions on the coreboot/Build_HOWTO, you first need the latest development code; which contains the mmc driver needed to load romstage, etc.

You need to make crossgcc first. This might take a lot of time: be patient 😛 Some missing toolchain errors can arise. Get past them by:

apt-get install bison flex patch
add-apt-repository ppa:linaro-maintainers/toolchain

Once this is done, set your suitable configuration in make menuconfig. Make sure to disable CONFIG_VGA_ROM_RUN (set by default), since it doesnt work for ARM boards. Just make and wait. Your coreboot.rom is ready. :)

This image needs to be placed on the SD card:

dd if=build/BOOT0 of=/path/to/sdcard/blockdev bs=1024 seek=8

Now pfff! this is enough to get coreboot up and running! 😀

For the next week; our plan is to Identify locations of the map() and read() calls; and to determine size of each map(). The driver is currently configured to pull the entire cbfs into ram; so we work to reduce size of these mappings. 

GSoC [early debugging] There is so much source

Not much new or fancy development yet. A time-consuming dive into sources of AMD platforms took place during the last two weeks as I need to fully understand what is going on with the memory management there. We need to find solution for CAR migration on these platforms, and while at it, let’s evolve these boards too for DYNAMIC_CBMEM.

Originally I had scheduled improvements for these platforms for July, but coordination of making changes to vendorcode might become a bottleneck, so I thought I better start early. In this process fifteen (15) or so copies of heap management sources of AGESA boards got combined into one. As a bonus, some family14 and family15 boards are now a bit closer for having S3 suspend/resume support.

I am expecting some more gear for coreboot development arrive later this week, I should then be fully equipped for my GSoC tasks.

GSoC [early debugging] The very short introduction

Oh dear, what did I get into again. My GSoC 2014 project page  gives you an idea of things to expect during this summer of code and seems like I have promised to deliver a lot this time. More like a complete in-circuit-debugging solution of x86 boot firmware over some readily available and low-cost USB hardware. I only scratched the surface with my GSoC 2013 when I did not get much further than a working usbdebug and some intense clean-up and preparation on the CBMEM side.

There has been serious use of usbdebug combined with SerialICE to troubleshoot and/or reverse-engineer proprietary firmwares. Tests have been done to connect GDB stub built into coreboot over BeagleBone even before debug target has initialized RAM.  Also other pieces of my project plan have already seen proof-of-concepts but the quality or the flexibility have not reached the requirements to see them in widespread use in coreboot community.

We can see more practical uses for SerialICE if we could connect QEMU, GDB, SerialIce and radare together, and visualize some of the system bus topologies at runtime. With the amount of support CPU and chipset vendors have shown towards open-source firmware development the last years, I consider this as a key part for any further community-driven mainboard ports on coreboot.

GSoC 2014 Projects

Congratulations coreboot’s GSoC students.  coreboot has three students working on coreboot for GSoC 2014 .

 

Title: Enhance early coreboot debugging

Student: Kyösti Mälkki

Mentors: Martin Roth and Rudolf Marek

 

Title: Generic Interface using alternate CBFS access patterns for ARM SoCs

Student: Naman Govil

Mentors Aaron Durbin and Marc Jones

 

Title: The yearly flashrom maintenance and enhancement proposal

Student: Stefan Tauner

Mentors Carl-Daniel Hailfinger and David Hendricks

 

Each student is required to post progress to this blog.  We expect first posts later this week. Stay tuned for progress!

coreboot GSoC 2014

We are excited to announce that coreboot has been selected for Google Summer of Code 2014.  As with previous years, coreboot will also support flashrom and SerialICE projects.

All the information you need is on the coreboot GSOC page and the following project ideas pages:

Students
We are in the Get Familiar period before the Application Period opens. Join the mailing list and IRC #coreboot. Start to get familiar with coreboot, our development process, and with the coreboot community. Discuss project ideas and scope with coreboot developers.
Student Application Period: March 10 – March 21

Mentors
To be a mentor, you must be a coreboot contributor in good standing (review and commit rights). If you would like to be a member, please add your name to the coreboot GSoC mentors section and signup and connect with coreboot in melange.

Feel free to contact me (IRC:marcj) if you have any questions.

Pencils down

This is it. The beginning for QiProg has ended. It has been a long and tedious journey to equip the Stellaris Launchpad for Low Pin Count mastering. The hardware is built, and it works. The software is there, although its contribution to the ecosystem is somewhat minimal — it is more of a bridge than a road in itself. The real value lies in the firmware. To the best of my knowledge, LPC bus mastering has not been implemented in a microcontroller without using ASICs or FPGAs. The vessels are here, and now it is time for the explorers to start their journeys.

Why do I talk like this is project is not over? It is not over. Although I have attained the minimal goals, there is a whole new world to explore. Integration in flashrom was one of the targets I had really hoped for, but was unable to complete. Is writing of the chip working? Yes. Is it reliable? Writing and verifying predetermined patterns works. Disturbingly, when I plugged in output from /dev/urand, the write did not verify. Some bits were simply not programmed, although the majority of the chip matched. Is it the LPC bus mastering that is at fault, is it the command sequence, or are we not giving the chip enough time to complete the operation? Will chips other than the SST 49LF080A work? Can we write a complete ROM image and boot a machine? This is the exploration phase: we have built it, now let’s make the most of it.

Time has been tight for the last month. I had forgotten to plan for the start of the fall semester, and my time since this event has been severely crippled. I wish I could have achieved more. I took the last two days off from the University to organize the last few weeks’ salad of stashed and uncommitted changes into readable patches.

I don’t feel sad, just tired. Exhausted, I have nothing left to keep me entertained but an Alec Bradley Presando cigar, which I have been saving for a special occasion. This is that occasion: QiProg has stopped being a commitment, and became a hobby, a child, something to care for. Now is the time to build your VultureProg, get the software, and start reporting those issues you know you will encounter. I’m in Houston, and I don’t have a problem.