[GSoC-2014] [cbfs_media] Stage 2

As per the plan, we set out to investigate the decompression algorithm that is being employed. Mid-way through, we found something interesting that appealed to us. That was the payload loading process. In our existing architecture, we memory map the entire payload. The selfload()’s current API assumes the payload has already been memory mapped. That’s the bad assumption that needs to change. Even if we investigate and resolve the decompression algorithm, and get a pipelined architecture relying on smaller buffer size, still this mapping will cost resources. Hence this needs to be rectified before we go on the the decompression thingy. So we decided this is what we will be targeting. 

I then spent some time trying to break the payload loading and see where we can put some map() saving efforts. Below shows some details of the process:
1. First we locate the payload. Here the process for stage loading happens where we have the following:
default_media->open()
Reading done. size = 24 bytes
Load entry 0x14440 file name (32 bytes)…
Mapping size is equal to 32
Found file:offset = 0x14478, len=95716
CBFS: Found file.
default_media->map(0x14478, 0x1761c)
Mapping size is equal to 95804
CBFS: located payload @ 7ec14298, 95716 bytes.
Thus we map the all the segments with this one big mapping.
2. Loading procedure begins by building the segment list (build_self_segment_list() does this) Here w.r.t. payload_segment_types, we check for proper destination address, file_size etc. After checking for the segment; we do simple aligning by pointing the prev and next pointers appropriately; to reach to the further places where to place the segment; and so on.
3. In load_self_segments() we run a simple for loop covering all segments. The loading that happens is
(i) A PAYLOAD_SEGMENT_CODE
–> Loading segment from rom address 0x7ec14298
code (compression=1)
New segment dstaddr 0x4a000000 memsize 0x39929 srcaddr 0x7ec142d0
filesize 0x175ac
(cleaned up) New segment addr 0x4a000000 size 0x39929 offset 0x7ec142d0 filesize 0x175ac
(ii) Next a PAYLOAD_SEGMENT_ENTRY
–> Loading segment from rom address 0x7ec142b4
Entry Point 0x4a000000
(iii) After this we come to load_self_segments()
First a bounce buffer is created:  Bounce Buffer at 7ffcf000, 186192 bytes.
We have one segment that is worked upon; which is compressed hence ulzma(src , dest) reads it.
–>Loading Segment: addr: 0x000000004a000000 memsz: 0x0000000000039929 filesz: 0x00000000000175ac
Post relocation: addr: 0x000000004a000000 memsz: 0x0000000000039929 filesz: 0x00000000000175ac
using LZMA
After that one segment that we see on the logs it says
–> Loaded segments
Hence process complete. In essence we had only 3 segments.
Currently, I am working on a strategy on deciding how to modify the architecture of the API so as to conserve as much sram memory consumption as possible.

[GSoC 2014][cbfs_media] Stage 1 : Mission Accomplished

Firstly, sorry for the delay in posting update on the work. I had been busy getting the design to code and wanted to post after its successful completion.

As I had talked about in the previous post, we did a detailed analysis on the existing read() and map() calls. The original log; with all the extra gibberish removed can be seen here. The first design modification that was done was to remove the mapping done for getting cbfs_header. These were the  0x20 size mappings we see in the log. These were unnecessary and could be done away with. And we did! 😛 This log shows the first optimized build; Stage 1 -> Part 1 ->done.

Now we moved on to the more complex and colossal mappings. A function cbfs_find_file() was created, which returned the absolute data_offset of the file based on the name and type we ask for. Once we have the whereabouts of the file; modifications were made in cbfs_load_stage() to appropriately read() and/or map() various files.

The files are arranged as  -> [  cbfs_file  ] [  cbfs_stage  ] [  data  ] <Thanks Aaron for this visualization >

cbfs_find_file() : worked with the cbfs_file to get details about the whereabouts of the file

cbfs_load_stage() : we first read fundamental information about the stage; and then do corresponding map() or read()

Voila!! Stage 1 Complete! 😀

Now, the major issue we have persisting is that the decompression of file data assumes memory mapped access to its contents, and hence is quite inefficient due the that ‘one’ large buffer. SO  this is what we tackle next, to be more precise, have a pipelined decompression strategy which would eliminate the need for one large data buffer.

Its getting fascinating to work on the project by the day! Until the next post, signing off.

P.S.  Thanks Aaron for helping out with any and every issue I face, and always finding the time to reply, even on sundays! 😀

GSoC 2014 [cbfs_media] Updates

This past week went into looking at the internal working of the cbfs_media interface. Some of the major observations were:

Locations of map() and read() calls
No read() calls at all. Also for the map() calls that were made, there weren’t any unmap() calls.

Size of mappings
The entire cbfs is pulled into the iram. There is a map call which puts about 28KB into the sram, to load romstage. The a10 has an sram of 32KB, hence we are using up most of the necessary ram.
The sequence followed is open() -> map()’s -> close().

Total Resources
Is just the sum of all the mappings, since there are no unmaps to subtract. This gives a benchmark to work upon. Now resource utilization is calculated each time coreboot loads, automatically and progress can tracked.
Now we are giving some thoughts on how to reduce the size of the mappings, one possibility being defining a limit (bound) on its size. What is happening currently, is the size is determined dynamically and hence some mappings are quite large. If we define a bound on it, and then repeat call ‘smaller’ map()s instead of one big one, that could do the job. But this wont always work as the decompression algorithm (LZMA) expects memory-mapped access to the entire compressed buffer. By the end of this week, we hope to strike a workaround this and get a more resource-efficient cbfs interface.

cbfs_media [Week 1]

This post covers the complete set-up and building coreboot for cubieboard. Due to lack of documentation for this, I had to spend sometime figuring out the details; hence decided to write it myself to help others in the future.

  1. Step 1: Build Payload

As mentioned here, what we have to do first is to build a payload to use later for coreboot.rom. A suitable ARM payload is the sunxi/uboot. Now, there are two ways to build uboot: natively or from another system. To build from another system, we need to get a suitable toolchain. For that you need to do:

apt-get install gcc-arm-linux-gnueabihf

Too many issues are faced to get this toolchain set up right 😐  A more suitable and convenient method is to build uboot natively from the cubieboard itself (thanks #cubieboard for the tip :P).  For this follow the instructions here. tl;dr Clone repository, choose you target board, make (without CROSS_COMPILE). This completes building the payload.  NOTE: The correct file to use as the payload is “u-boot”, not u-boot.bin. The “u-boot” file is the non-SPL part of uboot in elf format. The log for successful build can be seen here.

file u-boot
u-boot: ELF 32-bit LSB shared object, ARM, version 1 (SYSV), dynamically linked (uses shared libs), not stripped

2.  Step 2: Build coreboot

Before following the instructions on the coreboot/Build_HOWTO, you first need the latest development code; which contains the mmc driver needed to load romstage, etc.

You need to make crossgcc first. This might take a lot of time: be patient 😛 Some missing toolchain errors can arise. Get past them by:

apt-get install bison flex patch
add-apt-repository ppa:linaro-maintainers/toolchain

Once this is done, set your suitable configuration in make menuconfig. Make sure to disable CONFIG_VGA_ROM_RUN (set by default), since it doesnt work for ARM boards. Just make and wait. Your coreboot.rom is ready. 🙂

This image needs to be placed on the SD card:

dd if=build/BOOT0 of=/path/to/sdcard/blockdev bs=1024 seek=8

Now pfff! this is enough to get coreboot up and running! 😀

For the next week; our plan is to Identify locations of the map() and read() calls; and to determine size of each map(). The driver is currently configured to pull the entire cbfs into ram; so we work to reduce size of these mappings. 

[Pre-GSoC] Set up Phase – II

Hi All! This is in continuation with the previous blog post regarding set up. This post will contain the brief walk-through the process of setting up the cubieboard.

The first step is to burn the linux image on the micro SD card, and then use this micro SD card to boot the board. The BerryBoot interface is quite user-friendly and we can follow the instructions on it to successfully complete the process. Details for this can be found here.

The latest set of linux distributions that are available can be seen when we power on the board with the micro SD card inserted. This image depicts the screen as visible:

1After completing the installation process (takes about 10 minutes) we successfully install Linaro Ubuntu 2012.11

The Home screen looks like this:

2Thus, the set-up process is completed. A schematic for the cubieboard can be found here. Also, a cool video with the whole process can be seen here.

It is 20th May, and now its time to get cracking on the summer of code! 😀 My abode for the next few months:

20140520_204159__1400603984_14.139.82.6

 

[Pre-GSoC] Set-up Phase-I

Hi! This past week went in gearing up and getting ready for the Coding phase to start. I intend to use the community bonding period for set-up of necessary hardware and studying the 1000 SLOC of the existing cbfs_media interface.

I had ordered my cubieboard, the back-bone  for this project last week, to avoid any hassle due to unavailability of hardware when the coding phase actually started. I received it this week. Along with it, all other supplements like the micro-SD cards, card-reader, HDMI-cable, etc were gathered. Hardware setup : Check! 🙂

The Cubieboard has 4GB of internal memory (NAND Flash), which comes pre-loaded with Android 4.0.4, on power-on we can see it operating. For our purpose, we would need  linux on it. So, the next step is to run Linux on this board.  BerryBoot makes this very easy. We use the manual installation using SD card image, found on that link.  A very good article that I found, which describes the set-up really well can be found here.

I would write another post next week that will show some of the results of this installation.

Also, I am spending some time reading the existing cbfs_media interface, can be found here. The regions patch by Aaron highlights the direction we would be taking, with the aim of reducing the buffer size for mappings, i.e. the aim of the project, as explained in the previous post.

This Bonding Period =  Spending time on the IRC + Initial Set-up + Slogging through source codes 😛 + Writing blogs (a first for me 😀 )

Moin!: A new beginning

Hi All, I am Naman Govil from India and I will start working with coreboot this summer as a part of Google Summer of Code-2014. I am a junior year undergrad from International Institute of Information Technology, Hyderabad (IIIT-H), pursuing my majors in Electronics and Communication Engineering. I have been actively dealing with programmable devices ever since the start of my program. I am a hardware designer at heart, and getting a chance to combine hardware with programming makes for an interesting and gripping combo for me- enter coreboot! 😀

While searching for appropriate orgs to pursue a GSoC project, I stumbled upon coreboot. Open BIOS had a ring to it and so I decided to see the projects done here.  It gripped me instantly. I was pretty confident to pursue my project here and got to work early.

This summer, I will be doing a project on providing a generic Interface based on CBFS access patterns for ARM SoCs. The aim would be to optimize and enhance the CBFS_media interface used for accessing data, currently in x86 systems, to suit low-end ARM SoCs. An ARM-specific CBFS access pattern would enable coreboot to load its stages efficiently,  which will form the basis for establishing full support for coreboot on SoCs, and in-turn help bridge the gap between coreboot and ARM mainboards. The chosen ARM board for this project is a cubieboard 1.0,  due to its existing support for coreboot.

There will two main targets to improve the CBFS access. First, to reduce the size of buffers used to read data, thus reducing wastage of ram.  Next, the current API for uncompressing a file to a location requires that the entire deflated source needs to be in memory. The second and more important objective would be to bring a pipelined decompression strategy.

The last part of the project will be  to have management of the resources being used, i.e  avoiding any memory leakings while executing commands.

Project Deliverables would include:

  • A CBFS access mechanism (for when the underlying medium is not memory mapped, unlike for x86 systems), which will allow ARM-based SoCs, for example: the Cubieboard, BeagleBoard, etc to boot efficiently.
  • The API would be tested with a back-end,like the MultiMedia Card, to demonstrate and debug the generic interface.
  • The verification step is to compare the size of the cbfs cache required for the cubieboard using this access method with the old one. In the end it should be dramatically reduced.
  • By the end, we will have a reliable method to boot coreboot on ARM SoCs.

I hope I can complete the project satisfactorily and have a great learning experience with the community. Waiting for an awesome summer ahead! 😀