Progress GSoC week #1

The first task of my project is a working development board. A development board means that I have serial communication and I can flash new firmwares the chip and whole mainboard isn’t booting. The chip is a H8S 16-bit microcontroller with 64kb to 128kb EEPROM and is available in different packages. BGA and TQFP. BGA means the pins are under the chip, TQFP has pins on the side. TQFP is nice to hack, but most modern Thinkpads use the BGAs. But a T40 or T42 use a TQFP package. A friend donated his old T42 to me! Thanks a lot! Now with a hackable T42 I can start to create a development board out of the T42 mainboard. Like most other microcontroller this chip has a programmable bootloader in a ROM (called rom loader). The bootloader can boot to different states, configurable via 5 pins (MD0 MD1 P90 P91 P92).
P90 to P92 are only read when MD0 and MD1 are in a special bootstate.
After reading the documentation I found that the pins must match the following volatage levels to select the flash boot mode:

MD0-MD1 = 0V, P90-P92 = 3.3V.

Besides these configuration pins we need some additional wires to the following pins:
/RES – reset active low
UART RX – serial communication

Now it gets interesting. The MCU (microcontroller unit) can use a pin for different purposes depending on the PCB designer. Those pins called multifunction pins. Hopefully we don’t get blocked by unaccessible pins. After reading more documentation and using a Multimeter on the board I found out that /RES, RX, TX, MD1 require soldering, but are easy accessible. MD0 is already in a good state.
P90 is connected via a resistor to ground, but we need it to 3.3V.
Let’s find the resistor to solder 3.3V to it… Mhh. tricky! 3h later I found it on the
board hidden under the PMH4 (2nd EC/GPIO expander). Very uncommon.

P91 is named /SUS. Suspended active low, but can be driven by multiple controllers (chipset + h8s).
Because we want to boot linux on the main cpu later in the project we should not kill the chipset. I added a pin connector to this pin.

And the last pin P92 was connected to the SuperIO UART’s level shifter (MAX3242). I had to desolder the chip because P92 was driven by the level shifter.

Near the EC are 2 testpoints which are connected to an I2C bus. I soldered these too, because an I2C could be useful.

1 P91
3 md1
5 RX
6 TX
1 patch cable with a 3.3V + 1k Resistor (for P91).

So far so good. But somehow it doesn’t work. Some pins doesn’t have the right level. P92 doesnt have 3.3V. Why not?
P92 is pulled up via a resistor to VCC of the TTL shifter. The VCC isn’t powered. I need to resolder it to another 3.3V pin somewhere and take another look
on the other levels too.

PS. Some work was already done before GSoC started. I posted the first part of soldering on my blog

coreboot GSOC 2015 projects

Welcome the coreboot GSOC 2015 students!

Łukasz Dmitrowski – End user flash tool

lynxislazus – H8S EC firmware

Naman – Qemu Support for ARM64 Architecture

n1cky –  Improve/automate reporting and testing


A special thanks to this years mentors: adurbin, furquan, martinr, carldani, stepan, pgeorgi, rminnich, ruik, stefanct, and carebear!

coreboot GSoC 2015

coreboot has been accepted as a mentoring organization for GSoC 2015! We are accepting student applications for coreboot, flashrom, and SerialICE projects.

Students:  Welcome! Please review the coreboot GSoC page and Project Ideas page. Student applications formally open on 16 March at 19:00 UTC.

Mentors:  Please signup in Melange and request to join the coreboot project. You must signup for 2015, even if you already have a Melange account. You may also find the GSoC Mentor Guide helpful.

Community:  Please recruit and welcome prospective students to the organization. You may provide ideas on the Projects Page.

GSoC [infrastructure] : Along the way, something went terribly wrong

I started working with AMD platforms’ infrastructure with high hopes of being able  to better manage the CBMEM setup. While I have a selection of family14 boards to work with, things did not continue so well:

First off, tree had literally thousands of lines of copy-pasted or misplaced AGESA interface code remaining in the tree. A lot of that should have been caught in the reviews, but it appears a few years back the attitude was that if coreboot project was lucky enough to get some patches from an industry partner, the code must be good (as the development was paid for!) and just got rubber-stamped and committed.

Second, the agreement I have for chipset documentation is not open-source friendly. It contains a clause saying all documentation behind the site login is to be used for internal evaluation only. I was well aware of this at the beginning of GSoC and at that time I expected my mentor organisation would be able to get me in contact with right people at AMD to get this fixed. But that never happened. I am also concerned of the little amount of feedback received as essentially nobody in community has fam16kb to test.

So currently I am balancing which parts of my work on AGESA I should and can publish and what I cannot. Furthermore, first evidences that vendor has decided to withdraw from releasing  AGESA sources have appeared for review. In practice there has been zero communication with the coreboot community on this so I anticipate the mistakes that were done with FSP binary blobs will get repeated.

Needless to say the impact this has had on my motivation to further work on AGESA as my efforts are likely to go wasted with any new boards using blobs. I guess this leaves pleanty of opportunities for future positive surprises once we have things like complete timestamps, CBMEM and USBDEBUG consoles and generally any working debug output from AGESA implemented. I attempted to initiate communication around these topics already in fall 2013 without success and it is sad to see the communications between different parties interested in overall tree maintenance have not improved at all since then.

GSoC 2014 [flashrom] Support for Intel Bay Trail, Rangeley/Avoton and Wildcat Point

While we were busy updating our AMD driver code to accommodate the new SPI controller found in Kabini and Temash, Intel has also changed their SPI interface(s) in a way that required quite some effort to support it in flashrom. A pending patch set is the result of the work of a number of parties and I will shortly explain some details below.

All started with a patch for ChromiumOS’s flashrom fork in fall 2013 that introduced support for Intel’s Bay Trail SoCs which are used in a number of currently shipping or announced Chromebooks. Bay Trail is part of the Silvermont architecture also in other SoCs intended for different use cases like mobile phones (Merrifield and Moorefield) or special-purpose servers (Avoton and Rangeley). At least for the latter two the SPI interface is equivalent to Bay Trail’s. This was handy for Sage when they were developing their support package for Intel’s Mohon Peak (Rangeley reference board) which was upstreamed to the coreboot repository shortly before this blog post was written. They ported the patch to vanilla flashrom, added the necessary PCI IDs and submitted the result to our mailing list at the end of May.

Because we, the flashrom maintainers, are very picky, the code could not be incorporated as is. I took the patch and completely reworked and refactored it so that more code could be shared and we are hopefully better prepared for future variations of similar changes. Additionally, I have also backported the Intel Wildcat Point support that ChromiumOS got already in May.

The major part of my contribution was not simply integrating foreign code, but refactoring and refining it where possible as well as verifying it against datasheets. While digging these numerous datasheets and SPI programming guides I have also fixed the problem of hardware sequencing not working on Lynx Point (and Wildcat Point) as it was reported in March when I had no time to correct it.

All of this is not committed to the main repository yet, but will be soon. It is mostly untested so far and I would very welcome any testers with the respective hardware.

[GSoC-2014] Payload Loading – Success!

This post marks the completion of the payload re-structuring that I did as a second part of the project. The following summarizes the major highlights of the work:

  • We change the ‘struct payload’ ( in the src/include/payload_loader.h) to have a ‘struct cbfs_media’ and cbfs_file_handle. That way we have the two thing we need to read the contents of the file.
  • In the build_self_segment_lists() we use the above data structures to media->read() the payload metadata. 
  • We use the metadata to segregate segments on the basis of their types and form a linked list of segments for reference later.
  •  Then we do mapping and then decompression for the compressed segments and direct reading for the uncompressed segments.

Some of the major issues were: playing with the data_offset and segment offsets; to get src_address to read to.

I was able to get past all the issues and finally got a successful working boot This completes the revamp of stage and payload loading 😀

During the past week I also worked with gerrit; wherein I submitted my patches and received feedback. I am more involved with the development process of coreboot now and leaving aside some minor glitches; the process was very smooth.


[GSoC-2014] [cbfs_media] Stage 2

As per the plan, we set out to investigate the decompression algorithm that is being employed. Mid-way through, we found something interesting that appealed to us. That was the payload loading process. In our existing architecture, we memory map the entire payload. The selfload()’s current API assumes the payload has already been memory mapped. That’s the bad assumption that needs to change. Even if we investigate and resolve the decompression algorithm, and get a pipelined architecture relying on smaller buffer size, still this mapping will cost resources. Hence this needs to be rectified before we go on the the decompression thingy. So we decided this is what we will be targeting. 

I then spent some time trying to break the payload loading and see where we can put some map() saving efforts. Below shows some details of the process:
1. First we locate the payload. Here the process for stage loading happens where we have the following:
Reading done. size = 24 bytes
Load entry 0x14440 file name (32 bytes)…
Mapping size is equal to 32
Found file:offset = 0x14478, len=95716
CBFS: Found file.
default_media->map(0x14478, 0x1761c)
Mapping size is equal to 95804
CBFS: located payload @ 7ec14298, 95716 bytes.
Thus we map the all the segments with this one big mapping.
2. Loading procedure begins by building the segment list (build_self_segment_list() does this) Here w.r.t. payload_segment_types, we check for proper destination address, file_size etc. After checking for the segment; we do simple aligning by pointing the prev and next pointers appropriately; to reach to the further places where to place the segment; and so on.
3. In load_self_segments() we run a simple for loop covering all segments. The loading that happens is
–> Loading segment from rom address 0x7ec14298
code (compression=1)
New segment dstaddr 0x4a000000 memsize 0x39929 srcaddr 0x7ec142d0
filesize 0x175ac
(cleaned up) New segment addr 0x4a000000 size 0x39929 offset 0x7ec142d0 filesize 0x175ac
–> Loading segment from rom address 0x7ec142b4
Entry Point 0x4a000000
(iii) After this we come to load_self_segments()
First a bounce buffer is created:  Bounce Buffer at 7ffcf000, 186192 bytes.
We have one segment that is worked upon; which is compressed hence ulzma(src , dest) reads it.
–>Loading Segment: addr: 0x000000004a000000 memsz: 0x0000000000039929 filesz: 0x00000000000175ac
Post relocation: addr: 0x000000004a000000 memsz: 0x0000000000039929 filesz: 0x00000000000175ac
using LZMA
After that one segment that we see on the logs it says
–> Loaded segments
Hence process complete. In essence we had only 3 segments.
Currently, I am working on a strategy on deciding how to modify the architecture of the API so as to conserve as much sram memory consumption as possible.

[GSoC 2014][cbfs_media] Stage 1 : Mission Accomplished

Firstly, sorry for the delay in posting update on the work. I had been busy getting the design to code and wanted to post after its successful completion.

As I had talked about in the previous post, we did a detailed analysis on the existing read() and map() calls. The original log; with all the extra gibberish removed can be seen here. The first design modification that was done was to remove the mapping done for getting cbfs_header. These were the  0x20 size mappings we see in the log. These were unnecessary and could be done away with. And we did! 😛 This log shows the first optimized build; Stage 1 -> Part 1 ->done.

Now we moved on to the more complex and colossal mappings. A function cbfs_find_file() was created, which returned the absolute data_offset of the file based on the name and type we ask for. Once we have the whereabouts of the file; modifications were made in cbfs_load_stage() to appropriately read() and/or map() various files.

The files are arranged as  -> [  cbfs_file  ] [  cbfs_stage  ] [  data  ] <Thanks Aaron for this visualization >

cbfs_find_file() : worked with the cbfs_file to get details about the whereabouts of the file

cbfs_load_stage() : we first read fundamental information about the stage; and then do corresponding map() or read()

Voila!! Stage 1 Complete! 😀

Now, the major issue we have persisting is that the decompression of file data assumes memory mapped access to its contents, and hence is quite inefficient due the that ‘one’ large buffer. SO  this is what we tackle next, to be more precise, have a pipelined decompression strategy which would eliminate the need for one large data buffer.

Its getting fascinating to work on the project by the day! Until the next post, signing off.

P.S.  Thanks Aaron for helping out with any and every issue I face, and always finding the time to reply, even on sundays! 😀

GSoC 2014 [cbfs_media] Updates

This past week went into looking at the internal working of the cbfs_media interface. Some of the major observations were:

Locations of map() and read() calls
No read() calls at all. Also for the map() calls that were made, there weren’t any unmap() calls.

Size of mappings
The entire cbfs is pulled into the iram. There is a map call which puts about 28KB into the sram, to load romstage. The a10 has an sram of 32KB, hence we are using up most of the necessary ram.
The sequence followed is open() -> map()’s -> close().

Total Resources
Is just the sum of all the mappings, since there are no unmaps to subtract. This gives a benchmark to work upon. Now resource utilization is calculated each time coreboot loads, automatically and progress can tracked.
Now we are giving some thoughts on how to reduce the size of the mappings, one possibility being defining a limit (bound) on its size. What is happening currently, is the size is determined dynamically and hence some mappings are quite large. If we define a bound on it, and then repeat call ‘smaller’ map()s instead of one big one, that could do the job. But this wont always work as the decompression algorithm (LZMA) expects memory-mapped access to the entire compressed buffer. By the end of this week, we hope to strike a workaround this and get a more resource-efficient cbfs interface.

GSoC 2014 [flashrom] Rise like a phoenix

The best plan is no plan at all, then everything goes according to the plan.  (Jacek Bukowski)

I am 142% on track according to my proposal so far if you take the quote above seriously. But even if you don’t the progress so far is way better than I would have hoped for when I applied. I made 22 commits so far during the official coding period (starting with 2014-05-19 or within ~16 days), and 49 if you count from the date of project acceptance (2014-04-21, i.e. ~37 days). That’s pretty significant for a project that normally does 100-150 commits per year. Not all of the patches were authored by myself, and some of them were bitrotting for an extended period of time before I revived them. I have also reviewed, rebased and merged patches sent in by foreigners as well as long-term contributors. Even Carl-Daniel awakes periodically from hiatus and helps where he can. So flashrom seems to be pretty vivid again, yay.

Two of the most interesting sets of changes are the eventually committed support for AMD’s Yangtze-based SPI controller (found in Kabini and Tamesh), and finer-grained display of support/test status of hardware. For example we can now clearly indicate if a flash chip is actually a ROM that can only be probed and read. This has become necessary when we added our first ROM chip, Macronix MX23L3254. The results can be seen in the wiki as well as flashrom’s -L output. Besides that there were also various new chips and even two programmers committed with another 1-3 waiting.

It is still a bit early to nail down any fundamental changes I want to tackle, but I think I’ll manage to rewrite the probing algorithm completely this year. I have made one huge step towards that by getting rid of the .probe_timing field of struct flashchip. It was used by some probing functions of parallel chips but there were only about 5 concrete values so they could easily be wrapped into stub functions. This is – as Carl-Daniel correctly stated – not what we normally do… replacing data in tables with code, but in this case I think it is completely justified. The next big step will be to turn the probing loop inside-out. Currently we iterate over all flash chips, filter the ones compatible with the selected programmer, and execute the respective probing function stored. Naturally this creates the same results over and over again because there is only about a dozen different probing functions. That does not only take a lot of unnecessary time but also makes it hard to support more than one probing function. It should be the other way round: we should iterate over all possible probing functions and match the results with the stored flash chips. I will report back how far I got with that in my next post.