Category Archives: GSoC

Google Summer of Code

GSoC [early debugging] closure

Time to stop coding was last week and I feel I have barely touched the surface of my original project proposal.

My biggest contribution comes in the form of usbdebug patches: Most mainboards with USB 2.0 EHCI controller should now be able to produce console logging on USB port. More importantly, one has now the option of using some low-cost ARM development kit boards as a replacement for the Net20DC dongle device. Digging into ARM was not really in my original plans and setting up the required environment took more time that I had hoped for.

I had the idea of collecting a trace of PCI configuration in CBMEM. Turned out I first had to do some cleanup on both PCI and CBMEM and while those have been submitted (with a minimal amount of testing) the tracing part I did not start at all. On the other hand, cleanup on CBMEM has enabled timestamp collection and CBMEM console for ramstage for most mainboards and I have a fairly clear vision what needs to be done to extend these to romstages as well.  I think there would be interest to have these features on ARMv7 too, just takes a bit of coordination and access to platform.

On the so-called “coreboot panic room” tasks I did not submit any code. I liked the idea of using some BeagleBone board as a proxy and having the possibility to switch between SerialICE debugging or GDB or pre-OS flash programmer. If time permits, there might even be some integration or interaction with radare coming up next months for SerialICE.

 

GSoC 2013 [flashrom] blog post #8

After my last post I continued to improve our flashrom build bot and it now uses VirtualBox’ safe state feature which allows to bring up and down VMs in a matter of seconds. Also, I was able to fully parallelize the builds. Using the safe state feature alone improves the duration of a full build from 4m55.930s to 2m10.496s. The fully parallel execution gets it down to 25.807s, yay.

This week I have merged a mix of old and new patches that fix a few things, add support for other things, and most importantly… gets rid of the dependency on dmidecode: I have finally committed my internal DMI decoder. Compiling for DOS and under various BSDs should be easier now, the AMD patches are slowly getting reviewed… clearly there is some progress, but obviously not enough to reach my original goals for GSoC…

GSoC (coreboot): Test interface board complete

Apologies for the late update. The design that I posted in the last post was more challenging than I had thought. However I’m happy to announce that my test hardware that I call ‘coreboot test interface board’ (TIB) is now complete. Only some of the software interface part is remaining in the project. So let me share with you a very quick update of last month. Continue reading

GSoC [early debugging] More connectivity

A substantial cleanup on CBMEM initialisation is now under review. Goal is to get timestamps and CBMEM console supported on more/most mainboards, but I do not expect to complete this during the official GSoC period, or within the next two weeks time.

One of the goals I originally had set was to have means to re-program the flash chip from pre-OS environment. It is now clear I will not have time to finish (or even start) this part of the project. The  decision to delay this part was made quite early on, actually. I learned similar work had already been developed as a combination of FILO+libflashrom, and I hope Stefan’s efforts on another GSoC project will help get this code published in near future. I might still try to get the FILO console appear over usbdebug, adding support of USB communication class (CDC / ACM) in libpayload should not be very difficult.

I have saved some of the most interesting and challenging parts last: having SerialICE and GDB run over usbdebug. Hopefully I get to report about those next week.

GSoC 2013 [flashrom] blog post #7

Nothing too fancy to report this week. I have added NetBSD and DragonFlyBSD support to our buildbot, which took quite a while: I have never set up a VirtualBox VM on a headless machine. It is quite easy actually, but there are lots of options one can configure with the CLI so that the most important/unconditional ones are not obvious at all. For example, storage controllers need to have a user-defined name given to address it in later commands, one has to specify the bus and device number when attaching a hd to an IDE storage controller (even if you don’t care at all) etc. It does not sound very problematic, but…
$ VBoxManage | wc -l
495

That’s quite a wall of text. Luckily there is good documentation and lots of howtos online.

In general it wasn’t too bad and the RDP console support of VirtualBox in conjunction with Remmina made up for it: I did not want to set up port forwarding on the remote host nor ssh tunnels myself, but Remmina can do the latter automatically on connect which is really handy.

Configuring the BSDs was way worse. Mostly because I was completely unfamiliar with them and pkgsrc, but also because they have a long way to go regarding usability. Not only that the DragonFly installer does not even try to set a correct keymap, even after installation I could not get that to work and the installer has a few other quirks that thwarts some functionality completely.

Anyway, build testing works on these hosts now and will hopefully prevent future breakage when tinkering with OS-dependent preprocessor #ifdefs (unlike before…). Next: some refactoring of the build bot and fixing issues with getrevision’s use of date on BSDs. Then back again to layout patches and libflashrom.

GSoC [early debugging] Bridging the gap

ARM is now on the table, as I am bridging the coreboot console from USB to gigabit ethernet, using a BeagleBone board equipped with the USB debug device gadget driver. At first I was a bit concerned of all the latency having an USB-to-ethernet bridge software solution on the communication, so it was now good time to do some measurements.

I used daemon called ser2net to redirect coreboot console TTY to TCP (telnet) and enabled timestamping to estimate the time it takes from power-on to entering payload on amd/persimmon with maximum logging (level=spew).

First, connecting with serial UART @115200bps on x86 host this was total of 15 seconds. Of this it spent 4 seconds in AGESA doing some SPI flash operations and during that time there was no console output.

Repeating same using usbdebug on the BeagleBone OTG port, total time was 9 seconds and again 4 seconds was waiting for SPI completion.  I would say we have a rough figure that console output on usbdebug is twice as fast as what super-IO can typically offer.

Driver compilation and patches are updated here: http://www.coreboot.org/EHCI_Gadget_Debug

To use BeagleBone some additional work was needed on coreboot side but BeagleBone Black and other ARM boards where there is no hub between OTG connector and controller should already work.

A brief progress sheet

Last week, I laid out a list of things to do in order to get more of the protocol finalized. Most of the items are crossed off, however, one little item remains standing. This item, while innocent and seemingly harmless is more painful than falling on your buttocks from a 10 story building on solid granite. Let’s have a look at why this small item is of such significance.

  • new API call set_chip_size()

When people think of QiProg, they think of one gadget with one flash chip connected. This is the common case, and, for the foreseeable future, will be the de-facto way of using QIProg. However, the original USB specification was intended for a broader use case: a programmer with several, individually addressable chips connected. One who observes the qiprog_read_chip_id() call will notice that it translates to a READ_CHIP_ID request over USB. This request will return identification data for up to nine chips. Aye, there’s the rub.

How does this play into set_chip_size()? Simple, set_chip_size for which chip? Do we send a flat list of nine uint32_t sizes, thus only needing one round-trip (control request) for all chips? Do we use the wIndex field of the round-trip, at the cost of needing one such trip for each chip? Once this question is answered, it will determine the answer for set_[erase/write]_[size/command] call and their respective USB round-trips, thus completing the USB protocol, and bringing QiProg to a usable state.

It’s easy to see why this one little detail is a blocker for all other remaining issues. I am leaning towards the use of wIndex (not the glass cleaner). Implementing a new control request in software and firmware is a matter of minutes. Testing it, and making sure it works properly is, at most, a two hour endeavor. Getting the design right: priceless.

GSoC [early debugging] USB submission

Seems I am reaching one of my goals of my original GSoC proposal in bringing usbdebug available as a compile option for most the mainboards with compatible chipset.

As I wrote in an earlier blog, it has been more of a refactoring job on existing code rather than creating something entirely new. There are a series of protocol details where I spotted the implementation took some shortcuts and I have attempted to fix those. I also made improvements to better control how the possible debug dongle connection is probed. There is more testing needed there and also it needs to be fixed to not become excessively slow when dongle is not connected or if it is disconnected before OS is loaded.

At the time of writing, my patches have not yet been submitted to master but are available on the gerrit review board. It is likely there will be some minor fixes, so I will not give exact commit hashes one should checkout and merge. In short: checkout and merge the two topic-branches usbdebug-cfg and usbdebug-lib. In addition, for AMD Agesa boards one needs  “AMD AGESA: Place CAR_GLOBAL in BSP stack”.

Now if you only had something to connect it with, I could ask for your help actually testing these changes and finding out if it works! I am still waiting for my BeagleBone Black to arrive to make some fixes to the kernel EHCI debug gadget driver, and the situation with the choice of debug dongles will then improve quite radically. I do have the older BeagleBone and have built custom kernel modules for it and I have started to study how g_dbgp driver interacts with the gadget serial port framework.

What I discovered after picking up a second-hand original BeagleBone was that it does not have its USB port directly connected to the ARM chip, but there is an USB hub chip in between. It might be possible to configure that hub even though our USB requests are limited to a length of max 8 bytes per transaction. The EHCI debug specs do not allow a hub there, but if we can make it work, why not do it?

 

Experiments of mind

The time for writing code is over. The time to design hardware is over. After seven weeks, the vultureprog_action_shotbeginning has come to an abrupt end. I am severely behind schedule. In week seven I was supposed to implement erase functionality — tell the programmer how to erase the chip. This is not done. On the other hand, I have had code for weeks 8 and 9 almost ready, and just merged most of it last week. So, where am I? Am I ahead or behind schedule?

The fallacy of preemption

One of the requirements for applying as a GSoC coreboot student was to have a fully established, vultureprog_probingschedule from day[-1]. Establishing this schedule was a great experience, and it allowed me to think in depth about the problem and possible solutions — to a certain degree. I picked the steps I considered logical, in the order which I saw logical. Development is never about writing code in the order in which it will be executed. In this particular case, it was much easier to implement bulk writing without a predefined erase/write strategy, opting instead for a default just-do-it approach.

Why is this approach better than following the schedule, from a development point of view? We have had bulk read partially working for a while now. From the host point of view, reading and writing are symmetrical operations. The bulk of the code (pun definitely intended) is shared between the read and write operation. They both juggle data on the same endpoint. The only difference is the endpoint direction bit. It therefore made sense, once bulk reading was fixed for corner cases, to uses the same code to send data to the programmer. Making the programmer write that data was a matter of a couple of hours. There was no sensible reason to wait an additional two weeks before implementing this last bit.

Software development work is as much about making things work, as it is about the application of programming principles with unquestionable moral authority and correctness. In this case, implementing a trivial extension reusing code fresh in my mind was the preferred approach. Not only did it save me time by not having to re-examine the situation a few weeks from now, it also allows me to have a working program/verify scenario when implementing the erase strategies. As one might imagine, this makes the problem a lot easier. Attempting to preempt and enforce a schedule before the problem is thoroughly explored, occasionally conflicts with best practices of development. With this in mind, I am neither behind, nor ahead of schedule. I am precisely where I need to be.

A matter of experimentation

Most of the infrastructure and code is already in place. Bringing QiProg to completion is no longer an issue of adding functionality through code, but rather completing functionality by connecting the existing code. One issue I discovered after testing the bulk program code was a terrible race condition between read prefetching and the write loop. The prefetch logic incremented the internal address before data arrived. As a result, the new data would get written at the wrong address. Choosing the best solution to the problem is a matter of experimentation.

The “this won’t work because of that” and “what if this” turned into a series of exhausting thought experiments. I have been bugging Peter a lot in the past few days about a series of potential issues. Through tiring thought experimentation, we eventually agreed that the best way to proceed was to abstract a lot more through the API. This is a non-exhaustive list of the decisions we’ve made in the past week:

  • set_address() is hidden from the API
  • the internal address range is not exhausted once read or written
  • read and write operations must not be interdependent, the internal read and write pointers will be distinct (as a side effect, this change also eliminates the race condition depicted above)
  • set_address() + readn() turns into read(dev, where, n)
  • All API addresses begin at 0. The programmer translates that into an absolute address
  • new API call set_chip_size()
  • new API call to explicitly erase blocks or sectors (to be defined)
  • implicit erase on write can be enabled or disabled (to be defined)
  • implicit erase will erase the sector/block right before the first byte of the sector is written
  • exposing any USB specific dependencies in the API is strictly forbidden

My focus for the remainder of this week will be to shorten this list as much as possible. Once the dependency between read and write is unshackled, I will be able to erase/program/verify my faithful SST 49LF080A. From here, it will be a matter of finalizing and implementing the last obscure bits of the specification.

The state of QiProg for flashrom

As QiProg is still being finalized, implementing it as a flashrom programmer is still a long ways ahead. I do estimate that weeks 11 and 12 will provide ample time to integrate everything into flashrom, hopefully, in time for the 0.9.8 release.