GSoC [early debugging] AGESA woes

I took some days off the project for holiday in mid-July; after that there has been some amount of problem solving / headaches with my main development platform (samsung/lumpy), as the upstream coreboot tree still appears to be broken. There is incompatibility with recent binary blob and even after some local fixes I still lose some early debugging with usbdebug.

SAGE Electronic Engineering has kindly provided me an AMD Persimmon board for coreboot development.  It took a while for me to get it up and running as it turned out even the very basic documentation of the board connectors and jumpers were behind registration and login on AMD website. Board arrived with coreboot + SeaBIOS combination pre-installed. Seems like flashrom utility works and I also have SPI header connector for recovery purposes. So I should be all equipped for my own build and I hope I get lucky with usbdebug on this hardware.

Now I must say I am not a big fan of the AMD vendorcode named AGESA. Things like CBMEM init and CAR setup are done in a way different fashion compared to the implementation without using vendorcode wrappers. It is a heavy reading, and to get early logging via either CBMEM or usbdebug, I will need to master and possibly modify that code too.

On SerialICE parts I have not made the progress the way I originally had planned, meanwhile some cleanups on low-level PCI configuration and minor usbdebug fixes have been merged. There is more to come on that sector. A fairly complex patchset draft on CBMEM, finalizing that should make it possible to enable CBMEM console for all boards, for ramstage at least.

 

VultureProg: Equipped for galactic travel

vultureprog_ready_for_launchIt’s here! And it’s ready for takeoff. The VultureProg PCBs have finally arrived, and it is time to turn VultureProg from a proof-of-concept toy to a serious galactic tool. My major concern was that I could have misrouted one or two connections. The LAD pins are particularly sensitive, as they need to be mapped to sequential GPIO pins, and start from GPIO0, otherwise we need to do bitshitfs in every LPC cycle, killing any hope of decent performance. I was also worried that the particularly tight tolerances could be problematic during manufacture. Everything works as expected. Enough words, let’s see the porn.

vultureprog_hull

The bare PCBs

vultureprog_shuttle

Brings back memories, doesn’t it?

vultureprog_shuttle_angle

The world’s first fully assembled VultureProg board

From bitstream to reality

I found it interesting to look at the transformation from a spaghetti on the screen to something real, something tangible.

vultureprog_shuttle_sbs

From conception to birth: the complete transformation

No thorough and thoughtful post today. I have a new toy.

 

 

QiProg: Expanding the flight range


Today is disappointing. I was expecting to have gotten the first batch of VultureProg PCBs. Having fast_faster_vultureprogarrived in the US on Tuesday, I did not expect the package to be hovering in New York for most of this week. The need for eating the spaghetti and eliminating the hanging wires is growing more and more urgent with each speed increase. The long wires and insane inductance is already getting in the way of the signals, the lack of physical portability of the test setup is annoying, and the pain of accidentally knocking out a few wires is unbearable. Where are my PCBs ?!!!

High speed tests

The Stellaris is getting faster, much faster than initially predicted. From the pathetic 23 KiB/s read speed during the first tests, it now comfortably does over 450 KiB/s. The emulated LPC bus goes so fast, that I need to lower the CPU’s core clock down to a measly 16MHz to be able to just capture all the details of the waveforms.

Too much impedance

24MHz logic analyzer + Nyquist theorem = 12+MHz LPC clock

We’re pushing a 12MHz signal through 20+ centimeter hanging wires, routing them through a solder-less breadboard, then a socket, with the added load of probe wires. I was getting random errors or bad data, yet as soon as I disconnected the probe wires, everything magically worked. It must have had something to do with signal rise and fall times. On the Stellaris, this was relatively easy to cheat and fix by increasing the drive strength.

Where’s my bandwidth?

450 KiB/s * 1024 * 17 clocks/byte = 8 MHz LPC clock

Something does not add up. We know we’re running at over 12MHz because we can not sample the signal well, yet the throughput is much smaller. The USB on the Stellaris is easily capable of 1 MiB/s. To quote Seconds from Disaster, “when investigators looked at [...], what they found shocked them”:

vultureprog_idle_bus

 

The LPC bus is idle 30% of the time. We’re driving the bus so fast, that loop overhead is not only noticeable, but significant. Killing this overhead has the potential to bring the speed to 600 KiB/s.

Insect season

It’s a hot summer in Houston, with air conditioning running around the clock. Stepping in the hot, wet weather outside results in an instant cascade of sweat. Insects are crawling from every nick and crevice, fire ants are spawning from the underground in huge mounds, and mosquitoes are raising their own deadly army of high-pitched buzzers. The QiProg and VultureProg trees are no different.

I’ve made the executive decision to fix bugs as soon as they are getting in the way. I intentionally avoid the use of the term found. I can find my own damn bugs, however, it’s fixing them that is the problem. I prefer to have all the pieces in place before I polish and shine them.

A ripe testing ground

Out of curiosity, I wanted to see if VultureProg will work on Install’n’Pray operating systems (InP). InP are known for their excellent ability to expose even the smallest, most innocent problems, and expose problems where there are none. If VultureProg can work in an InP environment, it most certainly will be stable in unix-like environments. Although it did not work at first, testing on InP has lead to a number of  fixes.

promiscuous_vulture

Where next?

We have bulk reading completed, which was initially scheduled for week 6. Weeks 7 and 8 look scary, with a lot of goodies, including program and erase functionality. I’ve decided to peek early into how to program and erase LPC chips. I found flashrom’s jedec.c to be of great help: QiProg knows how to byte-program and erase JEDEC-compliant chips. A lot is still scattered in topic branches here and there, waiting for one’s mercy to merge. Somebody please send me some coffee.

GSoC (coreboot): Week 3 and 4

In the past two weeks I was on vacation and I have been working on what I call “test interface board”. Before I go on to elaborate this I feel there’s a need to discuss the big picture of this project because a lot of things have changed for good reasons and the old terminologies don’t make sense.

System Topology

Just to remind, my project is centered on building inexpensive and flexible test-rig for the Automated Distributed Firmware Test System described in Quality Assurance Talk by Stefan Reinauer.

A centralized Test Management Server generates test sequences for remotely located systems under test (SUTs) and this includes controlling and monitoring the SUTs and flashing different firmware builds on them. The test management server coordinates with a repository for accessing test builds and for storing test reports. Test reports are the final and useful output of the whole system and these may be accessed using browser by clients from internet.

A Test Supervision Server is a low power computer that acts as a local housekeeper of SUTs for a given physical location. It connects to the Test Management Server using SSH over internet and executes given test sequences by coordinating closely with the SUTs using a Test Interface Board. Programmable power-strips are provided to control power supply to the SUTs from Test Supervision Server.

My work will be confined to the distributed components for now. I have completed the programmable power-strip block. A future add-on to this block could be integrating active power & energy measurement of an SUT for energy efficiency benchmarking. If this is really desirable it could be done after I finish doing the other parts. Right now I’m working on the Test Interface Board.

Test Interface Board: Behaviour

The Test Interface Board provides necessary hardware interface for connecting Test Supervision Server to an SUT. This is necessary to flash firmware to the ROM, to control power/reset sw, to measure PSU voltages and surface temp. of ICs and to take POST feedback if available. Let’s dive into more details to see how this can be done.

Test Interface Board: Detailed box diagrams

FT232H has a multipurpose serial engine that can be configured as SPI master. FT232H has additional pins that may be used as GPIOs so a GPIO expander may not be needed. The FT232H datasheet states that it offers up to 30mpbs throughput in synchronous serial mode which makes it a fast flashing solution for the given price point (3$). Slave Select (SS) pins can be used to switch between other devices like an ADC that gives voltage and temperature measurements and an optional Feedback microcontroller configured as an SPI slave that gives more information about the SUT. A few GPIO pins can be used to configure Logic Level Translator to ensure compatibility with serial flash of different voltages ranging from 1.8 V to 5V and a GPIO pin will also be used to configure a FET toggle switch (MUX) to electrically detach the serial flash for programming and connecting it back to the motherboard when it’s done.

Notice that I’ve got rid of microcontroller this time. This is because a new microcontroller chip doesn’t necessarily have a bootloader and it needs to be programmed using a dedicated programmer. This adds considerable cost and inconvenience for someone who needs to build only few of these boards. So unless you’re using the optional Feedback module nothing needs to be programmed. Just ordering the board and components and soldering up everything using a 15W iron should be enough to make one of these.

I’m also going to ensure modularity by having small PCBs for each functionality connected to a main-board using headers so that they can be developed independently and used as required. Also, there’s flexibility of choosing temperature probes because it is possible that someone already has good quality probes (that come with professional DMMs).

And a few comments about the ADC I’ve chosen - The ideal choice of ADC for voltage and temperature measurements where the sampling period is large is an integrating ADC. An integrating ADC charges a capacitor from the input signal for a known period of time using an opamp integrator then it discharges that capacitor using a known negative reference voltage. The time it takes to discharge the capacitor is proportional to average value (area under curve) of input signal over sampling period. It’s theoretically simple but it needs use of precision external components and a microcontroller program to work. This is the technique used in professional DMMs (True-RMS) and bench power supplies (for feedback). Delta-Sigma ADCs are common and cheap these days but they don’t average the values over time like integrating ADCs. However, they can provide acceptable accuracy for our application and MCP3208 is a good candidate.

Please see the figure for more details and let me know if there are concerns or suggestions. I’ll post more stuff and schematics in a couple of days.

GSoC 2013 [flashrom] week #4

There is still no release, but we made quite some progress over the weekend. BSDs should work again as well as Windows (32b) – but the fix for 64b Windows will (most probably) not make it into 0.9.7 because we need to rework a bit of infrastructure code to get this right and we are not too sure about possible breakage from the quick hack I posted. We are still trying to verify the effectiveness of the IMC shutdown patch and looking for Dediprog testers with either an EM100 or a very old flash chip without fast reads. I expect the release to happen in the next two days. Now would be a good time to test flashrom’s trunk or my flashrom_0.9.7 branch on github on flash programmers near you. ;)

I would also welcome contributions to our brain storming about possible features of layout files. I decided to post them to the list instead of making a blog post out of it as envisioned in my last post.

DIY SOIC8 ZIF to DIP8 adapters

As mentioned previously I got an ASRock A180-H sample and want to develop Kabini support for flashrom with it. The board features an 8-pin DIP socket which is way easier to work with than a soldered flash chip, but the included DIP chip is the only one I possess. I do own quite a few SOIC chips though and 3 SOIC8 ZIF sockets though. And one of the two hacker spaces in town has these SOIC8 to DIP8 adapter PCBs in its inventory. So… of course I have built a SOIC8 ZIF to DIP8 adapter out of them.

Complete adapters are usually very expensive (>>50$) for no good reason (from a customer’s perspective only), so building such an adapter is the logical conclusion. Instead of the PCBs from futurlec mentioned above there is a very good deal available from adafruit. The cheapest SOIC ZIF sockets I could find are made by Wieson. They have 3 models available – for “200 mil” 8-pin (G6179-10), “209 mil” 8-pin (G6179-200000) and 16 pin (G6179-070000) chips. They are available from siliconkit (US, 2.5$ per 8-pin adapter, 10 pcs minimum), dediprog (Taiwan, 30$, 15 pcs package), and bios-repair.co.uk (UK, 5£) and probably elsewhere. I am using the G6179-10 below – IMHO the difference of the two 8-pin variants is the length of the chip to be inserted, not its width.

Before acquiring a pre-etched PCB I tried to solder a breakout board myself, similar to this tutorial:

diy_cropped_0

Attaching the wires works somewhat (but one needs steady hands and a lot of patience) but there is one major issue: there is just not enough space to fit the SOIC footprint inside the outline of the DIP pins (the DIP rows of the socket on the mainboard are 0.3″/~8mm apart, the SOIC chip including the pins is about the same size), and of course the pitch is so different that one can’t just solder it on top:
diy_tight_cropped_0

One could try solving this by rotating one of the footprints about 90° and routing the connections manually with enameled wires, but I very much prefer PCBs like this one:
board_cropped_0

Here is the end result with the futurlec PCB which took me a fraction of the time I wrestled with the manual breakout approach:standing_cropped_0

withchip_cropped_0

All pictures above are taken by myself and put into public domain.

GSoC 2013 [flashrom] week #3

After enjoying an extended weekend at the customs office the Micron chips finally arrived at my place and I began testing them with flashrom while preparing parts of this post. One has quite a lot of idle time when testing bigger flash chips with a lousy self-made programmer… the execution of flashrom -w with random data took 52m58s on a 16MB chip (N25Q128) to complete (which is actually two complete reads and one complete erase + read + write cycle per block). Oh if only there would be a cheap but awesome open USB programmer… ;)

Carl-Daniel was still missing till I called him at work yesterday, so Continue reading

GSoC [early debugging] Art of refactor

Your branch is ahead of origin/master by 48 commits.

Yes, I knew this would happen, it has become increasingly difficult to push new work for review on gerrit, as I have dependencies  on existing work waiting for merge. As the pile of un-merged patches increases so does the time I spend with git rebase, so I am hoping for some progress on that side.

My eyes in the local working directory have turned towards SerialICE integration inside coreboot tree. The benefits of this approach are better tree structure, wider hardware support, cache-as-ram and usbdebug.

There are several use-cases to consider:

  1. Compile classic stand-alone SerialICE ROM image with ROMCC, using super-IO and chipset initialisation from coreboot tree.
  2. Compile SerialICE as an alternative romstage with ROMCC, using existing coreboot bootblock added with serial port initialisation.
  3. Compile SerialICE as romstage with GCC and cache-as-ram to use existing usbdebug code and possibly better execution performance.
  4. Add abilility to jump out of SerialICE to regular romstage.

Also for the SerialICE session on debug host we have alternatives:

  1. Execute vendor BIOS image under QEMU.
  2. Execute coreboot image under QEMU.
  3. Execute coreboot image in user-mode under GDB without QEMU.
  4. Execute utils like nvramtool, msrtool, inteltool, superiotool, lspci and setpci remotely.

Now all of the above has been demonstrated before but not adopted. Adopting these widely for all mainboards may not happen during my GSoC, as there is no common function to call to enable a serial-port from romstage. At the minimum I will make some simple example one can follow to get SerialICE running on boards with existing coreboot support.

 

Cooking with thin spaghetti: The hard side of Vultureprog

One of the reasons I fell in love with the Stellaris Launchpad boards is that they are modularly vultureprog_3dexpandable. This notion is difficult to explain without comparison to STM Discovery boards, which have a row or two of pins on each side. The idea is simple: you hook one end of your wire to the right pin, and the other end to your breadboard, or you design a custom baseboard specific to the Discovery model. Stellaris takes this idea a little further. The layout of the pins is standardized, not just for the Stellaris, but across the family of TI development boards. Enter the Booster Packs: standardized add-on modules for TI boards. These modules are stackable, so it is possible to connect more than one to a single Stellaris board. This is why I wanted to use the Stellaris for this project. It’s much easier to build a booster pack than to tell people how to connect 32 wires; most people have problems connecting four of them to a buspirate. Let’s look at some of the design choices.

Constraints, constraints, constraints

It’s easy to imagine connecting a LPC chip: six wires and power. In reality, the situation is nowhere near as bright. Four ID pins need to be pulled low, reset pins (yes, there is more than one) need to be pulled high, and some pins simply cannot be left floating. Thus, even a simple bus like LPC becomes a nightmare. Without a logic analyzer to tell what works and what does not, the result is frustration and even self-inflicted injuries. Consequently, I wanted to do a few things right from the beginning(TM).

The most important point was to have all pins properly connected with zero wires. Users should not have to worry about what connects to where. Remember, these chips have 32 pins.

I also wanted to support all possible bus types. LPC and FWH are identical hardware-wise, and are not a problem to support concurrently. SPI is also just a few extra traces that lead to a header. On the other hand, having a programmer that also supports parallel mode is a much harder problem. It turns out there are really two “parallel” modes. The first one is ISA, where the chip is accessed via a linear address space. You put the address you want to access on the address pins, handle a couple of handshake lines to tell the chip if you want to read or write, and move the data over a separate 8-bit data bus.

On the other hand, the second “parallel” mode is a real pain. It uses a 2-dimensional address space, where you need to drive a row address, then a column address, and only then access the data. It’s called PP or “parallel programming” mode. Luckily we get a break: PP mode is an auxiliary programming mode specific to some LPC chips. If we support LPC, we don’t need PP. PP goes in the garbage bin (for now).

Now we need an efficient way to connect the GPIOs to the chip. By “efficient” I mean minimizing the number of GPIO accesses, and the number of bitshifts we need to do in firmware. A poorly chosen pinout will result in abysmal performance, as the 80MHz core struggles to shift the correct bit to the correct GPIO. My choice here was limited, as the best I could do was assign successive GPIOs to successive address pins. I spent the entire Sunday looking over chip datasheets and deciding on this “spaghetti recipe”.

Flexibility – a big issue

I also wanted to have the option between a normal PLCC32 socket, or a ZIF socket (AKA clamshell). I was really an idiot for thinking I would have both on the same board. On paper, it looks very straightforward. In reality, adjacent pins are on different hemispheres of the globe, and routing them is well, the tastiest spaghetti you have ever eaten. There was no way I could fit both a clamshell, and a PLCC32 socket. There was no way to route the 32 or so tracks on just 2 layers. So I killed the clamshell, the SPI header, and the LPC header. After a couple of hours of messing with the routing, I always had one or two pins that got cornered.

An epic fail

Even routing a simple PLCC socket proved difficult.

What coffee can do to you

I decided to start over, with all the components in place. Once I reduced the track size to 8 mils, and  spacing to 6 mils, I was able to route two tracks between a set of pins. This time, I placed the socket inside the clamshell, and managed to connect the two using just the top layer. I then worked from the booster pack connection to the DIP pins on the same side of the board, again, using only the top layer. Then I started using the bottom layer for DIP pins on the opposite side. After a few hours, Chuck Norris warped space and time to make room for all the tracks:

A little less epic this time

From here, it was a matter of optimizing the routing, taking care of ground planes and other finishing touches. In the end, we get VultureProg hardware version 0.1:

vultureprog_board

Don’t let the PRELIMINARY DESIGN warning fool you. There is an infinitesimal possibility I will ever want to go back and revise the design. We have 35 GPIOs. accessible on the Stellaris. Five of them are connected to the on-board LEDs and buttons. The remaining 30 are all used up.

Conclusion

If you are a Kicad user, you can head over to yet another one of my GitHub repositories. If you do not have a way to consume Kicad files, you can look in the doc and gerbers directories. Feel free to feed the gerbers to Mayhew Labs’ 3D Gerber Viewer (hint: you can rotate the board in 3D). With all that being done I ordered the first batch of PCBs from Seed Studio’s Fusion PCB service. Routing is definitely too crammed and painful, but I really wanted something versatile and flexible. Whether it lives up to its design goals in REV 0.1 or REV 0.2 remains to be seen. My money is on REV 0.1 — quite literally.

GSoC [coreboot debugging] Now it is broken, now it is not

I feel I did not make much progress the last week, I realised I wasted two days looking for error in my code and I finally found the error elsewhere. As for preparation to push my developments to review, I had rebased my tree. That is, I had picked up the developments done by other people in to my setup. My mistake. While the error still persists there in the master tree, in the process of recovering my platform I learned that there are two types of SOIC-8 SPI flash chips, ones that fit in the miniature socket I have and ones that are physically too large. The spare chips I had were of the second type and that slowed down my system recovery procedure radically. New flash chips are waiting for pickup in the store now.

This is actually just the situation I want coreboot to deal with better in the future: doing a firmware upgrade without the risk of bricking the device to the point where you need to use an external programmer device to recover. Problem is specifically with laptops, which may take a good hour or so to disassemble and put back together, and with every disassembly the risk of breaking some of those miniature connectors increases.

My plan of having two copies of firmware in the same flash chip image just got a bit more complicated. I learned that with recent platforms using a so-called binary blob for raminit, aka. system-agent binary, it is not possible to do a type of dual-boot-prefix setup I had planned, since one cannot put two system-agent binaries in the same CBFS image. I hope the system-agent build and release process is seriously improved to overcome this issue as badly gone(/done) binary blob upgrade procedure was the root-cause of my troubles the past week.

I have not really had a chance to test pre-OS flashing with FILO (actually the code might not yet be available for me to download). Instead I have attacked the low-level PCI and IO sources to reduce a good two or three copies from coreboot tree, this will help my efforts in the long run with SerialICE integration work.

 

GSoC (coreboot) Progress till week 2

As you might know my GSoC project is about making a test rig that can make coreboot test systems more accessible to a coreboot test server. This test rig enables coreboot test server to interact with the systems under test (which may be remotely located) in the following ways:

  • Power supply control (discussed in this post)
  • power/reset switch control, voltage and temperature readouts, firmware flashing on serial flash (to be done next)
  • provision for POST feedback (later)

With this project I’m hoping to create an environment where developers will be able to conveniently connect their systems to the coreboot test server for testing at their own place. This is why I like to call it a distributed test environment as it facilitates mass testing without the need to maintain a dedicated testing facility.

So this week I will present a nice and easy solution for power control of the coreboot test systems. I would call this device a ‘programmable power strip’. Before going to the final solution let me first walk you through all the routes that I’ve taken in order to answer some potential questions that may arise. Read on…