[GSoC] Ghidra firmware utilities, week 10

As stated in last week’s blogpost, I have started working on the UEFI helper script (aptly named UEFIHelper). The aim of this script is to assist with reverse engineering UEFI binaries. Similar projects exist for IDA Pro, including ida-efiutils, ida-efitools, and EFISwissKnife.

Background information

UEFI executables are either PE32(+) or TE binaries. The signature of the entry point function depends on the module type; some examples are PEI modules, DXE drivers, and UEFI applications. DXE drivers and standard UEFI applications use the following entry point function:

EFI_STATUS
_ModuleEntryPoint (
  IN EFI_HANDLE        ImageHandle,
  IN EFI_SYSTEM_TABLE  *SystemTable
  );

Other types of modules (such as PEI modules and PEI/DXE core modules) may have different parameters and return types for the entry point function. Nevertheless, we’ll focus on the standard entry point for now. ImageHandle is a firmware-allocated handle for the current EFI application. SystemTable is a pointer to the EFI_SYSTEM_TABLE structure, which in turn has pointers to other EFI tables (such as EFI_BOOT_SERVICES and EFI_RUNTIME_SERVICES). These tables provide data structures and function pointers for standard UEFI functionality, such as getting/setting NVRAM variables, locating/installing UEFI protocols, loading additional UEFI images, rebooting the system, etc.

UEFI’s extensibility is largely implemented through the use of protocols. Protocols are data structures used to enable communication between different UEFI modules, and can be identified by a GUID. A simplified example could be a a UEFI driver for a graphics card. It could support pre-boot graphics output by installing an implementation of the EFI_GRAPHICS_OUTPUT_PROTOCOL, which could then be located and used by other UEFI applications and drivers for graphics output.

UEFIHelper progress

MdePkg in EDK2 includes headers for core UEFI types and protocols. Given a parser configuration file, the C parser in Ghidra can be used to generate data type archives, which are used for storing type definitions in Ghidra. I used the MdePkg headers to generate UEFI data type archives for x86, x86_64, ARMv7, and ARMv8 (AArch64). UEFIHelper will automatically load the correct data type library for the current program’s architecture.

UEFIHelper will search for known GUIDs in the .data segment of the current UEFI program and apply the EFI_GUID type definition. UEFIHelper will also fix the entry point function signature to match the standard entry point for UEFI DXE drivers and applications.

UEFIHelper is a part of the ghidra-firmware-utils extension, and is available on GitHub as usual. I will continue to work on UEFIHelper during the next week.

[GSoC] Ghidra firmware utilities, week 9

Last week, I finished up my work on the UEFI firmware volume FS loader. This was the last FS loader I planned on writing for this project, so now it’s time to work on writing additional binary loaders and helper scripts to assist with UEFI reverse engineering. During the past couple of days, I’ve been working on a loader for Terse Executable (TE) binaries.

For the most part, UEFI binaries are standard PE32(+) executables. Standard headers such as the DOS stub, COFF header, and image headers are present. In order to reduce the size of binaries required for UEFI Platform Initialization, the TE binary format was created. The TE header only includes the fields needed for execution, dropping unnecessary fields such as the DOS stub. TE binaries are otherwise similar to PE32 binaries. EDK2 has additional documentation regarding the TE header.

Like the existing PE32 binary loader, the TE binary loader defines the program sections and defines the entry point function. It can be used in conjunction with the UEFI firmware volume FS loader to import TE image sections for analysis.

The TE binary loader is included in the latest commit in ghidra-firmware-utils. As always, feel free to submit an issue report if you encounter any problems with it.

Plans for this week

I have started working on the UEFI helper script. This script aims to assist with UEFI reverse engineering by loading UEFI type definitions, defining GUIDs, and fixing the entry point.

GSoC 2010, TianoCore as a payload :(

It’s been an interesting summer.  It didn’t at all turn out how I expected, but it is what it is.  TianoCore as a software project turned out to be massively more complex than I anticipated when I submitted my proposal, and the level of knowledge required was quite a bit deeper than I expected… it’s one of those cases where I didn’t know what I didn’t know.  I’ll have to talk to a couple of my professors about that, to see if there’s some elective class that explains the things I’ve missed.

Sorry, that’s vague, let me give an example.  Coreboot does it’s thing, hardware initialization, then passes control to the payload.  This seems to be the equivalent of the dreaded “goto”, which is actually pretty cool.  Coreboot doesn’t care what happens next.  So hypothetically, I have some code, anything, I want to use it as a payload.  I compile it, then what?  Well, it depends (as you all know), how was it compiled?  Is it an elf?  PE32?  Something else?  Where exactly is the entry point to this binary blob?  (That’s a rhetorical question, please don’t answer it in the comments.)  You would have thought at some point in one of my classes executable formats would have come up, just as an example.  Or calling conventions.  Or hundreds of other little things that I’d never seen or heard of before I suddenly realized that I needed to understand them.  So that’s what I ended up spending much of the summer on.  Write code… stop and realize what I’m doing doesn’t make sense/won’t work/is the wrong approach, then start over.

One of the things that drew me to coreboot as a project was that as a computer engineering student, I took a lot of classes focusing on the physical side of computing, starting with physics and circuits classes, moving up through logic gates to chip design.  On the other side, programming started at a pretty high level with c++, then worked down, till I got to the computer architecture and operating system classes, and assembly language (not x86, unfortunately).   I would expect that as a “computer engineer” I should understand the whole stack, that the physical, EE stuff and the CS stuff would meet in the middle.  But they haven’t (and they won’t: I’m about to graduate, and there aren’t any crucial classes left to take).  I knew this going into GSoC, and coreboot seemed like the perfect project to fill the gap (and give something back to the open source world that I’ve gotten so much out of).  Well, like I said, the gap turned out to be a lot bigger than I expected.  (To abuse the metaphor a little more, anyone remember when Evel Knievel tried to jump the Snake River Canyon?  That’s kind of how I feel about my summer of code.) Continue reading GSoC 2010, TianoCore as a payload 🙁

TianoCore payload, mid-summer status update

Well, it’s hard to believe that the GSoC midterm evaluations are here already.  I guess it’s true what they say, time flies when you’re sitting in a basement in front of a computer all day. If I were to evaluate myself, I’d give myself a barely passing grade based on results –I’m nowhere near where I expected to be this summer.  I think I mentioned before, partly this is due to TianoCore being massively more complicated than I expected when I wrote my project proposal –seriously, I ran cscope on the edk2 branch of TianoCore, and it reported over 160,000 files… the resulting index itself took up half a gig– and partly it’s due to there being so much about sophisticated C usage (and makefiles, and preprocessor directives, and macros, and calling conventions…) that I didn’t understand going into this.  (But, having said that, part of the reason I applied to coreboot was I knew there were a lot of important details that had been glossed over in my classes, things that I needed to know and would be forced to learn if I worked on a close-to-the-hardware project.)

Moving on to the status of my project. In my proposal I assumed a couple weeks to improve the state of TianoCore as a payload, a month or so to write a CBFS driver for TianoCore, and a month or so to write the VGA driver.  It turns out that the state of TianoCore as a payload was not very good, and so that is what I am working on.

Let me try to briefly explain my current approach.  UEFI itself does not initialize the hardware.  Before the UEFI firmware can be run, the system (from a cold boot) has to go through the Platform Initialization stage.  The PI stage is itself made up of the Security stage (the initial booting, and some optional checksums to make sure the image hasn’t been tampered with), the Pre-EFI Initiialization (PEI , where the memory and chipsets are woken up and initialized) and the Driver Execution Environment (DXE, which loads additional drivers, then starts the UEFI).  Coreboot already does most of this work in its own way, so it seems the best strategy would be for a coreboot payload to impersonate one of these stages (each stage is it’s own binary in the firmware volume), provide all the functions and data stuctures that the following stage expects, then jumps to it.  Inserting the payload immediately after the Security stage seems redundant and dangerous (the PEI stage would end up trying to reinitialize hardware that’s already  in use), and after the DXE stage seems too late, because then the payload would have to know how to load DXE-stage binary drivers.  So I’m working on implementing a pseudo-PEI stage, that translates the coreboot provided data structures into the form that the DXE stage expects, and writing a couple dozen functions that the DXE stage expects to be available.  This way we can have a minimal-sized payload that can leverage a separate DXE and UEFI stage compiled directly from the TianoCore codebase (or borrowed from a manufacturer supplied image, like a traditional option ROM).  I think I can have this working by the end of summer. Thoughts?

TianoCore payload status

I’ve been holding off on writiing a post until I had some significant chunk of code I could point to to show that I’m making progress.  Well, I am making progress, but there’s no “significant amount” of code yet.  I’m not worried about the slow start, because what I’m doing makes sense and I am writing code, but there are still so many things I run into every day where I have to stop and look something up to try to figure out why it was done a particular way.  Almost all of the confusion comes from the TianoCore codebase, which is 1) huge, and 2) written in a consistent but unfamiliar style.  Like this line,

 IN OUT   EFI_PEI_PPI_DESCRIPTOR      **PpiDescriptor OPTIONAL,

What’s with the IN, OUT and OPTIONAL?  I haven’t been able to find them defined anywhere, I’m assuming they’re special comments that Visual Studio knows about, but I haven’t found any reference to them. In comparison, reading the coreboot code is fun and easy.  It’s just straightforward c.

Progress with TianoCore as a payload

I’m finally past the point where I’m scratching my head over what I’m reading and know what I have to do.  This changes my initial schedule, somewhat.  I hadn’t really understood how much work needed to be done to make TianoCore an effective payload, so my original estimate of “two weeks” was totally wrong, but now that I’ve started I’m feeling good about it.  Right now I’m working on getting TianoCore to build with the coreboot reference toolchain, and adapting the Makefile and kconfig stuff from FILO.  The current way of building TianoCore’s edk2 package involves building their toolchain, and I’d like to cut that part out, or automate it if I can’t cut it out entirely.  The goal is to reduce the number of hoops that potential users and developer’s have to jump through in order to try EFI with coreboot.

One week plus…

I would like to report that I’m making some progress, but that wouldn’t quite be accurate.  I am making progress: I’m putting in adequate hours, reading code, reading specifications, but… let’s just say I’m starting to appreciate some of the less than complimentary things I’ve read about [U]EFI.  It is extremely complicated.  My inner engineer is constantly frowning as I read page after page of documentation, thinking, really?  they couldn’t think of a simpler way to design this?  Not trying to complain.  There are a few things I really like about EFI, but this sort of software development –the kind that starts with 3000+ pages of specifcations– is unlike anything I’ve ever dealt with.

Getting TianoCore to work well as a coreboot payload

Hi!  My name’s Robert Austin, and my GSoC 2010 project is to get TianoCore working well as a coreboot payload.   TianoCore is the open source component of Intel’s implementation of UEFI.  TianoCore on it’s own is not a BIOS replacement, but it can do some interesting things, and since a quite a few large companies are already committed to EFI, it makes sense to have an EFI payload as an option in coreboot.

I am very excited about working on this project, and working with the coreboot community.  I will make regular announcements about my progress here on this blog, and I will keep my working code available in a git repository here.  There isn’t anything there yet, but there will be soon.  To anyone who happens to checkout my code during the summer, I welcome any comments or suggestions relating to the style or quality of my code.  I want to write high-quality code, so feedback is appreciated.