[GSoC] Address Sanitizer, Part 2

Hello again! Its been a month since my last blog post. So, there are many updates I’d like to share. I’ll first cover the Address Sanitizer (ASan) algorithm in detail and then summarize the progress made until the second evaluation period.


If you recall my last post, I had briefly talked about the principle behind ASan. Now, let us discuss the ASan algorithm in much more depth. But first, we need to understand the memory layout and how the compiler’s instrumentation works.

Memory mapping

ASan divides the memory space into 2 disjoint classes:

  • Main Memory (mem): This represents the original memory used by our coreboot program in a particular stage.
  • Shadow Memory (shadow): This memory region contains the shadow values. The state of each 8 aligned bytes of mem is encoded in a byte in shadow. As a consequence, the size of this region is equal to 1/8th of the size of mem. To reserve a space in memory for this class, we added a new linker section and named it asan_shadow.

This is how we added asan_shadow in romstage:

#if ENV_ROMSTAGE && CONFIG(ASAN_IN_ROMSTAGE)
	_shadow_size = (_ebss - _car_region_start) >>  3;
	REGION(asan_shadow, ., _shadow_size, ARCH_POINTER_ALIGN_SIZE)
#endif

and in ramstage:

#if ENV_RAMSTAGE && CONFIG(ASAN_IN_RAMSTAGE)
	_shadow_size = (_eheap - _data) >> 3;
	REGION(asan_shadow, ., _shadow_size, ARCH_POINTER_ALIGN_SIZE)
#endif

The linker symbol pairs (_car_region_start, _ebss) and ( _data, _eheap) are references to the boundary addresses of mem in romstage and ramstage respectively.

Now, there exists a correspondence between mem and shadow classes and we have a function named asan_mem_to_shadow that performs this translation:

void *asan_mem_to_shadow(const void *addr)
{
#if ENV_ROMSTAGE
	return (void *)((uintptr_t)&_asan_shadow + (((uintptr_t)addr -
		(uintptr_t)&_car_region_start) >> ASAN_SHADOW_SCALE_SHIFT));
#elif ENV_RAMSTAGE
	return (void *)((uintptr_t)&_asan_shadow + (((uintptr_t)addr -
		(uintptr_t)&_data) >> ASAN_SHADOW_SCALE_SHIFT));
#endif
}

In other words, asan_mem_to_shadow maps each 8 bytes of mem to 1 byte of shadow.

You may wonder what is stored in shadow? Well, there are only 9 possible shadow values for any aligned 8 bytes of mem:

  • The shadow value is 0 if all 8 bytes in qword are unpoisoned (i.e. addressable).
  • The shadow value is negative if all 8 bytes in qword are poisoned (i.e. not addressable).
  • The shadow value is k if the first k bytes are unpoisoned but the rest 8-k bytes are poisoned. Here k could be any integer between 1 and 7 (1 <= k <= 7).

When we say a byte in mem is poisoned, we mean one of these special values are written into the corresponding shadow.

Instrumentation

Compiler’s ASan instrumentation adds a runtime check to every memory instruction in our program i.e. before each memory access of size 1, 2, 4, 8, or 16, a function call to either __asan_load(addr) or __asan_store(addr) is added.

Next, it protects stack variables by inserting gaps around them called ‘redzones’. Let’s look at an example:

int foo ()
{
char a[24] = {0};
int b[2] = {0};
int i;

a[5] = 1;

for (i = 0; i < 10; i++)
    b[i] = i;

return a[5] + b[1];
}

For this function, the instrumented code will look as follows:

int foo ()
{
char redzone1[32]; // Slot 1, 32-byte aligned
char redzone2[8];  // Slot 2
char a[24] = {0};  // Slot 3
char redzone3[32]; // Slot 4, 32-byte aligned
int redzone4[6];   // Slot 5
int b[2] = {0};    // Slot 6
int redzone5[8];   // Slot 7, 32-byte aligned
int redzone6[7];
int i;
int redzone7[8];

__asan_store1(&a[5]);
a[5] = 1;

for (i = 0; i < 10; i++)
    __asan_store4(&b[i]);
    b[i] = i;

__asan_load1(&a[5]);
__asan_load4(&b[1]);
return a[5] + b[1];
}

As you can see, the compiler has inserted redzones to pad each stack variable. Also, it has inserted function calls to __asan_store and __asan_load before writes and reads respectively.

The shadow memory for this stack layout is going to look like this:

Slot 1:        0xF1F1F1F1
Slots 2, 3:    0xF1000000  
Slot 4:        0xF1F1F1F1
Slots 5, 6:    0xF1F1F100  
Slot 7:        0xF1F1F1F1

Here F1 being a negative value represents that all 8 bytes in qword are poisoned whereas the shadow value of 0 represents that all 8 bytes in qword are accessible. Notice that in the slots 2 and 3, the variable ‘a’ is concatenated with a partial redzone of 8 bytes to make it 32 bytes aligned. Similarly, a partial redzone of 24 bytes is added to pad the variable ‘b’.

The process of protecting global variables is a little different from this. We’ll talk about it later in this blog post.

Now, as we have looked at the memory mapping and the instrumented code, let’s dive into the algorithm.

ASan algorithm

The algorithm is pretty simple. For every read and write operation, we do the following:

  • First, we find the address of the corresponding shadow memory for the location we are writing to or reading from. This is done by asan_mem_to_shadow().
  • Then we determine if the access is valid based on the state stored in the shadow memory and the size requested. For this, we pass the address returned from asan_mem_to_shadow() to memory_is_poisoned(). This function dereferences the shadow pointer to check the memory state.
  • If the access is not valid, it reports an error in the console mentioning the instruction pointer address, the address where the bug was found, and whether it was read or write. In our ASan library, we have a dedicated function asan_report() for this.
  • Finally, we perform the operation (read or write).
  • Then, we continue and move over to the next instruction.

Here’s a pictorial representation of the algorithm:

Note that whether access is valid or not, ASan never aborts the current operation. However, the calls to ASan functions do add a performance penalty of about ~1.5x.

Now, as we have understood the algorithm, let’s look at the function foo() again.

int foo ()
{
char a[24] = {0};
int b[2] = {0};
int i;

__asan_store1(&a[5]);
a[5] = 1;

for (i = 0; i < 10; i++)
    __asan_store4(&b[i]);
    b[i] = i;

__asan_load1(&a[5]);
__asan_load4(&b[1]);
return a[5] + b[1];
}

Have a look at the for loop. The array ‘b’ is of length 2 but we are writing to it even beyond index 1.

As the loop is executed for index 2, ASan checks the state of the corresponding shadow memory for &b[2]. Now, let’s look at the shadow memory state for slot 7 again (shown in the previous section). It is 0xF1F1F1F1. So, the shadow value for the location b[2] is F1. It means the address we are trying to access is poisoned and thus ASan is triggered and reports the following error in the console log:

ASan: stack-out-of-bounds in 0x07f7ccb1
Write of 4 bytes at addr 0x07fc8e48

Notice that 0x07f7ccb1 is the address where the instruction pointer was pointing to and 0x07fc8e48 is the address of the location b[2].


ASan support for romstage

In the romstage, coreboot uses cache to act as a memory for our stack and heap. This poses a challenge when adding support for ASan to romstage because of two reasons.

First, even within the same architecture, the size of cache varies across the platforms. So, unlike ramstage, we can’t enable ASan in romstage for all platforms by doing tests on a handful of devices. We have to test ASan on each platform before adding this feature. So, we decided to introduce a new Kconfig option HAVE_ASAN_IN_ROMSTAGE to denote if a particular platform supports ASan in romstage. Now, for each platform for which ASan in romstage has been tested, we’ll just select this config option. Similarly, we also introduced HAVE_ASAN_IN_RAMSTAGE to denote if a given platform supports ASan in ramstage.

The second reason is that the size of a cache is very small compared to the RAM. This is critical because the available memory in the cache is quite low. In order to fit the asan_shadow section on as many platforms as possible, we have to make efficient use of the limited memory available. Thankfully, to a large extent, this problem was solved by our GCC patch which allowed us to append the shadow memory buffer to the region already occupied by the coreboot program. (I ran Jenkins builder and it built successfully for all boards except for the ones that hold either Braswell SoC or i440bx northbridge where the cache area got full and thus couldn’t fit the asan_shadow section.)

In the first stage, we have enabled ASan in romstage for QEMU, Haswell and Apollolake platforms as they have been tested.

Further, the results of Jenkins builder indicate that, with the current translation function, the support for ASan in romstage can be added on all x86 platforms except the two mentioned above. Therefore, I’ve asked everyone in the community to participate in the testing of ASan so that this debugging tool can be made available on as many platforms as possible before GSoC ends.


Global variables

When we initially added support for ASan in ramstage, it wasn’t able to detect out-of-bounds bugs in case of global variables. After debugging the code, I found that the redzones for the global variables were not poisoned. So, I went through GCC’s ASan and Linux’s KASAN implementation again and realized that the way in which the compiler protects global variables was very much different from its instrumentation for stack variables.

Instead of padding the global variables directly with redzones, it inserts constructors invoking the library function named __asan_register_globals to populate the relevant shadow memory regions. To this function, compiler also passes an instance of the following type:

struct asan_global {
       /* Address of the beginning of the global variable. */
       const void *beg;

       /* Initial size of the global variable. */
       size_t size;

       /* Size of the global variable + size of the red zone. This
          size is 32 bytes aligned. */
       size_t size_with_redzone;

       /* Name of the global variable. */
       const void *name;

       /* Name of the module where the global variable is declared. */
       const void *module_name;

       /* A pointer to struct that contains source location, could be NULL. */
       struct asan_source_location *location;
     }

So, to enable the poisoning of global variables’ redzones, I created a function named asan_ctors which calls these constructors at runtime and added it to ASan initialization code for the ramstage.

You may wonder why asan_ctors() is only added to ramstage? This is because the use of global variables is prohibited in coreboot for romstage and thus there is no need to detect global out-of-bounds bugs.


Next steps..

In the third coding period, I’ve started working on adding support for ASan to memory functions like memset, memmove and memcpy. I’ll push the patch on Gerrit pretty soon.

Once, this is done, I’ll start writing documentation on ASan answering questions like how to use ASan, what kind of bugs can be detected, what devices are currently supported, and how ASan support can be added to other architectures like ARM or RISC-V.

[ASan patches]

[GSoC] Address Sanitizer, Part 1

Hello everyone. My name is Harshit Sharma (hst on IRC). I am working on the project to add the “Address Sanitizer” feature to coreboot as a part of GSoC 2020. Werner Zeh is my mentor for this project and I’d like to thank him for his constant support and valuable suggestions.

It’s been a fun couple of weeks since I started working on this project. Though I found the initial few weeks quite challenging, I am glad that I was able to get past that and learned some amazing stuff I’d cherish for a long time.

Also, being a student, I find it incredible to have got a chance to work with and learn from such passionate, knowledgeable, and helpful people who are always available over IRC to assist.

A few of you may already know about the progress we’ve made through Werner’s message on the mailing list. This is a much comprehensive report on the work we had done prior to the first evaluation period.


What is Address Sanitizer ?

Address Sanitizer (ASan) is a runtime memory debugging tool, designed to find out of bounds and use after free memory bugs. It is a part of toolset Google has developed which also includes Undefined Behaviour Sanitizer (UBSan), a Thread Sanitizer (TSan), and a Leak Sanitizer (LSan).

This tool would help in improving code quality and thus would make our runtime code more robust.


Compile with ASan

The firstmost task was to ensure that coreboot compiles without any errors after Asan is enabled. For this, we created dummy functions and added the relevant GCC flags to build coreboot with ASan. We also introduced a Kconfig option (currently only available on x86 architecture) to enable this feature. (Patch [1])


Porting code from Linux

The design of ASan in coreboot is based on its implementation in Linux kernel, also known as Kernel Address Sanitizer (KASAN). However, coreboot differs a lot from Linux kernel due to multiple stages and that is what poses a challenge.


Design of ASan

ASan uses compile-time instrumentation that adds a run-time check before every memory instructions.

The basic idea is to encode the state of each 8 aligned bytes of memory in a byte in the shadow region. As a consequence, the shadow memory is allocated to 1/8th of the available memory. The instrumentation of the compiler adds __asan_load and __asan_store function calls before memory accesses which seeks the address of the corresponding shadow memory and then decides if access is legitimate depending on the stored state.

If the access is not valid, it throws an error telling the instruction pointer address, the address where a bug is found and whether it was read or write.


Need for GCC patch

Unlike Linux kernel which has a static shadow memory layout, coreboot has multiple stages with very different memory maps. Thus we require different shadow offset addresses for each stage.

Unfortunately, GCC presently, only supports using a static shadow offset address which is specified at compile time using -fasan-shadow-offset flag.

So, we came up with a GCC patch that introduces a feature to use a dynamic shadow offset address. Through this, we enable GCC to determine shadow offset address at runtime using a callback function __asan_shadow_offset().

Though we’ve presented our patch to GCC developers to convince them to include this change in their upcoming version, for now, ASan on coreboot only works if this patch is applied. (Patches [2] and [3])


Allocating and initializing shadow buffer

Instead of allocating a shadow buffer for the whole memory region, we decided to only define shadow memory for data and heap sections. Since we don’t want to run ASan for hardware mapped addresses, this way we save a lot of memory space.

Once the shadow region is allocated by the linker, the next task is to initialize it as early as possible at runtime. This is done by a call to asan_init() which unpoisons i.e. sets the shadow memory corresponding to the addresses in the data and heap sections to zero.


ASan support for ramstage

We decided to begin in a comparatively simpler stage i.e. ramstage. Since ramstage uses DRAM, the implementation is common across all platforms of a particular architecture. Moreover, ramstage provides enough room in the memory to place our shadow region.

For now, we have only enabled this feature for x86 architecture and it is able to detect stack out of bounds and use after scope bugs. Though the patches are still in the review stage, we did a test on QEMU and siemens mc_apl3 (Apollo Lake based) and so far it looks good to us. (Patch [4])


Next steps..

In the second coding period, we’ve started working on adding ASan to romstage. I will push a change on Gerrit once we make some considerable progress.


We further encourage everyone in the community to review the patches and test this feature on their hardware for better test coverage. The more feedback we get, the more are the chances to have a high-quality debugging tool.

Announcing coreboot 4.12

coreboot 4.12 was released on May 12th, 2020.

Since 4.11 there were 2692 new commits by over 190 developers and of these, 59 contributed for the first time, which is quite an amazing increase.

Thank you to all developers who again helped made coreboot better than ever, and a big welcome to our new contributors!

Maintainers

This release saw some activity on the MAINTAINERS file, showing more persons, teams and companies declare publicly that they intend to take care of mainboards and subsystems.

To all new maintainers, thanks a lot!

Documentation

Our documentation efforts in the code tree are picking up steam, with some 70 commits in that general area. Everything from typo fixes to documenting mainboard support or coreboot APIs.

There’s still room to improve, but the contributions are getting more and better.

Hardware support

The removals due to the announced deprecations as well as the deduplication of boards into variants skew the stats a bit, so at a top level view this is a rare coreboot release in that it removes more boards (51) than it adds (49).

After accounting for the variant moves the numbers in favor of more hardware supported than the previous version. Besides a whole lot of Chrome OS devices (again), this release features a whole bunch of retrofits for devices originally shipping with non-coreboot OEM firmware, but also support for devices that come with coreboot right out of the box.

For that, a shout out to System76, Protectli, Libretrend and the Open Compute Project!

Cleanup

We simplified the header that comes at the top of every file: Instead of a lengthy reference to the license any given file is under, or even the license text itself, we opted for simple SPDX identifiers.

Since people also handled copyright lines differently, we now opt for collecting authors in AUTHORS and let git history tell the whole story.

While at it, the content-free "This file is part of this-and-that project" header was also dropped.

Besides that, there has also been more work to sort out the headers we include across the tree to minimize the code impacting every compilation unit.

Now that our board-variant mechanism matured, many boards that were individual models so far were converted into variants, making it easier to maintain families of devices.

Deprecations

For the 4.12 release a few features on x86 became mandatory. These are relocatable ramstage, postcar stage and C_ENVIRONMENT_BOOTBLOCK.

Relocatable ramstage

Relocatable stages are a feature implemented only on x86, where stages can be relocated at runtime. This is used to place ramstage in a better location that does not collide with memory the OS or the payload tends to use. The rationale behind making this mandatory is that you always want cbmem to be cached so it’s a good location to run ramstage from. It avoids using lower memory altogether so the OS can make use of it and no backing up needs to happen on S3 resume.

Postcar stage

With Postcar stage tearing down Cache-as-Ram is done in a separate stage. This means that romstage has a clean program boundary and that all variables in romstage can be accessed via their linked addresses without runtime resolution. There is no need to link global and static variables via the CAR_GLOBAL macro and no need to access them with car_set/get_var/ptr functions.

C_ENVIRONMENT_BOOTBLOCK

Historically the bootblock on x86 platforms has been compiled with romcc. This means that the generated code only uses CPU registers and therefore no stack. This 20K+ LOC compiler is limited and hard to maintain and so is the code that one has to write in that environment. A different solution is to set up Cache-as-Ram in the bootblock and run GCC compiled code in the bootblock. The advantages are increased flexibility and consistency with other architectures as well as other stages: e.g. printing to console is possible and VBOOT can run before romstage, making romstage updatable via RW FMAP regions.

Platforms dropped from master

The following platforms did not implement those feature are dropped from master to allow the master branch to move on:

  • AMDFAM10
  • all FSP1.0 platforms: BROADWELL_DE, FSP_BAYTRAIL, RANGELEY
  • VIA VX900

In particular on FSP1.0 it is impossible to implement POSTCAR stage. The reason is that FSP1.0 relocates the CAR region to the HOB before returning to coreboot. This means that after FSP returns to coreboot accessing variables via their original address is not possible. One way of obtaining that behavior would be to set up Cache-as-Ram again (but with open source code) and copy the relocated data from the HOB there. This solution is deemed too hacky. Maybe a lesson can be learned from this: blobs should not interfere with the execution environment, as this makes proper integration much harder.

4.11_branch

Given that some platforms supported by FSP1.0 are being produced and popular, the 4.11 release was made into a branch in which further development can happen.

Significant changes

SMMSTORE is now production ready

See smmstore for the documentation on the API, but note that there will be an update to it featuring a much-improved but incompatible API.

Unit testing infrastructure

Unit testing of coreboot is now possible in a more structured way, with new build subsystem and adoption of Cmocka framework. Tree has new directory tests/, which comprises infrastructure and examples of unit tests. See Unit testing coreboot for the design document.

Final Notes

Your favorite new feature or supported board didn’t make it to the release notes? They’re maintained collaboratively in the coreboot tree, so when you land something noteworthy don’t be shy, contribute to the upcoming release’s document in Documentation/releases!

Implementing support for advanced DPTF policy in Linux

Intel's Dynamic Platform and Thermal Framework (DPTF) is a feature that's becoming increasingly common on highly portable Intel-based devices. The adaptive policy it implements is based around the idea that thermal management of a system is becoming increasingly complicated - the appropriate set of cooling constraints to place on a system may differ based on a whole bunch of criteria (eg, if a tablet is being held vertically rather than lying on a table, it's probably going to be able to dissipate heat more effectively, so you should impose different constraints). One way of providing these criteria to the OS is to embed them in the system firmware, allowing an OS-level agent to read that and then incorporate OS-level knowledge into a final policy decision. Unfortunately, while Intel have released some amount of support for DPTF on Linux, they haven't included support for the adaptive policy. And even more annoyingly, many modern laptops run in a heavily conservative thermal state if the OS doesn't support the adaptive policy, meaning that the CPU throttles down extremely quickly and the laptop runs excessively slowly. It's been a while since I really got stuck into a laptop reverse engineering project, and I don't have much else to do right now, so I've been working on this. It's been a combination of examining what source Intel have released, reverse engineering the Windows code and staring hard at hex dumps until they made some sort of sense. Here's where I am. There's two main components to the adaptive policy - the adaptive conditions table (APCT) and the adaptive actions table (APAT). The adaptive conditions table contains a set of condition sets, with up to 10 conditions in each condition set. A condition is something like "is the battery above a certain charge", "is this temperature sensor below a certain value", "is the lid open or closed", "is the machine upright or horizontal" and so on. Each condition set is evaluated in turn - if all the conditions evaluate to true, the condition set's target is implemented. If not, we move onto the next condition set. There will typically be a fallback condition set to catch the case where none of the other condition sets evaluate to true. The action table contains sets of actions associated with a specific target. Once we've picked a target by evaluating the conditions, we execute the actions that have a corresponding target. Actions are things like "Set the CPU power limit to this value" or "Load a passive policy table". Passive policy tables are simply tables associating sensors with devices and an associated temperature limit. If the limit is exceeded, the associated device should be asked to reduce its heat output until the situation is resolved. There's a couple of twists. The first is the OEM conditions. These are conditions that refer to values that are exposed by the firmware and are otherwise entirely opaque - the firmware knows what these mean, but we don't, so conditions that rely on these values are magical. They could be temperature, they could be power consumption, they could be SKU variations. We just don't know. The other is that older versions of the APCT table didn't include a reference to a device - ie, if you specified a condition based on a temperature, you had no way to express which temperature sensor to use. So, instead, you specified a condition that's greater than 0x10000, which tells the agent to look at the APPC table to extract the device and the appropriate actual condition. Intel already have a Linux app called Thermal Daemon that implements a subset of this - you're supposed to run the binary-only dptfxtract against your firmware to parse a few bits of the DPTF tables, and it writes out an XML file that Thermal Daemon makes use of. Unfortunately it doesn't handle most of the more interesting bits of the adaptive performance policy, so I've spent the past couple of days extending it to do so and to remove the proprietary dependency. My current work is here - it requires a couple of kernel patches (that are in the patches directory), and it only supports a very small subset of the possible conditions. It's also entirely possible that it'll do something inappropriate and cause your computer to melt - none of this is publicly documented, I don't have access to the spec and you're relying on my best guesses in a lot of places. But it seems to behave roughly as expected on the one test machine I have here, so time to get some wider testing? comment count unavailable comments

Announcing coreboot 4.11

The coreboot project is proud to announce to have released coreboot 4.11.

This release cycle was a bit shorter to get closer to our regular schedule of releasing in spring and autumn.

Since 4.10 there were 1630 new commits by over 130 developers. Of these, about 30 contributed to coreboot for the first time.

Thank you to all contributors who made 4.11 what it is and welcome to the project to all new contributors!

Clean Up

The past few months saw lots of cleanup across the source tree:

The included headers in source files were stripped down to avoid reading unused headers, and unused code fragments, duplicate preprocessor symbols and configuration options were eliminated. Even ACPI got its share of attention, making our tables and bytecode more standards compliant than ever.

The code across Intel’s chipsets was unified some more into drivers for common function blocks, an effort we’re more confident will succeed now that Intel itself is driving it.

Chipset work

Most activity in the last couple months was on Intel support, specifically the Kaby Lake and Cannon Lake drivers were extended for the generations following them.

On ARM, the Mediatek 8173 chipset support saw significant work while the AMD side worked on getting Picasso support in.

But everything else also saw some action, the relatively old (e.g. Intel GM45, Via VX900), the tiny (RISC-V) and the obscure (Quark).

Verified Boot

The vboot feature that Chromebooks brought into coreboot was extended to work on devices that weren’t specially adapted for it: In addition to its original device family it’s now supported on various Lenovo laptops, Open Compute Project systems and Siemens industrial machines.

Eltan’s support for measured boot continues to be integrated with vboot, sharing data structures and generally working together where possible.

New devices

With 4.11 there’s the beginning of support for Intel Tiger Lake and Qualcomm’s SC7180 SoCs, while we removed the unmaintained support for Allwinner’s A10 SoC.

There are also 25 new mainboards in our tree:

  • AMD PADMELON
  • ASUS P5QL-EM
  • EMULATION QEMU-AARCH64
  • GOOGLE AKEMI
  • GOOGLE ARCADA CML
  • GOOGLE DAMU
  • GOOGLE DOOD
  • GOOGLE DRALLION
  • GOOGLE DRATINI
  • GOOGLE JACUZZI
  • GOOGLE JUNIPER
  • GOOGLE KAKADU
  • GOOGLE KAPPA
  • GOOGLE PUFF
  • GOOGLE SARIEN CML
  • GOOGLE TREEYA
  • GOOGLE TROGDOR
  • LENOVO R60
  • LENOVO T410
  • LENOVO THINKPAD T440P
  • LENOVO X301
  • RAZER BLADE-STEALTH KBL
  • SIEMENS MC-APL6
  • SUPERMICRO X11SSH-TF
  • SUPERMICRO X11SSM-F

In addition to the Cubieboard (which uses the A10 SoC), we also removed Google Hatch WHL.

Deprecations

Because there was only a single developer board (AMD Torpedo) using AGESA family 12h, and because there were multiple, unique Coverity issues with it, the associated vendorcode will be removed shortly after this release.

Support for the MIPS architecture will also be removed shortly after this release as the only board in the tree was a discontinued development board and no other work has picked up MIPS support, so it’s very likely broken already.

After more than a year of planning and following the announcement in coreboot 4.10, platforms not using relocatable ramstage, a C bootblock and, on systems using Cache as RAM, a postcar stage, won’t be supported going forward.

Significant changes

__PRE_RAM__ is deprecated

Preprocessor use of defined(__PRE_RAM__) have been mostly replaced with if (ENV_ROMSTAGE_OR_BEFORE) or the inverse if (ENV_RAMSTAGE).

The remaining cases and -D__PRE_RAM__ are to be removed soon after release.

__BOOTBLOCK__ et.al. are converted

This applies to all ENV_xxx definitions found in <rules.h>.

Write code without preprocessor directives whenever possible, replacing #ifdef __BOOTBLOCK__ with if (ENV_BOOTBLOCK)

In cases where preprocessor is needed use #if ENV_BOOTBLOCK instead.

CAR_GLOBAL is removed where possible

For all platform code with NO_CAR_GLOBAL_MIGRATION=y, any CAR_GLOBAL attributes have been removed. Remaining cases from common code are to be removed soon after release.

TSEG and cbmem_top() mapping

Significant refactoring has bee done to achieve some consistency across platforms and to reduce code duplication.

Build system amenities

The build system now has an all class of source files to remove the need to list source files for each and every source class (romstage, ramstage, …)

The site-local/ mechanism became more robust.

Stricter coding standards to improve security

The build now fails on variable length arrays (that make it way too easy to smash a stack) and case statements falling through without a note that it is intentional.

Shorter file headers

This project is still under way, but we started moving author information from individual files into the global AUTHORS file (and there’s the git history for more details).

In the future, we also want to replace the license headers (lots of lines) in each file with spdx identifiers (one line) and so we added a LICENSES/ directory that contains the full text of all the licenses that are used throughout our tree.

Variant creation scripts

To ease the creation of variant boards, util/mainboard/ now contains scripts to generate a new variant to a given board. These are still specific to google/hatch at this time, but they’re written with the idea of becoming more generally useful.

Payloads

Payload integration has been updated, coreinfo learned to cope with UPPER CASE commands and libpayload knows how to deal with USB3 hubs.

Added VBOOT support to the following platforms:

  • intel/gm45
  • intel/nehalem

Moved the following platforms to C_ENVIRONMENT_BOOTBLOCK:

  • intel/i945
  • intel/x4x
  • intel/gm45
  • intel/nehalem
  • intel/sandybridge
  • intel/braswell

libgfxinit

Most notable, dynamic CDClk configuration was added to libgfxinit, to support higher resolution displays without changes in the static configuration. It also received some fixes for better DP and eDP compatibility, better error recovery for Intel’s fickle GMBus and updated platform support:

  • Correct HDMI clock limit for G45.
  • DP support for Ibex Peak (Ironlake graphics).
  • Fixed scaling on eDP for Broadwell.
  • Support for ULX variants of Haswell and later.
  • Support for Kaby, Amber, Coffee and Whiskey Lake.

Other

  • Did cleanups around TSC timer
  • Improved automatic VR configuration on SKL/KBL
  • Filled additional fields in SMBIOS type 4
  • Removed magic value replay from Intel Nehalem/ibexpeak code base
  • Added OpenSBI on RISCV platforms
  • Did more preparations for Intel TXT support
  • Did more preparations for x86_64 stage support
  • Added SSDT generator for arbitrary SuperIO chips based on devicetree.cb

Extending proprietary PC embedded controller firmware

I'm still playing with my X210, a device that just keeps coming up with new ways to teach me things. I'm now running Coreboot full time, so the majority of the runtime platform firmware is free software. Unfortunately, the firmware that's running on the embedded controller (a separate chip that's awake even when the rest of the system is asleep and which handles stuff like fan control, battery charging, transitioning into different power states and so on) is proprietary and the manufacturer of the chip won't release data sheets for it. This was disappointing, because the stock EC firmware is kind of annoying (there's no hysteresis on the fan control, so it hits a threshold, speeds up, drops below the threshold, turns off, and repeats every few seconds - also, a bunch of the Thinkpad hotkeys don't do anything) and it would be nice to be able to improve it. A few months ago someone posted a bunch of fixes, a Ghidra project and a kernel patch that lets you overwrite the EC's code at runtime for purposes of experimentation. This seemed promising. Some amount of playing later and I'd produced a patch that generated keyboard scancodes for all the missing hotkeys, and I could then use udev to map those scancodes to the keycodes that the thinkpad_acpi driver would generate. I finally had a hotkey to tell me how much battery I had left. But something else included in that post was a list of the GPIO mappings on the EC. A whole bunch of hardware on the board is connected to the EC in ways that allow it to control them, including things like disabling the backlight or switching the wifi card to airplane mode. Unfortunately the ACPI spec doesn't cover how to control GPIO lines attached to the embedded controller - the only real way we have to communicate is via a set of registers that the EC firmware interprets and does stuff with. One of those registers in the vendor firmware for the X210 looked promising, with individual bits that looked like radio control. Unfortunately writing to them does nothing - the EC firmware simply stashes that write in an address and returns it on read without parsing the bits in any way. Doing anything more with them was going to involve modifying the embedded controller code. Thankfully the EC has 64K of firmware and is only using about 40K of that, so there's plenty of room to add new code. The problem was generating the code in the first place and then getting it called. The EC is based on the CR16C architecture, which binutils supported until 10 days ago. To be fair it didn't appear to actually work, and binutils still has support for the more generic version of the CR16 family, so I built a cross assembler, wrote some assembly and came up with something that Ghidra was willing to parse except for one thing. As mentioned previously, the existing firmware code responded to writes to this register by saving it to its RAM. My plan was to stick my new code in unused space at the end of the firmware, including code that duplicated the firmware's existing functionality. I could then replace the existing code that stored the register value with code that branched to my code, did whatever I wanted and then branched back to the original code. I hacked together some assembly that did the right thing in the most brute force way possible, but while Ghidra was happy with most of the code it wasn't happy with the instruction that branched from the original code to the new code, or the instruction at the end that returned to the original code. The branch instruction differs from a jump instruction in that it gives a relative offset rather than an absolute address, which means that branching to nearby code can be encoded in fewer bytes than going further. I was specifying the longest jump encoding possible in my assembly (that's what the :l means), but the linker was rewriting that to a shorter one. Ghidra was interpreting the shorter branch as a negative offset, and it wasn't clear to me whether this was a binutils bug or a Ghidra bug. I ended up just hacking that code out of binutils so it generated code that Ghidra was happy with and got on with life. Writing values directly to that EC register showed that it worked, which meant I could add an ACPI device that exposed the functionality to the OS. My goal here is to produce a standard Coreboot radio control device that other Coreboot platforms can implement, and then just write a single driver that exposes it. I wrote one for Linux that seems to work. In summary: closed-source code is more annoying to improve, but that doesn't mean it's impossible. Also, strange Russians on forums make everything easier. comment count unavailable comments

[GSoC] Wrap-up for Adding QEMU/AArch64 Support to Coreboot

Hello, I’m Asami. It is the time for the final evaluation of GSoC 2019 now. My project is adding QEMU/AArch64 support to coreboot. It would help developers for compatibility testing and make sure that changes to architecture code don’t break current implementations for ARMv8.

Here is my work on Gerrit. CL 33387 is the main patch for this project and it is successfully merged. We can now run coreboot on QEMU/AArch64.

What I’ve Done for My Main Project

Firstly I made a new directory qemu-aarch64 in src/mainboard/emulation/ and I was basically working on in it. In the bootblock stage, I have written custom bootblock code in assembly because an ARM virt machine doesn’t have SRAM, which means I had to relocate code inside ROM to DRAM during the first stage. I also made a memory layout with reference to the QEMU implementation. In the romstage and the ramstage, I registered a custom handler for detecting DRAM size because AArch64 throws a Synchronous External Abort that happens when you try to access something that is not memory.

I wrote documentation for how to use coreboot with QEMU/AArch64. Here is the page: https://doc.coreboot.org/mainboard/emulation/qemu-aarch64.html

Future Work

ARMv8 architecture has an integrated LinuxBios as a payload. However, I can’t run coreboot with it yet. It now causes an error while building. I also need to make sure that coreboot works well with ARM Trusted Firmware. So, I’d like to solve problems and I hope to see “Hello World” with LinuxBios.

Conclusion

I believe I almost succeed in my project. The main code has already merged and I also solved small problems found while working on the main project. The members of coreboot are really great and they, especially my mentor and reviewers, helped me a lot. It was not an easy project for me because I had never experienced to work on coreboot and I had to know the basic code flow of it. I’ve read much code and it made me grown up. Thank you all of the members for such an excellent time.

My Previous Blog Posts

[GSoC] Ghidra firmware utilities, wrap-up

Hi everyone. The official programming period for GSoC 2019 is now over, and it’s time for final evaluations. I will use this post to summarize what I’ve worked on this summer, as well as how to use the Ghidra plugin.

The project is available on GitHub: https://github.com/al3xtjames/ghidra-firmware-utils

Project details

In my initial project proposal, I planned on writing various filesystem loaders (for hybrid PCI option ROMs, Intel flash descriptor images, coreboot File System images, and UEFI firmware volumes), a binary loader for legacy x86 PCI option ROMs, and a UEFI helper script. I ended up implementing all of these in the Ghidra plugin, and also worked on a UEFI Terse Executable binary loader. You can look at my previous blogposts to see my progress throughout the summer.

Here is a description of the components included in the project:

FS loaders allow files stored within binary images to be imported directly into Ghidra. The following FS loaders are implemented in this project:

Hybrid PCI option ROM

Some PCI option ROMs may contain multiple executable ROMs. This is usually used to support multiple firmware types (e.g. a video card with legacy BIOS VGA support and UEFI Graphics Output Protocol support). The FS loader allows each embedded executable ROM image to be imported.

Intel firmware descriptor (IFD)

Recent Intel platforms have multiple regions on the SPI flash (used to store system firmware). The descriptor region describes the layout of these flash regions. The FS loader allows each flash region to be imported. Ghidra supports nested FS loaders, so other FS loaders (FMAP/CBFS or UEFI FV) can be used to parse certain regions, such as the BIOS region.

Flash Map (FMAP)

This is another standard for describing flash regions, used by coreboot and various Google devices. Like the IFD FS loader, this allows each defined flash region to be imported, and it can be used with other FS loaders (e.g. the COREBOOT region can be parsed with the CBFS loader).

coreboot File System (CBFS)

coreboot uses a simple file system to store independent binaries and data files. The CBFS loader can be used to import each CBFS file for analysis; for example, PCI option ROMs stored as CBFS files can be imported. Optional CBFS file compression (LZ4/LZMA) is supported.

UEFI firmware volume (FV)/firmware file system (FFS)

UEFI firmware images use firmware volumes for storing firmware files, which may consist of multiple sections. The UEFI FV FS loader allows UEFI firmware volumes to be imported, including embedded firmware files/sections.

This project also implements a couple of binary loaders:

Legacy x86 option ROM

PCI option ROMs that target the x86 legacy BIOS contain a raw 16-bit executable image. They also have additional header fields, including a field with the entry point instruction. The binary loader resolves the entry point and specifies that 16-bit x86 disassembly should be used.

UEFI Terse Executable (TE)

UEFI binaries can use one of two executable formats: the Portable Executable (PE32) format (also used on Windows), and the Terse Executable (TE) format. Terse Executables are essentially simplified PE32 binaries – the numerous DOS/NT/optional headers are condensed into a single TE header, without any superfluous header fields. The binary loader resolves the entry point and defines memory blocks corresponding to the sections defined in the TE header.

Finally, a helper script for assisting with the analysis of UEFI binaries is
included. The UEFI helper script does the following:

  • Imports a UEFI data type library
  • Defines the entry point signature
  • Searches for known EFI GUIDs in the .data/.text segments
  • Attempts to locate global EFI table pointers (gST/gBS/gRT)
  • Attempts to perform propagation of some EFI types to called functions

Project usage

Instructions for how to build and use the Ghidra plugin are included in the project’s README, but I’ll restate them here.

Building the plugin

Like other Ghidra plugins (and Ghidra itself), this project uses Gradle as the build system. Set the GHIDRA_INSTALL_DIR environment variable (point it to your Ghidra installation directory) and run gradle to build the plugin. Install the generated ZIP (in the dist directory) by selecting
File > Install Extensions in Ghidra, and then clicking the green plus icon.

Using the FS loaders

Load the specified input file into Ghidra (drag and drop or use File > Import File). Assuming the input file is supported by a FS loader, Ghidra should indicate that a container file was detected, and will allow you to batch import all enclosed files or view the file system.

Note that Ghidra does support parsing nested filesystems with multiple FS loaders. For example, UEFI firmware volumes in the BIOS region of an Intel firmware image can be parsed by first importing the Intel firmware image and then importing the BIOS region (select Import or Open File System in the right-click menu).

Using the UEFI helper script

After loading a UEFI executable (PE32 or TE), you can run the UEFI Helper script from the Script Manager window (under Window). Select UEFIHelper.java and click the green “Run Script” button.

Currently, the UEFI helper script assumes the entry point matches the standard driver/application signature (with EFI_HANDLE and EFI_SYSTEM_TABLE * parameters). SEC/PEI/SMM modules have different entry point parameters, which will have to be manually specified.

Future work

While my work for GSoC 2019 is complete, I think the following additions would be useful for this project (and UEFI reverse-engineering in general):

Processor module for disassembling EFI Byte Code (EBC)

EFI Byte Code is a byte code format used for platform-independent UEFI applications/drivers. Ghidra currently doesn’t support the EBC virtual machine architecture. Fortunately, it is possible to add support for an architecture by creating a SLEIGH processor specification.

Upstreamed Terse Executable loader

As previously described, TE binaries are very similar to PE binaries. Ghidra already has parsers for the data directory and section header structures, which are present in both PE and TE binaries. My TE loader had to reimplement these parsers, as the existing parsers depended on the NT header, which isn’t present in TE binaries. Removing the NT header dependency from the data directory/section header parsers would allow Ghidra’s existing parsers to be reused by the TE loader. This would also make it easier to upstream the TE loader.

Support for SEC/PEI/SMM modules (UEFI helper script)

Instead of assuming the entry point parameters, the script could prompt the user to select the module type, or somehow retrieve the module type from the FFS header (if the FS loader was used).

Additional GUID heuristics (UEFI helper script)

The script could locate calls to EFI_BOOT_SERVICES/EFI_RUNTIME_SERVICES functions with GUID parameters and automatically apply the EFI_GUID data type.

Protocol database (UEFI helper script)

Similar to the existing GUID->name database (imported from UEFITool), a database for mapping protocol definitions to the structure name could be created. The script could use this database to automatically apply the correct protocol structure type in calls to LocateProtocol/etc.

Very basic dependency graph (inspired by this UEFITool issue) (UEFI helper script)

The script could locate all calls to protocol consumption/production functions in EFI_BOOT_SERVICES (such as LocateProtocol, InstallProtocol, etc) and use this to generate a basic overview of the protocols used by the current UEFI binary.

Acknowledgements

I would like to thank my mentors Martin Roth and Raul Rangel for their continued assistance during the past 12 weeks. This has been a great opportunity, and it certainly wouldn’t have been possible without their help. I look forward to contributing to coreboot and other related projects (including Ghidra) in the future.

[GSoC] Coreboot Coverity, Final Update

It is now the final week of GSoC, and it is time for me to write my final blog post. Over the past summer I have worked on fixing the Coverity scan issues in coreboot, with the goal of making the code base “Coverity clean”. This has involved writing a substantial number of patches, the vast majority of which are in coreboot, with a sprinkling in a few other projects:

  • 146 patches in coreboot
  • 6 patches in flashrom
  • 6 patches in vboot
  • 3 patches in Chromium EC
  • 4 patches in OpenSBI
  • 2 patches in em100
  • 1 in the Linux kernel

At the time of writing, a few of my patches are still under review on Gerrit, so it is possible (and hopeful!) that this list will increase over the next few weeks.

In total, these patches resolved 172 Coverity issue reports of actual bugs. However, Coverity also isn’t always right, and some issues weren’t actually problems that required patches. These issues, 91 in total, were either false positives or intentional and were ignored. At the moment, there are currently 223 remaining reports in the issue tracker for coreboot. Despite being a substantial number, this is almost entirely composed of issues from third-party projects (such as OpenSBI or vboot, which probably shouldn’t be counted in the coreboot tracker anyway), and the AMD vendorcode. The original plan at the beginning of the summer was to work on the AMD vendorcode; however, after discussion with my mentors we decided to skip it, since with the upcoming deprecations for coreboot 4.11 it might not be around much longer. Aside from this, there are roughly 20 remaining issues, which mostly required refactoring or technical knowledge that I don’t have.

With the summary out of the way, I’d like to give everyone a sample of the sort of bugs I’ve worked on during the project, and hopefully give advice for avoiding them in the future. Here is a list of the most common, nasty, or subtle types of bugs I’ve found over the summer.

Missing Break Statements

In switch statements in C, every case statement implicitly falls through to the next one. However, this is almost never the desired behavior, and so to avoid this every case needs to be manually terminated by a break to prevent the fall-through. This unfortunately is very tedious to do and is often accidentally left out. For a prototypical example, let’s look at CB:32180 from the AGESA vendorcode.

switch (AccessWidth) {
        case AccessS3SaveWidth8:
                RegValue = *(UINT8 *) Value;
                break;
        case AccessS3SaveWidth16:
                RegValue = *(UINT16 *) Value;
                break;
        case AccessS3SaveWidth32:
                RegValue = *(UINT32 *) Value;
        default:
                ASSERT (FALSE);
}

In this switch there is a missing break after the AccessS3SaveWidth32 case, which will then fall-through to the false assertion. Clearly not intentional! Other examples of this, though not as severe, can be found in CB:32088 and CB:34293. Fortunately, these errors today can be prevented by the compiler. GCC recently added the -Wimplicit-fallthrough option, which will warn on all implicit fall throughs and alert to a potentially missing break. However, some fall throughs are intentional, and these can be annotated by a /* fall through */ comment to silence the warning. Since CB:34297 and CB:34300 this warning has been enabled in coreboot, so this should be the last we see of missing break statements.

Off-by-One Errors

There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

Anonymous

Everyone has been bitten by off-by-one errors. Let’s take a look at CB:32125 from the Baytrail graphics code.

static void gfx_lock_pcbase(struct device *dev)
{
        const u16 gms_size_map[17] = { 0, 32, 64, 96, 128, 160, 192, 224, 256,
                                       288, 320, 352, 384, 416, 448, 480, 512 };
        ...
        u32 gms, gmsize, pcbase;
        gms = pci_read_config32(dev, GGC) & GGC_GSM_SIZE_MASK;
        gms >>= 3;
        if (gms > ARRAY_SIZE(gms_size_map))
                return;
        gmsize = gms_size_map[gms];
        ...
}

Here we have an array gms_size_map of 17 elements, and a bounds check on the gms variable before it is used to index into the array. However, there’s a problem. The bounds check misses the case when gms == ARRAY_SIZE(gms_size_map) == 17, which is one past 16 – the index of the last array element. The fix is to use >= in the check instead of >. This exact error when performing a bounds check is very common: see at least CB:32244, CB:34498, and CL:1752766 for other examples.

Another nasty place where off-by-one errors strike is with strings – in particular, when making sure they are null terminated. Here is CB:34374 from the ACPI setup of the Getac P470.

static long acpi_create_ecdt(acpi_ecdt_t * ecdt)
{
        ...
        static const char ec_id[] = "\_SB.PCI0.LPCB.EC0";
        ...
        strncpy((char *)ecdt->ec_id, ec_id, strlen(ec_id));
        ...
}

The problem is that strncpy() will only copy at most strlen(ec_id) characters, which excludes the null character. The author might have been thinking of the similar strlcpy(), which does explicitly null terminate the string buffer even if it never reaches a null character. In this case none of the string-copying functions are needed, since ec_id is a string buffer and so can be copied using a simple memcpy().

Boolean vs Bitwise Operators

In C, all integers are implicitly convertible to boolean values and can be used with all boolean operators. While somewhat convenient, this also makes it very easy to mistakenly use a boolean operator when a bitwise one was intended. Let’s take a look at CB:33454 from the CIMX southbridge code.

void sb_poweron_init(void)
{
        u8 data;
        ...
        data = inb(0xCD7);
        data &= !BIT0;
        if (!CONFIG(PCIB_ENABLE)) {
                data |= BIT0;
        }
        outb(data, 0xCD7);
        ...
}

Here BIT0 is the constant 0x1, so !BIT0 expands to 0, with the net effect of data being completely cleared, regardless of the previous value from inb(). The intended operator to use was the bitwise negation ~, which would only clear the lowest bit. For more examples of this sort of bug, see CB:34560 and OpenSBI 3f738f5.

Implicit Integer Conversions

C allows implicit conversions between all integer types, which opens the door for many accidental or unintentional bugs. For an extremely subtle example of this, let’s take a look at OpenSBI 5e4021a.

void *sbi_memset(void *s, int c, size_t count);

void sbi_fifo_init(struct sbi_fifo *fifo, void *queue_mem,
                   u16 entries, u16 entry_size)
{
        ...
        sbi_memset(fifo->queue, 0, entries * entry_size);
}

Do you see the problem? The issue is that entries and entry_size are both 16-bit integers, and by the rules of C are implicitly converted to int before the multiplication. An int cannot hold all possible values of a u16 * u16, and so if the multiplication overflows the intermediate result could be a negative number. On 64-bit platforms size_t will be a u64, and the negative result will then be sign-extended to a massive integer. As the last argument to sbi_memset(), this could lead to a very large out-of-bounds write. The solution is to cast one of the variables to a size_t before the multiplication, which is wide enough to prevent the implicit promotion to int. For other examples of this problem, see CB:33986 and CB:34529.

Another situation where implicit conversions strike is in error handling. Here is CB:33962 in the x86 ACPI code.

static ssize_t acpi_device_path_fill(const struct device *dev, char *buf,
                                     size_t buf_len, size_t cur);

const char *acpi_device_path_join(const struct device *dev, const char *name)
{
        static char buf[DEVICE_PATH_MAX] = {};
        size_t len;
        if (!dev)
                return NULL;

        /* Build the path of this device */
        len = acpi_device_path_fill(dev, buf, sizeof(buf), 0);
        if (len <= 0)
                return NULL;
        ...
}

With the function prototype right there, the problem is obvious: acpi_device_path_fill() returns negative values in a ssize_t to indicate errors, but len is a size_t, so all those negative error values are converted to extremely large integers, thus passing the subsequent error check. During code review this may not at all be obvious though.

Both these errors could be prevented using the -Wconversion compiler option, which will warn about all implicit integer conversions. However, there are an incredible number of such conversions in coreboot, and it would be a mammoth task to fix them all.

Null Pointers

Null pointers need no introduction – they are well known to cause all sorts of problems. For a simple example, let’s take a look at CB:33134 from the HiFive Unleashed mainboard.

static void fixup_fdt(void *unused)
{
        void *fdt_rom;
        struct device_tree *tree;
        
        /* load flat dt from cbfs */
        fdt_rom = cbfs_boot_map_with_leak("fallback/DTB", CBFS_TYPE_RAW, NULL);

        /* Expand DT into a tree */
        tree = fdt_unflatten(fdt_rom);
        ...
}

This code attempts to load a device tree from a location in the CBFS. However, cbfs_boot_map_with_leak() will return a null pointer if the object in the CBFS can’t be found, which will then be dereferenced in the call to fdt_unflatten(). On most systems dereferencing a null pointer will lead to a segfault, since the operating system has set up permissions that prevent accessing the memory at address 0. However, coreboot runs before the operating systems has even started, so there are no memory permissions at all! If fdt_rom is a null pointer, fdt_unflatten() will attempt to expand the device tree from whatever memory is at address 0, leading to who knows what problems. A simple null check will avoid this, but requires the programmer to always remember to put them in.

Another common issues with null pointers is that even if you do a check, it might not actually matter if the pointer has already been dereferenced. For example, here is a problem with the EDID parser in CB:32055.

int decode_edid(unsigned char *edid, int size, struct edid *out)
{
        ...
        dump_breakdown(edid);

        memset(out, 0, sizeof(*out));

        if (!edid || memcmp(edid, "\x00\xFF\xFF\xFF\xFF\xFF\xFF\x00", 8)) {
                printk(BIOS_SPEW, "No header found\n");
                return EDID_ABSENT;
        }
        ...
}

In this case the EDID is dumped doing the null pointer check, but at worst there should only be a wonky dump if edid is null, right? Not necessarily. Since dereferencing a null pointer is undefined behavior, the compiler is allowed to assume that no null pointer dereferences occur in the program. In this case, dereferencing the edid pointer in dump_breakdown() is an implicit assertion that edid is not null, so an over-zealous compiler could remove the following null check! This optimization can be disabled using -fno-delete-null-pointer-checks (which is done in coreboot), but does not prevent any problems that could have happened in the null dereference before the check took place. See this article in LWN for details on how a vulnerability from this problem was dealt with in the Linux kernel.

Conclusion

C has always had the mantra of “trust the programmer”, which makes mistakes and errors very easy to do. Some of these errors can be prevented at compile time using compiler warnings, but many cannot. Coverity and other static analyzers like it are very useful and powerful tools for catching bugs that slip past the compiler and through peer review. However, it is no silver bullet. All of these errors were present in production code that were only caught after the fact, and there are certainly bugs of this sort left that Coverity hasn’t found. What do we do about them, and how can we ever be sure that we’ve caught them all? Today, there are new languages designed from the beginning to enable safe and correct programming. For example, libgfxinit is written in SPARK, a subset of Ada that can be formally verified at compile time to avoid essentially all of the above errors. There is also the new oreboot project written in Rust, which has similar compile time guarantees due to its extensive type system. I hope to see these languages and others increasingly used in the future so that at some point, this job will have become obsolete. 🙂

[GSoC] How to Use ARM Trusted Firmware in Coreboot

Hello, I’m Asami. In this article, I’m going to talk about how to use ARM Trusted Firmware in coreboot for ARMv8 (AArch64). ARM Trusted Firmware provides a reference implementation of secure world software for Armv8-A and Armv8-M. You can see the code via https://github.com/ARM-software/arm-trusted-firmware.

Trusted Firmware has 5 steps which are called as BL1, BL2, BL3-1, BL3-2, and BL3-3. BL1 step is similar to the bootblock stage and BL2 is similar to the romstage of coreboot. In the coreboot project, we only use the BL3-1 part that is expected to work on EL3 exception level. The code of BL3-1 will execute just after the ramstage and before the payload when we enable the Trusted Firmware.

How to enable Trusted Firmware

It’s very easy to enable Trusted Firmware on coreboot. You just need to ‘select ARM64_USE_ARM_TRUSTED_FIRMWARE’ in your Kconfig. If you want to run coreboot on QEMU/AArch64, you need to add the ‘select ARM64_USE_ARM_TRUSTED_FIRMWARE’ at src/mainboard/emulation/qemu-aarch64/Kconfig. The next step switches depending on the configuration at src/arch/arm64/boot.c. Here is the code to switch the next step:

// src/arch/arm64/boot.c
static void run_payload(struct prog *prog)
{
	void (*doit)(void *);
	void *arg;

	doit = prog_entry(prog);
	arg = prog_entry_arg(prog);
	u64 payload_spsr = get_eret_el(EL2, SPSR_USE_L);

	if (CONFIG(ARM64_USE_ARM_TRUSTED_FIRMWARE))
		arm_tf_run_bl31((u64)doit, (u64)arg, payload_spsr);
	else
		transition_to_el2(doit, arg, payload_spsr);
}

Why using Trusted Firmware

Coreboot for ARMv8 has 2 options to pass an execution from it to a payload. The first is passing execution to a payload directly and the second one is passing to the BL3-1 code before a payload. You always don’t have to use Trusted Firmware. However, you need to enable Trusted Firmware if you want to run Linux because it expects to work with PSCI. PSCI is an abbreviation of Power State Coordination Interface which is a standard interface for power management that can be used by OS vendors for supervisory software working at different levels of privilege on an ARM device. Coreboot doesn’t have the setup for PSCI but Trusted Firmware does.

Current Status

Unfortunately, QEMU/AArch64 in coreboot doesn’t support Trusted Firmware yet. It means we can’t run Linux with QEMU for ARMv8. I’m now trying to support Trusted Firmware for QEMU/AArch64.