[GSoC] Address Sanitizer, Part 1

Hello everyone. My name is Harshit Sharma (hst on IRC). I am working on the project to add the “Address Sanitizer” feature to coreboot as a part of GSoC 2020. Werner Zeh is my mentor for this project and I’d like to thank him for his constant support and valuable suggestions.

It’s been a fun couple of weeks since I started working on this project. Though I found the initial few weeks quite challenging, I am glad that I was able to get past that and learned some amazing stuff I’d cherish for a long time.

Also, being a student, I find it incredible to have got a chance to work with and learn from such passionate, knowledgeable, and helpful people who are always available over IRC to assist.

A few of you may already know about the progress we’ve made through Werner’s message on the mailing list. This is a much comprehensive report on the work we had done prior to the first evaluation period.


What is Address Sanitizer ?

Address Sanitizer (ASan) is a runtime memory debugging tool, designed to find out of bounds and use after free memory bugs. It is a part of toolset Google has developed which also includes Undefined Behaviour Sanitizer (UBSan), a Thread Sanitizer (TSan), and a Leak Sanitizer (LSan).

This tool would help in improving code quality and thus would make our runtime code more robust.


Compile with ASan

The firstmost task was to ensure that coreboot compiles without any errors after Asan is enabled. For this, we created dummy functions and added the relevant GCC flags to build coreboot with ASan. We also introduced a Kconfig option (currently only available on x86 architecture) to enable this feature. (Patch [1])


Porting code from Linux

The design of ASan in coreboot is based on its implementation in Linux kernel, also known as Kernel Address Sanitizer (KASAN). However, coreboot differs a lot from Linux kernel due to multiple stages and that is what poses a challenge.


Design of ASan

ASan uses compile-time instrumentation that adds a run-time check before every memory instructions.

The basic idea is to encode the state of each 8 aligned bytes of memory in a byte in the shadow region. As a consequence, the shadow memory is allocated to 1/8th of the available memory. The instrumentation of the compiler adds __asan_load and __asan_store function calls before memory accesses which seeks the address of the corresponding shadow memory and then decides if access is legitimate depending on the stored state.

If the access is not valid, it throws an error telling the instruction pointer address, the address where a bug is found and whether it was read or write.


Need for GCC patch

Unlike Linux kernel which has a static shadow memory layout, coreboot has multiple stages with very different memory maps. Thus we require different shadow offset addresses for each stage.

Unfortunately, GCC presently, only supports using a static shadow offset address which is specified at compile time using -fasan-shadow-offset flag.

So, we came up with a GCC patch that introduces a feature to use a dynamic shadow offset address. Through this, we enable GCC to determine shadow offset address at runtime using a callback function __asan_shadow_offset().

Though we’ve presented our patch to GCC developers to convince them to include this change in their upcoming version, for now, ASan on coreboot only works if this patch is applied. (Patches [2] and [3])


Allocating and initializing shadow buffer

Instead of allocating a shadow buffer for the whole memory region, we decided to only define shadow memory for data and heap sections. Since we don’t want to run ASan for hardware mapped addresses, this way we save a lot of memory space.

Once the shadow region is allocated by the linker, the next task is to initialize it as early as possible at runtime. This is done by a call to asan_init() which unpoisons i.e. sets the shadow memory corresponding to the addresses in the data and heap sections to zero.


ASan support for ramstage

We decided to begin in a comparatively simpler stage i.e. ramstage. Since ramstage uses DRAM, the implementation is common across all platforms of a particular architecture. Moreover, ramstage provides enough room in the memory to place our shadow region.

For now, we have only enabled this feature for x86 architecture and it is able to detect stack out of bounds and use after scope bugs. Though the patches are still in the review stage, we did a test on QEMU and siemens mc_apl3 (Apollo Lake based) and so far it looks good to us. (Patch [4])


Next steps..

In the second coding period, we’ve started working on adding ASan to romstage. I will push a change on Gerrit once we make some considerable progress.


We further encourage everyone in the community to review the patches and test this feature on their hardware for better test coverage. The more feedback we get, the more are the chances to have a high-quality debugging tool.