Updating LZ4 during my GSoC

Introduction

In this blog post, I will give an insight into my Google Summer of Code project, where I worked on coreboot. My target goals were updating compression and decompression algorithms LZ4 and LZMA to their newest version.

In the whole process, I learned a lot about coreboot as an open-source community, and about GSoC, which could help students considering applying for a GSoC themselves.

This blog won’t be addressing technical details, there will be a section at the end about my work and the results of it, but the rest of the blog will be a bit more meta.

What did I learn about open-source communities?

Before this project, I never contributed to any open-source project. My studies don’t involve much programming, therefore I’m not that professional, my programming skills are homebrewed. Now open source communities, especially as big as coreboot, really intimidated me (they still do somehow). There are many people, all of them probably better programmers than me, having more knowledge of the project as well. How could I help them with anything useful at all? And what if I push my first patch and it’s just embarrassing? Having a mentor helped, I think without someone to talk to, I would have never stuck my nose into any big open-source project.

And all the people I met were very kind and helpful if needed. And I learned that critics, especially in text form, may sound harsher in my head than they are meant to.

Thoughts about Google’s Summer of Code

GSoC is a really good opportunity to get a feeling for working on open source. If you are a student searching for insight into open source, guided by a mentor, a GSoC project can be just right for you.

But something I underestimated, was, how time-intensive such a project (even a medium one) is. I probably was too naive, but beforehand, I just talked myself into “Yeah this won’t be a problem, I can just work around two days a week for this, just shift stuff to other days.” Well, it turns out, that’s not how workload and mental load work. For me at least. I do work besides my studies and in the first weeks of GSoC, I was overwhelmed by the combination. Besides having fewer hours to do other things, just having more things to think about can be quite stressful under some conditions.

GSoC allows you to stretch your project, so there is less work per week. In total, the project hours stay the same. This opportunity really helped me because I underestimated the load the weekly hours would apply.

If I were to apply to GSoC again, I would double-check if such a commitment in terms of work is feasible for me. And that I would recommend to everyone thinking about applying. GSoC can be interesting and fun, but you need to be able to provide the time needed for the project.

What did I do and achieve?

I started my project by updating LZ4 code in coreboot. After that, I planned to move on to another compression algorithm, LZMA. My hopes were to increase the decompression speed and the compression ratio at the same time. Looking at the release notes for LZ4 since the version currently used by coreboot (spanning 8 years of releases), they stated an increase in speed and compression factor.

To get an overview, I first searched for all places where LZ4 was used and checked the version which the code was from. I found out, that there are five files containing code from the LZ4 repository, where every file contains another subset of functions from the original file. 3 of these files were imported from LZ4 wrapper files.

Then I fetched LZ4 code from the newest LZ4 release and merged it into the files where old LZ4 code was used. After merging, there were many LZ4 functions never used in our use case. Therefore I searched for each source file through each wrapper file to find out which functions were used and which I could remove. This way I could ensure there was no unused code in these files, providing a better overview

After that, there still were many errors that needed to be resolved. That took the most time, being more complicated than I assumed. In the end, all tests passed, and building a rom and running it with QEMU worked just fine, my last ToDo was to test how much the new version was faster than the old if it was at all.

Release notes stated many speed improvements, so I hoped this would show up after my work.

The whole process took longer than I thought. Therefore I will miss the goal of updating LZMA. As I am writing this, my patch is open for review and I am in the process of creating statistics that may show that there is a speed improvement. If there is not, maybe there is no good reason to change to a newer LZ4 version. As one comment states, coreboot does require safe in-place decompression, which may be a problem with the new version and thus would have to be checked.

My work is public on the coreboot Gerrit under this link https://review.coreboot.org/c/coreboot/+/77148. I do hope my patch will be merged, as I want to contribute to this project. But even if a merge may be rejected, that is part of open-source work too. I’ll try to improve on my next patch. This is my first open source contribution and it’s not perfect, but it certainly won’t be my last.

What could be done in the future?

As stated in a comment on my patch, in-place decompression is important for coreboot. There is a merged pull request from 2016 resolving that issue, but its functionality may have been lost in further development. Therefore, the new LZ4 version has to be checked for in-place safety.

To tweak compression/decompression speeds and compression factor one might want to edit compression parameters. One could do this using a Kconfig entry.

Also, it occurred that after building coreboot (running make in coreboot dictionary), when memory regions and sizes of the coreboot.rom are printed, there is no compression listed for the payload, even when it certainly has been compressed.

Statistics

I tested this on my laptop with KVM and QEMU, on an AMD® Ryzen 5 5500u. I changed my system’s CPU affiliation, to make 2 of my cores unused (they still got used, but not much, usually ~1%). Then I assigned these 2 CPUs to my VM to make different runs more comparable. Still, different runs got different timings, so I made 6 runs for the old LZ4 and the new LZ4 version each.

Avg ramstage decompression timeRamstage decompression time std devAvg payload decompression timePayload decompression time std dev
Current version420µs54µs229µs64µs
Updated version520µs113µs219µs4µs
Ramstage compression ratioPayload compression ratio
Current version1,5071,452
Updated version1,511,454

These values indicate that there is a very small improvement in the compression ratio. Regarding decompression time, most tests have a relatively high standard deviation, which makes the results less statistically relevant. Ramstage decompression seems to have slowed down (~24%) while average payload decompression got 4-5% faster.

All in all, these results are everything but exciting to me. Although a newer version may have other advantages, it seems that there is no decompression time improvement as I hoped and compression ratio improvements are so small, that they might not be noticeable.