Skip to content

The Future of CI How KDAB Creates Modern Continuous Integration Using Buildbot

For years, we at KDAB have been using Buildbot as our build and continuous integration system. Gerrit hosts all our projects and is our code review platform. Our deployment of Buildbot and build machines has naturally grown over the years. It builds hundreds of configurations and up to a thousand builds daily, but issues with reliability and quality of service called for a major restructuring. Over the past year, we gradually developed and migrated to new infrastructure and, once that was in place, we were finally able to add some long-awaited features.

Buildbot at KDAB

Buildbot is a continuous integration system written in Python. It offers a high degree of flexibility, because the build configuration is fully written in Python. Thanks to an extensive homegrown library of functions and features, we need only a few lines of code to build, test and package a new C++ and Qt/QML application on Linux, Windows, Mac as well as Android, iOS and other platforms. Many of our projects build against multiple different compilers and Qt versions per platform. Sometimes application bundles need to be signed and notarized (looking at you, Apple). Once the build is finished, the apps are offered for download from our servers and developer app stores like Visual Studio App Center. Developers are notified about build failures per email, our chat system and, of course, in our code review tool Gerrit.

System Architecture

From a system architecture point of view, Buildbot follows a master/worker paradigm: Workers are virtual machines, bare metal servers or docker containers with build environments. Each worker runs a Buildbot process which receives and executes build commands. A central master keeps track of changes in Gerrit, build configurations, workers, builds and build results. In large deployments like ours, the master is made up of multiple Buildbot processes with different responsibilities: One process to serve the web interface, and another to coordinate builds. They rely on a variety of additional services: MariaDB for data storage, Apache2 to manage user authentication and Crossbar.io to coordinate messages between the master’s Buildbot processes. The Buildbot master centrally controls how builds are run on the workers and issues every build command individually to the respective worker.

Diagram of Buildbot Infrastructure at KDAB

This setup enabled, among others, the following features:

  • build 100s of projects, configurations and builds daily, 
  • monitor 100s of Git repositories for changes, hosted on our internal Gerrit, on GitHub and other platforms
  • report build results via email, into our chat system, to Gerrit and GitHub – accessible to customers and KDABians alike
  • web interface to provide insight into builds (for KDABians only) and build artifacts

Over the past years, the number of projects and daily builds has grown steadily. The speed of builds and of the web interface degraded correspondingly. The master processes and all accompanying services were hosted on a single VM with a networked file system – and the file system was quickly identified as main bottleneck. Unfortunately, the traditional approach to deployment hampered our efforts to move the system to bare metal hardware with decent file system speed: All dependencies, the services and Buildbot itself were installed directly on the system using the system package manager and Python’s pip. Buildbot configuration relied heavily on hard-coded values, including IP addresses and file system paths. For every new instance we would have to recreate the setup step by step, a very laborious process that would have left us with a setup just as inflexible as the old one.

Modern Buildbot at KDAB

Therefore, we decided to encapsulate Buildbot and all services in Docker containers and use Docker Compose to describe the whole stack. To that end, we undertook the following steps:

  • refactor the Buildbot configuration so that the configuration supports multiple independent instances of the Buildbot master:
    • read instance-specific parameters like URL, website name, address of back-end services, etc. from environment variables
    • allow to read different Python scripts to describe builds and build workers, configured via environment variables
  • create Docker images
    • for Buildbot, including all dependencies and our custom patches
    • for Apache2
    • for an SSH server to receive artifact uploads from workers
    • a monitoring solution (Telegraf) to collect performance data from the services
    • and more
  • create a Docker Compose configuration to describe the services and their interactions

The Docker Compose configuration is instance-independent: In order to create a new instance of Buildbot master, with its own set of workers, build configuration, URL, name and so forth, we simply copy the Docker Compose configuration to the host system and create an environment file which describes the instance. Now, setting up a new Buildbot instance.

Quality of Service Improved

Provided with these new tools, we directly created two new Buildbot instances: One to replace the old instance for KDAB-internal and customer projects, and another to build projects of one of our larger customers. For this customer we already provided code hosting and continuous integration services, but their developers could not directly access Buildbot’s web interface with details on the builds, due to limitations in Buildbot’s access management. The new separate instance removes this obstacle, and our customer now has full access to builds.

The transition brought about many other improvements to our customers, developers and system administrators:

  • Build speed improved drastically: The time from creation of a commit on Gerrit to the beginning of the respective builds lowered from typically more than 5 minutes to less than a second. The overall build time often sank by more than 50 %.
  • The system became more reliable with less crashes and freezes.
  • Time to update build configuration on the fly, e.g. to add a new project, decreased from minutes to a few seconds.
  • Special configuration of Apache2 allows to easily brand the Buildbot web interface not only by changing the instance’s name but also by changing colors, a feature which Buildbot does not offer natively.
  • We can offer dedicated Buildbot instances to customers so that they have direct access to build results.

Build Results in Gerrit

For years platforms like GitHub present build results in the web interface and have build failures block merges. Gerrit has only offered rudimentary support for this. Buildbot could block submission of patches. It could also create comments to inform about build errors. But the presentation in Gerrit was cluttered and often confusing, a workaround.

Luckily, Gerrit 3.4 introduced the Checks API. JavaScript plug-ins in Gerrit fetch information from a CI system and supply it to the Gerrit web interface. Gerrit then displays the build results right there with the commit message and review comments. The interface shows every build configuration separately, and even build output like compile errors or test failures are right there in Gerrit. The Gerrit project provides an example of what it can look like:

Example of Gerrit's Check UI

So far, there is no plug-in for Buildbot publicly available, so we developed our own. Now, when a developer opens a change on Gerrit, our JavaScript plug-in queries Buildbot’s REST API for builds. The script will automatically determine the correct instance of Buildbot, and whether the user has access to that particular instance at all.

We needed to come up with a few tricks to make this happen. First, Buildbot did not allow to efficiently query builds for a Gerrit change. We added that feature, provided the patch to Buildbot upstream, and deployed it on our instances. This, by the way, is not our first contribution to Buildbot. Over the years we contributed 40 patches. Second, we introduced an additional endpoint on the web server to gracefully check whether a Gerrit user has access to a particular Buildbot instance in the first place. We do this to avoid unnecessary log-in dialogs in the web interface. Third, we created a custom data store in Gerrit to map repositories to Buildbot instances.

The new Docker Compose configuration helped significantly with this: We could easily develop and test all of these changes on local development instances of Buildbot. Deployment to the production instances was also quick and efficient. Fundamentally, without the performance improvements that the new instances brought, this feature would have not been possible. Feedback by our developers has been great so far.

Conclusion

This is not all, of course. We are currently looking into using Docker and VM images to create reproducible build environments. Developers get access and can then debug build failures  in the exact same environment as the CI. We are also investigating ways to upstream the Gerrit plug-in.

We at KDAB consider best practices and efficient workflows to be a large part of creating great software. That is why we keep investing into better infrastructure, for ourselves and for our customers. If you’d like to learn details about our infrastructure or discuss similar projects, feel free to get in touch.

 

About KDAB

If you like this article and want to read similar material, consider subscribing via our RSS feed.

Subscribe to KDAB TV for similar informative short video content.

KDAB provides market leading software consulting and development services and training in Qt, C++ and 3D/OpenGL. Contact us.

Leave a Reply

Your email address will not be published. Required fields are marked *