Continuous, Continuous Integration
Last week, I finished implementing our refactor of Hemlock’s file I/O from synchronous system calls to asynchronous io_uring submissions and completions. Firstly, I want to say that io_uring rocks. Building an asynchronous I/O framework on top of it was blissful compared to similar past project experiences with poll and epoll.
Also last week, I found out that Hemlock’s new file I/O implementation breaks our continuous
integration build in GitHub Actions. The reason? The file I/O changes targeted bleeding edge kernel
features, and GitHub Actions’ ubuntu-latest
was version 20.04 (nearly two years old now). That
created a problem for us. We were using io_uring kernel features that weren’t available until kernel
version 5.13. GitHub Actions recently made Ubuntu 20.04.3 available, which gave us kernel 5.11. It
was a little short of meeting our needs.
I briefly thought of turning off our C/I and merging the file I/O changes with roughly the following commit message.
Refactor File I/O on top of io_uring.
- Disable C/I because it can't handle io_uring.
- It works on my machine. ¯\_(ツ)_/¯
However, as it was not amateur-hour at BranchTaken, I did not do that. Continuous integration remained continuous and file I/O changes sat patiently in a branch while we modified our C/I. GitHub Actions obviously could not do the job alone, so we had to build and test our pull requests elsewhere. Deciding where was a bit daunting.
We considered the cloud giants. AWS (some mix of EC2, ECS, ECR, Codebuild, or Lambda), Azure, and GCP. Additional solutions we considered were TravisCI and self-hosted GitHub Actions runners on a cloud provider. Of all these, I found that TravisCI was the only one that would not support the latest release of Ubuntu. Given my modest experience with AWS and our eventual need for ARM processing, AWS seemed the obvious choice amongst cloud providers. Said experience led me to estimate a minimum of one week (and to brace for as much as one month) of effort to get everything set up properly, and that I’d probably do it in a way that would need frequent refactors and maintenance. I found the idea distasteful, but it initially seemed the best option.
Fortunately, the GitHub Universe conference was still in our
short-term memory. One of the big takeaways we had was that the gh CLI
is amazingly useful. Just before committing to integrating AWS into our C/I, we realized we might be
able to use the newly-announced gh extension feature
to achieve our goal. Extensions are implemented as bash scripts. The simplicity of it made it
extremely attractive compared to the overwhelming complexity of spinning up new cloud
infrastructure. We could create our own gh push command
to supersede our usage of git push
. It would build and run our C/I tests, git push
our commit,
and mark the commit status
as success
or failure
on our GitHub repository. Additionally, Docker is so reliable and
repeatable across different environments that we were okay with just having the extension build and
run our tests in a local docker build
command rather than in the cloud. We had the Docker
infrastructure mostly built already since we use it for all of our development. Our old GitHub
Action would then be reduced to verifying that commits have a success
status rather than being
wholly responsible for building and testing. Accidentally running git push
instead of gh push
would still be fine for us. Commits pushed this way simply wouldn’t have any associated status and
would not be mergeable from a pull request until we used gh push
to run C/I tests and write
results to GitHub.
We decided to go the route of the gh push
extension, and are quite happy we did. Our C/I is
actually simpler than when we started and we think our testing strategy will take us a long way into
the future. Unabashedly, that testing strategy is, “It works on my machine. ¯\_(ツ)_/¯”.