Audience-focused approach to architecting a build pipeline

14 min readOct 12, 2021

All I want for Christmas is for Apple to not buy our CI/CD provider, drop Android support, and force us to look elsewhere. — Mariah Carey, December 2017, most likely

Abstract

This article walks through my first major project after finishing up undergrad — redoing a CI/CD pipeline because of extenuating circumstances for an ever-growing team, and ever-evolving mobile application.

Although I specifically mention which technologies we ended up using towards the end, this piece is meant to describe a platform-agnostic approach to creating your CI/CD pipeline. The main benefit is to allow for flexibility when changing build providers, should the need for that arise in the future.

The bulk of the work centered around figuring out the audiences/users of the build pipeline, as well as their requirements. Everything else stemmed from there.

The process described here targets the Android mobile platform, however, the paradigms may be applied to web, embedded, and backend projects as well.

Required disclaimer — I’m not sponsored, nor endorsed by any of the companies mentioned in the article. I do have my personal preferences, however, they are just that, and do not reflect my past, present, or future employers.

Introduction

If you are not familiar with the concept of a CI/CD pipeline, or are one of my non-CS friends reading this (hello!), then this section will shed some light on the topics at hand.

Let’s say you work in the field of medicine. Before a new vaccine or treatment is approved for general use, it needs to be properly developed, validated, and then tested. Similarly, a majority of schools will receive new syllabi only after a select few test them out, and various administrators are happy with the results. In software engineering, this translates to making sure a project passes a multitude of tests, builds properly (on some platform, referred to as a build provider here), and runs as expected.

The abbreviation CI/CD actually stands for 3 different things. For the sake of brevity and everyone’s sanity, no one says CI/CD/CD.

Continuous Integration — The first part takes care of making sure that each change submitted by developers is successfully added back to the project. For example, let’s say a developer is changing a button from red to blue, its text from “Log in” to “Sign in”, and its typeface from Arial to Papyrus. Normally, this would be one big revision, but I’m stretching this out for demonstration purposes. CI has to make sure that each change — first color, then text, then typeface does not break the project for everyone else. Although each developer may check for build failures on their own machines, this isn’t a reliable way of making sure everyone gets the same end result.

Continuous Delivery — This is the next step in the pipeline. This part ensures that after every revision mentioned above, the project is delivered to relevant stakeholders. Three new builds will be sent out either to QA, alpha/beta testers, product management teams, or straight to customers. Everyone will be able to get three different projects: one with a blue button, then a blue button that says “Sign in”, and finally a button with a terrible typeface.

Continuous Distribution — This is the more automatic complement to Continuous Delivery. Continuous Distribution is an automatic process by which a project is sent out to customers, assuming everything went well in the previous two steps. There’s very little human intervention during this stage. With a fully automated process, there is no concept of a release day — every single change that successfully passes through the pipeline is immediately sent out to customers.

💡 The distinction between Delivery and Distribution not very clear-cut in the mobile space. Both the Apple App Store, and Google Play validate app updates manually, a process which can take days. In this sphere of software engineering Delivery most likely means sending the project out to QA (and/or internal testers), and Distribution means sending the project out to the general public. A hopefully helpful diagram:

Why the rework?

Before the overhaul, our build “pipeline” consisted of the following setup:

Open a pull request (a set of changes to submit into the codebase) and kick off a build. This would take 30 minutes.
Our build provider would invoke a bash script, which in turn invoked Gradle to lint, test, and build our app.
The resulting binary was sent to QA, validated, and, upon a successful test of a feature, the code change would be submitted.

The most advanced logic in all of this was an if-else check in the bash script for the branch name — if it was “playstore-release”, then we’d build a release version. 🎉 I don’t think said branch was ever created. There were multiple issues with this process:

Feedback time for developers was extremely long. If a developer missed a space for the linter to pick up, the error would only show up towards the end of the build process. Fixing it would require a new commit, and a new build.
QA tested one feature at a time, which meant that there was no guarantee that developer #2’s change would not break developer #1’s feature — tests were very granular, and not end-to-end.
No automated process for Continuous Delivery existed, and there was no Continuous Deployment to be found.

This Neanderthal solution, albeit somewhat satisfactory in the beginning, would not scale with the ever-growing number of developers, upcoming alpha program launch, and expansion of our QA team. To add insult to injury, Apple acquired our build provider Buddybuild around Christmas 2017, further forcing our hand.

Determine your audiences

We started the overhaul by determining who will be using the build pipeline, and worked backwards towards build configurations in our project. Developers were the first audience, followed by our internal QA team. With the alpha program launching soon, external testers were the third group. Production (end customers) were not even considered at this stage — it was important for us to not overcomplicate things in the beginning.

Taking a look at each audience and their requirements yields the following delineations in configurations:

Developers — Builds are completely open, for internal use only. Logging is very verbose, all cutting-edge feature flags are either enabled, or at the very least exposed for preliminary testing. Custom debug overlays, tools (like LeakCanary) are enabled. On the flip side, 3rd party integrations (like QA reporting tools) are disabled. These may be enabled for testing, but are generally reserved for information originating from QA teams.

Staging (QA team) — Builds are still mostly open, for internal use only, but logging may be less verbose. Logging statements not inherently useful for QA are disabled. Only pertinent feature flags (ones that are at least code complete and won’t crash the app) are enabled, but the ability to toggle other flags is present. Custom developer overlays are disabled, but debugging tools are still enabled for bug reporting. Finally, QA integrations are enabled for formal bug tracking processes.

Alpha — Builds are now external (to the team), logging is capped at the error level. Access to toggling feature flags is disabled, with only the completed/validated ones being turned on. Debug overlays and monitoring tools are disabled. There is now a formalized bug reporting process followed via QA integrations.

Initial breakdown of audiences and their requirements

Naturally, there is leeway in determining what is enabled/disabled for each audience. We drew a harder line between internal and external teams in terms of how much information we want to expose in logs and error messages. Although it is possible to keep everything enabled across audiences, this may result in reporting overload with lots of duplicate bugs.

Since we now had our main audiences and their basic requirements in place, we could create three different build configurations: dev, staging (for QA), alpha. Color coding the builds like in the diagram above, we can see how each audience will only ever be primarily working with one type of build.

Example path of a build from developers to alpha testers

A build’s progression from developers to customers will have timelines that highly depend on the needs of the business. For example, QA may get a new build from developers once a week or once a month. After several weeks or months of testing, QA may give the green light to push to alpha (or customers). Another example could be that all of this happens in one week, and Thursdays are reserved for deployments to customers.

You may have noticed these diagrams are starting to look a lot like Git branches — more on that next.

Adding complexity

Branching strategy

It is important to keep track of your changes, and understand where in the pipeline your build is. A git branching strategy is very useful for this. Think of this like Google Docs for code, with everyone’s changes to the same paragraph being applied in an organized manner. Much like “we work using a form of agile”, we used a form of Gitflow to track our changes and builds.

A brief overview of the process is the following:

General work may go into the develop branch.
When required, a release branch (each release gets its own) is created, and work relevant to that release only goes into that branch. Parallel to this, work for release(N+1) is added to the develop branch.
Release-specific branches output builds for QA to test. If an emergency fix is required, a developer submits the change straight into this branch, and a new build is automatically sent to QA.
After QA approval, a release build is created from the release branch, launched to customers, and the develop branch is brought up to speed with any missed emergency fixes.

This is what multiple git whiteboarding sessions looked like in the end. See the develop branch? Me neither.

Data buckets

I’ve mentioned bug reporting and monitoring tools earlier in the article, and this section expands on so-called data buckets. These are destinations that various tools send data to from different stages in the pipeline.

We were not only working with external hardware devices, but also with multiple cloud environments. As a result, our build audiences didn’t always line up across teams, resulting in more refined bucket splitting on our part.

Analytics — Starting with the easiest integration, developers did not use analytics platforms past basic testing. The above-normal error and crash rates seen during development may pollute dashboards with useless information. To prevent this, despite low usage, developers had their own analytics bucket; so did QA, alpha, and production.

Monitoring — Monitoring tools followed a similar pattern as analytics. However, the boundary between internal and external audiences (alpha and up) was more strongly enforced. Our QA team got their own bucket, with alpha and beta (added later on) sharing another. Production got their own bucket as well due to privacy-centered reporting requirements. Bug reporting tools had a similar setup, with external audiences following a much more formal filing process.

API keys — These were very interesting to set up to better align with our cloud team and allow communication with our hardware products. There were only two cloud environments that the app could communicate with for login — dev and prod. Everything up until certain QA builds (for example, we later added rcStaging [release candidate]) used the dev keys. Past rcStaging everything used production keys, making it easy to figure out if something was a cloud issue or not.

💡 Many more tools required API keys to function, and the bucket mappings were not always 1-to-1. The question we kept asking ourselves during the process of choosing destinations was always “Are these builds intended for internal or external audiences?” and going from there.

Technical details

This section describes the technical side of our build pipeline. We strove to create as much of a platform-agnostic approach as possible. For example, the environment variables local builds pulled were the exact same as the ones found on our build provider. The service would simply overwrite the values. This made it easier to substitute keys locally, or to potentially deploy our project on a new service in the future.

Infrastructure

Buddybuild was our previous build provider. We started looking for alternatives when Apple bought the service and started turning down Android support. Some candidates were Bitrise, Jenkins (on and off premises), Visual Studio App Center, CircleCI, and Travis CI. In no particular order, most of these were dropped for one reason or another (this work was done in early 2018, we are sure things have since changed):

Poor support for branching and inefficient configuration requirements for API keys.
No support for encrypted blob storage for signing keys.
Lack of iOS build support (we had two native apps).
Simply slow builds, lack of customer support in case of need.
Cost of maintenance and setup, poor custom workflow options.
Outdated approach to CI/CD — by the way, a good read on this here.
To people wondering — Github actions didn’t exist back then! 😅

We ended up going with Bitrise. I would be lying if I said their Docker CLI tool didn’t play a significant role in our decision. The fact of the matter is that testing a CI/CD pipeline is frustrating. If you’re testing step 8, and things fail because of a typo on step 3, you’ll end up needing to kick the build off again. With the CLI tool, we were able to cut down on iteration time and debug build issues locally using Docker containers.

Bitrise would be the only source of production keys in our entire workflow — using its blob storage, the keystore would be downloaded to the project folder, and Gradle would use it to sign the builds. All other builds used the debug keystore, stored in the GitHub repo. We needed a universal debug keystore since 3rd party integrations relied on consistent application signatures, and this was way easier than adding a new entry for every developer’s device.

Audiences

We used product flavors to configure build targets for our audiences. A shortened version of this can be seen below. Although now one may use build types, this was not the case back then— only debug and release values were allowed.

For people new to Android, if you combine the flavor names above with build types (debug, release), you get application variants — alphaDebug, stagingRelease, devRelease. These are the actual targets to build. Because application IDs and versionNames had suffixes, it was easy to see where errors were coming from: my.awesome.app-alpha, or my.awesome.app-dev.

Keys and integrations

Starting with the easiest integration, LeakCanary was used for memory monitoring. It was configured via a flag found in a Gradle properties file, and was only enabled for dev builds.

A lot of our other integrations like TestFairy, Firebase, Segment relied on API keys found in the operating system variables (yeah, we know, lessons learned). Developer machines only had dev-level keys, with all others being blank. We could have configured Gradle to only sync specific application variants but we didn’t because of…

Developer machines would have empty keys; Bitrise would populate all of them.

If we combine what we talked about in this article with regards to audiences, their requirements, and the branching strategy, we get the following merged chart:

A simplified, but practical CI/CD pipeline

The whole pipeline looks much more organized than the original bash script. One giant win we saw with this segmentation was that our builds now took 12 minutes instead of 30. We can attribute this to multiple factors:

We built and tested only what we needed at each step — if Gradle already ran unit tests for one target, we would not ask it to run tests for another one. Similar logic applied to any and all code analysis tools.
Instead of using bash scripts, we used Bitrise’s Gradle wrappers to run our commands. This made it easier to have the step (and hence, the build) fail earlier on in the process — and not error out with a missing space after running 1000+ unit tests.
Bitrise most likely had more powerful offerings than Buddybuild, and if someone can please confirm/deny this — that would be great, I couldn’t find spec listings for the latter.

Obfuscation

I’d like to bring up a rather specific addendum to the build pipeline, but an interesting one nonetheless. Customers get an obfuscated version of Android applications — code is turned to gibberish, with comments and other unused lines striped out to reduce final package size. We encountered instances where this process of minification would break our project.

Since we rewrote our mobile app in Kotlin a week after Google IO 2017, we ran into early adopter issues like this one. Our builds would either break, or crash randomly during runtime. We realized that it’s a good idea to build an almost customer-perfect target early in the pipeline. This way, after a full test of the full version, QA can perform a light smoke test of the minified target to catch these elusive bugs. In our example, our QA branches built prod ready builds for these checks:

prodRelease was simply our production target that we sent out after alpha gave the green light.

Closing thoughts and lessons learned

If you’re in the same, or similar position to where we were in 2018, and are not sure where to start — hopefully this article helps. Start with your audiences, any and all users of your build pipeline, figure out their requirements, and work towards creating build configurations.

This was one of the most educational projects I’ve worked on, with many lessons learned:

Start simple. If you have multiple testing teams, give them all one configuration to start with. Developers should get their own target as well, and maybe something to launch to external teams. Once you’ve figured out all requirements and the intricacies of the build configurations, you can start adding complexity. We didn’t create our production target until weeks later, and the changes with alpha were minimal.
We put API keys into our system environments, and I’m not sure why we did this — hindsight is 20/20. Use property files, don’t check those in, and have the CI system overwrite all the necessary values for production during the build process.
Keep unique identifiers just that — unique for each audience. This will help with bug triage in the future. Some examples are custom version names like 1.1.1-alpha, print statements displaying build number in the logs, or having different app launcher icons. Just don’t forget to clean things up for production…
Automated daily builds are helpful not only for project stability and testing, but also for unprecedented failures in 3rd party services. For example, it was easy for us to know when JCenter was down, or when Jitpack had a glitch and we weren’t able to pull required dependencies.
I learned Gradle! The hard way! By breaking our project more times than I can remember! Also that XCode doesn’t hold a candle to the Gradle build system….but that’s neither here nor there 😉