Latest news about Bitcoin and all cryptocurrencies. Your daily crypto news habit.
Feature Flags and Test-Driven Design: Some Practical Tips
âFork in the roadâ by Bs0u10e0
In 2018, our team spent a lot of time working with feature flags and test-driven design (TDD). Our goal was to effect an architectural change to our system: changing the source of truth of some data and moving it out of the database owned by a legacy monolith into a new database controlled by a new microservice. However, much of the code requiring the data would remain in the monolith.
Some examples of the types of things we feature-flagged are:
- whether to go down a refactored code path or not;
- whether to publish messages to a message queue when a certain event occurred;
- how to publish those messages (we tried multiple variations of batching and transaction boundaries to achieve acceptable performance);
- whether to just delete messages at the receiving end or actually handle them;Â and
- whether to use a local source of data or a remote one.
We were working on a pretty important piece of code; the kind of business function where, if we stuffed it up, someone would probably have to spend several days doing remedial fixups or making phone calls to chase up millions of dollars. Hence, most of the feature flags we put in were to ensure we could test new code paths in production, and rollback safely & quickly. Some of them we even deliberately switched on for 10 minutes, let the new code run, shut them off again and then looked at the results offline. From memory, all of these flags were simple on/off switches, although we do have the capability to do per client/customer/category flags when we need to. These tips should work for either type.
Because we used so many different feature flags in a relatively short period of time, we got to experiment with and see the effect of a number of different patterns, both good and bad and developed the following set of guidelines.
For context, in this particular app, our oldest and biggest monolith, most classes have a 1:1 unit test. Thatâs not our current preferred practice anymore, but itâs a legacy we live with when working in this app. It also has a large collection of integration tests.
Goals for Feature Flags with TDD
Archer by Unknown
There are a couple of goals that we were aiming for as we did this work:
- We want to make sure all currently tested paths that the flag affects are tested with the flag on and off.
- We donât want our test suites to become too large and/or hard to understand because of feature flags.
- We want a git history thatâs easy to understand despite the temporary complexity of a feature flag.
- Ideally, deleting a feature flag will be a quick task that requires no in-depth reading of the code or tests.
Feature Flags and TDD: Our Five Guidelines
1. Make a copy of every affected test
The first step to ensuring tests donât get unwieldy is to not try and test all the feature flag ON and OFF cases in one test. So: donât open up the existing test file and start adding new âFeature X ONâ test cases. Instead, make TWO test files: one for flag on, one for flag off. Yes, weâll probably have a bunch of duplicated test cases that are exactly the same in both files, maybe even most of them, but this is only temporary.
There are three benefits to this approach: First, weâve avoided creating a confusing monstrosity of a test with flag on, flag off, and flag-agnostic tests all mingled in together. Second, once weâve decided to keep the flag on permanently, we can just delete the whole âflag offâ test file. âSimples!â Thirdly, we can do the setup for the flag once at the top of each test file, rather than having to set it at the start of each test case and hoping people notice that detail when reading the test.
2. Make the copied file the test for the âOldâ functionality, and the original file the test for the âNewâ
When we make a copy of the existing test to a new file, we want to make the new file the tests for the old functionalityâââwith the feature flag off (e.g. cp FooTest FooPreFeatureXTest)âââand change the original test file to have the test cases for when the flag is on. This helps to keep a contiguous SCM history for the enduring test suite showing how the functionality progressed.
The alternative is to put the âfeature flag onâ test cases in the new file, then, when we get rid of the feature flag, deleting the original test and renaming the ânewâ one to the ânormalâ one (e.g. mv FooWithNewFeatureXTest FooTest). We found problems with this latter approach when we looked at our git history: it would often show the whole test being deleted then re-created, which makes it much harder to inspect what changes were made to the tests when the new change was introduced.
3. At the end of any test that enables a feature flag, always set the flag back to what it was before the test started
This is important because, if our flags are stateful in a way that isnât automatically reset between tests, we can end up with tests that donât specifically turn the flag on or off then randomly running with the flag on depending on the order in which our tests are executed. For most tests, this wonât make a difference, but for some, it will and we could get failures in our build that donât make sense because we canât see from the single testâs code that the flag was left on by another test before it ran.
Note also the careful wording of this rule: we donât always disable the flag at the end of the test; we set it back to what it was before the test. Being disciplined in resetting the flag is crucial to making the next tip work.
4. Run the whole build with the feature flag ON
Once we think weâve created pre-feature and post-feature versions of all the test suites that we believe are affected by the feature flag, we need to run the whole build with the flag ON. This will flush out tests that are affected by the flag where we havenât yet created a divergent suite. If weâve been sufficiently comprehensive in our test modifications, nothing should fail, because all tests of paths that care about the flag should already be explicitly setting the flag in the test.
In a simple microservice, weâre probably unlikely to find anything with this step, although it should be quick to run so itâs still worth doing. On the other hand, in a complex monolith with many components, itâs quite possible that there are parts of the system that weâve forgotten rely on the behaviour that weâve just changed, and their integration tests may well fail as a result of the changes. We really want to find these transitive breakages before switching on the feature flag in a production system. If we donât, we actually end up running an untested version of that dependent component in prod.
5. Switch on and delete feature flags ASAP
Feature flags are a device for assisting in controlled, reversible migration in production systems, but they also complicate the code. We want our code to be simple, so when there are feature flags in place, we make it a high priority to get them switched on in production. If we donât prioritise switching them on, we canât prioritise deleting them. This becomes especially taxing if we find the need to put other feature flags in the same area and start getting feature flag interplay.
The instant that a migration is completed in prod and weâre happy that the functionality should remain, the feature flags become technical debt. We want to remove them as soon as possible afterward so that we revert our code to being as simple as it can be.
Yo, what about the Strategy pattern?
âStrategy pattern in UMLâ by Jason S. McDonald
When I shared some of the above tips on Twitter, a few people responded that they like to use the Strategy pattern when theyâre doing feature flagging.
While Iâm a big fan of the Strategy pattern, I would generally avoid it when feature flagging. My reasoning is that feature flagging is ideally a temporary measure, whereas the Strategy pattern is a design construct for supporting multiple implementations. While feature flagging often results in multiple implementations for a short time, I donât think itâs worth introducing a design construct for this short period, only to delete it again shortly after, once the flag is removed. So we generally just used an if/else in all places that rely on the flag, knowing that the âelseâ branch will be removed shortly. We would probably make an exception to this if the feature flag pertained to a large or complex piece of code that was being almost entirely re-written, and it was clear that readability would be greatly improved by having the two implementations in separate classes.
How do YOU do Feature Flags and TDD?
These are the main guidelines that our team developed through almost a year of feature-flagging our way through a complex piece of migration work. If youâve got other tips for how the work with feature flags, particularly in a TDD environment, Iâd love to hear about them in the replies.
This article was originally published on Grahamâs blog âEvolvable Meâ. Visit the blog for more articles about software development and to sign up for notifications of new articles by Graham.
You can follow Graham on Twitter at @evolvable.
Feature Flags and Test-Driven Design: Practical Tips was originally published in Hacker Noon on Medium, where people are continuing the conversation by highlighting and responding to this story.
Disclaimer
The views and opinions expressed in this article are solely those of the authors and do not reflect the views of Bitcoin Insider. Every investment and trading move involves risk - this is especially true for cryptocurrencies given their volatility. We strongly advise our readers to conduct their own research when making a decision.