Big bang migrations, in which all traffic is cutover to a new system when it is sufficiently close to feature-parity with the legacy system, are inherently risky and almost always delayed. Risks arise from complexity built up over time in the legacy system as well as uncertainty about how it works and how it is used. The migration process itself presents a risk due to its sheer scale. Failure can easily be catastrophic, and rollbacks become just as complex as the migration itself. A dual-running period with some kind of bidirectional data synchronisation can mitigate some of these migration-related risks, but this still locks up value in the new system until late in the development process.
This is an inherently Waterfall delivery approach. Not only does the new system need to reach feature-parity (or thereabouts) before any value can be unlocked, but during this time fixes and new features are likely to be required on the legacy system too, duplicating work. What if the legacy system could be gradually replaced piece by piece and start delivering innovative new features to users within a few weeks?
The term “strangler pattern” was coined by Martin Fowler, after noticing the strangler fig growing around its host tree (the host it will eventually kill) during a hike through the rainforests of Queensland. Using this pattern, developers build new components around the legacy system, gradually migrating features as and when it makes the most sense for the business.
Where the pattern can be applied
This pattern is applicable whenever a complex system needs to be replaced. It’s a great approach to use when decomposing a monolith into a suite of microservices, though it also works really well when migrating away from a complex SaaS solution. Regardless of whether the SaaS solution has a monolithic or microservices architecture behind the scenes, it’s probably going to look like a monolith from the outside and almost certainly going to be billed as such. Continuing to be billed in full for a platform that is only being partially used may not seem to make sense at first glance, but it is worth looking more closely at the business case.
The hosting and support costs for the new platform are likely to be very small when using a Serverless architecture, even more so when usage is low. Many SaaS solutions are billed on usage or monthly active users (MAOs) and with the right migration strategy these metrics can be reduced gradually as the new system comes online. Unlocking new features can have massive business benefits that may completely outweigh the additional cost, and the hidden costs of a complex big-bang migration likely outweigh even a generous dual-running period.
The pattern in action
The following real world example, drawing from experience, illustrates this concept in technical terms. For the last 2 years, a specialist team has been gradually replacing parts of a large-scale Online Video Platform (OVP) for a global media company. The system stores information about TV content and allows monetisation of that content catalogue. There are many parallels to more generic eCommerce systems: there is a catalogue of products that users can browse, search and purchase, users have to sign up and share personal data and billing information that must be protected, and there is widespread personalisation of the experience.
Placing a focus on personalisation
The first target feature area for innovation with this project was personalisation. Increasing personalisation was predicted to significantly boost user engagement and retention, thereby increasing revenue. The new features included resume playback functionality and targeted content recommendations driven by machine learning. However, large volumes of usage data (e.g. video plays, clicks, favourites) are required to power these kinds of features and the data was at this point all stored in the existing platform.
The best practice strategy in this case was to wrap the existing system early in development so usage data could be intercepted and then migrate the personalisation APIs as the new features were ready.
One of the first things the development teams did was stand up an API and persistence layer to handle this data and insert a middleware service in the path between the apps on end user devices (Web, Mobile, TVs). This is often referred to as the Backends For Frontends (BFF) pattern. This BFF provided the ideal place to implement a data fan-out, with usage data being propagated to both the existing and the new systems in parallel.
When the new personalisation APIs were ready for primetime, it was relatively easy to put them into service by releasing a new version of the middleware. Enough data had been captured at this point in the new system to provide the required features to end users without having to do any bulk data migrations, significantly reducing risk and complexity. Devices were migrated incrementally from the old to the new middleware version by device type as the new features were written into the client code. Because data continued to be sent to both the old and new backend services, this switch was seamless for the end users.
In this example, the strangler pattern enabled new features to be delivered rapidly to end users in a fraction of the time it would take to create a complete replacement of the existing platform. Furthermore, a risky big-bang data migration was avoided by implementing data fan-out in the BFF layer.
Content management
The second feature area for innovation was content management. User focus groups had shown that enhanced navigation and search capabilities would significantly increase user satisfaction and engagement. It was vital that new features were added quickly to stay competitive and replacing the existing system in its entirety was not feasible.
The strategy was to insert the new CMS in the flow of data between the existing system and the applications on the end user devices to enable the catalogue data to be augmented and new features to be launched quickly. The existing system would become fully wrapped by the new CMS in a later phase.
In the first phase a new CMS and search engine was built that polled data from the API of the existing system on a regular interval. As soon as the new APIs for browse and search were ready, they were integrated into the BFF layer and put into live service, even though the new CMS UI did not even allow editors to make changes to the data yet. Over the next two years the team followed a strategy of only adding features in the new CMS. When a user-facing feature required additional data, the ability to create this data was built in the new CMS.
Eventually the team started to work on migrating the remaining editorial functionality from the existing system. To make this an iterative process and avoid a big bang migration they set up a two-way sync between the systems. This meant that an editorial user could make the same change in either system, meaning no roll-backs would be required. It also meant that there was no pressure to immediately migrate the other consumers of data from the existing system.
Here, as in the previous example, the strangler pattern enabled much faster time to market for new features. Using this pattern also avoided two big-bang migrations: the migration of apps on client devices from the old to the new browse and search APIs, and the migration of editorial users from the content management UI in the OVP to the new CMS UI.
Rapidly bringing new features to market
While a number of organisations will remain steadfast in their commitment to a big bang migration, it is always worth trying to dig deeper into the details and find ways to make the process more iterative. As the above real life example shows, detractors should look again at the strangler pattern as the starting position when starting to talk about the replacement of any complex system. With this pattern, organisations are able to remove ageing components and functionalities at their desired pace. Most importantly however, organisations can ultimately bring new features to market quicker and gain a competitive edge.
Chris Birkinshaw, Technology Principal, Merapar an Alfa1 Company