There’s a chance you might have heard about multi-cloud in the last year or so. And for good reason, too. According to IBM’s State of Multicloud report, 30% of organisations are using three or more clouds, while multi-cloud strategies are rapidly becoming an established part of the digital playbook, with 21% of businesses overall describing it as their ‘most important initiative’.
We hear plenty about its potential for getting apps into production quickly. But what about doing the same for data? Well when it comes to cloud data warehousing (an enterprise system used for storing data from various sources), it’s generally the case that two (or more) clouds are often better than one, given the variety and complexity of data handled in the modern day.
A multi-cloud architecture can present any number of combinations of cloud platforms and cloud data warehouses (CDWs), and there are a multitude of factors that might compel an organisation to set up their infrastructure in this way. However, there are several common use cases and reasons that we see every day:
- Meeting regional needs – The need to fulfil data compliance and sovereignty requirements, as well as the freedom to scale up and down in the different markets that make most sense for your business
- Data recovery – Multi-cloud architectures can be a safeguard against cloud outages, disasters and other expected downtime by having your data stored across multiple platforms and warehouses
- Vendor diversification – Multi-cloud allows greater flexibility for your organisation should either pricing, storage or compute offerings change on their end, or demand for data on yours. It’s also well known to be a boon to businesses trying to avoid vendor lock-in, especially those tied to legacy, data-intensive applications and platforms
- Different tools for different needs – Data and IT teams are not homogenous; they contain people with a whole range of experience and preferences using different tools, technologies and datasets to support the varying needs of different lines of business, meaning a diversity of cloud environments is key to meeting their needs.
However, multi-cloud isn’t a silver bullet solution and, like any technology, comes with its own data risks and challenges to navigate. Without a solid underlying strategy for where to store and execute against your data in the cloud, organisations risk facing serious problems. Let me explain why.
Solving the multi-cloud puzzle
Inherently, a multi-cloud configuration creates data silos, with data stored in various data warehouses and lakes across multiple platforms, locations and environments. This outcome is by no means intentional, but leads to inconsistencies in output as individuals apply their own rules to and work on different streams of data. Ultimately, such silos are a major obstacle to that “single source of truth” that businesses so desperately seek, as well as their efforts to become truly data-driven.
Portability is also a major cause of this; organisations find it hard to break out of data silos because their data is stored in different formats across a variety of technologies and platforms. Vendor lock-in is so common partly because of this lack of interoperability between providers, and while portability solutions exists, at present they’re expensive to obtain and maintain.
Both data silos and the lack of portability are ongoing issues, because moving data between platforms or regions can pose a data security risk without the correct governance, processes and security controls in place. So how can businesses safeguard against these data challenges and ensure they capitalise fully on the freedom, flexibility and efficiencies associated with a multi-cloud architecture?
First we need to remember that multi-cloud architectures are not one-size-fits-all, just as business problems aren’t either. They can comprise a mix of public and private cloud infrastructures, or use a variety of different cloud data warehouse providers, such as Amazon Redshift and Snowflake. Or perhaps you might be hosting operational data stores in AWS but migrating and doing data analytics in Azure. Often multi-cloud is all of these at once!
Whatever the scenario, a unifying data management layer that allows for the secure passage of data across warehouses, platforms and regions is always paramount. It sets the foundation for ‘Cross-cloud’ data sharing, in which organisations deploy a single type of cloud data warehouse that can operate on multiple cloud data platforms. Snowflake customers can launch their CDW on AWS, Google Cloud and Azure, for example. Or you could flip this configuration, and use a single underlying cloud platform and multiple CDWs.
Whatever solution you opt for, a strong data transformation and loading process is essential if you’re to extract maximum value from your data in a multi-cloud infrastructure. Although these architectures can be complex without the right integration strategy, it ultimately helps data teams to be more productive, flexible and efficient. When fed into a robust data strategy across the organisation, this allows then companies to take away more value from their data and make more informed decisions – the ultimate silver lining sought from any cloud.
David Langton is the Vice President of Product at Matillion, an enterprise data platform providing non-experts with a means to leverage data to drive performance. He is a seasoned software professional with over 20 years of experience creating award-winning technology and products. Prior to his role at Matillion, he worked as a data warehouse manager and contractor in the financial industry.