Increasingly, big data is being deployed in the cloud; the latest research proves it. Quite rightly so – increased scalability for growing workload volumes, reduced management overhead and the assurance of SLAs is often all it takes to move your workloads to the cloud.
A wide array of technologies, Kafka, Spark, Impala, Hadoop, NoSQL to name a few, are being rapidly deployed. However, to truly reap the benefits and get the most from modern data applications (such as customer 360, Fraud Analytics and predictive maintenance) in the cloud requires data-driven planning and execution. So where to begin? Below are key considerations before deciding how, when, and why to move to the cloud.
Understanding the current on-premises environment
In order to effectively move data deployments to the cloud, it goes without saying that a business needs to fully understand its current environment and dependencies. Specifically, details around cluster resource and usage and application requirements and behaviours are vital for making confident decisions around what makes sense to take to the cloud.
Firstly, an organisation needs to identify what its big data cluster looks like and how dataplines are architected from data ingest through to consumption of insights. What are the services deployed? How many nodes does it have? What are the resources allocated and what is their consumption like across the entire cluster? Secondly, it needs to understand what applications are running on the cluster, as well as how much resources these applications are utilising and which are prone to bursting, for it is these that make a prime case for cloud deployment.
Making the move, one application at a time
Once a thorough understanding of the cluster has been ascertained, the next step of the cloud migration journey is to understand which workloads would benefit most from the advantages offered by the prospective cloud environment. For example, the applications that are prone to bursting in nature as well as failing due to lack of available resources make good candidates, namely due to the third party’s obligation to SLAs.
Not only this, but once it has been decided which applications it makes sense to move to the cloud, the question moves on to timing. Often, most organisations strategically decide to phase out migration to make the transition as smooth as possible. They may decide to migrate applications belonging to specific users and/or queues first, followed by others in a different phase as determined by the analysis made in understanding the on-premises environment.
It may sound easy enough, but often this information is scattered across the technology stack, making competent analysis – and subsequently a firm business case – challenging if not impossible.
Taking the leap with full faith
What is needed, then, is a single pane of glass that can ensure effective cluster reporting across time that reveals patterns in workload and application resource usage, as well as predictive analysis to ensure cloud deployments are designed with peak utilisation times top of mind.
A Data Operations and Performance Management solution can offer all of this information in real-time across the entire data stack that allows for full confidence in mapping an on-premise environment to a cloud-based one. Not only that, but the most advanced solutions will also provide different strategies for this mapping based on individual goals, whether they be based predominantly on cost reduction, workload fit or otherwise.
Taking this a step further still, reputable providers will have comprehensive knowledge of the main cloud providers, including the specs and cost of each VM, and can help to monitor the migration process as it happens. Once the workloads are there, the best APM tools can then compare how a given application is performing in its new home compared to before as well as provide recommendations and automatic fixes to ensure performance stays up to par
Ultimately, as an organisation migrates its apps to the cloud, a robust Data Operations approach will help ensure it won’t be flying blind. With data-driven intelligence and recommendations for optimising compute, memory, and storage resources, proper planning and the technology ensures the transition is a justified and, most importantly, smooth one.
Kunal Agarwal co-founded Unravel Data in 2013 and serves as CEO. Mr. Agarwal has led sales and implementation of Oracle products at several Fortune 100 companies. He co-founded Yuuze.com, a pioneer in personalised shopping and what-to-wear recommendations. Before Yuuze.com, he helped Sun Microsystems run Big Data infrastructure such as Sun's Grid Computing Engine. Mr. Agarwal holds a bachelors in Computer Engineering from Valparaiso University and an M.B.A from The Fuqua School of Business, Duke University.