Wake up and smell the smart Cumulonimbi
The origins of data analytics – the science of analysing and drawing conclusions from data – and cloud computing can both be traced to the 1950s. Using cloud computing for analytics, however, has happened only very recently.
Data analytics was originally a backroom administrative task, a manual process of pen and paper, ledgers and tables. With the advent of databases and spreadsheets analytics has slowly moved on, but it was the explosion of new business intelligence tools that enabled the effective slicing, dicing and visualisation of data that really accelerated change. In fact, a whole new profession has come into being to do the work: the data scientist. A key challenge is that data analysis is computer-intensive. To gain accurate insights requires huge amounts of data to be processed. Until recently this has been beyond the means of most companies.
The Cloud grows up
Cloud computing, as we know it now with elastic compute capacity, server virtualisation and pay-as-you-go pricing, first became commercially available in 2006. However, the original clouds were underpowered and not suited to data analytic processing. The original amount of power that the elastic compute unit provided was less than you would find in a modern smartphone. Furthermore, these first clouds suffered from ‘noisy neighbours’ – a result of virtualisation when more than one task is run on the same physical machine. Inherent bottlenecks on disk I/O or network resources meant that your virtualised cloud computer ground to a halt. In analytics – where your analysis is only as quick as your slowest lookup – this could be disastrous.
[easy-tweet tweet=”#Cloud computing as we know it, first became commercially available in 2006″ user=”EXAGolo” usehashtags=”no”]
Welcome to the Power Cloud
However, times have moved on; no longer are you limited to one cloud provider. Hundreds of clouds, both large and small, are now available. For specialised analytical tasks there are clouds optimised for analytics, and to avoid the problem of noisy neighbours you can now use “bare metal” where the virtualisation layer is removed so you benefit from full, single-tenant physical servers to run your analytics. Bigstep’s Full Metal Cloud is one such cloud provider, which promises an instant 500 per cent performance boost when compared to virtualised clouds.
[easy-tweet tweet=”Analytical jobs are now the ‘square peg to a traditional transactional database’s round hole’” via=”no” hashtags=”data”]
Database technology has also evolved, and analytical jobs are now the ‘square peg to a traditional transactional database’s round hole’. The old way of dealing with this problem was indexing the database, creating partitions and data aggregates so you could set the database up in a way where some data could be analysed. But still, such analytical jobs ran slowly, in batches and would take hours to complete.
Hadoop, the framework for distributed processing of large data sets across computer clusters, has played an important part in this evolution. Although Hadoop scales well, it is very slow to process queriesand this is not ideal in a world where decisions are expected to take just seconds, not minutes. That’s where a fast analytic database comes in to play. Users can continue to collect and store data in Hadoop clusters and then run fast analytic jobs in the database. Moreover, the latest database technology is ready for the cloud; indeed most solutions are now cloud-enabled and run in-memory with native connectivity to popular business intelligence tools such as Tableau – which is a particularly user-friendly platform for analytics and data visualisation.
As technology requirements increase, businesses need only pay for what they need at the current time
The advantages of cloud are plain to see: Firstly, companies no longer have to invest in data centres. Secondly, it has enabled organisations to move away from a traditional CAPEX model (buy the dedicated hardware upfront in one go and depreciate it over a period of time) to an OPEX one where you only pay for what you need and consume. As technology requirements increase, businesses need only pay for what they need at the current time, rather than having to plan months or years in advance. This means the cloud also enables incremental growth. In addition, all the nuts and bolts of running a data centre, backup and recovery, networking and power are done for you. That’s a huge headache removed in one fair swoop. Taking data analytics to the clouds is undoubtedly the next step in the cloud revolution and the natural way forwards for businesses if they want to get to grips with their data once and for all.
EXASOL’s high performance in-memory analytic database is now available in Microsoft Azure.
Mathias Golombek, CTO, EXASOL
Mathias Golombek, is the Chief Technology Officer at EXASOL, provider of the world’s fastest in-memory analytic database. Leading global companies using EXASolution to run their businesses faster and smarter include Adidas Group, Gfk, IMS Health, Olympus, myThings, Sony Music and Xing, among many others.