The problem with big data is that there’s so much of it. No, that’s not a Yogi-ism, but an accurate description of the paradox facing many organizations today. Generating and storing data is easier than ever. We’ve entered an era in which it is straightforward to collect a myriad of data, from the most mundane (think server logs or retail register receipts) to the most profound (like one’s genome or daily online meanderings and preferences), cheaper to store it indefinitely and easier than ever to aggregate from multiple sources. Assuming about a tenth of the data is accessed in any given month, storing a terabyte on S3 runs about $400 per year. Google Cloud is about the same.
As I write in this column, the bounty of data perversely makes it harder to refine into useful information: more costly to process, complex to analyze, harder to comprehend and thus less than effective at improving decisions. The situation will only get worse as the variety of technology trends summed up by today’s hottest buzzword, the Internet of Things (IoT), mean our data supply, both as individuals and businesses will be shifting into overdrive as data collection and connectedness is added to all manner of objects, whether home thermostats, industrial equipment, or even livestock. Never hear of the quantified cow? Your local dairy farmer has and is may already be outfitting them with the bovine equivalent of a Fitbit to increase milk production while reducing greenhouse gas emissions.
The column goes on to highlight several innovative start-ups like BitYota, Interana, Medio, Quid and others that address the problem of too much data by exploiting the cloud and cloud-like distributed software design to deliver so-called data analytics as a service. The business benefits are many. First, by exploiting cloud infrastructure (often AWS) and adopting a SaaS business model, these services eliminate big, up front infrastructure build outs and CapEx expenses, replacing them with usage-based, pay-as-you-grow pricing. Second, by encapsulating data analysis expertise in easily deployed and used software, it allows unlocking the value of big data without hiring a staff of PhD data scientists and DB admins.
The exploitation of big data for previously unknowable business insights and improved decision-making is still nascent, but a new generation of data analytic services democratizes access allowing any organization to tap resources previously only available to the largest few. I will be closely watching this technology and emerging software market to see the innovative ways business leaders use these and other services that will surely follow.