But beyond the general impression that “big data” represents the ability to collect and analyze lots and lots of information in some efficient manner, most people have a difficult time explaining with any specificity what the term really means.
Moreover, for some people “big data” isn’t very far removed from “big brother” – and for that reason, there’s some real ambivalence about the concept. Consider these recent “man on the street” comments about big data found online:
- “Big data: Now they can crawl all the way up your *ss.”
- “The scary thing about big data is knowing [that] Big Brother can know every single thing you do – and realizing your life is too unimportant for Big Brother to even bother.”
- “Big data is what you get after you take a big laxative.”
But now we have a recently-published book that attempts to demystify the concept. It’s titled Big Data: A Revolution that will Transform How We Live, Work and Think, and it’s authored by two leading business specialists – Viktor Mayer-Schönberger, a professor of internet governance and regulation at Oxford University and Kenneth Cukier, a data editor at The Economist magazine.
The book explores the potential for creating, mining and analyzing massive information sets while also pointing out the potential pitfalls and dangers, which the authors characterize as the “dark side of big data.”
The book also exposes the limitations of “sampling” as we’ve come understand it and work with it over the past decades.
Cukier and Mayer note that sampling works is fine for basic questions, but is far less reliable or useful for more “granular” evaluation of behavioral intent. That’s where “big data” comes into play big-time.
The authors are quick to note that advancements in data collection tend to come along, shake things up, and then quickly become routine.
Mayer calls this “datafication,” and describes how it works in practice:
“At first, we think it is impossible to render something in data form. Then somebody comes up with a nifty and cost-efficient idea to do so, and we are amazed by the applications that this will enable – and then we come to accept it as the ‘new normal.’ A few years ago, this happened with geo-location, and before it was with web browsing data gleaned through ‘cookies.’ It is a sign of the continuing progress of datafication.”
Causality is another aspect that may be changing how we go about treating the data we collect.
According to Cukier and Mayer, making the most of big data means “shedding some of the obsession for causality in exchange for simple correlations: not knowing why but only what.”
So then, we may have less instances when we come up with a hypothesis and then test it … but rather just use the data to determine what is important and act on whatever information is revealed in the process.
One example of this practice that’s cited in the book is how Wal-Mart determined that Kellogg’s® Pop-Tarts® should be positioned at the front of the store in selected regions of the country during hurricane season to stimulate product sales.
It wasn’t something anyone had thought about in advance and then decided to verify; it was something the retailer discovered by mining product purchase data and simply “connecting the dots.”
Author Mayer explains further:
“There is a value in having conveniently placed Pop-Tarts, and it isn’t just that Wal-Mart is making more money. It is also that shoppers find faster what they are likely looking for. Sometimes ‘big data’ gets badly mischaracterized as just a tool to create more targeted advertising … but UPS uses ‘big data’ to save millions of gallons of fuel – and thus improve both its bottom line and the environment.”
One area of concern covered by the authors is the potential for using “big data predictions” to single out people based on their propensity to commit certain behaviors, rather than after-the-fact. In other words, to treat all sorts of conditions or possibilities in the same manner we treat sex offender lists today.
Author Kenneth Cukier believes that the implications of a practice like this – focusing on the use of data as much as the collection of the data – is “sadly missing from the debate.”
This book fills a yawning gap in the business literature. And for that, we should give Dr. Mayer-Schönberger and Mr. Cukier fair dues. If any readers have become acquainted with the book and would care to weigh in with observations, please share your thoughts here.