Tuesday, April 16, 2013

Why 'lean data' beats big data - Guardian

The big data hype may not help you make the right decisions for your business – and there four reasons why a lean approach makes better sense

Here are four reasons to prefer lean – rather than big – data.

1. Starting with 'big' puts the cart before the horse

Without knowing what your data needs are, it's counterproductive to start with the assumption of industrial-scale data. If your data strategy consists of collecting a few thousand data points a day, you're not in the big data club. And maybe the most meaningful data is quite small – such as the example of Austin-based startup Food on the Table (as Eric Ries described in his book, The Lean Startup), who initially offered their service to only a handful of customers.

2. Everyday tools pack a lot of punch

Chances are, the optimal amount of data storage and processing capability for your business is going to be less than Google's. Lean data relies on picking the right tools for the job, and you may already have them. Fjord recently helped Harvard Medical School redesign interactive paediatric growth charts to be used on tablets, using relatively simple data judiciously to improve doctors' decision making and potentially reducing significant harm to patients.

3. Diminishing returns still apply

Statements like "data is the new oil" make it sound like data is currency, when it's actually an investment. In all statistical measurements, once enough data points have been collected to establish a result, adding more data points begins to create less accuracy. This should be a pressing concern when you're investing increasing amounts of money, time and resources into capturing and analysing data.
For example, American statistician, Nate Silver, frequently uses polls of sample sizes ranging from hundreds to thousands, and his model explicitly accounts for diminishing returns.

4. The hard part is still done by humans

The dirty secret of big data is that no algorithm can tell you what's significant, or what it means. Data then becomes another problem for you to solve. A lean data approach suggests starting with questions relevant to your business and finding ways to answer them through data, rather than sifting through countless data sets.