Richard Wray, writing recently in The Guardian, pointed out that the volume of data held is now estimated at 487 billion GB. To put this in perspective he explained that in printed form this would form a pile that would stretch to Pluto 10 times over. The really staggering statistic, however, was that if this data were printed then the stack would grow faster than NASA’s fastest rocket. I haven’t checked the stats, but a quick back of the envelope calculation suggests he’s in the right order of magnitude.
What does this mean? Apart from the staggering numbers, it tells us that the problem for organisations isn’t holding large amounts of information – they already do that. Nor is the problem necessarily how to index that information – increasingly they have defined information standards to do that. The real problem is its continual growth – very few taxonomies or models properly account for the rapid rate of growth.
A new generation of Information Management techniques are starting to appear which are designed to deal less with the data you have now and more with the data that you are likely to gain in the future. My next few blog entries will introduce a couple of these techniques for both structured and unstructured data.
Interesting blog Rob. Thanks.
When you say that the problem is continual growth of information - are you talking about information created by the organisation itself (and its customers and suppliers)? Or are you referring to the 'rocket' of information globally?
Either way, isn't the vast majority of this growth going to be in unstructured data? If this is so then surely the Information Management techniques you hint at are all about either a) directly mining unstructured data realtime or b) developing an ETL technology capability to extract structured data/information from the unstructured data.
Are we dealing with complexity or chaos?
Posted by: OzAnalytics | 06/14/2009 at 11:55 AM
Thanks for the question. In my view, it doesn't matter where the information sits as long as it is relevant to the business. More importantly, I also think that it doesn't matter whether the data is structured or unstructured, it is important. See my latest post for the foundations of a taxonomy!
Posted by: Robert Hillard | 06/14/2009 at 05:26 PM