In today’s world, making effective decisions depends on having good information at your fingertips. But as our ability to collect and analyze vasts amount of this information has grown over the past decate, our capability to effectively use this information hasn’t sufficiently matured. It’s very likely that the big investments in the collection and storage of this data isn’t paying off in better decision making.
Yet, while we struggle with that gap today, the pace of data continues to accelerate. There are now more than 2 billion users of the Internet[i] accessing and generating vast amounts of data. According to Cisco’s most recent Visual Networking Index[ii], Internet traffic increased eightfold over the last five years and will increase another fourfold over the next five. They estimate that by 2015, annual Internet traffic will approach one zetabyte. That’s a staggering amount of data. The gap will continue to grow.
To put that into perspective, let’s start smaller with a petabyte. A petabyte is 1015 bytes or 1 million gigabytes – capable of storing about 350 million MP3 songs[iii]. Using Gracenotes[iv] as an estimate on the total number of songs available (around 97 million) and the release of about 50 albums (or 500 songs) per week, it would take almost a thousand more years to have a petabyte of professionally recorded music. And a zetabyte is one million petabytes!
Figure: How Big is Each Byte
Name | Number of Bytes | Number of Songs | All of Wikipedia[v] |
Megabyte | 1,000,000 | < 1 | < 1 |
Gigabyte | 1,000,000,000 | 350 | < 1 |
Terabyte | 1,000,000,000,000 | 350 thousand | One tenth |
Petabyte | 1,000,000,000,000,000 | 350 million | 100 copies |
Exabyte | 1,000,000,000,000,000,000 | 350 billion | 100,000 copies |
Zetabyte | 1,000,000,000,000,000,000,000 | 350 trillion | 100,000,000 copies |
The reason why this is so problematic to on-line marketers is that it continues to underscore the one-to-one marketing “data treadmill” – no matter how much data you collect about a single customer or potential customer, there’s always more to collect as their “digital exhaust”[vi] continues to expand in size and scope. While some of today’s largest marketing databases range into the terabytes of data, future database will need to expand significantly. But the emphasis will continue to be applying “big judgment” to “big data”; collecting more data and throwing it at existing processes is just throwing gasoline at the fire.
[i] http://www.internetworldstats.com/stats.htm
[ii] http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/VNI_Hyperconnectivity_WP.html
[iii] Assuming about 2.8 megabytes per recorded song.
[iv] http://www.gracenote.com/
[v] http://en.wikipedia.org/wiki/Wikipedia:Database_download. Using 10 terabytes to make the math a big more straightforward. The size does not include images, just the text.
[vi] http://en.wikipedia.org/wiki/Digital_exhaust#cite_note-digital_exhaust_1-0