Drowning in information

If that’s how you feel, there a very good reason–the amount of information around us is truly staggering, according to UC Berkeley professors Hal Varian’s and Peter Lyman’s report, `How Much Information’.

Before we touch on the number bytes of information produced each year, let’s define a few terms, and put them in perspective:

Today, most of us are pretty clear on what a megabyte is-a million bytes of information, which could be a million text characters or a portion of an audio or video file, etc. (By the way, I do know that a megabyte is technically 2^20, or 1,048,576 bytes of data, but we’re going to ignore such nits for clarity.)

Five megabytes can hold the text of the entire works of Shakespeare. A gigabyte is one thousand megabytes, or 1,000,000,000 bytes. Most new PCs now come with between 10 to 80 gigabytes of disk storage. Fifty gigabytes would contain the text of an entire library floor of books.

A terabyte is one thousand gigabytes of data, or 1,000,000,000,000 bytes. Ten terabytes would hold the text of the entire Library of Congress. A petabyte is one thousand terabytes of data, or 1,000,000,000,000,000 bytes. Two petabytes would contain the text of every academic research library in the US.

An exabyte is one thousand petabytes, or 1,000,000,000,000,000,000 bytes. Five exabytes would store all of the words ever spoken by human beings on this planet.

And although we’re not going to use these terms today, the day may well come when we need to know that a zettabyte is one thousand exabytes, and a yottabyte is one thousand zettabytes (which is a yaotta data! – sorry…)

Now that we have a context, let’s see just how much information was created in 1999, assuming it were all digitized:
Optical storage (music and data CD and DVDs) – 83 terabytes.

Paper – 240 terabytes.

Film (still, movies, and medical images) – 427 petabytes.

Magnetic storage (camcorder tapes, disk drives) – 1.693 exabytes.

That’s a grand total of 2.1 exabytes of new information produced that year. Yet if that sounds like a lot, Varian and Lyman found that the growth rate of such information is 50 percent each year!

Where does the Internet fit into this? The `surface Web’ (html pages) consists of 2.5 billion documents growing at the rate of 7.3 million pages–per day! Overall, there’s 25 to 50 terabytes of html data on the Web.

If we also include the vast databases that sit behind many websites, which only generate a `Web page’ in response to a query from us, the report estimates that the Web contains 550 billion documents containing 7.5 petabytes of data! And the amount of e-mail is similarly staggering–as many as 1.1 trillion messages comprising 20 petabytes of information–this year!
No wonder we feel overloaded.

The opportunities

The amount of information swirling around us is already huge, and it’s growing at a tremendous rate. Happily, revolutionary new storage technologies are striving to keep up to let us store, and later retrieve and manipulate, this vast gold mine of the information we’re creating.

Of course, dealing with information on this scale far exceeds the capabilities of today’s processors and information management tools. Which is why our continuing hardware advances remain necessary, and why there are still incredible opportunities for new and effective ways to deal with increasing amounts of information (such as Compaq’s `Zero Latency Enterprise’ technologies.

We already know what it feels like to be drowning in information, and it’s only going to get worse. A good information-lifeline will be very valuable, indeed!

Jeffrey Harrow
Senior Consulting Engineer
(Technology and Corporate Development Group), Compaq

Note: This is an article from the `Rapidly Changing Face of Computing’, a free weekly multimedia technology journal written by Jeffrey Harrow. More discussions around the innovations and trends of contemporary computing and the technologies that drive them are available at
www.compaq.com/rcfoc. The writer’s opinions do not necessarily reflect the opinion of Compaq. The RCFoC is copyright 2000, Compaq.

