1. Archive

Lots of information, but most is boring

Researchers have discovered why you are having more trouble keeping up with things these days:

There is twice as much new information in the world as there was just three years ago _ and most of it isn't very interesting.

The researchers concluded that the amount of new information produced last year was about 23 exabytes. An exabyte is a million terabytes. A terabyte is a million megabytes _ roughly equivalent to the content of a million books.

"The difficult thing was to try to measure only unique content and not all the copied information," said Hal Varian, an information scientist at the University of California-Berkeley and one of the researchers who conducted the information census.

"If you're in the information business, it's important to know how much is out there," said Jim Gray, manager of the Microsoft Research unit in San Francisco. Microsoft Research, Intel, HP and EMC paid for the study by Varian and colleagues.

Worldwide, the group determined, the amount of new information on the planet is increasing by about 30 percent each year.

"That's slowed a down bit lately, though, probably due to the economic slowdown," Varian said.

Varian acknowledged their definition of information was somewhat arbitrary and circular, in that they measured only the kind of information that can be quantified digitally.

Knowledge passed along orally would not have been counted.

The estimates were based on statistical extrapolations from analyses of 10,000 Web sites, snooping into "typical" hard drives and samplings of information as it was transmitted.

The researchers found that most of the new information produced, about 92 percent, is digital or "magnetic media" such as that on the hard drive of a PC.

The researchers also reported that the flow of information has increased. They found that new information flowing across televisions, radios, telephones, Web sites and the Internet had increased by 3{ times to 18 exabytes as of 2002. The amount of new but stored (non-transmitted) information in 2002 was determined to be about five exabytes.

"Unfortunately, much of this information being created out there is not very dense in terms of its interest," said Ed Lazowska, the chairman of computer science at the University of Washington. It's mostly boring and of little use to most people, he said.

As this mountain of spam, technical data and other largely useless information continues to pile up at an alarming rate, Lazowska said, the need to find new tools for digging our way out of it becomes even more urgent.

"If we don't do something, the Internet will become useless," agreed Gray. If you think getting a few hundred spam emails a day is bad, he said, imagine getting 10,000 a day.