Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Data Storage Media

Digital Big Bang — 161 Exabytes In 2006 176

An anonymous reader tips us to an AP story on a recent study of how much data we are producing. IDC estimates that in 2006 we created, captured, and replicated 161 exabytes of digital information. The last time anyone tried to estimate global information volume, in 2003, researchers at UC Berkeley came up with 5 exabytes. (The current study tries to account for duplicating data — on the same assumptions as the 2003 study it would have come out at 40 exabytes.) By 2010, according to IDC, we will be producing far more data than we will have room to store, closing in on a zettabyte.
This discussion has been archived. No new comments can be posted.

Digital Big Bang — 161 Exabytes In 2006

Comments Filter:
  • by cryfreedomlove ( 929828 ) on Monday March 05, 2007 @09:20PM (#18245260)
    I imagine that a lot of this is web traffic logs. What if the US government really does force ISP's to keep records detailing the sites visited by their customers? Will my ISP rates increase to pay for all of that disk space?
  • by Animats ( 122034 ) on Monday March 05, 2007 @09:26PM (#18245288) Homepage

    What's really striking is how little data was available in machine-readable form well into the computer era. In the 1970s, the Stanford AI lab got a feed from the Associated Press wire, simply to get a source of machine-readable text for test purposes. There wasn't much out there.

    In 1971, I visited Western Union's installation in Mawah, NH, which was mostly UNIVAC gear. (I worked at a UNIVAC site a few miles away, so I was over there to see how they did some things.) I was shown the primary Western Union international gateway, driven by a pair of real-time UNIVAC 494 computers. All Western Union message traffic between the US and Europe went through there. And the traffic volume was so small that the logging tape was just writing a block every few seconds. Of course, each message cost a few dollars to send; these were "international telegrams".

    Sitting at a CRT terminal was a woman whose job it was to deal with mail bounces. About once a minute, a message would appear on her screen, and she'd correct the address if possible, using some directories she had handy, or return the message to the sender. Think about it. One person was manually handling all the e-mail bounces for all commercial US-Europe traffic. One person.

  • by blubadger ( 988507 ) on Monday March 05, 2007 @09:43PM (#18245450)

    In River Out of Eden [wikipedia.org] Richard Dawkins traces the data explosion of the information age right back to the big bang.

    "The genetic code is not a binary code as in computers, nor an eight-level code as in some telephone systems, but a quaternary code with four symbols. The machine code of the genes is uncannily computerlike."
  • by product byproduct ( 628318 ) on Monday March 05, 2007 @09:56PM (#18245548)
    Amazingly it would take 1,600,000 years for /dev/urandom to produce 161 exabytes (assuming 3.2 MB/s, YMMV)
  • by basic0 ( 182925 ) <mmccollow@yahooEEE.ca minus threevowels> on Monday March 05, 2007 @10:08PM (#18245646)
    Ok, so we generate some staggering amount of computerized data every year. This is one of those stories where I can't remember hearing about it before, but it really doesn't feel like "news".

    My question is how much of this data is actually being used? I'm horrible for constantly downloading e-books, movies, software, OSes, and other stuff that I'm *intending* to do something with, but often don't get around to. I end up with gigabytes of "stuff" just sucking up disc space or wasting CDs. I burned a DivX copy of Matt Stone and Trey Parker's popular pre-South Park indie film "Orgazmo" in about 2001. I've since seen the film 2 or 3 times on TV. I STILL haven't watched the DivX version I have, and now I can't find the CD I put it on. I know I'm not the only one who does this either, as many of my friends are using up loads of storage space on files they've just been too busy to have a look at.

    Right now I'm on a project digitizing patient files for a neurologist. We're going up to 10 years deep with files for over 18,000 patients. Most of this is *just* for legal purposes and nobody is EVER going to open and read the majority of these files. The doctor does electronic clinics where he consults the patient and adds new pages to their file, which will probably sit there undisturbed until the Ethernet Disk fails someday.

    I think a more interesting story (although probably MUCH more difficult to research) would be "How much computerized data is never used beyond it's original creation on a given storage medium?"
  • Of course we will (Score:3, Interesting)

    by PIPBoy3000 ( 619296 ) on Monday March 05, 2007 @10:13PM (#18245702)
    Think about scientific instruments that gather gigabytes of data per second. They hold on to that for as long as they have to, pulling out interesting data, summarizing it, and throwing out the rest. I track all the web hits for our corporate Intranet. The volume is so huge that the SQL administrators come and have a little heart-to-heart chat with me if I let it build up over a few months. I don't really care about the raw information past a month or so. Instead, I want to see running counts of which pages are being viewed, which people are big utilizers of our network, and so on.

    A good analogy is the human brain. We gather in huge amounts of information per second via touch, sight, and so on, but throw out the vast majority of the information. The key is to have good filtering systems so that things that are interesting and relevant are held onto.
  • Google Says: (Score:3, Interesting)

    by nbritton ( 823086 ) on Monday March 05, 2007 @10:28PM (#18245808)
    (161 exabytes) / 6,525,170,264 people = 26.4931682 gigabytes per person.
  • Google Says: (Score:3, Interesting)

    by nbritton ( 823086 ) on Monday March 05, 2007 @10:36PM (#18245884)
    (161 exabytes) / 1,093,529,692 people[1] = 158.086639 gigabytes per person and 19.6380918 gigabytes per person if you don't count the duplicate data.

    [1] Total est. of people on the Internet:
    http://www.internetworldstats.com/stats.htm [internetworldstats.com]
  • by stoicio ( 710327 ) on Tuesday March 06, 2007 @12:30AM (#18246564) Journal
    It's well and fine to have a statistic like 161 exabytes
    of data, but what's the point. Is that data any more useful
    to people than the selective data that was used to run the world
    50, 60 or 100 years ago?

    We as individuals are only capable of assimilating a limited amount
    of information so most of those exabytes are just rolling around
    like so many gears in an old machine. If they are minimally used or
    never used they simply become a storage liability.

    As an example, the internet has not made *better* doctors.
    Even with all the latest information at thier finger tips
    professionals are still only the sum of what they can
    mentally absorb. Too much data, or wrong data (ie: wikipedia)
    can lead to the same levels of inefficiency seen prior to
    the 'information age'. What would a single doctor do with
    160 exabytes of reading material, schedule it into the work day?

    Also, if the amount of information is rated purely on bytes
    but not in *useful content* the stats get skewed. Things like
    movies and music should be ranked by the length of script
    and/or notation. That would make the numbers much less than
    160 exabytes.

    Saying that the whole world produced 160 exabytes of information
    is like saying the whole world used 50 billion tonnes of water. ...was that water just running down the pipe into the sewar or
    did somebody actually drink it to sustain life?

    Mechanistic stats are stupid.

The only possible interpretation of any research whatever in the `social sciences' is: some do, some don't. -- Ernest Rutherford

Working...