Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Stored Data to Exceed 1.8 Zettabytes by 2011

Posted by CmdrTaco on Wed Mar 12, 2008 07:44 AM
from the less-than-eighty-percent-porn dept.
jcatcw writes "By 2011, there will be 1.8 zettabytes of electronic data stored in 20 quadrillion files, packets or other containers because of, among other things, the massive growth rate of social networks, and digital equipment such as cameras, cell phones and televisions, according to a new study by IDC. Data is growing by a factor of 10 every five years. According to John Gantz, IDC's lead analyst, "at some point in the life of every file, or bit or packet, 85% of that information somewhere goes through a corporate computer, website, network or asset," meaning any given corporation becomes responsible for protecting large amounts of data that it and its customers may not have created. The study, which coincided with the launch of a " digital footprint" calculator, also found that as the world changes over to digital televisions, analog sets and obsolete set-top boxes and DVDs "will be heaped on the waste piles, which will double by 2011.""
+ -
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by sleeping123 (1109587) on Wednesday March 12 2008, @07:47AM (#22726990)
    Porn
    • , and other usenet binaries, and the world's torrents.. all downloading through your ISP, which is a corporation. Anything on the internet comes through corporations- ISPs. How is that 85% figure surprising?
      • Re: (Score:3, Interesting)

        Some of the data transfers really seems wasteful. I download a Linux DVD ISO file, burn it onto a DVD, install the system on a new hard disk drive, then download another couple of Gigabytes of updates. Wouldn't be simpler to just have an installation DVD that creates a minimal system which then downloads the latest version of each module.

        And that DVD is really only used once and then forgotten about.
    • "Documentaries".
      • by beckerist (985855) on Wednesday March 12 2008, @08:45AM (#22727478) Homepage
        From: http://en.wikipedia.org/wiki/Google_platform [wikipedia.org]

        # Upwards of 450,000 servers ranging from a 533 MHz Intel Celeron to a dual 1.4 GHz Intel Pentium III (as of 2005)
        # One or more 80GB hard disks per server (2003)
        So at least using these numbers, let's say on average they have 120gb per server (1 and a half, 80 GB drives...) That would mean they have 54,000 TBs or 54 PBs. I'm sure they have even more now, but as a point of reference! Yes, Google has a finite amount of space!
  • Riiight (Score:3, Insightful)

    by InvisblePinkUnicorn (1126837) on Wednesday March 12 2008, @07:47AM (#22726992)
    "as the world changes over to digital televisions, analog sets and obsolete set-top boxes and DVDs"

    That's what I plan on doing. I'm going to throw out all my DVDs and buy the Blu-Ray equivalent.

    Or maybe I'll just keep the DVDs (and the player) and buy whatever cable adapters I need to get them working on these newfangled devices.
    • Re: (Score:3, Informative)

      What, are you kidding? Blu-ray has horrifying DRM and doesn't really look that much better than DVDs with good postprocessing. I'd never even think of supporting DRMed blu-ray.
      • Re:Riiight (Score:5, Insightful)

        by Tony Hoyle (11698) <tmh@nodomain.org> on Wednesday March 12 2008, @08:02AM (#22727102) Homepage
        Get a decent TV. There's a massive difference between DVD and Bluray.

        DRM? Who cares. I'm not planning on copying 20gb+ disks.
        • Re:Riiight (Score:5, Insightful)

          by Aenoxi (946506) on Wednesday March 12 2008, @08:37AM (#22727408)
          Please mod parent up. If I had a nickel for every person who spouted that same upscaled DVD tripe, then, then, then I'd have enough to buy a Blu Ray disk ;)

          There is a world of difference between 1080p and DVD quality - but you'll never see it if your TV can't natively display 1080p (or at least 720) or you use a composite video interconnect rather than HDMI/DVI or component (yes, I know, but you'd be surprised how many people still do...)

          Whilst I can imagine that a true 1080p picture might look similar to upscaled DVD on a small screen (which necessarily has very small dot pitch), the difference becomes clear as you scale up the screen beyond 30 inches or so (and bleeding obvious once you get beyond 42"). Interpolation and post-processing can only get you so far. Notwithstanding CSI, even high-end upscaling cannot create genuine detail that didn't exist in the original image - and the more post-processing you do, the more artifacts you are going to see.

          I've been running a Pioneer BR player via HDMI to a 1080p 60" plasma for 6 months and whilst upscaled DVD is nice, it can't hold a candle to the 1080 BR picture. Double blind test anyone on a similar system and there's no way you'd get anything but a 100% success rate of identifying HD BR vs upscaled DVD.
          • You're not quite putting it in the right terms for the slashdot audience. How about:

            When you download a 5Gb Blue-Ray Rip it will look much better than a 1Gb DVD rip if you play it on the right equipment. The right equipment being a display to do it justice, and mplayer to do the upscaling nicely :)

            Seriously though, on reading your post I'm shocked by just how much hassle everything is using legal components. We got our TV cheaply as it wasn't "HD-Ready". Apart from the lack of sticker it does do 1280x1024 s
        • Some early Blu-Ray players are incapable of playing the latest discs because of DRM. Plenty of the first HDTVs will force your overpriced HD content to be downscaled to SD because they don't support HDCP, as soon as they start using ICT.

          I'd say DRM matters, no matter whether you plan to copy discs or not. Probably more so than to the pirates, as usual.
        • Re: (Score:3, Insightful)

          DRM? Who cares. I'm not planning on copying 20gb+ disks.

          I would have said that about DVDs not so long ago. Disk space and bandwidth become cheaper with time.

          And besides copying, a DRM crack allows me to play discs on the operating system of my choice, to extract small parts of the feature for purposes of review, criticism or parody, and to bypass any annoying previews, trailers, propaganda, threats, or other junk that the studio may have seen fit to prepend to the show.

        • Re: (Score:2, Insightful)

          I don't even care enough about high fidelity imagery to wear my glasses day to day. The resolution of a normal TV is plenty for me.

          High fidelity audio however is an entirely different story.
      • I'll wait until DRM is cracked (especially region coding).
      • I was a joke. I was pointing out how dumb it would be to throw out all your DVDs and buy a bunch of overpriced discs just because they're the new thing.

        *WHOOOSH*
  • Y2k300! (Score:5, Funny)

    by xZgf6xHx2uhoAj9D (1160707) on Wednesday March 12 2008, @07:50AM (#22727006)
    If, like the summary (but not the article for some reason) states, total data is growing by a factor of 10 every 5 years, then somewhere around the year 2300 we'll have 10^80 bits stored. The number of elementary particles in the known universe is estimated to be between 10^79 and 10^81. Seems we're kind of screwed at that point.
  • Well yes... (Score:3, Insightful)

    by theM_xl (760570) on Wednesday March 12 2008, @07:53AM (#22727026)

    85% of that information somewhere goes through a corporate computer, website, network or asset
    That's all? I mean, a good deal will be created by corporations in the first place, all the major bits of internet infrastructure belong to one corporation (for-profit or not) or another, the post office is a corporation... 85% seems low, actually.
    • Re: (Score:3, Insightful)

      I don't know about that. Imagine all of the digital pictures taken that never travel outside the home user's computer, memory card or CDs. Even more important, consider the amount of digital video data generated by home users with their camcorders. A single 60 minute Mini-DV tape is in the neighborhood of 15 GB. That's one single tape, and my family alone has dozens of them just from a single year. Even if those videos are uploaded to the internet, they must first be converted to some other format that
  • by peragrin (659227) on Wednesday March 12 2008, @07:54AM (#22727034)
    Is that half of it will be copies of Windows Vista, XP, a few hundred Linux distro's.
  • by EricR86 (1144023) on Wednesday March 12 2008, @07:59AM (#22727078)

    Since we're talking very large orders of magnitude it would help to know what definition of zetabyte they're using.

    2^50 bytes or 10^15 bytes?

    The former is astronomically larger.

    • Re: (Score:2, Informative)

      2^50 bytes or 10^15 bytes?
      What I really meant was: 2^70 bytes or 10^21 bytes? Pfft. Only a few orders of magnitude... :|
    • If by "astronomically larger" you mean 12.6%, then I'm astronomically larger than the average Indonesian male.
    • At the risk of being modded down, isn't that distinction the whole point of the IEC's "zebibyte" proposal?

      Anyway, most measurements of mass storage (bandwidth quotas, hard disk capacity etc) seem to measured in actual megabytes (MB), gigabytes (GB) etc, as opposed to binary megabytes (MiB), binary gigabytes (GiB) and so on. Binary byte prefixes only seem to be used for RAM and flash these days, presumably because of the convenient manufacturing realities involved - and I really wish that manufacturers of th
      • Re: (Score:3, Insightful)

        In theory, yes. In practice, the whole Zebibyte thing is complete nonsense. Everyone other than hard drive manufacturers has been using the SI prefixes to refer to power of two quantities when referring to binary data for 40 years. Attempting to redefine them retroactively just causes confusion. If I see something that says KB, and don't know when it was written, I have no idea if it pre or post-dates the KiB nonsense and so I have no idea if it refers to 1024 or 1000 bytes.
        • Re: (Score:3, Insightful)

          So you're better off if someone does use the proper prefix then. Without it, KB could mean either. With it, at least you know what kiB means, so you're definitely right some of the time.
        • by Waffle Iron (339739) on Wednesday March 12 2008, @09:27AM (#22727832)

          Everyone other than hard drive manufacturers has been using the SI prefixes to refer to power of two quantities when referring to binary data for 40 years. Attempting to redefine them retroactively just causes confusion.

          No, the confusion is cause by using a pseudo-binary based number system in a world where almost everything else is decimal.

          Quick question: You have a 2000 MiB video file and a 2470 MiB video file. Will they both fit on a 4.37 GiB DVD? Now you need your calculator.

          It's much easier to figure out if a 2097 MB and a 2590 MB file fit on a 4.7 GB disk. You can do that in your head.

          I've been burned numerous times by programs ambiguously reporting sizes in KiB and MiB causing me to run out of space on something that I'm trying to fill. All storage sizes should always be reported in decimal numbers. If RAM manufacturers want to keep using powers of two due to the implementation detail of how their chips are constructed, they should *always* use KiB, MiB and GiB.

            • Re: (Score:3, Insightful)

              For everything else, that is, using a computer, it's back to binary.

              It is not. RAM is the only quantity in computers commonly measured in binary. Hard drives have always been in decimal. Floppies have always been in an even more stupid system where "MB" == 1000*1024. Clock speeds have always been decimal.

              Going farther, measuring IO or network performance, to cite two trivial examples, or understanding any of those subjects in general, you're binary to binary.

              You appear to have been bambooozled yourself by the confusion caused by this issue. I/O speed of buses is always decimal because it derives from MHz and GHz, which are decimal. Network bandwidth is more often measured in decimal megabits, not binary.

              You seem to thi

  • Wrong metric? (Score:4, Interesting)

    by guruevi (827432) <evi.smokingcube@be> on Wednesday March 12 2008, @08:06AM (#22727162) Homepage
    I was wondering if they weren't a bit wrong in their calculations. A Zettabyte is 1 Million Petabytes. Knowing that where I work has about 2 petabytes in a few SAN's and there are 1000's of larger institutions and millions that are smaller (that store in the terabytes range) around the world. The place I worked before had about a half a petabyte just in tape backups for credit card and other transactions, catalog and pricing information, images etc. and that was just an average clothing company, hardly rivaling JCPenney or Macy's. I'm also thinking about Wal-Mart with millions of products and thousands of stores. And we're just talking about SAN's here mainly in the US, not including desktops, laptops, camera's, personal information, Google.

    On another note, how much does a zettabyte actually yield these days, drive manufacturers might just give you 700 Petabytes for it. Oblig. XKCD: http://xkcd.org/394/ [xkcd.org]
  • 1,800 exabytes of raw data. Anyone like to guess how much of this will be useful data! Judging by some system specifications I have read 5% is being generous. A twenty page specification can be condensed to a single page of useful information, and over half is "boilerplate" disclaimers, etc. which are the same in all the company's specifications.
    • No kidding. Just think of all the space used to store formatting data. I just typed the expression "Blah." (minus quotes) in a .doc file, it's 19,456 bytes in size to store 5 bytes of information.

      I'm not saying that formatting data is entirely without worth, but there's definitely some improvements to be had WRT efficiency.
    • Just store it in XML and, at least, triple the storage requirements.
  • by Bombula (670389) on Wednesday March 12 2008, @08:14AM (#22727214)
    The interesting thing here is the part about data being relayed through third parties and the issues involved. As for the data figures themselves, those are pretty misleading because data does not equal useful information. There is far less useful information in an MS Word file than 100Kb or whatever, for example, so these zetabyte figures bandied about aren't terribly meaningful other than to draw attention to the infrastructure needed to support digital data relaying. To see my point, turn things upside down: there is vastly more data stored on an LP record or celluloid film than on a CD or digital photograph. But is that data useful information? Only a few audiophiles and filmophiles would argue that there is.

    Yes, there is a lot of data in the world. But is there really that much more information out there? A zillion copies of the same song just means more data, not more information.

  • don't worry, we can mine landfills and recycle the plastic out of them at some point. After all, the plastic isn't going anywhere, and we're only going to get more technologically advanced, so at *some* point, surely this will make sense!
  • Photoshop was 22,000 files last time I checked. ...and I know people who think that's cool.

  • Sounds like a fraternity thing.
  • We're gonna need more SI prefixes.
    • or we're one big EMP pulse away from losing almost 2 zettabytes of data.

      and technicallly, there is only one SI definition of zettabyte [wikipedia.org], which is 10^21. The binary definitions used by the IEC like the zettabyte=2^70 are being renamed to avoid ambiguity (proposed to be zebibyte for zeta binary byte).
  • ...640k ought to be enough for everyone.
    • secondly, who really cares? Most of it is cached google pages and pron anyway...
      That's why /.ers care.
      • by epine (68316) on Wednesday March 12 2008, @09:21AM (#22727784)

        secondly, who really cares? Most of it is cached google pages and pron anyway...
        That's why /.ers care.
        But actually, no. We're very close already to being able to generate pron on demand without involving any principle photography. You won't even need to say what you want, that will be ascertained on the fly by neuro-cranial-bio-feedback.

        After enough of the male population has been brain mapped, it will probably turn out like spam: there's only so many unique permutations, as long as the scene is dressed up a little differently from time to time to maintain the novelty factor.

        Pron seems to be a lot like Big Bertha, where each mortar round was larger than the last, to accommodate progressive barrel enlargement. Eventually the images become extremely shocking to get any response at all.

        http://www.wired.com/science/discoveries/news/2008/03/mri_vision [wired.com]

        The future of compression is not to send the picture itself, but the reduced specification for an image that produces the same effect on the human visual system. We're already doing this with psycho-acoustic encoding.

        Once we have a sufficiently sophisticated model of human sensory perception, mental and emotional responses (which will run to TBs I'm sure), we can run a competition for the best feature movie encoded in under 4KB. Mostly it would describe desired emotional responses and cognitive states, the actual images would be back-generated to achieve this effect as determined by the human perceptual model.
    • I have a ten story library here on campus for you to store and index in your head.

      Wet processing is excellent for some restricted purposes, wet storage just plain sucks.