Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Data Storage Networking Hardware

Storing Data For the Next 1,000 Years 243

An anonymous reader writes "This may be an interesting take on creating long-term storage technologies. A team of researchers at UCSC claims to have come up with a power-efficient, scalable way to reliably store data for a theoretical 1,400 years with regular hard drives. TG Daily has an article describing this technology and it sounds intriguing as it uses self-contained but networked storage units. It looks like a complicated solution, but the approach is manageable and may be an effective solution to preserve your data for decades and possibly centuries." Nice to see research on this using the kinds of real-world figures for disk lifetimes that recent studies have been turning up.
This discussion has been archived. No new comments can be posted.

Storing Data For the Next 1,000 Years

Comments Filter:
  • by Raindance ( 680694 ) * <`johnsonmx' `at' `gmail.com'> on Tuesday April 22, 2008 @11:43PM (#23167886) Homepage Journal
    Part of the solution to very long-term storage, of course, has to involve a method to read the data you've archived.

    I tend to think systems such as the one described in the article aren't good long-term solutions. If their math works on the failure rates, that's fantastic- but just try to hook up a 2028 computer to one of these things to pull the data off.*

    (Ever tried to get data off an obsolete tape backup?)

    I think the most reliable archival system is going to be an active one, where data is saved on modern storage hardware and always copied to more modern tech as it arrives.

    The other side of this is, for anything more advanced than text-- given that you can get at the data, what do you open it with? File types die over time and it's basically impossible to find programs to open certain files nowadays, much less such programs that will run on a modern OS. I think the answer to this has to be virtualization. Store the data *and* programs that can open the filetypes you need opened inside a portable virtual machine (e.g., a Windows vmware image). Over time, you may have to layer virtual machines inside virtual machines as OSes grow obsolete. But that's okay- virtualization is only going to become more elegant, and the end result is that you'd have your data in its original environment, completely accessible by native programs.

    *Some elements of this problem could be solved by having backup servers use wireless and filesharing protocols that might stand the test of time- e.g., 802.11n and SAMBA. No need to just pick one 'most likely to be future-proof' combination, either: run bluetooth and serial access, webdav and a http fileserver, etc. Still, *not* storing data on modern hardware is always going to be a risky kludge.

    There's probably room for a lucrative business based around this-- figuring out the most elegant way to archive and retain meaningful access to data under various computing/disaster scenarios. Hey, I do consulting. :)
    • by LoudMusic ( 199347 ) * on Tuesday April 22, 2008 @11:53PM (#23167958)

      (Ever tried to get data off an obsolete tape backup?)

      I think the most reliable archival system is going to be an active one, where data is saved on modern storage hardware and always copied to more modern tech as it arrives.
      Oh man, the headaches involved here. It only takes five years and archived data is obsolete. And yes, virtualization can help, but in the past I've resorted to keeping an entire system available, off-line, to guarantee that the client be able to open their data. Sometimes you get lucky and there's either a plug-in for the old app to export to the new app, or one for the new app to import from the old app. But even on the rare chance that one is available, I've never seen a 100% conversion - even on simple stuff.

      Maybe old data was meant to die.
      • Re: (Score:3, Informative)

        by dbIII ( 701233 )

        Oh man, the headaches involved here. It only takes five years and archived data is obsolete

        Only in the MS Windows world. For the rest of us if it predates ASCII we can use "dd" to convert from EBCDIC if we have to. The tapes from 1982 I recently read in however were transcribed to new media for me first in case the media had become damaged over time and because I'm not familiar with 9 track drives. It was a direct copy so the data format was retained even if it was on new media (IBM3490 format but done

      • Re: (Score:3, Interesting)

        by npsimons ( 32752 ) *

        Maybe old data was meant to die.

        Maybe proprietary formats were meant to die.

        I still have documents in plain ASCII that I can open from over ten years ago. I've got a few .wri's that I can still open, thanks to reverse engineering efforts by the open source community. Older proprietary formats are now defacto open standards. The thing that can kill this off? Patents, for one. Trade secrets in the form of overly complicated proprietary formats for another.

        And yes, I realize I'm not talking about GB's of

    • by oGMo ( 379 ) on Wednesday April 23, 2008 @12:11AM (#23168044)

      There's probably room for a lucrative business based around this-- figuring out the most elegant way to archive and retain meaningful access to data under various computing/disaster scenarios. Hey, I do consulting. :)

      Find a chisel. [wikipedia.org]

    • The other side of this is, for anything more advanced than text-- given that you can get at the data, what do you open it with? File types die over time and it's basically impossible to find programs to open certain files nowadays, much less such programs that will run on a modern OS.

      Simple: You use only formats that are openly specified and free software. HTML and everything XML-based actually is text, while format descriptions and decoders for Theora, Vorbis etc. will be around for a long time, probably

      • Re: (Score:3, Insightful)

        by jimicus ( 737525 )

        Simple: You use only formats that are openly specified and free software. HTML and everything XML-based actually is text.
        Even if it's text you're not 100% out of the woods. EBCDIC, ASCII, Unicode plus however many others have existed over the years.

        While it's not generally too awkward to convert from one characted encoding to another, "just text" is a slight oversimplification.
    • Try harder (Score:5, Insightful)

      by daBass ( 56811 ) on Wednesday April 23, 2008 @12:40AM (#23168174)

      (Ever tried to get data off an obsolete tape backup
      There are loads of people that can make this work. The most important thing is having the specs of what is on it, how it was recorded. (even just a few hints and some knowledge of how computer systems in that era might have recorded data is enough) That the machine used is no longer functioning and had an interface that doesn't work with your USB-only modern PC anyway is of no relevance.

      Given the media, specifications and some time and money, a trio of engineering, electronics and CS students will make a machine that will read any old tape, punchcard, early HDD, etc. A CD is laughably simple technology, an engineer 100 years from now will build a player (in a way that may not look anything like our current players) in no time at all.

      Today's technology is even more well documented and certainly not beyond the capabilities of future generations to make readers for.

      If you find an old tape and want to do it in an afternoon, you are out of luck. If you are an historian that really, really wants to get to the data, it is not all that hard.
      • What's the data worth to you? That is the question you have to ask with archival.

        an engineer 100 years from now will build a player (in a way that may not look anything like our current players) in no time at all.

        What on earth makes you think there will still be electricity in 100 years? Civilizations don't expand exponentially for ever. They hit a limit and in the following economic collapse there is all sorts of chaos. Ultimately the only assumption you can make for storing information for very very long times is that a human being be able to see and touch it.

        At the moment, the very, very best method of long term archival we have i

        • Ummm ... I'd like to see how you propose we store movies on goat skin.

          More seriously, perhaps our goal should be to store our information in such a way that future civilizations, once they're at least as advanced as we are, will be able to read it. We don't really need to be able to see our YouTube videos during the post-apocalyptic nuclear winter or whatever disaster you're envisioning for us, but, after we recover, it would be kind of nice to still have them around :)
      • "A CD is laughably simple technology, an engineer 100 years from now will build a player (in a way that may not look anything like our current players) in no time at all"

        Rubbish. For a start CD players are not simple devices otherwise Edison would have invented them. (Just because something is now a commodity item doesnt mean anyone could build one from scratch). If you;re uncomvinced go study the maths on auto focusing an pit tracking lasers, not to mention D/A conversion, reed solomon error corection etc.
        • Re: (Score:3, Insightful)

          by daBass ( 56811 )

          If you;re uncomvinced go study the maths on auto focusing an pit tracking lasers, not to mention D/A conversion, reed solomon error corection etc.

          Why would you use such arcane methods a 100 years from now? If they looked closely at the disc, they would see the patterns. Knowing (as they will) that we used to use "binary", they'll quickly assume they represent 1 and 0. Take a quick scan of the entire disc and do the rest in memory. Somehow I doubt they'll have much of a problem with D/A conversion either. (which is so simple, they'll figure that one out too. Understanding the data is supposed to be audio, they'll quickly put two and two together. Mos

        • My brother and I can and have built a trebuchet and a ballista. We did cheat and involve power tools, but that's just for speed. He also fences, shoots a longbow, and makes armor. Just because not everyone has the skills or the knowledge to do something doesn't mean that you couldn't find someone.
          • Re: (Score:2, Interesting)

            by CastrTroy ( 595695 )
            Yes, you could build a trebuchet in your backyard. However, is it up to the standards of where things would have been when construction of trebuchets was at it's prime. Saying you could build one with power tools just to save time doesn't really hold much water with me. We could build the pyramids with all the modern tools we have at our disposal. The trick is that the Egyptians were able to do it without all those fancy tools. There's still a lot of controversy about how the pyramids were actually acco
    • Re: (Score:2, Funny)

      by Zencyde ( 850968 )
      Only problem I can see with it is generation loss. Copy something over and you're missing a couple of bits. Okay, not too much harm done. Copy it again and you're missing even more. Okay.. a bit of a hit we can keep going. By the time you've copied it twenty times, it sounds like this: http://www.youtube.com/watch?v=Yu_moia-oVI [youtube.com]
    • I agree that virtual machines are a solution to file formats becoming obsolete, but I think that emulation may be more appropriate than virtualization for this purpose. VMware can only be used on x86 computers, and even on x86 computers future processors may have subtle differences that could affect old virtual machines. An emulation of an entire computer, including the processor, can be ported to any computer, and have exactly identical behavior.

      Also, it may not be necessary to layer virtual machine

    • Some elements of this problem could be solved by having backup servers use wireless and filesharing protocols that might stand the test of time- e.g., 802.11n and SAMBA. No need to just pick one 'most likely to be future-proof' combination, either...

      I'm going to, anyway: The Web. Straight up HTTP, with HTML documentation. Fall back to plain-text if you're extra-paranoid, but if you don't do any styling, straight HTML is very future-proof and backwards-compatible. If you do anything on top of that (Dav, etc

    • by RKBA ( 622932 )

      Ever tried to get data off an obsolete tape backup?
      No problem. I just slap it into my tape drive. [wikimedia.org]
    • I think the most reliable archival system is going to be an active one, where data is saved on modern storage hardware and always copied to more modern tech as it arrives.

      I think the reverse, and one should choose the most low tech technology that is viable. For example, if you'd have some simple binary storage variant of an abacus and etch a translation table on a plate (which should not be necessary but make it more easy), even if the earth got nuked and your precious, most advanced technology is LOST together with ALL THE DATA in your setup, Mad Max and his buddies will still be capable of recovering my pr0n. Obviously, this is a technology that is not viable, something

    • Nothing like half the problem, not even a few percent of the problem, for starters they better not put it in a places that is likely to be invaded or attacked. Baghdad Museum [guardian.co.uk] housed things of this age an older.

      The real problem is not scientific, it is political. Vast amounts of data have been lost over the last millenium and the things that get preserved are done so by institutions such as Churches that are wealthy and can survey the vaguaries of pollitical whim (even they loose a lot when some dictato [wikipedia.org]

    • Re: (Score:3, Funny)

      by dpilot ( 134227 )
      Can't believe that this long into the thread, and nobody has mentioned OOXML. Obviously your data needs to be in an open and documented format, so that it has the best chance of being read and the metadata properly interpreted later. Since it's an ISO standard, OOXML must be the obvious choice to meet requirements.
  • by erroneus ( 253617 ) on Tuesday April 22, 2008 @11:45PM (#23167906) Homepage
    No, not punch cards... but close!

    Stone and chisel. That's the way to store data for 1,000 years. The reason why I say this is simple. The more "religious" the world's populations become, the closer to the dark ages we become. (The reverse is true as well as history illustrates.) I expect there will be a second "dark ages" at which point all other technologies will simply not be available.
    • I take it weather will not be available in this future of yours?
    • by martin-boundary ( 547041 ) on Wednesday April 23, 2008 @12:17AM (#23168078)
      Why not microscopic etching [zyvex.com]. One advantage over the stone and chisel approach is that you can carry the mountain in your pocket until the next civilization figures out how to read it...
      • Re: (Score:2, Insightful)

        by Anonymous Coward
        Umm, stone carvings aren't immune to little things like weathering effects. Microscopic etching isn't going to be any better at retaining data for long periods of time than a stamped CD (which is essentially the same thing).

        The reason why ancient carvings are durable is because they're macroscopic, and hence inherently have lots of built-in redundancy. (The shape of a letter, for example, uses vast quantities of atoms shaped in a precise way to convey very little information; 5 to 7 bits worth, for Latin-
        • by dajak ( 662256 )
          The reason why ancient carvings are durable is because they're macroscopic, and hence inherently have lots of built-in redundancy.

          Excellent point. Even CD lasts a long time if used properly. Write them with a very low density. Use a drill and make big holes in them. Another solution is to lay out the CDs in the shape of letters and then cover them in 30 feet of sand or something to prevent disruptions.
    • Stone and chisel. That's the way to store data for 1,000 years. The reason why I say this is simple. The more "religious" the world's populations become, the closer to the dark ages we become.

      You've found Leibowitz's grocery list.

    • by evanbd ( 210358 ) on Wednesday April 23, 2008 @01:32AM (#23168438)
      You could, of course, update the technology a bit: Rosetta Project [rosettaproject.org]. High density, readable with a high quality microscope, and partially readable with the naked eye -- the spiral of shrinking text should make the usage instructions obvious: "get a magnifying glass, there's more here."
    • by dajak ( 662256 )
      What's wrong with clay tablets? A writer that etches stone tablets would create a lot of fine dust. Wet clay is easier to handle and cheaper to make.
  • From TFA:

    Santa Cruz (CA) - Have you ever thought how vulnerable your data may be through the simple fact that you may be storing your entire digital life on a single hard drive? On single drive can hold tens of thousands of pictures, thousands of music files, videos, letters and countless other documents. One malfunctioning drive can wipe out your virtual life in a blink of an eye. A scary thought. On a greater scale, at least portions of the digital information describing our generation may be put at risk
    • Bummer, guess I don't have to cry endless tears over the loss of my "digital life".

      I know what you mean; I've lost backups (and stuff that I didn't back up), and it was really inconsequential. I really think if I hadn't lost it I'd never have looked at it again anyway. Keeping important stuff for work is one thing, but like you said, so far it will all fit on one USB key or some DVD's.

      I suppose I'm an un-cool 21-st century luddite, since I don't feel the need to construct and preserve some massive digital emo-temple to myself.

    • by Cheapy ( 809643 )
      I was talking to a friend of mine in the computer labs in my uni's CS building today. We were talking about Windows 98, then Windows 3. This reminded my friend of something that happened. He's a part of the Tech Support mailing list that our uni has. Just recently, someone sent out an e-mail asking if anyone had the hardware and software to get some data off of some 8 inch floppies. Some medical group needed data on Vietnam vets, and had the floppies with the data on it.

      Congratulations, you have 2 DVDs with
    • The only thing I really back up are digital photos and home videos. These all can be done for free with online services, like http://www.picturepush.com/ [picturepush.com] and youtube.com. I'm not sure why people are worried about backing up their music and movies, that stuff can be easily replaced.
  • by Tmack ( 593755 ) on Tuesday April 22, 2008 @11:50PM (#23167944) Homepage Journal
    Since those "recent studies" links have already degraded into 404's. Maybe something like what was covered a few days ago? [slashdot.org]

    tm

  • But what about... (Score:3, Insightful)

    by bigredradio ( 631970 ) on Tuesday April 22, 2008 @11:52PM (#23167952) Homepage Journal

    Since there will be many holes shot into this theory, let me be one of the first to fire a shot. Electricity (as we know it) may not be around then. I am not predicting the dark ages, but who's to say that far in advance there is still a live socket.

    Any storage device that relies on outside power cannot be guaranteed for 100 years, let alone 1400. I would have more faith in a stone tablet.

    This is a fine example of "academic" research dollars at work.

    • Re: (Score:2, Insightful)

      by fucket ( 1256188 )
      Good point. Also, in 1400 years there may no longer be any humans on earth to read the tablets you store so you might want to lock a human or two in the vault with your data.
    • by cgenman ( 325138 )
      This is a fine example of "academic" research dollars at work.

      As opposed to the pragmatic issues of industry, this long-term thinking is actually is the sort of problem that academia is supposed to tackle, because it sometimes gives the major breakthroughs which revolutionize life. Like, for example, some sort of giant computer system which would survive a nuclear attack... in case you really need those trajectory tables calculated remotely during nuclear winter.

      And it does have pragmatic uses. It is a la
    • Re: (Score:2, Insightful)

      "Electricity (as we know it) may not be around then."

      I'm not sure how you expect electricity to 'change' in the future.

      If a civilization can't generate electricity, then they wouldn't have the technical knowledge to even know what to do with digital data, so the whole point would be moot.
      • Re: (Score:2, Insightful)

        by Ihmhi ( 1206036 )

        It's electricity, not Greek Fire. It's not some big mystery on how to generate it. Even if we're using microscopic black holes to generate power, it would not be hard to set up a windmill and some copper wire.

        The bigger issue would be being able to actually read the data.

      • I'm not sure how you expect electricity to 'change' in the future.

        Like every civilization in history, we think we know almost everything about the physical workings of the universe; there's just a few tiny holes that need to be plugged, then the tapestry is complete.

        Even if we don't discover some better way to transfer power in the next 1000 years(if you can't grasp how much technology can change in that amount of time, just look a 1000 years the other way), don't you think we'll at least optimize our use of electricity? Eventually the connectors of today will be obso

  • This is SO scary! I had just been looking at Wikipedia looking up some obscure phenomenon, and went over to Slashdot. While the page is loading my thoughts drift and I think how important isn't Wikipedia, for now and the future. Someone should print it out and... What? A Slashdot article claims that someone will print the German edition. I manage to collect my thoughts and login, and, notice THIS article... I'm drifting in a black void by now... We wikipedians have come to bring you back home... Sorry fo
  • by Anonymous Coward on Wednesday April 23, 2008 @12:03AM (#23168004)
    Did anyone else notice that the lead researcher's name is Mark Storer? How perfect is that?
  • Since TFA talks about 2 & 3 MB/sec throughput rates...
    How long will this array take to fill up the first time around?

    A 10 PB storage system could be built for about $4700 with an annual operational cost (power for running and cooling the system) of about $50.

    Unless 10 PB (petabytes) means something other than what I think (10,000 terabytes), where did they get the $4700 number?
    I even read their definition of static cost [usenix.org] (You have to go up a few paragraphs) and I still don't know.

    • by Blkdeath ( 530393 ) on Wednesday April 23, 2008 @12:09AM (#23168040) Homepage

      Unless 10 PB (petabytes) means something other than what I think (10,000 terabytes), where did they get the $4700 number? I even read their definition of static cost [usenix.org] (You have to go up a few paragraphs) and I still don't know.

      Table 3: Comparison of system and operational costs for 10 PB of storage. All costs are in thousands of dollars and reflect common configurations. Operational costs were calculated assuming energy costs of $0.20/kWh (including cooling costs).

      Does $4.7 million sound a bit more realistic?

    • by Dahamma ( 304068 )
      Yeah, exactly. They are clearly off by an order of magnitude, unless they found a secret source of 1TB drives for $4 each....
  • by Chairboy ( 88841 ) on Wednesday April 23, 2008 @12:05AM (#23168016) Homepage
    One thing remains constant in thousands of years of recovered cave paintings, manuscripts, papyrus drawings, and more. And that constant... is pornography. It lasts, it's popular, and it's always in demand.

    Clearly, the answer for long term data storage is to use steganographic techniques to encode your data into various types of creative skinpics. Pick famous folks, pretty folks, strange fetishes... the whole gamut. Pick things that people will keep. A hundred years later, all someone needs is the key phrases to search for.
    "We need that Higgs Boson experiment data from 2012, how will we get it? The infocalypse has destroyed all of our cataloged data!"
    "No problem, my great grandfather left a note in his journal telling his descendants to search for 'Britney spears enema' and use 'wet riffs' to decode the LHC data in whatever we use for files."
    "President Spears? That's crazy!"

    Voila!
  • "This hard drive will self-destruct in 1,400 years."
  • Any long term data I keep gets moved to new mediums as they become available. There is no single medium that will last for the times described. The good news is that digital data has a very low corruption rate and a copy can be reverified for a guarantied duplicate every time its needed. I've moved from floppy drives, 44MB WORM, to ZIP, to CD, to DVD and am now using a 12 drive 1TB RAID-5 with AIT backups.
  • Rotate your media (Score:3, Insightful)

    by profplump ( 309017 ) <zach-slashjunk@kotlarek.com> on Wednesday April 23, 2008 @12:19AM (#23168084)
    Wouldn't it be a lot easier to simply keep the archive on a live system, and rotate it to new media from time to time as the old media dies and new storage systems become available? After all, if no one is looking after this system, what's to keep it from being forgotten in the basement of a long-abandoned building?

    In addition to taking advantage of the falling cost of storage for a fixed-size data set -- making future replacement media purchases much cheaper than redundant media purchases today -- you also have the opportunity to re-process the data into new formats, so that you'll still be able to read it when you want it.
  • by jr76 ( 1272780 )
    They completely ignored the fact that the chips and memory managing the system will likely have some degree of failure in the 1400 years the data will survive on their media architecture.

    Look, I am into genealogy quite a bit and see this as a tremendous problem.

    The only thing approaching a viable solution is the Rosetta Disk ( http://www.rosettaproject.org/ [rosettaproject.org] ) using etched nickel media (rock) in a human readable format, which you could theoretically create a binary cipher for a global archival format.
    • Re: (Score:3, Funny)

      by mochan_s ( 536939 )

      Then, if a super-termite or some sort of paper eating worm ravaged the world and ate all the paper in the world, then we'd be in the same situation.

      • Re: (Score:3, Interesting)

        by utnapistim ( 931738 )

        You don't need a supertermite :)

        Just an idiot with a political agenda and authority on his hands:

        The Nazis used to burn books if I'm not mistaking.

        Also, if I remember correctly, there was some pasha or other in the ottoman empire who said that either the kuran is the only truth and then other books have no purpose, or the kuran is not the only truth, and then the fact that there are other truths must be hidden; thus, he burned the library.

        It only takes a bunch of idiots.

  • I find it subtly ironic that the last two links in summary of the article about data loss are broken.
  • Lasers. (Score:3, Interesting)

    by menace3society ( 768451 ) on Wednesday April 23, 2008 @12:38AM (#23168168)
    Laser engraving, seriously. There's some project out there....
    ah yes, here, [rosettaproject.org] that seeks to preserve all the languages of the world by laser-engraving them onto stainless steel plates. They've changed things up a bit, but the basic idea is the same: put it somewhere it won't get lost or corrupted, and if it's important, people will figure it out later. If it's not important, then it doesn't matter.

    Very few things in the world are really worth keeping for even a lifetime. If your grandkids inherit all of your stuff, what will they save and keep, and what will they throw away? If you know what they will throw away, why not save them the trouble and toss it yourself?

    We've gotten ourselves into this mindset where making backups of every piece of data you've ever owned ought to be saved, for no other reason than because it's easy and cheap. I think everyone should have a periodic storage meltdown to force them to reconsider what it is they really need to have.
  • by Raul654 ( 453029 ) on Wednesday April 23, 2008 @12:48AM (#23168210) Homepage
    There are two sure-fire proven techniques for storing data long term - using a reliable non-volatile storage medium (engraving in a non-oxygen reactive metal will do nicely) and making many redundant copies of them.

    Electronic storage is by its very nature unreliable -- electromagnetic properties (like charge accumulation, ferromagnetic hysteresis, etc) are inherently volatile.

    And even if you manage to solve the problem of transporting your data into the future, you're still faced with the problem of making sense of it. Electronic formats change (just ask the guy out in California who makes a *FORTUNE* charging law people to retrieve files from obsolete formats and/or media). In the physical realm, this is true as well - languages change and become very difficult to read. (If you don't believe me, try reading Beowulf in its original old-English form, circa 700 AD).
  • ... hash tree-like structures ... staggered rebuild ... large redundancy stripes ... "scrub" ...
    Is this based on ZFS?
  • by pclminion ( 145572 ) on Wednesday April 23, 2008 @12:56AM (#23168252)

    And I mean it literally -- why have any physical storage at all? Why not just bounce chunks of data around forever on the Internet? Presumably the 'net is going to be here for a long, long time. Imagine a mass P2P network where the data being traded is just encrypted chunks of the data of other users. It needn't ever get written to a mass storage device at all -- just received from one peer and immediately sent to others.

    A protocol could be developed to allow one peer to request, or steer, the network to locate and deliver requested blocks on demand. This might be a high-cost operation, akin to bringing data in from backup tape. Or, a client could just wait for the right chunk of data to recirculate to its position in the network. But storing data is easy -- just encrypt it, format it a certain way, and inject it into the network.

    A natural model for the topology of such a network, and the protocol itself, is the circulatory system. Here, cells move in a fluid, generally in one direction, but through a complex network of vessels, and in a circulatory manner. The immune system might provide inspiration for directed movement of data chunks. (See? The Internet really is just a series of tubes.)

    Over time, the infrastructure of the Internet, the P2P clients, and the exchange protocol itself could evolve, as long as enough redundant chunks are allowed to constantly recirculate. Specialized clients could cache data to "long term" storage for periods of a few days or weeks, in case of large, random outages, but permanent data storage would never rely on any specific technology at all -- even TCP/IP itself. It's all just this mass of recirculating encrypted chunks of data, like cells in the blood stream.

    • Re: (Score:3, Interesting)

      by ChrisA90278 ( 905188 )
      And I mean it literally -- why have any physical storage at all? Why not just bounce chunks of data around forever on the Internet?

      Good idea but some one first tried this in the 1950's The idea was to send the data encoded on a microwave beam and aim the beam at the moon. The signal would bounce off the moon and come back to Earth a few seconds later. A receiver would detect the signal and feed it back to the transmitter. Many thousands od bis would be stored in the radio signal.

      This was an extention of
  • From TFA

    A 10 PB storage system could be built for about $4700 with an annual operational cost (power for running and cooling the system) of about $50.

    Wtf? That is 0.044 cents/GB. That's impossible! No one can do it that cheap. Sloppy reporting again I guess... Perhaps they meant 10 TB.

  • 3d crystal holography like in startrek would be cool.

    With no moving disc of course.

    "perfect holographic storage could store 4 gigabits per cubic millimetre"

    http://en.wikipedia.org/wiki/Holographic_memory [wikipedia.org]
  • Isn't this a plan to find a place for obsoleted technology?
    For all i know, regular HDD is going out of season soon, replaced by chips and memory crystals.
    Why would anyone want to use HDD in the future?
  • Idiotic (Score:5, Insightful)

    by Eivind ( 15695 ) <eivindorama@gmail.com> on Wednesday April 23, 2008 @02:10AM (#23168602) Homepage
    This is completely idiotic.

    First, it ignores physics. MTBF can't be used in reverse. Yes, it is possible that the MTBF on a newish disc is 300K hours or more, put differently, if you've got 1000 such discs running, then every 300 hours, about every 2 weeks, one will die.

    This does however:

    • NOT imply that a average disc will last for 300K hours of operation, i.e. 47 years.
    • NOT imply that a disc that is idle 90% of the time will last for 470 years.
    • NOT imply that a disc that is idle 95% of the time will last for a millenium.


    It would offcourse if degradation in idle state was -ZERO-. If aging made -ZERO- difference and if the MTBF-rates quoted are realistic AND constant over centuries (i.e. older discs DONT start to fail more often, not even if they're centuries old)

    In short: bullshit. It's overwhelmingly likely that not a single disc out of 1000 will remain functional after a millenium, even if it is powered down 97% of the time. At which point no amount of redundancy, distributed or not, will help.

    Also, the exersize is pointless. As long as storage-capacities keep growing exponentially, nearly the entire cost of storing a set of data is in the first few years. If you've paid what it costs to safeguard data for a decade, you've already paid 95% or thereabouts of what it costs to store it forever.

    So, storing something safely for a very long time is actually a easy task, all you need to do is:

    • Create multiple copies at geographically distinct sites.
    • Regularily transfer the copies to newer larger media


    Yeah, this -does- mean that data that nobody cares about will die. Tough luck.

    For example, if you -currently- have a petabyte you want stored, you could buy 3 petabyte enterprise storage-servers, at a cost of perhaps $3million. You host these at three separate companies, say one in europe, one in japan, one in usa. For this you may pay $300.000/year. Total cost for first 5 years: $4.5 million

    After 5 years you buy 3 new entry-level storage-servers. Storage/dollar has doubled ever 18 months, or a factor of 12 over 5 years. The servers now cost let's say $300K, and they're 4U-units rather than complete racks now, so hosting-costs is down to $50.000/year.
    Total cost for years 5-10: $550.000

    After 10 years you buy 3 new 1U "small office" servers. They cost $21K in total. Hosting is $10K/year. Total cost for years 10-15: $71K.

    After 15 years you sign up for the needed amount of space on 3 separate servers and pay $3K/year, or $15K for the period.

    After 20 years you put the data on 3 thumbdrives and store them however one can cheaply store a thumbdrive, total cost perhaps $1000
    Or you sign up with 3 separate el-cheapo hosting-providers and pay $300/year.

    After 25, you send the data as an attachment to your choise of 3 free email-providers, they all come with atleast 500PB free storage anyway, it's not as if you'll notice the extra 1PB attachment.

    More likely though, you've got much MORE data to take care of in the future, so you're still paying $1million/year. Only now that buys you a storage-solution where the old 1PB-archive is a completely trivial file, taking up a so minute fraction of the array that it's not even noticeable and the incremental cost is essentially zero.
    • They're not talking about MTBF(Mean time before failure) at all. They're talking about MTTDL(Mean time to Data Loss).

      I suspect they expect you to replace the harddrives as they fail.

      The largest benefit they seem to offer is the really really low maintenance cost, since the drives are powered off 95% of the time the electricity cost is minimal so almost all the cost is just replacing failed drives.
      • by Eivind ( 15695 )
        Then it makes even -less- sense. The mean-time-to-data-loss is infinity, or close enough to infinity to make no difference if you've got a suitable count of independent copies.

        If, for example, a perfectly ordinary disk fails once every 1500 days, and it takes a day to replace it and get the data onto the new disc, then 2 such discs stored geographically spread will both fail inside of the same day (=data-loss) once every 1500^2 days, or once every 6000 years. Okay, so you can get unlucky, use 3 and your exp
  • by zdzichu ( 100333 ) on Wednesday April 23, 2008 @02:15AM (#23168616) Homepage Journal
    So, they are proposing Sun StorageTek 5800 [sun.com] (codenamed Honeycomb) as their research?

    Compare article with this whitepaper [sun.com], especially Figure 13 on page 28. Networked nodes with 4 disks each, grouped in cells of 16 + 1 management node. Each object is stured redundantly on disks of different storage nodes. Everything self-contained, accessible by nice API. Oh, and the software is Open Source.
  • missing the point (Score:4, Insightful)

    by nguy ( 1207026 ) on Wednesday April 23, 2008 @02:20AM (#23168644)
    It's easy to build distributed, reliable storage that theoretically lasts thousands of years if you assume that you can just keep going down to the corner computer store and buy replacement parts that more or less work like today's parts, that operating systems keep doing what they have always been doing, and that networks keep working the way they always have. But those are bad assumptions.
  • by HetMes ( 1074585 ) on Wednesday April 23, 2008 @03:47AM (#23168944)
    What kind of data that will be lost otherwise do we have to back-up for posterity? I mean, come on, no one is going through your perl-scripts, c++ classes, 10000 digital holiday pictures, diaries of what you had for breakfast, or IRC logfiles. You are not that important! Although it would be fun to speculate what kind of information would have been in the caveman-wiki.
    • by argent ( 18001 ) <peter@NOsPAm.slashdot.2006.taronga.com> on Wednesday April 23, 2008 @06:59AM (#23169838) Homepage Journal
      no one is going through your perl-scripts, c++ classes, 10000 digital holiday pictures, diaries of what you had for breakfast, or IRC logfiles

      I'm sure that the people in the 11th century would have said the same thing about their accounts and letters, and yet historians and archeologists depend on them to tell us what life was like 1000 years ago.
    • I'd imagine the data would contain recipes on how to make steel, concrete, semiconductors, CPUs, power plants, basic definition of how we think the universe works (quantum mechanics, relativity, etc.).

      And if you think humanity will never "unlearn" how to make such things, you're forgetting the roman empire (concrete was reinvented some 1500 years after the romans used it---we're talking about basic construction material here that people just managed to `lose' to history!).

      Wouldn't it be nice to find an

  • Not a single Stargate reference (I, for one, welcome our future ignorant overlords).
  • I'm using Indiana preview 2 (final version comming out really soon). It boot from ZFS and I add two USB drives that get mirrored by ZFS. That's all I needed. Cheap RAID 1 that can be read by any computer that runs ZFS (MAC OS X, BSD, OpenSolaris and FUSE Linux distros).
  • We have been in this post-apocolyptic dark age for 1000 years now. Everything we need to bootstrap civilization mk II is stored on this "Aard Reeve".

    However, the only "Comm Pewter" capable of reading this information was stolen by the evil Mordacs and hidden deep in their underground lair.

    Return the Comm Pewter and we can once again wield the mighty "Loy Yer" to enslave the lowly "City Zens" and bend them to our will. We will restore "Celeb Riti" to her "Shaw Ping Mall" throne.

    All the greatest powers and
  • Major flaw in plan (Score:3, Interesting)

    by Adeptus_Luminati ( 634274 ) on Wednesday April 23, 2008 @08:28AM (#23170606)
    The basis of this plan is that if you spin the hard drives less time, in theory the components will last longer. Theoretically this sounds great, but in practice this is not true. Obviously these guys have never worked in a real data centre for a few years in a row. Where I work, we actually place bets with a bunch of co-workers as to how many hard drives we'll lose, everytime we have to shutdown and bring back up the data centre. We only end up doing this once, maybe twice a year. And note that these are planned graceful shutdowns. Out of about 1000 hard drives we have, we lose about 3 on average. The last time the Data Centre was shut down and brought back up, we lost 7 drives! Hard drives are designed to run for long periods of time. They were not designed to stop, start, stop, start. Try doing that with your car and see how long it lasts! I would bet money that the hard drives wouldn't last past 3 years... 5 if you're lucky with this plan. 1400 years is completely ridiculous. And that my friends is the difference between theory and practice. So as they say....

    "In theory, practice is perfect; but in practice, it is often only theory".

    Adeptus

Keep up the good work! But please don't ask me to help.

Working...