Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Data Storage The Internet

Cringely's P2P Backup Idea 205

gewg_ writes "If Napster and Bit Torrent had a baby, would it Baxter? As a follow-on to Cringely's last column where he talked about having a backup strategy in the wake of Hurricane Frances, this week he proposes a distributed RAID notion as a solution."
This discussion has been archived. No new comments can be posted.

Cringely's P2P Backup Idea

Comments Filter:
  • Queue Linus quote in 3....2....1....

    Ah, there it is :-)

    -Chris
  • by Artifex ( 18308 ) on Sunday September 12, 2004 @01:01PM (#10228469) Journal
    Baxter [abisoft.com] is, of course, the famous IRC client for BeOS. (Hi, Seth!)
    • From the article:

      It sounds from every description like the solution is Linux-specific, but I'm sure it can be made to work with other UNIX variants, especially since Gmail, itself, runs on Apple xServe 1u boxes. Windows compatibility is unknown, but I'm sure someone will solve that soon.

      I know, it's a little childish, but I get a good feeling when I see something small...even this little thing here...that thinks of other OS's first and Windows compatibility will be "real soon now" or something like tha
    • would it Baxter?

      I don't know, do babies Baxter these days? I mean they puke and shit and cry but when you talk about Baxtering I'm not too sure.

      Oh you mean would the baby be Baxter?

      Sorry, my fault.
  • by Faust7 ( 314817 ) on Sunday September 12, 2004 @01:02PM (#10228477) Homepage
    Depending on exactly what you have stored, millions of people may want to help you backup as soon as possible.
    • by Mod Me God Too ( 687245 ) on Sunday September 12, 2004 @01:09PM (#10228514)
      I once encoded some data in a few MP3s... this was back in 2000. The MP3s were long speech files... about 30mb/file @ 160kbps and were popular, but took so long to transfer, so to propegate the 'new' files as quickly as possible I reduced the bit rate from 160kbps to 32kbps and added in the 'extra' 'noise' as I did this - as it's speech it didn't really matter.

      If I do a search now they're easy to find, much easier than the original 160kbps were.

      This was just a test, no special data used - but an amazing way to archive and distribute data.

  • p2p backup (Score:5, Funny)

    by khrtt ( 701691 ) on Sunday September 12, 2004 @01:03PM (#10228480)
    I think this is old news. Some people have been backing up the source code for viruses that they wrote on Kazaa for months now.
    • Funny, because only the binaries for those viruses ever seem to spread out very well.
    • Re:p2p backup (Score:4, Interesting)

      by aqua ( 3874 ) on Sunday September 12, 2004 @04:31PM (#10229585)
      I made a related waggish proposal a couple of years ago:

      1. Make tarball of backup
      2. Encrypt if desired
      3. Encode tarball, 4-8 bytes at a time, in email addresses
      4. Put email addresses on web
      5. Wait for spam

      Presto -- spammers now pay for your backup; anytime you have a disk failure, just wait a while and watch your spamcan or smtp log, and reconstruct your backup at will.

      (Some assembly required, offer void where prohibited)
  • No thanks (Score:5, Insightful)

    by Lord_Dweomer ( 648696 ) on Sunday September 12, 2004 @01:03PM (#10228483) Homepage
    Maybe this would be good for some data, but I would never backup sensitive data on something like this. Nor would a lot of businesses.

    • Re:No thanks (Score:4, Insightful)

      by OrangeHairMan ( 560161 ) on Sunday September 12, 2004 @01:08PM (#10228508)
      I would never backup sensitive data on something like this

      Encryption? Simply using GnuPG or any of the free AES encryptors out there will make it incredibly secure. If your data is sensitive enough, you should be doing this already...

      -orange
      • Re:No thanks (Score:4, Insightful)

        by proj_2501 ( 78149 ) <mkb@ele.uri.edu> on Sunday September 12, 2004 @01:26PM (#10228611) Journal
        the issue is not necessarily secrecy, but knowing you can get that data back exactly when you want to.
        • Oh sure. You'd absolutely have to be able to verify the backups. Perhaps by presenting a challenge to your peers which they could only respond to correctly by scanning the entire dataset to compute the result.


          You'd also need some sort of currency system so by sharing out 1GB you'd get a GB in return.

      • Re:No thanks (Score:3, Interesting)

        by DarkHelmet ( 120004 ) *
        Encryption? Simply using GnuPG or any of the free AES encryptors out there will make it incredibly secure. If your data is sensitive enough, you should be doing this already...

        Or for that matter, why not build encryption into the system itself, so that you don't have to manually do it.

      • Re:No thanks (Score:2, Insightful)

        by slasher guy ( 624616 )
        And then lose your key with the rst of the data!
      • RTFA as Cringely said: data should be encrypted before they go into the system, which suppose that the system encrypt the data and which also give the interesting problem: where are the key stored?

        If the key are stored by the client, there *will* be problems of lost key, so the key must be also stored on the backup server.
    • Re:No thanks (Score:3, Interesting)

      by pHDNgell ( 410691 )
      Maybe this would be good for some data, but I would never backup sensitive data on something like this. Nor would a lot of businesses.

      I've been backing up sensitive data almost exactly like this for quite a while now. I've got an application that breaks a stream of data into chunks and encrypts them. It compares the md5 of the source block against the md5 of the same block from the previous backup. If they match, it hard links the block into the backup directory, if they don't match, it encrypts the b
  • i had this idea a few months ago, wish i could have done something about it then, oh well.
    • Re:damn.. (Score:5, Interesting)

      by dotwaffle ( 610149 ) <slashdot@wPARISalster.org minus city> on Sunday September 12, 2004 @01:12PM (#10228533) Homepage
      I had this idea in about '97 or '98. I looked around to see if anyone else had done anything like this (remember, this is kinda pre-mass-P2P) and found that someone had done so, but on a business scale solution. I think it was called Mango, and is still in production today. It essentially made a portion of your drive available for a drive letter, then whetever was copied onto it could be seen by all. The data was stored in at least 2 places, so if one went down, there was still one copy, and the remaining copy would duplicate, so that there was always at least 2 copies. In the end, I think nobody went for it because it was too expensive... But this is EXACTLY what a lot of Small-Medium businesses need atm. Bring on the Mango's!
      • When I was working at a factory last year, I was part of an IT team supporting 1000+ PCs. An idea I thought of, but haven't had much time or chance to flesh out, was a "peer-redundant file system," whereas all those computers could have background hosts serving up a specified amount of space for use by anyone on the same network. The space would be treated like a block of sectors on a network-based drive, allocated by a master server, and made redundant through a desired number of hosts (anytime data gets p
      • You cannot have-a the Mango! [progressiveboink.com]
  • by duplicatedAccount ( 523194 ) on Sunday September 12, 2004 @01:05PM (#10228492)

    Well, we [askemos.org] leave the data where it belongs: in the proxy network where the processes live too. Still a bit incomplete, but maturing WebDAV and mountable slices forthcoming...

  • Freenet (Score:5, Interesting)

    by John_Allen_Mohammed ( 811050 ) on Sunday September 12, 2004 @01:06PM (#10228495)
    Just insert a bunch of data into the network.. record the keys and retrieve once a week then delete. That should keep the data retrievable from the network for a good while. Using two nodes would help. Plus everything is encrypted with some heavy shit.

    Or, just make a local-freenet on the company lan.. everything is encrypted and unretrievable without the proper keys, so it's very secure and it's distributed.. + FEC encoding.

    That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project :(
    • Re:Freenet (Score:2, Interesting)

      by Joe Tie. ( 567096 )
      Ah, so that explains it. I finally got enough ram to keep freenet going 24/7, and was surprised to find it so unreliable. I wasn't expecting a speed demon, but I was expecting that links to files on freesites would work if the site itself was. That, so far, has seldom been the case. Are there any other similar projects going on?
    • Re:Freenet (Score:2, Interesting)

      by MrJay ( 172412 )

      That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project :(

      Check out other development lists on popular projects (if they're public). You'll find that heated debates, arguments, an

  • Interesting idea (Score:5, Insightful)

    by scoser ( 780371 ) on Sunday September 12, 2004 @01:06PM (#10228501) Journal
    Now the world's porn will be safe forever!

    But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well. I can see people signing up just for the chance to hunt through people's data.


    • [...] the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken.

      More likely, let's say the hard drive gets broken. Would you really trust your data to such a thing?
    • by legirons ( 809082 )
      "But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well."

      Worse, if this Aunt Nedda lives in the UK, she could go to jail for 2 years for not being able to decrypt the files on her hard-drive at the request of the police.
      • Re:Interesting idea (Score:3, Interesting)

        by bloo9298 ( 258454 )

        On the contrary, I'd say Auntie has a really strong case that she never had the key to someone else's encrypted data stored on her drive, so the RIP act would not apply to her.

    • but, unlikely to work in practise:

      how mush redundancy should there be ? Two full copies sound like far too few. If 'R' is the number of redundant copies, then understand that every participant has to be sharing R*D bytes, where D is the average backup size. Plus, of course, their own personal data, so everyone's hard drive has to be at least three times the size of the average data set. For realistic backup strategies, ensuring that a full copy was online at any point in time, R would probably have to be m
  • by william_lorenz ( 703263 ) on Sunday September 12, 2004 @01:08PM (#10228509) Homepage
    Cringley's not the first with this kind of idea. In fact, the Freenet Project [sourceforge.net] already implements something to this effect. Although not specifically designed for reliable backups, the distributed caching algorithms essentially replicate data towards where it's most often needed, helping to improve network performance and creating copies of important data along the way so that it won't be destroyed if a central server fails. Obviously not a commercial solution, but very interesting.
    • And oddly, Simpson Garfinkel, another well-known technopundit, submitted a very similar idea (P2P backup service) as a business plan to the MIT 50k competition back in 2002. See here [216.239.39.104] for the entry summary (search in the page for Garfinkel). Anyway, I somehow dredged that up from the back of my brain when I saw this Cringely piece because I recalled that Garfinkel was interested in actually doing something like this several years back.
  • I lived the storm (Score:2, Interesting)

    by jsm008us ( 774007 )
    Well, I lived through this storm, checking my PC upstairs to make sure nothing was going to damage it. If the storm was risking the roof flying off and my room becoming flooded, I would have taken out my hdd. This sounds like a brilliant idea.

    Hey, it beats trying to store data to gmail accounts! ;)
  • Save Betamax (Score:5, Interesting)

    by chatooya ( 718043 ) on Sunday September 12, 2004 @01:11PM (#10228525)
    Ideas like Cringely's will be impossible if the INDUCE Act passes.

    Save Betamax [savebetamax.org] is a national Congress call-in day this tuesday to oppose the INDUCE Act. It might be our last chance to stop this bill.
    • If a p2p backup system did catch on, it would provide another bit of evidence that p2p is a legitimate tool that should be available. The more "good" uses, the more it legitimizes p2p.
  • Much faster (Score:4, Informative)

    by interiot ( 50685 ) on Sunday September 12, 2004 @01:14PM (#10228541) Homepage
    Alternatively, you can spend $100-200 on a iPod-sized laptop drive enclosure [newegg.com] and drive, and have a MUCH faster incremental backup [mikerubel.org] system that's easy to store away from the original data (eg. store your home backup drive at work).

    As a bonus, you can use it to transport data (eg. your mp3 collection) between places, or even use it to boot linux anywhere [mandrakesoft.com] with much more space and document storage capability than Knoppix.

    • For what it's worth, these are VERY practical, I'm surprised I don't see more people using them. Compared to the mini-iPod, 2.5" drive enclosures are about 40% longer and wider but just as thin, they can be USB-bus powered since laptop drives consume less power, when combined with USB-2 they're no slower than laptop drives normally are, and since the enclosure is cheap, it's easy to upgrade the setup to a larger hard drive every couple years. It really seems like a no-brainer to me.
      • Re:Much faster (Score:3, Insightful)

        by interiot ( 50685 )
        And compared to a mini-iPod, you get something that 1) you don't have to worry about power since it has no batteries and doesn't require external power, 2) you get the same amount of disk capacity for something like 1/3rd the cost, 3) THEY SUPPORT USB MASS-STORAGE drivers so any modern OS can talk to it without extra drivers or funky software. Yes, it's not a portable music player, but this solution may be more appropriate for geeks who spend all their time next to a computer in one form or another, or are
    • And to save people research time... here's a link [ibm.com] that explains booting linux from USB-2 or Firewire drive with the help of a boot floppy/CD, in the cases that the BIOS doesn't support it natively. (just like the Mandrake GlobeTrotter provides)
    • Another variation on the rsync script is rsnapshot [rsnapshot.org], which works quite nicely.
    • Yes, it's easy to store the drive away from the original data, but at that point, you can't back up anymore. A system that involves moving a drive every morning / afternoon is still a barrier to the average user, where as the P2P solution requires no extra effort by the user. Plus, it's distributed, so having all your computer equipment stolen or burning your apartment down will still leave you with good data.
      • Re:Much faster (Score:3, Insightful)

        by interiot ( 50685 )
        The P2P solution either requires users to have their cable modem pegged and nearly unusable for 60 days [google.com] (80gb is the current best laptop drive size, most cable modems max out at 128kbps up), or that they backup only a fraction of their hard drive. I can't quite figure out how carrying a laptop drive around, full of your MP3's, which you can play on any computer you sit at, is any less convenient than either of those options.
  • How about a raid RAID-0 system using the entire Internet?

    Oohh yeah, baby!
  • Nice idea, but (Score:5, Interesting)

    by moonbender ( 547943 ) <moonbender@@@gmail...com> on Sunday September 12, 2004 @01:17PM (#10228556)
    It's a neat idea. In a nutshell, he suggests a Peer to Peer encrypted storage network. You get exactly as much storage room as you are willing to offer yourself for others to use. When you store anything, it's encrypted and automatically spread to other systems.

    It doesn't make for a very safe backup, though: What happens if somebody decides to stop the service and just deletes his local storage? You've got no more backup at least for a while, and you might not even know it. And of course, other people have head crashes, too, which would also obliberate your backup at least for the time it takes to recreate it from your own data. Of course, by that time, you might have deleted it yourself, either by accident or knowingly, since you have a backup after all. A viable solution would be to store every file multiple times on different remote servers, although that'd lower the storage capacity you get. It's still the right step, though.

    The crucial problem is that the service provider can't really give any guarantees that you will be able to regain your lost data. With three or more independent copies in different locations, it's very unlikely that the backup won't work for some reason, but a backup that's not 100% is not a very useful one, especially in those situations where backups are really crucial.

    It's still a neat idea, and to my knowledge has not been done to that degree of sophistication. Of course, as others suggest, nobody is stopping you from inserting encrypted data into Freenet, but that's nowhere near as fast and secure as this could be. And while it's not a true backup, it's better than no backup at all, and most likely enough security for many persons.
    • Part of your concern over others having hard disk crashes, and other issues is taken up by compression and proper distribution of the data.

      First you want to backup your critcal data. Usually this is not already compressed information like jpeg images, or mpeg video, or even mp3 audio. It's your working documents, source code, and a subset of your e-mail.

      If you compress this data, you are likely to experience a 50% compression ratio or higher. This means that once you chunk that data for distribution, you
    • Here's a few ideas in repsonse: If the data you are storing for someone goes down, you could send a notification as to whether its recoverable or not. To complement this, users will query periodically for random CRCs of the data they have backed up. So as long as you have > 1 backup partners, you have a reasonable degree of backup security.
  • by ei4anb ( 625481 )
    http://www.csua.berkeley.edu/~emin/source_code/dib s/ which is open source and also http://www.hivecache.com/ which will be commercial 'real soon now'
    • To flash out the "real soon now", the first public betas of the consumer version of the HiveCache backup system will be available for windows at the end of October, Linux and OS X clients will probably be ready for beta testing before the end of the year.
  • I still say Gmail... (Score:5, Interesting)

    by plasticmillion ( 649623 ) <matthew@allpeers.com> on Sunday September 12, 2004 @01:18PM (#10228563) Homepage
    Not to beat a dead horse [slashdot.org], but Cringely seems like he was in a bit of a hurry to reject the Gmail solution. Wouldn't simple encryption solve the privacy problem? The Gmail text analysis is based on the assumption that the data is some kind of natural language text, so it would be baffled by anything else. Huffman encoding (or some other compression) would do the trick and save space besides.

    • that's the same thing I was thinking...

      gpg-encrypting a few zipfiles/tarballs and emailing them to your gmail account should be pretty simple.

      Safe backup of the gpg keys might be a bit trickier...

      Something like the AES-based tools also mentioned is more vulnerable to a dictionary attack then gpg with a very large key...
  • by CrazyJim1 ( 809850 ) on Sunday September 12, 2004 @01:20PM (#10228577) Journal
    If your character data was stored on everyone else's computer, it would act like a virtual server, where if a few data sets get hacked, they'd be corrected by the whole.

    P2P can work in wild ways we haven't even tapped.

    too bad orrin hatch is trying to outlaw p2p:
    www.geocities.com/James_Sager_PA
  • I've been working on something like this for awhile. It's not entirely the same as what Cringely proposed, but it's a step in that direction and I'd like to continue working on it and making it more evolved if there is interest in it. Currently, it's meant for intranet use, so the machines in your office can easily backup to each other. That's obviously not hurricane-proof, but it does help protect against a single point of failure. Also, I am about to add the option to let you make backups to an off-si
  • by hng_rval ( 631871 ) on Sunday September 12, 2004 @01:26PM (#10228613)
    Foldershare [foldershare.com]
    We use foldershare for peer-to-peer backup, but the catch is that you invite people that you trust to your libraries.

    For backup purposes, I only invite myself and just connect another computer to the account.
  • Sounds unfeasable. (Score:3, Insightful)

    by dj245 ( 732906 ) on Sunday September 12, 2004 @01:28PM (#10228626) Homepage
    How many times would you have to duplicate the data to ensure that no corruption (both intentional and unintentional) occurred? You would have to compare copies of the data to each other to make sure it matched. I wouldn't want my backup corrupt because some joker wrote Goatse.cx pictures to it a few thousand times. You would also have to store additional data in the event that people ran the program and then quit, taking your backup along with them. So maybe you would have 1gb backed up over the network, and 10gb of other people's crap on your computer. And thats assuming it ran on some sort of credit system where you only got to backup a percentage of what you allowed people to store. Otherwise hoarders would run rampant and take over the system.
  • What BS. (Score:4, Informative)

    by Critical_ ( 25211 ) on Sunday September 12, 2004 @01:31PM (#10228640) Homepage
    I just went through Hurricane Ivan in Grenada. If you have been watching the coverage you should know that our island was completely destroyed. There is no water, no electricity, and no security. The university I attend (St. George's) lied to the students' parents about our situation. There were looters with guns and machetes threatening students. The first two nights we fended for ourselves with a large bonfire and homemade weapons, knives, pipes, etc. The third night we had 10 minutes to pack up and leave since we could see the looters lighting fires to apartment buildings on the road we were on. I quickly took the hard drives out of my two laptops (and the external drive I have), picked up a GSM roaming phone, any cash I had, a passport and two pairs of clothes. We ran to campus. Campus had about 200 male students lighting bonfires and running security teams to monitor the area. We chartered our own jet out of Grenada yesterday to Barbados which is where I am writing this from. My point is this: no one cares about data in this situation. No one wants to know about RAID or tape backups. If it came down to it, I would have ran with only a passport, a phone, and cash. We were worried for our lives and whether we had water or not, data was not our concern. People need a reality check. How many of you can claim that you went through a Category III or IV hurricane on an isolated island fending for their lives? Not many, so quite franly Cringely can go to hell.
    • Re:What BS. (Score:4, Insightful)

      by BlackHawk-666 ( 560896 ) on Sunday September 12, 2004 @01:59PM (#10228784)
      Maybe you don't care about that data today, since the terror experience is still fresh, but you might care about it later. For example, assume that data was full of photographs of friends, deceased relatives, and other impossible to replace stuff. This backup scheme would've suited you even better than grabbing the hard drive, because you wouldn't even have had to do that.
    • I'm sorry chuck... (Score:3, Informative)

      by Anonymous Coward
      I would have moderated you into oblivion given the chance.

      I genuinely feel for you and your struggle for safety given the recent events, and you have my deepest sincere sympathy...

      But that is not what this article is about. And how about this, given the chance to either leave my data behind or fend for myself given those circumstances...I'd stay with my data.

      Perhaps your data isn't a life or death matter to you, but my stacks of CD's, DVD's and harddrives with the past 15 years of my writing, graphics,
    • Re:What BS. (Score:4, Insightful)

      by Loualbano2 ( 98133 ) on Sunday September 12, 2004 @02:17PM (#10228883)
      I am not trying to minimize your experience with Ivan, so please don't take this comment as such. The story you posted sounds crazy as hell and I wouldn't wish such an episode on anyone except my worst enemies.

      I do believe you reacted a little emotionally, which is understandable given your current situation. I think that if you look at the article again, you will find the only reason he mentions hurricanes is because Frances news reports before the fact got him thinking about it.

      That being said, I don't think Crigley was trying to insinuate that someone in a situation such as yours should or could worry about data. The point I took away from the article is that a person wouldn't need to worry about data at all under any disaster circumstance if you implement a system such as the one he proposes.

      I think that if you look at it like that, you will agree that he is not trying to discount the gravity of your experience.

      -ft
    • There were looters with guns and machetes threatening students

      Is it just me, or did this poster sound like some 1930's colonialist complaining about how 'the natives' got out of control?

      You want to go to play-school and take advantage of incredibly low living costs due to enormous depravity between what you hold in your wallet and what the average local makes- you'd better not complain when law breaks down and you suddenly find yourself more wanted than a sugar cookie next to an ant mount.

      Funny thing-

    • Re:What BS. (Score:3, Insightful)

      by kwerle ( 39371 )
      OK, your life sucks right now, and I'm sorry.

      You're a student. You've accumulated less than a decade? worth of useful data. Depending on what that data is (scientific data hard to reproduce, personal writing (books/plays), scientific data easy to reproduce, or highscores on minesweeper), that data may have a $ value from 0 to maybe 10s of thousands of dollars (which means time). Small companies that have been in business for just a few years can have data that is worth millions of dollars. Ask your ne
  • Poorly Thought Out (Score:4, Informative)

    by Naeleros ( 550233 ) on Sunday September 12, 2004 @01:37PM (#10228666)
    This idea is poorly thought out. It has a couple of *major* flaws, imo.

    #1) It doesn't recognize the reality of the complexity of backup software. Kinda easy to gloss over 'automated' backups without ever describing it. Pretty hard to imagine some piece of software that can universally back stuff up on everyone's hard drive and at the same time be very easy to use. Imagine mom/dad trying to use software with similar capabilities to Veritas BackupExec isn't easy. And.. imagine the wide variety of live files and databases that it wouid have to handle.

    #2) Data integrity. He suggests a 1:1 ratio for backup space. Not hardly. How is he going to have any kind of redundancy with that? Crashes and people unsubscribing will happen all the time. The data would have to have a *lot* of tolerance to that.

    A parity solution wouldn't be nearly enough. That assumes that only 1 failure at a time happens (using RAID 5 as my basis here). It would be easy to imagine that one person unsubscribed with part of your data and another had a crash or corruption problem.

    So.. complete mirroring would be necessary. Again, its easy to imagine 2 people's system going offline at the same time.. so, you'd probably need more than 2x Mirror. At this point... how much is enough to ensure reliability? 3x 4x 5x ? ? ? How much do you trust your average netizen?

    So.. pick your number and then divide your backup space by it. Like 5x? Add 10GB and you have 2GB usable storage. Not very good.

    I'll just skip over the 'auto backup' of people's 40GB storage over a 128K up line for now.. already typed too much...
    • How about 1.2x? (Score:3, Informative)

      by roystgnr ( 4015 )
      Error correction gets a lot more sophisticated than checksums, you know. You can make a Reed-Solomon codec for 8-bit code words with 255 byte encoded blocks having any even number of parity bytes, and the way optimal RS codes work is that you can recover the original data as long as the number of missing code words plus twice the number of corrupted code words is less than the number of parity code words you chose.

      So, you divide your data into chunks 225 bytes long. Each byte in a chunk goes to a differe
  • FreeNet [freenet.org] was sold on a bunch of users for just that but quite simply no one is willing to dump hard drive space to random users out there.
    However, I would use this sort of thing on an internal network because I directly control how much space is availible and I'd be able to, with adoption, access video from one of my three computers from a set-top-box in the living room and manage it as a single library. That's the sort of thing we need to be looking at, but unfortunately very few companies are officially d
  • by YetAnotherName ( 168064 ) on Sunday September 12, 2004 @01:40PM (#10228684) Homepage
    A company called 312, Inc. [312inc.com] already has a commercial product for P2P backups called Lean On Me [312inc.com].

    I don't work for them, etc.
  • DIBS (Score:4, Informative)

    by wan-fu ( 746576 ) on Sunday September 12, 2004 @01:41PM (#10228687)
    Cringley is adding nothing new here. We've all already seen this on Slashdot [slashdot.org]. Hell, the website [berkeley.edu]even mentions how it's like P2P but not.
  • Fud? (Score:3, Insightful)

    by broothal ( 186066 ) <christian@fabel.dk> on Sunday September 12, 2004 @01:44PM (#10228700) Homepage Journal
    I lost interest in what this guy has to say when I read this:

    "But while it might be easy to use Gmail for offsite backup, I couldn't bring myself to do that just because of the intrusive nature of Gmail. Remember this is a system that is by invitation only, which means that Google can quickly map a social network establishing who knows who. And since Gmail actually analyzes the content of your e-mail and can automatically group it by subject (how creepy is that?), Google not only knows who your friends are, but what do you talk about with those friends."

    I nominate this to the prestigious "Fud of the week" award.
  • I did some research into this on my B.Sc. thesis, in essence it's a solution looking for a problem.

    The thing is, you want backups because you want to be able to get it back, with this (and my idea) you have little control over the backup; in short words, it's not a backup.

    FreeNet may at a first ignorant glance be a solution to this dilemma, however, you still have the same terror of doubt. Because you're not in control!

    To summarize, there is a difference between not wanting to lose something, and wanting
  • I've also been suggesting this for years. I'm too lazy to search for the older posts, but here is one from July:

    http://slashdot.org/comments.pl?sid=115027&cid=974 3518 [slashdot.org]

    Of course what matters, though, is not talking about ideas, but *doing* them.
  • Pricing (Score:3, Interesting)

    by duvel ( 173522 ) on Sunday September 12, 2004 @01:55PM (#10228769) Homepage
    Cringely writes: Apple, for example, will let you mount up to a 100 megabyte iDrive as part of its .mac Internet service, but that costs $99 per year. Eight dollars per month for 100 megabytes of storage is too darned much.

    The company I work for (banking) sells storage for 120 euro per gigabyte per year to our internal clients. That's storage on RAID-disks (think StorageTek and the like), including backup (on tape) and all necessary services (people doing maintenance, restoring backups, etc). 120 euro / gigabyte / year comes to 1,22 dollar / month / 100 megabytes (compare to 8 $ per month with Apple). Considering our 1,22 $ plus some network costs, plus maintaining a billing system for a couple of million clients, and a bit of profit margin, maybe 8 $ per month is not a rip-off.

    • Re:Pricing (Score:2, Insightful)

      by legirons ( 809082 )
      "Cringely writes: Apple, for example, will let you mount up to a 100 megabyte iDrive as part of its .mac Internet service, but that costs $99 per year. Eight dollars per month for 100 megabytes of storage is too darned much."

      For that $100 per year, you could buy 3 128"MB" USB-keys that give you more storage space, have faster copy-times from your computer, and have 3 times the redundancy as the network-storage option. They're small enough to post to a friend in a different location if you want (cheap, and
    • how about $9/month for 3GB of storage?
  • by Ars-Fartsica ( 166957 ) on Sunday September 12, 2004 @01:56PM (#10228771)
    How much data do you *really* want backed up? I have lots of MP3s ripped, but I have "backups" on CD. The OS and prtograms I can always reload. That leaves me with about five megs of my own data I do not want to ever lose. There are dozens of free repositories that will handle this.

    For larger, business-driven uses, you probably want something like DataSafe. They will keep media for you in a very safe place. Or better yet, keep your whole business disaster protected -have more than one live site for IT operations.

    • How much data do you *really* want backed up?

      I'd say I'm currently pretty good at about 5GB. My databases are very important to me (photos, etc...), plus my mail, and source code and stuff.

      Since I started using arch (and darcs to a lesser extent), every commit I make is backed up in two or three fully usable mirrors within the hour, and on three or four machines within a couple of days. That's probably where my most valuable work is. I've lost photos in the past, though. That hurts.
  • Encrypted, distributed, obscured as there no good way to find data unless you know its key..

    Too bad its still too slow..
  • Cringely is just looking for an excuse to be clever, a fluff-piece space-filler.

    1. He starts by saying he can't use gmail because of privacy. Duh, can you say "encryption"?

    2. He also gives a privacy complaint because gmail knows who you associate with, through the chain of invitations.
    Bullshit. There are lots of people on the web offering anonymous invitation URLs.

    3. Savor this contradiction:
    "First, it is for BACKUP, so recovery has to be slow enough so people won't think of it as another hard drive
  • Baxter? I barely even know her!
  • by lo_fye ( 303245 ) <derek.geekunity@com> on Sunday September 12, 2004 @02:31PM (#10228949) Homepage Journal
    I was thinking about something like this for video activists who frequently have their tapes/discs confiscated by the cops. It'd be great if they had PocketPCs with webcams that were operating in a baxterian sort of way such that the video they were taking was simultaneously being recorded to the storage of other activists/media within wifi range. You could have wifi NAS (network storage) in vehicles and apartments surrounding the demonstration area, as well as on ipod-level storage in future wifi enabled pocketpcs. 3G cameraphones with hard drives [slashdot.org] might provide another simpler option, if they could be networked together in a p2p fashion. The cops might be able to confiscate my webcam and pocketpc, but my recordings (and proof) would be elsewhere in the aether.
  • Already done (Score:3, Insightful)

    by Afty ( 182462 ) on Sunday September 12, 2004 @02:41PM (#10228988)
    There are several research groups doing work on distributed P2P backup systems. I know there's a group at MS doing this, as well as a group at MIT (http://catfish.csail.mit.edu/~kbarr/pstore/), and several others that don't come to mind offhand. I did a project on this in grad school, so I'm familiar with the research.

    There are a lot of issues here, mostly centering around the fact that you can't trust people in an open P2P network.
    1) They might look at your data.
    2) They might not be online when you want your data.
    3) They might delete your data, or do other malicious things to it (insert viruses, etc.).
    4) They might freeload by using space on other hosts and then deleting all the data they receive.
    5) If a host leaves the system permanently, you need to detect that and replicate its data somewhere else. Also, how do you know whether it's leaving permanently or just logging off for a while?

    #1 is easy, just encrypt the data. #2, #3, #4, and #5 are hard because data integrity is really important in a backup solution. You end up having to replicate the data all over the place to "ensure" that it'll be available when you need it, but then you've got the problem of having to donate more space than you receive to use the system. Plus, it's still not certain that your data will be available when you need it.

    Basically what I'm trying to say is that it's a hard problem. :)
    • Re:Already done (Score:3, Insightful)

      by burns210 ( 572621 )
      So your data on the DRAID (distributed RAID) is encrypted with your public key, so that only you can decrypy it.

      The system should have redudant locations. Similar to the GFS(Google's Filesystem) that has 3 copies of every piece of data(on different computers), for just that reason.

      The system should require that you have 1-3 times as much on your system(that is other's data), that you have on other people's computer.

      The system should not have a user's data stored on a single computer, rather each file or
  • ...the more I am struck by how stupid he is.

    Backing up data is easy and cheap, as cheap as anti-virus measures for windows boxen... the fact is people cannot get either to work because they are LAZY and STUPID and most of all DETERMINED TO STAY THAT WAY.

    There is an old saying about fools and their money being soon parted, I think there is also a modern corollary, "Fools and their data are soon parted."

    With my work hat on, when asked to help a user with PC problems, I have long evolved a simple tactic, I
  • by mrm677 ( 456727 ) on Sunday September 12, 2004 @03:09PM (#10229154)
    Nothing new here. Check out Berkeley's OceanStore [berkeley.edu] project for an idea of a global storage solution impervious to local disasters.
  • Pastiche (Score:3, Informative)

    by bloo9298 ( 258454 ) on Sunday September 12, 2004 @03:53PM (#10229350)

    Looks like he might like Pastiche [psu.edu].

  • Prior art (Score:3, Interesting)

    by isomeme ( 177414 ) <cdberry@gmail.com> on Sunday September 12, 2004 @04:41PM (#10229696) Journal
    An equivalent idea was proposed in about 1982, at the dawn of the internet. Simply tar your filesystem, then email the tar to yourself along a lengthy old-style routing chain. If you need your data back, just wait for the email to arrive and untar it. You could tune the recovery latency by adjusting the routing chain. Of course, over dialup uucp, even one-node-out-and-back path could result in a two day latency.

    Man, those were the days.
  • Farsite [microsoft.com]. HiveCache [hivecache.com]. I even worked on a commercial offering: Mangomind [mangosoft.com] (called Medley at the time). Some of these weren't positioned as backup solutions but, structurally, they're just like what Cringely describes. There have been many others, but I'll let people Google for themselves.

  • During the early 2000's an idea like this had already surfaced during the much hyped Storage Service Provider (SSP) rush. While most companys like the now defunct StorageNetworks (NASDAQ:STOR) were just building massive terabyte clusters into CoLo's around the country one provider Digital Knox [digitalknox.com] was creating a system very similar to the OceanStore [berkeley.edu] concepts from Berkeley. The idea was not using P2P however since this required users to volunteer space. Simply put take the idea of a RAID array with parity
  • and timothy is probably one of the worst /. editors out there for mindlessly replicating his swill onto the site. Here's my backup solution, firewire cards in everything, the Macs already have them and they cost about $30 for one [siig.com] that will work in PCs running Windows or Linux, two LaCie 160Gb firewire drives [lacie.com] these are about the size of a thick paperback and can be had for $160. A small Pelican case, #1400 [pelican.com].

    Put one hard drive power supply in the Pelican case, use the other one with the hard drives to back

  • by joey ( 315 )
    duplicity [nongnu.org] already allows trading disk space for backups with friends, or even people you don't know. It's safe (all data encrypted by gpg), it's low bandwidth (deltas sent using rsync algorythm), and it's not a business.

    The hardest thing about duplicity right now is probably finding a similarly interested party to trade disk space with.

    I trade duplicity space with someone I've never met who has a machine in the same colo, for a backup close to my coloed machine. I also use duplicity to send backups of the
  • by obi ( 118631 ) on Sunday September 12, 2004 @07:35PM (#10231161)
    ... the Distributed Internet Backup System http://www.csua.berkeley.edu/~emin/source_code/dib s/ [berkeley.edu]

Trap full -- please empty.

Working...