Cringely's P2P Backup Idea 205
gewg_ writes "If Napster and Bit Torrent had a baby, would it Baxter?
As a follow-on to Cringely's
last column where he talked about having a backup strategy in the
wake of Hurricane Frances, this week he proposes a distributed RAID notion as a solution."
Queue Linus Quote in..... (Score:2, Funny)
Ah, there it is
-Chris
Re:Queue Linus Quote in..... (Score:5, Informative)
"Backups are for wimps. Real men upload their data to an FTP site and have everyone else mirror it."
Re:Queue Linus Quote in..... (Score:2, Funny)
Baxter is already taken! (Score:5, Informative)
makes my heart a little warmer... (Score:2, Interesting)
It sounds from every description like the solution is Linux-specific, but I'm sure it can be made to work with other UNIX variants, especially since Gmail, itself, runs on Apple xServe 1u boxes. Windows compatibility is unknown, but I'm sure someone will solve that soon.
I know, it's a little childish, but I get a good feeling when I see something small...even this little thing here...that thinks of other OS's first and Windows compatibility will be "real soon now" or something like tha
If Napster and Bit Torrent had a baby (Score:2)
I don't know, do babies Baxter these days? I mean they puke and shit and cry but when you talk about Baxtering I'm not too sure.
Oh you mean would the baby be Baxter?
Sorry, my fault.
What an awesome idea (Score:5, Funny)
Re:What an awesome idea (Score:5, Interesting)
If I do a search now they're easy to find, much easier than the original 160kbps were.
This was just a test, no special data used - but an amazing way to archive and distribute data.
Re:What an awesome idea (Score:2)
Hey, don't give any ideas to the terrorists.
Oh, wait a minute... You mean that Pr0n I downloaded today had the blueprints of the pentagon imbedded?
p2p backup (Score:5, Funny)
Re:p2p backup (Score:2)
Re:p2p backup (Score:4, Interesting)
1. Make tarball of backup
2. Encrypt if desired
3. Encode tarball, 4-8 bytes at a time, in email addresses
4. Put email addresses on web
5. Wait for spam
Presto -- spammers now pay for your backup; anytime you have a disk failure, just wait a while and watch your spamcan or smtp log, and reconstruct your backup at will.
(Some assembly required, offer void where prohibited)
No thanks (Score:5, Insightful)
Re:No thanks (Score:4, Insightful)
Encryption? Simply using GnuPG or any of the free AES encryptors out there will make it incredibly secure. If your data is sensitive enough, you should be doing this already...
-orange
Re:No thanks (Score:4, Insightful)
Re:No thanks (Score:2)
You'd also need some sort of currency system so by sharing out 1GB you'd get a GB in return.
Re:No thanks (Score:3, Interesting)
Or for that matter, why not build encryption into the system itself, so that you don't have to manually do it.
Re:No thanks (Score:3, Funny)
Re:No thanks (Score:2, Insightful)
Re:No thanks (Score:2)
If the key are stored by the client, there *will* be problems of lost key, so the key must be also stored on the backup server.
Re:No thanks (Score:3, Interesting)
I've been backing up sensitive data almost exactly like this for quite a while now. I've got an application that breaks a stream of data into chunks and encrypts them. It compares the md5 of the source block against the md5 of the same block from the previous backup. If they match, it hard links the block into the backup directory, if they don't match, it encrypts the b
damn.. (Score:2)
Re:damn.. (Score:5, Interesting)
Peer-Redundant File System (Score:2, Informative)
Re:Peer-Redundant File System (Score:2)
Re:damn.. (Score:2)
over and over again ... (Score:3, Interesting)
Well, we [askemos.org] leave the data where it belongs: in the proxy network where the processes live too. Still a bit incomplete, but maturing WebDAV and mountable slices forthcoming...
Freenet (Score:5, Interesting)
Or, just make a local-freenet on the company lan.. everything is encrypted and unretrievable without the proper keys, so it's very secure and it's distributed.. + FEC encoding.
That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project
Re:Freenet (Score:2, Interesting)
Re:Freenet (Score:2, Interesting)
Entropy is dead and has been for a few months at this point. They decided that Freenet is broken and claimed they could do better. Time did tell.
The other point many people critical of Freenet make is that other P2P systems are much faster. One chap came into the Freenet channel and claimed Entropy is much better than Freenet because it's written in C++ and real fast. He didn't realize that Entropy development was dead and that the network consisted of about 20 peers. Nobody knows the precise number of Fr
Re:Freenet (Score:2, Interesting)
That assumes freenet works, AFAIK it's still fucking broken. Ian Clarke is playing too much politics with the project and the only coder that really understands freenet (Mathew Toseland) is swamped with ideas, day after day.. it just gets worse and worse... The donations seemed like a good idea, but after watching the DEV list for the last 18 months, I realize it's a failed project :(
Check out other development lists on popular projects (if they're public). You'll find that heated debates, arguments, an
Interesting idea (Score:5, Insightful)
But on the serious side, the claim of using encryption to store data on someone's hard drive worries me. Let's say the encryption gets broken. Now you might get Aunt Nedda's cookie recipes, but then again, you might get BobCo's strategic investment plan for the next 6 months as well. I can see people signing up just for the chance to hunt through people's data.
Re:Interesting idea (Score:2)
More likely, let's say the hard drive gets broken. Would you really trust your data to such a thing?
Re:Interesting idea (Score:2, Insightful)
Worse, if this Aunt Nedda lives in the UK, she could go to jail for 2 years for not being able to decrypt the files on her hard-drive at the request of the police.
Re:Interesting idea (Score:3, Interesting)
On the contrary, I'd say Auntie has a really strong case that she never had the key to someone else's encrypted data stored on her drive, so the RIP act would not apply to her.
Interesting idea at first blush (Score:2)
how mush redundancy should there be ? Two full copies sound like far too few. If 'R' is the number of redundant copies, then understand that every participant has to be sharing R*D bytes, where D is the average backup size. Plus, of course, their own personal data, so everyone's hard drive has to be at least three times the size of the average data set. For realistic backup strategies, ensuring that a full copy was online at any point in time, R would probably have to be m
Not The First w/ The Idea (Score:5, Informative)
Re:Not The First w/ The Idea (Score:3, Interesting)
I lived the storm (Score:2, Interesting)
Hey, it beats trying to store data to gmail accounts!
Save Betamax (Score:5, Interesting)
Save Betamax [savebetamax.org] is a national Congress call-in day this tuesday to oppose the INDUCE Act. It might be our last chance to stop this bill.
Re:Save Betamax (Score:2)
Much faster (Score:4, Informative)
As a bonus, you can use it to transport data (eg. your mp3 collection) between places, or even use it to boot linux anywhere [mandrakesoft.com] with much more space and document storage capability than Knoppix.
Re:Much faster (Score:2)
Re:Much faster (Score:3, Insightful)
Re:Much faster (Score:2)
Re:Much faster (Score:2)
Re:Much faster (Score:2)
Re:Much faster (Score:2)
Re:Much faster (Score:3, Insightful)
Not for safety, stupid! (Score:2)
Oohh yeah, baby!
Nice idea, but (Score:5, Interesting)
It doesn't make for a very safe backup, though: What happens if somebody decides to stop the service and just deletes his local storage? You've got no more backup at least for a while, and you might not even know it. And of course, other people have head crashes, too, which would also obliberate your backup at least for the time it takes to recreate it from your own data. Of course, by that time, you might have deleted it yourself, either by accident or knowingly, since you have a backup after all. A viable solution would be to store every file multiple times on different remote servers, although that'd lower the storage capacity you get. It's still the right step, though.
The crucial problem is that the service provider can't really give any guarantees that you will be able to regain your lost data. With three or more independent copies in different locations, it's very unlikely that the backup won't work for some reason, but a backup that's not 100% is not a very useful one, especially in those situations where backups are really crucial.
It's still a neat idea, and to my knowledge has not been done to that degree of sophistication. Of course, as others suggest, nobody is stopping you from inserting encrypted data into Freenet, but that's nowhere near as fast and secure as this could be. And while it's not a true backup, it's better than no backup at all, and most likely enough security for many persons.
Re:Nice idea, but (Score:2)
First you want to backup your critcal data. Usually this is not already compressed information like jpeg images, or mpeg video, or even mp3 audio. It's your working documents, source code, and a subset of your e-mail.
If you compress this data, you are likely to experience a 50% compression ratio or higher. This means that once you chunk that data for distribution, you
Re:Nice idea, but (Score:2)
similar already exists (Score:2, Informative)
Re:similar already exists (Score:2)
I still say Gmail... (Score:5, Interesting)
Re:I still say Gmail... (Score:2)
gpg-encrypting a few zipfiles/tarballs and emailing them to your gmail account should be pretty simple.
Safe backup of the gpg keys might be a bit trickier...
Something like the AES-based tools also mentioned is more vulnerable to a dictionary attack then gpg with a very large key...
If Diablo 1 was in P2P (Score:4, Interesting)
P2P can work in wild ways we haven't even tapped.
too bad orrin hatch is trying to outlaw p2p:
www.geocities.com/James_Sager_PA
I'm working on something like this (Score:2)
FolderShare - Pretty similar (Score:3, Informative)
We use foldershare for peer-to-peer backup, but the catch is that you invite people that you trust to your libraries.
For backup purposes, I only invite myself and just connect another computer to the account.
Sounds unfeasable. (Score:3, Insightful)
What BS. (Score:4, Informative)
Re:What BS. (Score:4, Insightful)
I'm sorry chuck... (Score:3, Informative)
I genuinely feel for you and your struggle for safety given the recent events, and you have my deepest sincere sympathy...
But that is not what this article is about. And how about this, given the chance to either leave my data behind or fend for myself given those circumstances...I'd stay with my data.
Perhaps your data isn't a life or death matter to you, but my stacks of CD's, DVD's and harddrives with the past 15 years of my writing, graphics,
Re:What BS. (Score:4, Insightful)
I do believe you reacted a little emotionally, which is understandable given your current situation. I think that if you look at the article again, you will find the only reason he mentions hurricanes is because Frances news reports before the fact got him thinking about it.
That being said, I don't think Crigley was trying to insinuate that someone in a situation such as yours should or could worry about data. The point I took away from the article is that a person wouldn't need to worry about data at all under any disaster circumstance if you implement a system such as the one he proposes.
I think that if you look at it like that, you will agree that he is not trying to discount the gravity of your experience.
-ft
are we supposed to feel sorry for you? (Score:2, Insightful)
Is it just me, or did this poster sound like some 1930's colonialist complaining about how 'the natives' got out of control?
You want to go to play-school and take advantage of incredibly low living costs due to enormous depravity between what you hold in your wallet and what the average local makes- you'd better not complain when law breaks down and you suddenly find yourself more wanted than a sugar cookie next to an ant mount.
Funny thing-
Re:What BS. (Score:3, Insightful)
You're a student. You've accumulated less than a decade? worth of useful data. Depending on what that data is (scientific data hard to reproduce, personal writing (books/plays), scientific data easy to reproduce, or highscores on minesweeper), that data may have a $ value from 0 to maybe 10s of thousands of dollars (which means time). Small companies that have been in business for just a few years can have data that is worth millions of dollars. Ask your ne
Poorly Thought Out (Score:4, Informative)
#1) It doesn't recognize the reality of the complexity of backup software. Kinda easy to gloss over 'automated' backups without ever describing it. Pretty hard to imagine some piece of software that can universally back stuff up on everyone's hard drive and at the same time be very easy to use. Imagine mom/dad trying to use software with similar capabilities to Veritas BackupExec isn't easy. And.. imagine the wide variety of live files and databases that it wouid have to handle.
#2) Data integrity. He suggests a 1:1 ratio for backup space. Not hardly. How is he going to have any kind of redundancy with that? Crashes and people unsubscribing will happen all the time. The data would have to have a *lot* of tolerance to that.
A parity solution wouldn't be nearly enough. That assumes that only 1 failure at a time happens (using RAID 5 as my basis here). It would be easy to imagine that one person unsubscribed with part of your data and another had a crash or corruption problem.
So.. complete mirroring would be necessary. Again, its easy to imagine 2 people's system going offline at the same time.. so, you'd probably need more than 2x Mirror. At this point... how much is enough to ensure reliability? 3x 4x 5x ? ? ? How much do you trust your average netizen?
So.. pick your number and then divide your backup space by it. Like 5x? Add 10GB and you have 2GB usable storage. Not very good.
I'll just skip over the 'auto backup' of people's 40GB storage over a 128K up line for now.. already typed too much...
How about 1.2x? (Score:3, Informative)
So, you divide your data into chunks 225 bytes long. Each byte in a chunk goes to a differe
Simply Goofy (Score:2)
However, I would use this sort of thing on an internal network because I directly control how much space is availible and I'd be able to, with adoption, access video from one of my three computers from a set-top-box in the living room and manage it as a single library. That's the sort of thing we need to be looking at, but unfortunately very few companies are officially d
Already a commercial product (Score:4, Interesting)
I don't work for them, etc.
DIBS (Score:4, Informative)
Fud? (Score:3, Insightful)
"But while it might be easy to use Gmail for offsite backup, I couldn't bring myself to do that just because of the intrusive nature of Gmail. Remember this is a system that is by invitation only, which means that Google can quickly map a social network establishing who knows who. And since Gmail actually analyzes the content of your e-mail and can automatically group it by subject (how creepy is that?), Google not only knows who your friends are, but what do you talk about with those friends."
I nominate this to the prestigious "Fud of the week" award.
Solution looking for a problem (Score:2)
The thing is, you want backups because you want to be able to get it back, with this (and my idea) you have little control over the backup; in short words, it's not a backup.
FreeNet may at a first ignorant glance be a solution to this dilemma, however, you still have the same terror of doubt. Because you're not in control!
To summarize, there is a difference between not wanting to lose something, and wanting
Cringley's been reading my posts! (Score:2, Interesting)
http://slashdot.org/comments.pl?sid=115027&cid=97
Of course what matters, though, is not talking about ideas, but *doing* them.
Pricing (Score:3, Interesting)
The company I work for (banking) sells storage for 120 euro per gigabyte per year to our internal clients. That's storage on RAID-disks (think StorageTek and the like), including backup (on tape) and all necessary services (people doing maintenance, restoring backups, etc). 120 euro / gigabyte / year comes to 1,22 dollar / month / 100 megabytes (compare to 8 $ per month with Apple). Considering our 1,22 $ plus some network costs, plus maintaining a billing system for a couple of million clients, and a bit of profit margin, maybe 8 $ per month is not a rip-off.
Re:Pricing (Score:2, Insightful)
For that $100 per year, you could buy 3 128"MB" USB-keys that give you more storage space, have faster copy-times from your computer, and have 3 times the redundancy as the network-storage option. They're small enough to post to a friend in a different location if you want (cheap, and
Re:Pricing (Score:2)
Is it really necessary? (Score:4, Insightful)
For larger, business-driven uses, you probably want something like DataSafe. They will keep media for you in a very safe place. Or better yet, keep your whole business disaster protected -have more than one live site for IT operations.
Re:Is it really necessary? (Score:2)
I'd say I'm currently pretty good at about 5GB. My databases are very important to me (photos, etc...), plus my mail, and source code and stuff.
Since I started using arch (and darcs to a lesser extent), every commit I make is backed up in two or three fully usable mirrors within the hour, and on three or four machines within a couple of days. That's probably where my most valuable work is. I've lost photos in the past, though. That hurts.
Sounds like freenet (Score:2)
Too bad its still too slow..
dumb, AND contrived (Score:2)
1. He starts by saying he can't use gmail because of privacy. Duh, can you say "encryption"?
2. He also gives a privacy complaint because gmail knows who you associate with, through the chain of invitations.
Bullshit. There are lots of people on the web offering anonymous invitation URLs.
3. Savor this contradiction:
"First, it is for BACKUP, so recovery has to be slow enough so people won't think of it as another hard drive
What a name (Score:2)
Baxter for Video Activists (Score:3, Interesting)
Re:Baxter for Video Activists (Score:2)
Already done (Score:3, Insightful)
There are a lot of issues here, mostly centering around the fact that you can't trust people in an open P2P network.
1) They might look at your data.
2) They might not be online when you want your data.
3) They might delete your data, or do other malicious things to it (insert viruses, etc.).
4) They might freeload by using space on other hosts and then deleting all the data they receive.
5) If a host leaves the system permanently, you need to detect that and replicate its data somewhere else. Also, how do you know whether it's leaving permanently or just logging off for a while?
#1 is easy, just encrypt the data. #2, #3, #4, and #5 are hard because data integrity is really important in a backup solution. You end up having to replicate the data all over the place to "ensure" that it'll be available when you need it, but then you've got the problem of having to donate more space than you receive to use the system. Plus, it's still not certain that your data will be available when you need it.
Basically what I'm trying to say is that it's a hard problem.
Re:Already done (Score:3, Insightful)
The system should have redudant locations. Similar to the GFS(Google's Filesystem) that has 3 copies of every piece of data(on different computers), for just that reason.
The system should require that you have 1-3 times as much on your system(that is other's data), that you have on other people's computer.
The system should not have a user's data stored on a single computer, rather each file or
The more I read cringely... (Score:2, Troll)
Backing up data is easy and cheap, as cheap as anti-virus measures for windows boxen... the fact is people cannot get either to work because they are LAZY and STUPID and most of all DETERMINED TO STAY THAT WAY.
There is an old saying about fools and their money being soon parted, I think there is also a modern corollary, "Fools and their data are soon parted."
With my work hat on, when asked to help a user with PC problems, I have long evolved a simple tactic, I
Re:The more I read cringely... (Score:2)
Re:The more I read cringely... (Score:2)
Lots of other projects (Score:4, Informative)
Pastiche (Score:3, Informative)
Looks like he might like Pastiche [psu.edu].
Prior art (Score:3, Interesting)
Man, those were the days.
Been there, done that (Score:2)
Farsite [microsoft.com]. HiveCache [hivecache.com]. I even worked on a commercial offering: Mangomind [mangosoft.com] (called Medley at the time). Some of these weren't positioned as backup solutions but, structurally, they're just like what Cringely describes. There have been many others, but I'll let people Google for themselves.
Redundant Array of Network Devices [RAND] (Score:2)
Cringely's a fucking moron (Score:2)
Put one hard drive power supply in the Pelican case, use the other one with the hard drives to back
duplicity (Score:2)
The hardest thing about duplicity right now is probably finding a similarly interested party to trade disk space with.
I trade duplicity space with someone I've never met who has a machine in the same colo, for a backup close to my coloed machine. I also use duplicity to send backups of the
Maybe he should check out DIBS (Score:3, Informative)
Re:idea (Score:4, Informative)
Large businesses have a scheduling process and hire people to swap tapes, move tapes in and out of the various facilities, rotate tapes, and replace tapes that are no longer reliable. This process is done on a 24x7x365 (plus leap days) basis. Most of the data is actually being backed up via tape silos and 'robots' to handle the actual tapes while the various backups are hapening, but it is still a significant investment in people.
A small business may be able to get away with burning a CD-R or CD-RW every night with that days transactions, and a small stack of CD-R (or RW) every weekend which they take home and store in a CD spindle in their freezer, or something. Though I think you would be hard pressed to find a small business that actually does that. (I am sure there are some that do.) Monthly or quarterly they should be taking a spindal of archived data to a remote relative's place to provide further archival of data.
Mid sized businesses are in a bit of a quandry. The number of tapes needed for a good backup is more than anyone really wants to haul around, handle and store at home, but they are not sure it is worth the expense of using a comercial off-site backup for either.
A project like this may be just what they are looking for. No tapes or disks to try to keep track of. Everything compressed and encrypted, so it is reasonably secure. Retreival can start as soon as the replacement system is ready to start retreiving it.
I personally think it should be trialed only as a suplement to some other backup strategy, but even then, someone would decide it was either too much of a hassle, or not reliable enough.
There are even people here who think it is 'reasonable' to haul around 160 or 250 Gig hard drives to backup their critical data.
-Rusty
Re:Gmail backups (Score:2)
Oh BTW, if you have an extra invite could you give it to me.