Why Mirroring Is Not a Backup Solution 711
Craig writes "Journalspace.com has fallen and can't get up. The post on their site describes how their entire database was overwritten through either some inconceivable OS or application bug, or more likely a malicious act. Regardless of how the data was lost, their undoing appears to have been that they treated drive mirroring as a backup and have now paid the ultimate price for not having point-in-time backups of the data that was their business." The site had been in business since 2002 and had an Alexa page rank of 106,881. Quantcast said they had 14,000 monthly visitors recently. No word on how many thousands of bloggers' entire output has evaporated.
DUH! (Score:5, Insightful)
DUH!
Dear Every Corporate Tool in the Universe: (Score:5, Insightful)
El Oh El (Score:4, Insightful)
That's all I can say at this. I'm really surprised that with all the users they had, they are so quick to say "everything is gone and we're giving up" instead of just starting over and maybe implementing protocol that would make sure this doesn't happen again.
Re:When is backing up *not* an option? (Score:5, Insightful)
This is at a minimum people. Come on!
How hard is it to remember: (Score:5, Insightful)
Mirroring: High availability
Backups: High reliability
Only 2 drives? (Score:4, Insightful)
BUT, according to the site "the server which held the journalspace data had two large drives in a RAID configuration". Only TWO drives.
All they had to do was pull one of the drives, replace it, and lock up the original off site. In a couple of hours the drives would have been mirrored again.
Re:Dear Every Corporate Tool in the Universe: (Score:5, Insightful)
And that's why your IT department actually needs funding. Sleep tight.
They've had the site live for 6 years.
This wasn't a lack of funding, it was just sheer stupidity.
6 years and nobody ever thought it'd be a good idea to back everything up to dvd or an external hard drive. HTML compresses really well in case they didn't know.
Re:Dear Every Corporate Tool in the Universe: (Score:5, Insightful)
Re:El Oh El (Score:5, Insightful)
Considering how complete and unrecoverable the loss is, they have no idea who their users are. The accounts would have to be recreated from scratch, but who would try? Their users have no reason to ever trust them again. Journalspace would have a difficult time wooing back their original users, and no new user would seriously consider using them.
Bowing out is the only recourse, but I'm glad they're considering releasing their source code.
Re:Ouch (Score:5, Insightful)
Or even one, stale, backup.
Re:To the HR department (Score:4, Insightful)
The only problem with that idea is that it may not have been the IT guy's decision to save money by not having a true backup system. I have seen companies skimp on backup systems because they thought their RAID system was enough.
A lesson for admins, and users too (Score:5, Insightful)
No doubt this incident is the result of the admin's fault. He's been confusing mirroring and backup and carried on the mistake until it's too late, as pointed out in other comments.
Now what about a user's angle? The morale is you can never think your data is safer when it's "in the cloud". If you value your blog and your readers, you *should* save a copy of your work as well as the readers' info, *locally*, somewhere you have control over.
There's no place like $HOME.
Re:Dear Every Corporate Tool in the Universe: (Score:2, Insightful)
For want of a nail, the shoe was lost.
For want of a shoe, the horse was lost.
For want of a horse,the rider was lost.
For want of a rider, the battle was lost.
For want of a battle, the kingdom was lost,
And all for the want of a horseshoe nail.
Re:Dear Every Corporate Tool in the Universe: (Score:4, Insightful)
Never underestimate the beancounter's desire to save every cent possible. If your site's working perfectly fine, well, what's the point of having backups? Seriously, I see this happen all the time with small businesses. "Oh, it's never failed before, why do we need backups?" Then the server implodes.
Course, they then get pissed at us for not preventing it, but what do they expect us to do, shell out for a tape drive with our own cash? I think not.
There is a denial going on (Score:5, Insightful)
In today's world where primary storage and protection storage are well-defined, and where entire industry grew around it (examples: NetApp, Data Domain), one is hard-pressed to understand the reason for such a debacle. The reading of the note referred to in the article [journalspace.com] leads me to believe, unfortunately, that Journalspace's IT department did not understand the difference.
It is sometimes considered a bad form to say something bad about fellow techies. We prefer to look for 'outside' causes. Still, to learn and avoid the same problems in the future, one has to admit his mistakes first. This paragraph from the Journalspace's page:
The value of such a setup is that if one drive fails, the server keeps running, using the remaining drive. Since the remaining drive has a copy of the data on the other drive, the data is intact. The administrator simply replaces the drive that's gone bad, and the server is back to operating with two redundant drives.
makes me believe there is a denial going on.
Re:To the HR department (Score:2, Insightful)
A better backup solution needn't cost much, or even anything. Simply FTPing to your own home machine on occasion would have been a millionfold improvement (given the popularity metrics, I don't think this was like a staffed operation or anything. Just a guy or two)
Re:That's what backups are for (Score:5, Insightful)
My guess (and this is a guess, I'd never heard of the site before yesterday) is that this is some guy who started his own little site and it got bigger and bigger. Basically he never designed the backup, the system was just slowly pieced bigger and bigger until it got to it's current state.
The comments in the messages from the site's operator about the cost of the drive recover and thinking both drives just died at once indicate to me that this site was basically a hobby for him and he isn't experienced as an admin.
Re:Dear Every Corporate Tool in the Universe: (Score:5, Insightful)
Hell, they could have spent $50 on a USB hard drive (i.e., half-assed it) and been better off!
Re:A lesson for admins, and users too (Score:5, Insightful)
And a corollary to the parent's good advice: if you can't easily get a complete copy of your work, find another host. Manual one-by-one downloads don't cut it.
Re:Ouch (Score:4, Insightful)
This story put the fear of god into me. The first thing I did since reading it is to back up the website I admin (for my dad) locally. I'd always assumed our host would have good backup, but that seems naÃve now.
Mirroring (Score:5, Insightful)
Re:Dear Every Corporate Tool in the Universe: (Score:5, Insightful)
Being too stupid to recognize your own shortcomings is also a form of stupidity. Or hubris, whichever is more appropriate.
You need more than backups ... (Score:5, Insightful)
Re:When is backing up *not* an option? (Score:4, Insightful)
Personal backups of online data (Score:5, Insightful)
Do the big kahunas of the "Web 2.0" world give users that option? Gmail, Myspace, Facebook, Twitter etcetera ad nauseam?
Re:When is backing up *not* an option? (Score:2, Insightful)
NAS devices are cheaper and faster now. Lower end removable drives are not much more expensive than tapes, and they are a lot faster and easier to manage.
Having 21 days of off-site backups stored on NAS is kinda difficult.
Re:Excellent! (Score:2, Insightful)
Yes.
But you wouldn't think everyone would catch on...
Re:Double Duh! (Score:4, Insightful)
Or attach a 4 TB Drobo to it and then use Time Machine.
Then make a backup and test the restore.
Their admin is criminally incompetent.
Re:When is backing up *not* an option? (Score:5, Insightful)
Even accepting your price that's a cost of about 12.7 cents per gigabyte and you can get 800GB native LTO-4 tapes for about $50, which comes out to about 6.3 cents per gigabyte.
But quoting costs for desktop grade SATA drives severely understates the true cost. For any non-trivial site installation you're talking near-line rated drives, drive caddies, storage shelves and additional SAN fabric. Then price out the additional power, cooling and rack space. Then price offsite shipping and storage for the bulkier, heavier and more delicate disk option.
Mirroring has its place. Snapshotting has its place. And backups to stable media still has its place too.
Re:When is backing up *not* an option? (Score:2, Insightful)
Fine. Get the cartridges, but what about the capital cost minus depreciation of the drive? What about random access?
Now weigh those against an inexpensive jbod frame with a 2gb FC backplane. What's the write speed of LT vs a tasty little GB SAS drive? Rackspace? You can put a dozen into about 4U. Cooling? Although I'll grant you green cost, the random accessibility out-classes the seek time and tape insertion by a human cost dramatically. Stable media? Tape? Sometimes. Shelf space?
SAN fabric is dirt these days. You can get a nice Silkworm and a cheap-but-reliable SAS backplane for dirt as well. Perhaps a couple of GBICs.... or some handy-dandy fiber cables (also dirt these days) and you're in business. Or, put up a 10dot network off your public-face grid, and just use iSCSI. No need to use tape anymore. Get out of the reality distortion field, but do the right thing by testing what you have and doing drills to ensure that whatever you have, works and is a procedure understood by all.
Re:Dear Every Corporate Tool in the Universe: (Score:4, Insightful)
A USB drive is an excellent non-archival backup. Two or more in rotation is even better. That plus a decent RAID for the primary storage will cover most data losses. Even better if the drive goes home with the admin at night.
Re:The rules of backups (Score:3, Insightful)
1. Backup all your data
2. Test your backups
3. Backup frequently
4. Test your backups
5. Take some backups off-site
6. Test your backups
7. Keep some old backups
8. Test your backups
9. Secure your backups
10. Test your backups
11. Perform integrity checking
10. Test your backups
Every company I've worked at has had a backup plan. Exactly zero have had a recovery plan.
Re:Double Duh! (Score:3, Insightful)
Re:No Archive.org either (Score:1, Insightful)
So let me get this straight... Journalspace.com was smart enough to have someone there setup a robots.txt file, but nobody there asked if anything was being backed up to a tape/external drive/DVD/CD/Floppy Disk/Cocktail Napkin?
I'm just glad I've never used Journalspace.
Re:When is backing up *not* an option? (Score:1, Insightful)
"There is no rational justification for tape anymore, what with the cost per TB stored on hard disks..."
Pardon? Once you buy the tape library (which admittedly can be pricey), the cost per TB on tape is a hell of a lot less with tape. Check out LTO4 for an example of a high capacity, high performance tape system. (Also, keep in mind that your high-end disk systems also drag along a big chunk of change in infrastructure before you can plug in disks, so the initial cost of a tape library is not such a straight win for disk systems either.)
Also, the monthly cost of spinning disk is also considerable in terms of power, real estate and cooling. And with the costs of disk x 2 for HA, you can get better value for your money with other stuff, *depending* on your requirements.
And before someone else brings it up, data deduplication can work just as well for tape as it does for disk. IBM's Tivoli Storage Manager (TSM) next version will have a data de-dup'ing built in, and it's due out in Q1 2009. (Let the "brand X de-duping is better than brand Y de-duping" wars begin!)
The moral of the story about disk vs tape vs software is that one solution does NOT fit all situations. Simple to implement doesn't mean cost-effective or even rational. Unfortunately, a good DR plan still requires people to think about disaster scenarios end-to-end and be focused on the business requirements rather than on one narrow definition of "good."
"There is no silver bullet."
Re:DUH! (Score:3, Insightful)
We can only hope they remain silent.
Re:When is backing up *not* an option? (Score:1, Insightful)
Do you and the admin at Journalspace.com share tips, by any chance?
Re:When is backing up *not* an option? (Score:5, Insightful)
That's not my company's policy, that's *my* policy. I can take a 3-month hit to my personal data. AND YET MY LAX PERSONAL POLICY WOULD HAVE SAVED JOURNALSPACE.
My *company's* policy is daily offsiting. Expensive, but very many of our locations could become a smoking hole in the ground and we'd still be able to restore and operate.
Re:When is backing up *not* an option? (Score:5, Insightful)
Fine. Get the cartridges, but what about the capital cost minus depreciation of the drive? What about random access?
Random access is why snapshots also have their place. :) Archival backups and nearline backups solve different sets of problems.
Now weigh those against an inexpensive jbod frame with a 2gb FC backplane.
What kind of capacity are we talking. For a small site you can pick up a little 2U unit that'll store 6.4TB uncompressed for under $5k. Or if you're running a larger site you can snag a 4U unit with two drives for about $15k that'll handle 30.4TB with optional expansion to 60.8TB native.
What's the write speed of LT vs a tasty little GB SAS drive?
120MB/sec per drive without compression. And now that you've talking about SAS drives your per TB cost is hopelessly optimistic. Even OEM packaged terabyte SAS drives are going to run you about a quarter a gigabyte, which is now four times the media cost of an LTO-4 solution.
Rackspace? You can put a dozen into about 4U.
So about 12TB in 4U compared to the 30TB unit I mention above.
Cooling? Although I'll grant you green cost, the random accessibility out-classes the seek time and tape insertion by a human cost dramatically.
Have you never heard of a tape library?
Stable media? Tape? Sometimes.
Properly handled tape is incredibly stable.
Shelf space?
If you're doing off-site storage, that's going to be an issue regardless of what media you're using. And as I pointed out, tape is far more compact and far lighter than disks.
No need to use tape anymore. Get out of the reality distortion field, but do the right thing by testing what you have and doing drills to ensure that whatever you have, works and is a procedure understood by all.
I'm not the one dismissing an entire class of technology while demonstrating ignorance of its costs and benefits.
Re:When is backing up *not* an option? (Score:3, Insightful)
I'm not sure what planet you're on, but I wish the rest of us were there with you.
Backup media should be and must be transported offsite every freakin day. You'd do that with a hard disk? Or more correctly, you'd do that with a STACK of hard disks? Or is your building fire, flood (including broken sprinkler pipes), gas leak, and drunken-truck-driver proof.
Re:When is backing up *not* an option? (Score:3, Insightful)
can you restore a RAID with different hardware? With LTO3 tape I have several drive choices.. notably I can by a NEW drive and know the tape will work even 3-4 years out. What happens when the maker of your RAID solution moves on and wants to send you next year's model? Will the encryption and striping still line up on different hardware made by a different company?
Re:Dear Every Corporate Tool in the Universe: (Score:3, Insightful)
If so, he can do so anyway.
Re:Dear Every Corporate Tool in the Universe: (Score:3, Insightful)
Don't send tapes home with Admins... send them to the bank to be put into a safety deposit box with the days checks if you have to. Admins don't want tapes in their home, it's a corporate security risk and the admin WILL forget to bring some back because one or two is no big deal... until they're not at your company anymore. I know I wouldn't do that because I wouldn't want to be the guy who's laptop bag gets ripped off with customer data on tapes inside it. It's just bad mojo waiting to happen.
Treat data media just like the companies cash money.
Re:When is backing up *not* an option? (Score:1, Insightful)
With a SCSI or SAS hard drive you'll be lucky to even have the correct adaptors to be able to plug the sodding thing into your controller and power it up after two or three years...
Re:There is a denial going on (Score:3, Insightful)
Yeah, right. If there's anything professionals love to do, it's talk trash about their peers. What's the first thing a computer guy says when you bring him in to fix a broken system? "My god, what idiot spec'd/built/installed/configured this piece of garbage? It's a miracle it ever worked at all!" Ditto every other kind of professional, from plumber to surgeon to architect to accountant.
(As such a professional, I often discover that the idiot I'm complaining about was me.)
Re:When is backing up *not* an option? (Score:3, Insightful)
Anybody who uses disk based backup for a while finds out that it needs to be augmented by tape sooner or later. A disk based subsystem gets full pretty soon once you get used to the convenience. If you continue to buy disk drives it gets really expensive so people find that the best of both worlds is D2D2T. This reduces the size and speed of the tape subsystem you need, but doesn't make it obsolete. You need offsite storage anyway.
Re:Double Duh! (Score:3, Insightful)
BSD is no longer BSD either. You need to pick your flavour, whichever one suits your poison.
Re:There is a denial going on (Score:2, Insightful)
Ditto every other kind of professional, from plumber to surgeon to architect to accountant.
My experience is the exact opposite, particularly when comes to a medical profession. It's like mafia, and no one dares speak ill of another 'made man', at least not on the record.
I worked for over a decade as an independent networking consultant, and some of the most daring statements I heard people make when criticizing someone else's design were of the kind "perhaps it is not the most ideal for your environment. Needs change quickly, and not everything can be foreseen". Being a loudmouth rarely buys you a lot of business. Even your clients don't want to see that. It's not in good taste, and then, there is always a possibility somewhere in their minds that if you speak that way of others, you may speak that way of them.
Re:When is backing up *not* an option? (Score:1, Insightful)
And itsy-bitsy data sets apparently. But in a world where one's data is measured in tens of terabytes rather than hundreds of gigabytes, tape is still king.
Re:When is backing up *not* an option? (Score:3, Insightful)
again with the power requirements! To keep even month and year would require a massive amount of extra hardware and power, not to mention people to tend it. I'd agree it's super good at recovery but how much more value are you getting versus tapes in a safety deposit box?
Re:When is backing up *not* an option? (Score:2, Insightful)
Not too bad of an idea, least there is duplicate data at a different location which is better than what these guys are proposing.
Ideal thing is to have the duplicated data be MILES from the main DC site but it's not always practical when you have large volume of data to replicate and backup. Yes I know all about rysnc and smart data replication but still nothing like good old fashioned complete full backups at a expense of time. That always seems to work well for most people.
Re:DUH! (Score:4, Insightful)
Fixed that for you. ;)
Re:DUH! (Score:3, Insightful)
If your tape is readable the day you make it, it will be readable for many years. Tape (especially half-inch tape) doesn't really go bad over time - 20 year shelf life is common.
The reason you hear the "oh noooes, my tapes aren't readable" horror stories is because people are too lazy to verify the tapes at the time of creation, and the tape drives go silently bad over the years. Schedule a verify after every backup (and you don't need to verify everyhting you wrote, just a sample) and you won't have a problem with "bad tapes". Well, except maybe with QIC tape; what garbage.
Re:When is backing up *not* an option? (Score:3, Insightful)
Your setup sounds *completely* vulnerable to a single malicious employee (with the right passwords). Typical engineer: protect against multiple failure modes, and disregard malice.
You don't have a backup until you seperate the data from the ability to destroy that data. Of course, there is WORM disk storage that meets that need, but that's far more expensive than tape (though handy for meeting auditing requirements).
Something is fishy here..... (Score:3, Insightful)
For everything to be just gone and I mean LONG gone, then something besides a truncation or un-linking of the file had to occur.
Now I don't know all that much about the apple file system, but I would imagine it is like most file systems in that it links clusters and sectors of data together using some sort of allocation table, hash, b-tree or something.
Now unless they had file scrubbing turned on and the OS purposefully went out and overwrote every segment of the file with 01010101 and 10101010 then the vast majority of the data should still be there, at least I would think it would be. I mean even the nastiest revenge oriented guy, would have to be able to invoke some kind of program to do that.
I am assuming that it was an SQL database of some flavor. I don't know much about MySQL internals but I am pretty sure a
delete from table
simply goes through the index and marks pages deleted and does not physically go out and scrub ever page that has data on it. I know that is how Oracle works.
So this leaves me wondering about the data recovery house.... I they were doing a sector by sector read on the entire drive ( either of them ) they should till see all sorts of data on the disk. Now I don't know if the database compresses data on the fly ( some do, some don't) and I don't know if drive compression is an option on OS-X. If so, I can see where they would see just mostly larges amounts of compressed data ( making things VERY difficult if not impossible to recover, but baring that, most OS's have the hooks built in do simply do a sector by sector read of the storage device and although your binary data ( images and the like ) might be unrecoverable, you could probably get most if not all of the text.
Just a thought, but hey I might be crazy, it is just the hacker in me that brings these things to mind...
Re:DUH! (Score:2, Insightful)
Amen to this -- at Fermilab we had a setup with lots of 8mm tapes, and we thought the MTBF was awfully high (several were failing each week), much more than the 30,000 hour MTBF specified...until we realized it was 30,000 hours with a 5% duty cycle, or 600 hours of use. 600 hours divided by even a dozen tapes is 50 hours, about 6 8-hour days and these were in use up to 24x6... The system also let us empirically confirm the single-bit error rate of DRAM, something on the order of 1 in 10^13 bits at the time.
Hot spare, hot swap, hot plug...that's how you gotta do it when you have so much hardware on hand that failures need to be planned for rather than prevented.
Re:Serves em Right (Score:3, Insightful)
I see your point, but something about this does not pass the smell test.
To have nothing on the HD(s) then someone had to very very carefully wipe the entire disk by overwriting every block and sector that the data occupied, and that would have made whatever DB system shit its pants as it started seeing data disappear so it would have been really obvious, really fast that something was amiss and as you relate would have more then likely caused a kernel panic and or at least a core dump of the DB system.
Re:Double Duh! (Score:3, Insightful)
<sigh...>
Re:When is backing up *not* an option? (Score:3, Insightful)
It's just being crowded out of the low-end market by ever larger and ever cheaper hard drive sizes. Tape costs would have to drop by about a factor of 4 (or more) to compete in the lower end of the market where 100 tapes is a lot.
(If I could backup 800GB for $10, that would be much more of a no-brainer decision. The cost-advantage would be high enough to pay for the expensive tape drive. And $50 LTO-4 tapes are a lot better then back when a lot of large-capacity tapes cost $100 each.)
Re:Double Duh! (Score:3, Insightful)
"Either can do the job so one is a backup..."
Which one is the backup?
The whole point of a backup is that it is *stable*. Neither copy is stable, so there is no "backup on the hardware level". There are two active systems.
If you cannot restore an accidentally-deleted file from it, it's not a backup.
It is a serious mistake to use the term "backup" in relation to a RAID 0 array. There is only one correct way you can do that, "either disk can serve as a backup for the other, should its media fail".
Either disk can serve as a backup for the other *drive*. However, there is no backup copy of the data. It is *not* a backup solution. There is no backup.
There is, however, fault-tolerance. A media fault can be tolerated. But if the active copy of the data is corrupted, there is no backup.
Re:When is backing up *not* an option? (Score:1, Insightful)
SAN fabric is dirt these days. You can get a nice Silkworm and a cheap-but-reliable SAS backplane for dirt as well. Perhaps a couple of GBICs.... or some handy-dandy fiber cables (also dirt these days) and you're in business.
Out of curiosity do you have any idea what you're talking about? Fiber or GBICs? You're gonna need both...
Also SAN/LTO are hardly mutually exclusive. Plenty of tape libraries are san attached.