Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage Upgrades

Why RAID 5 Stops Working In 2009 803

Lally Singh recommends a ZDNet piece predicting the imminent demise of RAID 5, noting that increasing storage and non-decreasing probability of disk failure will collide in a year or so. This reader adds, "Apparently, RAID 6 isn't far behind. I'll keep the ZFS plug short. Go ZFS. There, that was it." "Disk drive capacities double every 18-24 months. We have 1 TB drives now, and in 2009 we'll have 2 TB drives. With a 7-drive RAID 5 disk failure, you'll have 6 remaining 2 TB drives. As the RAID controller is busily reading through those 6 disks to reconstruct the data from the failed drive, it is almost certain it will see an [unrecoverable read error]. So the read fails ... The message 'we can't read this RAID volume' travels up the chain of command until an error message is presented on the screen. 12 TB of your carefully protected — you thought! — data is gone. Oh, you didn't back it up to tape? Bummer!"
This discussion has been archived. No new comments can be posted.

Why RAID 5 Stops Working In 2009

Comments Filter:
  • by Whiney Mac Fanboy ( 963289 ) * <whineymacfanboy@gmail.com> on Tuesday October 21, 2008 @07:03PM (#25461453) Homepage Journal

    12 TB of your carefully protected â" you thought! â" data is gone. Oh, you didn't back it up to tape? Bummer!

    If it wasn't backed up to an offsite location, then it wasn't carefully protected.

  • RAID != Backup (Score:4, Insightful)

    by vlad_petric ( 94134 ) on Tuesday October 21, 2008 @07:09PM (#25461521) Homepage

    I mean, WTF? Many people regard RAID as something magical that will keep their data no matter what happens. Well ... it's not.

    Furthermore, for many enterprise applications disk size is not the main concern, but rather I/O throughput and reliability. Few need 7 disks of 2 TB in RAID5.

  • What. (Score:4, Insightful)

    by DanWS6 ( 1248650 ) on Tuesday October 21, 2008 @07:13PM (#25461545)
    The problem with Raid 5 is that the more drives you have the higher probability you have that more than one drive dies. That's why you have multiple raid 5 arrays of 4 disks maximum instead of one array of 7 disks.
  • Oh look, noobs. (Score:1, Insightful)

    by Anonymous Coward on Tuesday October 21, 2008 @07:13PM (#25461553)

    If you use RAID to 'protect' your data, you clearly don't value your data at all.

    While the interesting bit of this article is the coming demise of RAID 5, what you should be bringing away with it is, if RAID is all that stands between you and data loss, you're a noob.

  • by SatanicPuppy ( 611928 ) * <SatanicpuppyNO@SPAMgmail.com> on Tuesday October 21, 2008 @07:16PM (#25461583) Journal

    Yea, because we all backup 12TB of home data to an offsite location. Mine is my private evil island, and I've bioengineered flying death monkeys to carry the tapes for me. They make 11 trips a day. I'm hoping for 12 trips with the next generation of monkeys, but they're starting to want coffee breaks.

    I'm sorry, but I'm getting seriously tired of people looking down from the pedestal of how it "ought" to be done, how you do it at work, how you would do it if you had 20k to blow on a backup solution, and trying to apply that to the home user. Even the tape comment in the summary is horseshit, because even exceptionally savvy home users are not going to pay for a tape drive and enough tapes to archive serious data, more less handle shipping the backups offsite professionally.

    This is serious news. As it stands, the home user that actually sets up a RAID 5 raid is in the top percentile for actually giving a crap about home data. Once that becomes a non-issue, then the point has come when a reasonable backup is out of reach of 99% of private individuals. This, at the same time as more and more people are actually needing a decent solution.

  • Re:RAID != Backup (Score:4, Insightful)

    by Anonymous Coward on Tuesday October 21, 2008 @07:20PM (#25461643)

    Furthermore, for many enterprise applications disk size is not the main concern, but rather I/O throughput and reliability. Few need 7 disks of 2 TB in RAID5.

    Some of us do need a large amount of reasonably priced storage with fast read speed & slower write speed. This pattern of data access is extremely common for all sorts of applications.

    And this raid 5 "problem" is simply the fact that modern sata disks have a certain error rate. But as the amount of data becomes huge, it becomes very likely that errors will occur when rebuilding a failed disk. But errors can also occur during normal operation!

    The problem is that sata disks have gotten a lot bigger without the error rate dropping.

    So you have a few choices:

    - use more reliable disks (like scsi/sas) which reduce the error rate even further
    - use a raid geometry that is more tolerant of errors (like raid 6)
    - use a file system that is more tolerant of errors
    - replicate & backup your data

  • by tepples ( 727027 ) <tepples.gmail@com> on Tuesday October 21, 2008 @07:21PM (#25461651) Homepage Journal

    When HDD's move to bigger sectors - there should be better error recovery reducing the probability of unrecoverable read errors. Right?

    Not if what fails is the drive motor.

  • by EdIII ( 1114411 ) * on Tuesday October 21, 2008 @07:22PM (#25461661)

    I can see a lot of people getting into a tizzy over this. The RAID 5 this guy is talking about is controlled by one STUPID controller.

    There are a lot of methods, and patented technology that prevent just the situation he is talking about. Here is just one example:

    PerfectRAID(TM) is Promise's patented RAID data protection technology; a suite of data protection and redundancy features built into every Promise RAID product.

            *
                Predictive Data Migration (PDM): Replace un-healthy disk member in array and keep array on normal status during the data transition between healthy HD and replaced HD.
            *
                Bad Sector Mapping and Media Patrol: These features scan the system's drive media to ensure that even bad physical drives do not impact data availability
            *
                Array Error Recovery: Data recovery from bad sector or failed HD for redundant RAID
            *
                RAID 5/6 inconsistent data Prevent (Write Hole Table)
            *
                Data content Error Prevent (Read/Write Check Table)
            *
                Physical Drive Error Recovery
            *
                SMART support
            *
                Hard/Soft Reset to recover HD from bad status.
            *
                HD Powercontrol to recover HD from hung status.
            * NVRAM event logging

    RAID is not perfect, not by any stretch, but if you use it properly it will serve it's purpose quite nicely. If your data is that critical, having it on a single raid is ill advised anyways. If you are talking about databases, then RAID 10 is more preferable and replicating the databases across multiple sites, even more so.

  • Smells Like FUD. (Score:5, Insightful)

    by sexconker ( 1179573 ) on Tuesday October 21, 2008 @07:25PM (#25461697)

    What is this article about?

    They say that since there is more data, you're more likely to encounter problems during a rebuild.

    The issue isn't with RAID, it's with the file system. Use larger blocks/sectors.

    Losing all of your data requires you to have a shitty RAID controller. A decent one will reconstruct what it can.

    The odds of you encountering a physical issue increases as capacity increases, and decreases as reliability increases. In theory, the 1 TB and up drives are pretty reliable. Anything worth protecting should be on server-grade hard drives anyway.

    The likelihood of a physical problem popping up during your rebuild is no higher with new drives than it was with old drives. I haven't noticed my larger drives failing at higher rates than my older, smaller drives. I haven't heard of them failing at higher rates.

    Remember, folks, RAID is a redundant array of inexpensive disks. The purpose of RAID is to be fault-tolerant, in the sense that a few failures don't put you out of production. You also get the nice bonus of being able to lump a bunch of drives together to get a larger total capacity.

    RAID is not a backup solution.

    RAID 5 and RAID 6, specifically, are still viable solutions for most setups. If you want more reliability, go with RAID 1+0, RAID 5+0, whatever.

    Choosing the right RAID level has always depended on your needs, setup, budget, and priorities.

    Smells like FUD.

  • by mbone ( 558574 ) on Tuesday October 21, 2008 @07:25PM (#25461707)

    How many times does this have to be said.

    RAID is not a backup. RAID is designed to protect against hardware failures. It can also increase your I/O speed, which is more important in some cases. Backups are different.

    Depending on what you are doing, you may or may need a RAID, but you definitely need backups.

  • by DrVxD ( 184537 ) on Tuesday October 21, 2008 @07:28PM (#25461729) Homepage Journal

    RAID 5, as well as RAID 6 is nothing more at an attempt to add some amount of redundancy without sacrificing too much space. Go RAID 1 instead with the same number of disks.

    As far as I'm concerned, RAID 5 really has no redeeming features (it's slow, not particularly safe, but lulls people into a false sense of security).

    From a data integrity perspective, though, RAID6 is a better solution than RAID1.

    Given arrays of equal sizes, with RAID6 your data can survive the loss of *any* two disks; with RAID1, if you lose two disks which happen to be a mirrored pair, then you're hosed.

    But, as you point out, RAIDn doesn't really qualify as "carefully protected"

  • Re:RAID != Backup (Score:4, Insightful)

    by MBCook ( 132727 ) <foobarsoft@foobarsoft.com> on Tuesday October 21, 2008 @07:30PM (#25461739) Homepage

    I've always understood it as RAID exists to keep you running either during the 'outage' (i.e. until a new disk is built) or at least long enough to shut things down safely and coherently (as opposed to computer just locking up or some such).

    It's designed to give you redundancy until you fix the problem. It's designed to let you limp along. It's not designed to be a backup solution.

    As others have mentioned: if you want a backup set of hard drives, you run RAID 10 or 15 or something where you have two(+) full copies of your data. And even that won't work in many situations (i.e. computer suddenly finds it's self in a flood).

    All that said, the guy has a possible point. How long would it take to build a new 1TB drive into an array? That could be problematic.

    There is a reason SANs and other such things have 2+ hot spares in them.

  • by Whiney Mac Fanboy ( 963289 ) * <whineymacfanboy@gmail.com> on Tuesday October 21, 2008 @07:33PM (#25461773) Homepage Journal

    Oh come on. Do you have 12TB of home data? Seriously? And if you do, it's not that hard to have another another 12TB of external USB drives at some relatives place.

    I've got about 500GB of data that I care about at home & the whole lot's backed up onto a terrabyte external HDD at my Dad's. It's not that hard.

    If you think raid is protecting your data, you're crazy.

  • by SatanicPuppy ( 611928 ) * <SatanicpuppyNO@SPAMgmail.com> on Tuesday October 21, 2008 @07:36PM (#25461817) Journal

    Yea, but DVD is transient crap. How long will those last? A few years? You cannot rely on home-burned optical media for long term storage, and while burning 12 terabytes of information on to one set of 1446 dvds (double layer) may not seem like a big deal, having to do it every three years for the rest of your life is bound to get old.

    For any serious storage you need magnetic media, and though we all hate tape, 5 year old tape is about a million times more reliable than a hard drive that hasn't been plugged in in 5 years.

    So either you need tape in the sort of quantity that the private user cannot justify, or you're going to have to spring for a hefty RAID and arrange for another one like it as a backup. Offsite if you're lucky, but it's probably just going to be out in your garage/basement/tool shed.

    Now, what do you do if you can't rely on RAID? No other storage is as reliable and cheap as the hard drive. ZFS and RAID-Z may solve the problem, but they may not...You can still have failures, and as hard disk sizes increase, the amount of data jeopardized by a single failure increases as well.

  • RAID6 = Win (Score:3, Insightful)

    by MukiMuki ( 692124 ) on Tuesday October 21, 2008 @07:39PM (#25461851)

    Scrub once a week, or once every two weeks.

    RAID6 isn't about losing any two disks, it's about having two parity stripes. It's about being able to survive sector errors without any worry.

    It's about losing ONE drive and still have enough parity to replace it without any errors.

    RAID6 on 5 drives is retarded, tho, because it leaves you absurdly close to RAID1 in kept space. RAID6 is for when you have 8-10 drives. At that point you barely notice the (N - 2) effect and you have a fast (provided your processor can handle it all) chunk of throughput along with an incredibly reliable system. Well, N-3 with a hotswap.

    Personally, I think I'd go RAID-Z2 via ZFS if only because it's a little bit sturdier a filesystem to begin with.

  • by Angus McNitt ( 542101 ) on Tuesday October 21, 2008 @07:44PM (#25461893)

    ... very few people correctly cycle in new drives periodically to reduce the chance of a mass failure.

    That is also because very few people buy a Raid setup piecemeal. Most end up buying a solution, fully populated. The idea of swapping out some drives as you go, or growing your RAID over time doesn't always look good, either to the PHBs who usually run the budget, or to the vendor. We had a vendor trying to sell us a iSCSI SAN device tell us that varying the drive lots and dates increased the chances of failure. Needless to say we went elsewhere.

    When we bought the RAID array for our Exchange box, this is going back a few years, everybody looked at my like an idiot because I asked for drives with different lot numbers. It was the best I could do as buying over time was not an option. HP was actually pretty cool about this request and out of 8 disks, no 3 have the same lot number or manufacture date.

    Of course we are also running RAID on that machine for non-backup and do a nightly replication, so your mileage may vary.

  • Wow, how incite-ful. Doesn't matter what the discussion is, some geek is bound to weigh in with all the shortcomings of any idea.

    Newsflash: there is no perfect backup! No method is foolproof, especially when it's bound to be boring as hell, and you've got an inevitable human factor. You get lazy moving the tapes offsite, you put off fixing a dead drive because there are 4 others, you wipe your main partition upgrading your distro and forget that your CRON rsync script uses the handy --delete flag, and BOOM wipes out your backup.

    Shit happens. Pointing out what we all already know doesn't do anything helpful.

  • RAID-10 (Score:2, Insightful)

    by tonytnnt ( 1335443 ) on Tuesday October 21, 2008 @07:51PM (#25461979)
    RAID-10 ftw? Expensive I know, but at least you have a full layer of redundancy rather than just a parity drive.
  • by lucas teh geek ( 714343 ) on Tuesday October 21, 2008 @07:51PM (#25461981)

    RAID doesn't protect against your worst enemy
    rm -r *

    nor is it supposed to. not being a moron seems to have protected me from "my worst enemy" just fine. RAID has protected me from random disk failures. seems to be working as designed

  • by MBCook ( 132727 ) <foobarsoft@foobarsoft.com> on Tuesday October 21, 2008 @07:59PM (#25462047) Homepage
    Good points. While magnetic media is problematic, SSDs are going to become a very viable option for the home backup (compared to stacks of DVDs or the possible reliability of old magnetic HDs).
  • My solution (Score:3, Insightful)

    by SuperQ ( 431 ) * on Tuesday October 21, 2008 @07:59PM (#25462053) Homepage

    I'm in the process of building a new 8x 1T array. I'm not using any fancy raid card. Just a LSI 1068E chipset with a SAS expander to handle LOTS (16 slots in the case, using 8 right now).

    I'm not putting the entire thing into one big array. I'm breaking up each 1T drive into ~100GB slices that I will use to build several overlapping arrays. Each MD device will be no more than 4-5 slices. This way if an error occurs on one disk in one part of a disk I will have a higher probability of recovery.

    I may also use RAID 6 to give me more chance of rebuilding.

    Disk errors tend to not be whole disk errors, just small broken physical parts of a single disk.

    SMART will give me more chance to detect and replace dying drives.

  • by cbreaker ( 561297 ) on Tuesday October 21, 2008 @08:00PM (#25462057) Journal

    Seriously - what's the problem with RAID 5? It's not a FALSE sense of security: It actually DOES prevent data loss or down time on a single disk failure. If you're a moron, you're creating 14 disk arrays. If you're smart, you keep it to 7 disks at the very most.

    RAID 5 is great. It's fast, unless you have a shit controller without enough cache. It's going to prevent down time on a single disk failure (which is overwhelmingly the most common type of failure) and it doesn't cost you too much capacity.

    Usually I'm more concerned with a fire or flood than a double-disk failure.

    RAID 6 is good, but you get the same (actually worse) performance hit over RAID 5. More parity calculations. You can lose any two disks, which is nice, and if you can spare the space, go for it!

    I don't see RAID 6 as being all that much more of a big deal over RAID 5 and actually it shouldn't really have it's own number since it's exactly the same technology and parity system as 5. It should be RAID 5.1 or something. Or maybe RAID5+1. The only reason it's become more available now is because controllers have gotten fast enough to deal with the additional parity.

  • by grahamd0 ( 1129971 ) on Tuesday October 21, 2008 @08:05PM (#25462113)

    Yea, but DVD is transient crap. How long will those last?

    But DVD is *cheap* transient crap, and perfectly adequate for home backups.

    I've got something in the area of 200GB of data on the machine which I'm currently using to type this, but very little of that data has any intrinsic or sentimental value to me. Most of it is applications and games that could easily be reinstalled from the original media or re-downloaded. A DVD or two could easily hold all of the data I *need* and even cheap optical media will outlive this machine's usefulness.

  • Dumbass. (Score:3, Insightful)

    by cbreaker ( 561297 ) on Tuesday October 21, 2008 @08:06PM (#25462131) Journal

    I guess you should be considered a new age Luddite?

    Are you the same guy that always waits for SP1 before using any software? I thought so.

    RAID is a proven technology and it's use in nearly all business IT systems from big to tiny.

    RAID isn't meant as a replacement to backups. It's one PART of the entire system of preventing unnecessary data lose, and more importantly, down time. You can keep on running your server while the failed disk is replaced and rebuilt.

    So, while I eat cheeto's and surf Slashdot while that RAID array rebuilds itself, you can go ahead and recover your old data from last night all day long while people bitch at you for not using the technology that's been around since the inception of the hard drive.

    If you actually did have the experience you claim, you'd slap yourself for such a stupid fucking post.

  • by WhatAmIDoingHere ( 742870 ) <sexwithanimals@gmail.com> on Tuesday October 21, 2008 @08:16PM (#25462215) Homepage
    RAID is NOT a back-up solution. RAID is a "oh shit my hard drive failed" solution.
  • by Wesley Felter ( 138342 ) <wesley@felter.org> on Tuesday October 21, 2008 @08:19PM (#25462257) Homepage

    SSDs are going to become a very viable option for the home backup

    Yeah, I love paying much more for my backup than for my primary storage.

  • by backtick ( 2376 ) on Tuesday October 21, 2008 @08:24PM (#25462319) Homepage Journal

    First off, Isn't this story a year+ old? Sheesh.

    Second off, if you're worried about URE on X number of disks, what about a single capacitor cooking off on the raid controller? No serious data is stored on a single raid controller system, without good backups or another raid'd system on completely unique hardware. Yes, if you put a lot of disk on one controller and have a failure you have a higher risk of *another* failure. That's why important data doesn't depend on *only* RAID, and why lots of places use mirroring, replication, data shuttling, etc. This isn't new. Most folks that can't afford to rebuild from backups or from a mirror'd remote device also couldn't have used 12TB for anything *but* bulk offline file storage because it's slower than christmas VS a 'real' storage array. Using it for the uber HD DVR? Great. Oh no, you lose X-files's last episodes. This isn't banking data we're talking here.

  • by John Hasler ( 414242 ) on Tuesday October 21, 2008 @08:28PM (#25462365) Homepage

    Prioritize your data. I cannot believe that a home user has 12TB of important stuff. Back up your critical records both on site and off [1]. Back up the important stuff on site with whatever is convenient. Let the rest go hang.

    [1] Use DVDs in the unlikely event you have that much critical data. Few home users will have a critical need for that stuff beyond the life of the media. Any that do can copy it over every five years, and take the opportunity to delete the obsolete stuff.

  • by Phizzle ( 1109923 ) on Tuesday October 21, 2008 @08:33PM (#25462411) Homepage
    Isn't it more cost effective to do RAID 1, with a nightly backup to an external. At least in my home, I do not require mission critical hot-swapping capabilities. Then again I only have 3x 1TB hard drives. Also, after RTFAing the author of the article assumes that an unrecoverable read error corrupts your RAID array. It does not, typically your bad sector gets added to the list and mapped out of being used. Speaking of used, article assumes that entire drive is being used, but if the error on the part of the drive not covered with data, this is also a non issue.
  • by SatanicPuppy ( 611928 ) * <SatanicpuppyNO@SPAMgmail.com> on Tuesday October 21, 2008 @08:38PM (#25462451) Journal

    Sure, right now. The first hard drive I ever bought was 8 megabytes and cost 600 dollars. 4 years ago I bought a 1gb usb flash drive for 300 dollars, now they're running 10-20 bucks.

    In a few years solid state will be something I'm looking at VERY seriously. It has serious potential for long term storage. Yea, it's too expensive...right now...But in the long run it's the most promising thing out there.

  • by jandrese ( 485 ) <kensama@vt.edu> on Tuesday October 21, 2008 @08:42PM (#25462473) Homepage Journal
    Then I've got great news for you about tape drives.
  • by Wesley Felter ( 138342 ) <wesley@felter.org> on Tuesday October 21, 2008 @08:42PM (#25462479) Homepage

    I agree that SSDs are inevitable... for primary storage. Once I've switched my laptop over to SSD I'll still use a hard disk for backup, though.

  • The vast majority of Egypts writings were stored on perishable papyrus, not carved or painted on stone. Of all that they ever wrote or stored, we have but the tiniest fraction remaining.

    If we lost technology today, there would be nothing left but paper in 20 years. In a thousand, there wouldn't even be much paper.

  • by postbigbang ( 761081 ) on Tuesday October 21, 2008 @08:56PM (#25462623)

    If you source the original term 'RAID', it goes to an ACM article describing Redundant Arrays of Inexpensive Disks. In RAID 0, which is actually a marketing term, there's striping, but no redundancy that can infer the contents of a missing member of the array. From the perspective of availability, it has none. As you cite, RAID 1 is a mirrored pair, usually the same type of drive, and it also is likely the fastest RAID-- and most expensive in terms of available net data after redundancy for availability. There is also no RAID 6...10, as these are marketing terms, too.

  • by ushering05401 ( 1086795 ) on Tuesday October 21, 2008 @08:57PM (#25462627) Journal

    "Shit happens. Pointing out what we all already know doesn't do anything helpful."

    Actually, it gives posters like you a chance to remind everyone else that shit happens.

    I believe there would be many fewer frustrated/bitter IT workers if more people meditated on the fact that shit just happens. In today's marketplace it is usually IT left holding the bag when things go south anyhow... gotta get acclimated to that and roll on.

    Anyhow, I doubt there are many IT veterans not familiar with really expensive, really borked backup systems. Smarter people than me have observed that as technology progresses, existing strategies either age or mature. The ones that age become brittle, and the ones that mature become more robust...

    Corporate suits usually insure that both aged and mature technologies will be flogged on long past their rational retirement dates.

  • Don't panic! (Score:4, Insightful)

    by Joce640k ( 829181 ) on Tuesday October 21, 2008 @08:59PM (#25462647) Homepage

    RAID 5 will still be orders of magnitude more reliable than just having a single disk.

  • by MightyYar ( 622222 ) on Tuesday October 21, 2008 @09:21PM (#25462869)

    I do the same thing, but I want to warn you...

    I've had TWO occasions where it has failed me. Once, a lightning strike that zotched both drives. The second time a rubber isolator failed in the case and the master drive fell onto the backup.

    In both cases the bad spots in the two drives were different so I got back most of my data, but now I use Mozy as well as mirroring. I REALLLLLLLY don't want to lose all of my digital photos. :)

  • Re:Don't panic! (Score:5, Insightful)

    by Anonymous Coward on Tuesday October 21, 2008 @09:47PM (#25463119)

    No, it won't. That's the point of this not-news article. It's getting to the point where (due to the size of the disks) a rebuild takes longer than the statistically "safe" window between individual disk failures. Two disks kick it in the same timeframe (the chance of which increases as you add disks) and you're screwed.

    A poorly designed multi-disk storage system can easily be worse than a single disk.

  • by rbanffy ( 584143 ) on Tuesday October 21, 2008 @09:50PM (#25463157) Homepage Journal

    "If you think raid is protecting your data, you're crazy."

    BTW, RAID will do nothing if you accidentally "sudo rm -rf /" it.

  • by jaxtherat ( 1165473 ) on Tuesday October 21, 2008 @10:41PM (#25463595) Homepage

    I love how you use the language "get what they deserve".

    What about my situation, where I have to store ~ 1TB of unique data per office in 3 offices that are roughly 1000 km apart and I have to keep everything backed up with a budget of less than ~AU$ 4000 IN TOTAL?

    I have to run a 4 x 1TB RAID arrays on the file servers and use rsync to synchronise all the data between the offices nightly "effectively" doing offsites, and have a 3 TB linux NAS (also using RAID 5) for incrementals at the main site.

    That is all I can afford, and I feel that I'm doing my best for my employer given my budget and still maintaining my professional integrity as a sysad.

    Why do I "get what they deserve" when I can't afford the necessary LTO4 drives, servers and tapes (I worked it out I'd need ~ AU$ 30,000) to do it any other way?

  • Re:Don't panic! (Score:5, Insightful)

    by bstone ( 145356 ) * on Tuesday October 21, 2008 @11:35PM (#25464087)

    Using the same failure rate figures as the article, you WILL get an unrecoverable read error each and every time you back up your 12 TB of data. You will be able to recover from the single block failure because of the RAID 5 setup.

    With that kind of error rate, drive manufacturers will be forced to design to higher standards, they won't be able to sell drives that fail at that rate.

  • by totally bogus dude ( 1040246 ) on Tuesday October 21, 2008 @11:37PM (#25464117)

    If you're replicating data between all three offices (and a fourth backup system?) then you are making backups. The vitriol is aimed at people who set up a RAID-5 array and then say "hooray my data is protected forevermore!".

    Tape systems, especially high capacity tapes, are very expensive, and even those are prone to failures. Online backups to other hard drives are the only affordable means of backing up today's high capacity, low cost hard drives. To do it properly though, you need to make sure you do have separate physical locations for protection from natural disasters, fires, etc. Which you have.

    The only concern your system may have is: how do you handle corrupted data, or user error? If you've got a TB of data at each site it's unlikely that mistakes will be noticed quickly, so after the nightly synchronisation all your backups will now have the corrupt data and when someone realises in a month's time that someone deleted a file they shouldn't have or saved crap data over a file, how do you restore it? Hopefully your incremental backups can be used to recover the most recent good copy of the data, but how long do you keep those for?

  • Re:Don't panic! (Score:5, Insightful)

    by Sillygates ( 967271 ) on Wednesday October 22, 2008 @12:07AM (#25464333) Homepage Journal
    The mathematical theory behind raid5 is not complicated at all. http://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_5 [wikipedia.org]

    And there is parity, that's how raid5 works.

    You are probably referring to "silent" errors, which for performance reasons, isn't read/detected by most raid5 implementations. And in reality there is little reason to actively read parity, unless they are running/recovering in degraded mode: Sure, you'll be informed that there is data corruption, but there is no way to tell whether the parity, or the original data is at fault (though its true, some implementations will scrub/update the parity to match the original data on an occasional basis).

    I don't see a single set of raid5 disks as a backup solution at any measure though (disk reliability is only one aspect of this, hardware/driver/filesystem bugs can also cause hard or impossible to detect corruption), but it is a great 'best effort' to prevent a bit of downtime on high availability disks.
  • by tengu1sd ( 797240 ) on Wednesday October 22, 2008 @12:19AM (#25464395)
    >>>The company BOTH cares about their data AND can't afford a proper backup system.

    It can be that the company cares, but doesn't care enough to budget for potential data recovery. All you can do is to make sure the risks are explained, with budget option and well documented paper trail is cover your nether regions. Been there, done that. The typical response is that backups are not important, until a failure and a few days of uncertainty is forced upon the company.

    Having the same, potentially corrupted, data at multiple sites mitigates against the loss of a disk, or even the loss of a single site. User error or database corruption can wind up copied over your good data. Needing to go back for more than a day or two can may not be practical in a disk to disk backup environment.

    It's a part of system manager's role to spell out potential problems in easy to understand power point sound bytes and show what options are available. The better you can do this, the more toys you'll have to play with.

  • Re:Don't panic! (Score:5, Insightful)

    by Allador ( 537449 ) on Wednesday October 22, 2008 @01:59AM (#25464903)

    You seem to misunderstand the article. They are saying that if you need 12T of storage RAID 5 is not reliable. You would be better off with a single 12T disk if such a thing existed.

    Thats not what the article says at all.

    The article says that if you build your RAID arrays from the biggest disks available (which no one with half a brain does) like 1-3TB drives, and you have them filled, then the numbers come out as presented.

    But there's a reason why no one on the planet builds important raid arrays out of 1TB drives. Rebuild time is too long.

    This is also one of the big reasons why you see so many 73GB and 140GB SAS/SATA drives in raid arrays, and why server storage drives dont grow anything like as fast as consumer garbage drives.

  • Re:Don't panic! (Score:5, Insightful)

    by Eivind ( 15695 ) <eivindorama@gmail.com> on Wednesday October 22, 2008 @02:05AM (#25464929) Homepage

    Yes. It's amazing that the article presents the basic point so horribly poorly. The problem is not the capacity of the disks.

    The problem is that the capacity has been growing faster than the transfer-bandwith. Thus it takes a longer and longer time to read (or write) a complete disk. This gives a larger window for double-failure.

    Simple as that.

  • Re:Don't panic! (Score:3, Insightful)

    by NormalVisual ( 565491 ) on Wednesday October 22, 2008 @03:05AM (#25465161)
    This is also one of the big reasons why you see so many 73GB and 140GB SAS/SATA drives in raid arrays

    Didn't you mean SAS/SCSI? Most of the servers I've seen with smaller disks have been one of those, at rather brisk spindle speeds.
  • by AySz88 ( 1151141 ) on Wednesday October 22, 2008 @03:50AM (#25465373)
    Goodness, even the summary says "didn't back up? bummer!". Yes, we all know RAID only hedges against hardware failure. The point of this whole exercise is that RAID 5 doesn't even adequately help with hardware failures once data per drive grows large enough.
  • by Doug Neal ( 195160 ) on Wednesday October 22, 2008 @04:33AM (#25465523)

    I'm glad the 'not being a moron' thing worked out for you. But, what would you suggest to those in the audience that cannot claim the same. :-)

    OS X?

  • Even so (Score:2, Insightful)

    by anubis7733 ( 1377725 ) on Wednesday October 22, 2008 @07:49AM (#25466353)
    Even if it was feasible to buy all these hard drives or a tape drive, the amount of time it would take to properly do all these back-ups on a useful time scale seems to be beyond the reach of the typical user. Even power users do other things in their lives than worry about their computers. I can't see somebody with enough free time to make CD or DVD or tape backups every so often. And if you are copying your whole 1+ TB drive then it would take forever. It may just be that because I'm a college student I have less time than most people with normal jobs, but I see my dad come home late from work almost every day, and then he's just too tired to want to do anything else. So maybe this whole discussion just becomes irrelevant because not too many people realistically have the time to be able to do all this backing up, and would rather just take the risk of running a RAID setup.
  • Re:Don't panic! (Score:1, Insightful)

    by Anonymous Coward on Wednesday October 22, 2008 @08:05AM (#25466435)
    eat's it's self

    Aaaah it burns! It burns! Make it stop, please, make it stop!

  • Yes, that's what time machine is for. Sadly, my mac is the best backed up machine here. I have an external seagate drive hooked up with time machine and average around a month of backup points. I also burn things on DVD twice a year I can't live without like my iTunes collection. I really wish blu-ray would pick up on Macs for backup purposes. I could backup my iTunes with 3 50GB BD discs. 135GB of data to backup on 8GB DVDs?

    Tapes are cost prohibitive and optical hasn't kept up with hard drive capacity. I remember when I could backup my whole computer on 2 CDs. Now, even with BD I'd need 5 discs.

    Optical discs have their own problems, but I like to have backups on at least two different types of media. Since tapes are expensive and I've had terrible luck with them professionally, I'd like to stick to optical when possible.

  • Re:Don't panic! (Score:3, Insightful)

    by sarkeizen ( 106737 ) on Wednesday October 22, 2008 @10:19AM (#25467955) Journal

    It's an article about raid predicting doom written by a guy that knows nothing about raid.

    He's correct in most things. I'm just not sure I agree with him on his dates and although I expect your example is supposed to be funny it's probably better to pick one that applies. If you read the article you'll see that depending on how many drives you have per RAID5 unit your error rate may be acceptable. However Robin makes the pretty observant point that you are essentially paying more for less protection as raid drives grow in size.

    So things he's correct on:

    Drives fail (enterprise or otherwise) at about 3% per year.
    URE do occur but the 1 per 12TB of data read quantity is for SATA drives.

    Questionable things:

    RAID controllers probably don't read the entire surface during a rebuild but rather just the parity portions of the disk. This means in a RAID5 of 1TB disks. You are reading 1TB of data. Which would likely mean that you have a 1 in 12 chance of getting an URE. This may be an acceptable risk for some.

    The assertion that it's the "end of raid 5" is a little severe. A RAID50 mitigates the risk and the functions for calculating your parity data can be extended arbitrarily HOWEVER this is always at the expense of performance.

    The rate of disk growth may not follow the proscribed pattern.

    Red Herrings(?):

    Does the controller take the array offline if it encounters an URE during rebuild or does it continue? This may make change the result from being a system halt to data corruption but neither are unacceptable in the enterprise IMHO.

    The good argument underlying "doomsday dates" is that it seems reasonable that drive size is increasing at a much faster rate than these two figures are decreasing. Which means as storage needs grow the size of drives deployed will also likely grow but there is now an extra expense to consider.

  • by penguinbrat ( 711309 ) on Wednesday October 22, 2008 @11:20AM (#25468915)

    RAID is a backup - just backing up the hardware and NOT the data...

"The one charm of marriage is that it makes a life of deception a neccessity." - Oscar Wilde

Working...