Disk Drive Failures 15 Times What Vendors Say 284
jcatcw writes "A Carnegie Mellon University study indicates that customers are replacing disk drives more frequently than vendor estimates of mean time to failure (MTTF) would require.. The study examined large production systems, including high-performance computing sites and Internet services sites running SCSI, FC and SATA drives. The data sheets for the drives indicated MTTF between 1 and 1.5 million hours. That should mean annual failure rates of 0.88%, annual replacement rates were between 2% and 4%. The study also shows no evidence that Fibre Channel drives are any more reliable than SATA drives."
Re:Repeat? (Score:2, Interesting)
The best part about the entire thing is the very last quote:
"If they told me it was 100,000 hours, I'd still protect it the same way. If they told me if was 5 million hours I'd still protect it the same way. I have to assume every drive could fail."
Just common sense.
This study is useless. (Score:3, Interesting)
This study is not news. All it says is that people *think* their hard drives fail more often than the mean time to failure.
Interface matters why? (Score:3, Interesting)
I have thought the MTTF is bullshit for a while (Score:5, Interesting)
I don't consider myself a fluke because I know quite a few other people who have had similar problems. What's the deal?
Also, does anyone else find this quote interesting?:
"and may have failed for any reason, such as a harsh environment at the customer site and intensive, random read/write operations that cause premature wear to the mechanical components in the drive."
It's a f$#*ing hard drive! Jesus H Tapdancing Christ how can they call that premature wear, do they calculate the MTTF by just letting the drive sit idle and never reading and writing to it? That actually wouldn't suprise me.
Even better ... (Score:4, Interesting)
Start with 100 drives. Continuous usage.
How many fail in the first 6 months? 12 months? 18 months?
Check SMART Info (Score:4, Interesting)
To view the SMART info for drive
smartctl -a
To do a full disk read check (can take hours) do:
smartctl -t long
Sadly, I just found read errors on a 375-hour-old drive (manufacturer's software claimed that repair succeeded). Fortunately, they were on the Windows partition
Re:This study is useless. (Score:4, Interesting)
I dont really care to know exactly what is wrong with the drive. If i replace it, and the problem goes away, I would consdier that a bad drive. Even if you could still read and write to it. I just did one this morning that showed no symptoms other than windows taking what I considered a long time, to boot. All the user complained about was sluggish performance, and there were no errors or drive noises to speak of. Problem fixed, user happy, drive bad.
As I already posted, a good rule of thumb is 3 years from the date of manufacture, is when most drives go bad.
Re:Odd numbers for memory failure? (Score:3, Interesting)
Actually, one useful feature of Vista... (Score:5, Interesting)
...is that it detects SMART disk errors in normal use (i.e. you don't have to be watching the BIOS screens when your PC boots).
When I was trying the Vista RC, it told me that my drive was close to failing. I, of course, didn't believe it at first, but I ran the Seagate test floppy and it agreed. So I sent it back to Seagate for a free replacement.
About the only feature that impressed me in Vista, sadly. (And I'm not sure it should have impressed me, tbh. I'm assuming XP never did this as I've never seen/heard of such a feature.)
Re:Personally I am SHOCKED (Score:2, Interesting)
It's exactly this kind of bullshit that irritates me. Suppose you look at a file. It's 95,015,327 bytes long. You're claiming that referring to the file as being 95MB is "inappropriate"?
I'm a software engineer, fully versed in binary math, and the fact that computers refer to that file as being 90MB still really pisses me off. It's pointless and annoying.