Forgot your password?
typodupeerror
Hardware Technology

SSD Annual Failure Rates Around 1.5%, HDDs About 5% 512

Posted by samzenpus
from the breaking-news dept.
Lucas123 writes "On the news that Linus Torvalds's SSD went belly up while he was coding the 3.12 kernel, Computerworld took a closer look at SSDs and their failure rates. While Torvalds didn't specify the SSD manufacturer in his blog, he did write in a 2008 blog that he'd purchased an 80GB Intel SSD — likely the X25, which has become something of an industry standard for SSD reliability. While they may have no mechanical parts, making them preferable for mobile use, there are many factors that go into an SSD being reliable. For example, a NAND die, the SSD controller, capacitors, or other passive components can — and do — slowly wear out or fail entirely. As an investigation into SSD reliability performed by Tom's Hardware noted: 'We know that SSDs still fail.... All it takes is 10 minutes of flipping through customer reviews on Newegg's listings.' Yet, according to IHS, client SSD annual failure rates under warranty tend to be around 1.5%, while HDDs are near 5%. So SSDs not only outperform, but on average outlast spinning disks."
This discussion has been archived. No new comments can be posted.

SSD Annual Failure Rates Around 1.5%, HDDs About 5%

Comments Filter:
  • Poor statistics (Score:5, Insightful)

    by Anonymous Coward on Thursday September 12, 2013 @10:11PM (#44836791)

    "client SSD annual failure rates under warranty tend to be around 1.5%, while HDDs are near 5%"
    So they are less likely to fail early in their life.

    NOT:
    "So an SSDs not only outperforms, but on average outlast spinning disk."

    This is completely unsubstantiated by the evidence provided.

    • by MickLinux (579158)

      That said, my memory was that some reported on Âlashdot that you can force the failure of an SSD by powering it down in themiddle of a write, then powring it up, causing it to go into chkdsk, and finally powering it down in the middle of chkdsk. Which is not too unlikely an occurrance. If you wanted to decrease the user failure rate, you might hook it upyo a supercapacitor.

    • Re: (Score:2, Informative)

      by Anonymous Coward
      There's another factor they don't take into consideration - when the drive fails, in which condition will it be afterwards. I had multiple HDDs fail on me in my life, and the most common effect was inability to read a bunch of sectors. It damaged the file system and several files, but in most cases I could still mount it read-only and recover most of the stuff from it unscathed. Just a few weeks ago I had an SSD failure (OCZ Vertex3). I was working and the drive just suddenly died. Without a warning, and of
      • Re:Poor statistics (Score:4, Interesting)

        by Sir_Sri (199544) on Friday September 13, 2013 @12:54AM (#44837807)

        The way I configure SSD's is as a OS/boot drive, and then I write all user data off to a RAID with traditional HDD's.

        The simplest way is a SSD for windows/linux and then put your user directories on a RAID1 of 1TB drives, and then backups from there.

      • What about putting 2 SSDs into a software RAID 1 configuration? Does that solve the problem?

        What you said is my experience, also. I haven't had catastrophic failure of a HDD in perhaps 20 years in a population of perhaps 15 computers. In my experience what most often fails is the HDD electronics, so it is possible to extract the data by temporarily replacing the HDD electronics with a circuit board from another, identical HDD. Also, of course, in the last 20 years we have replaced HDDs because of frequen
        • by jkflying (2190798)

          That won't work if they both die from some bug which is triggered by eg a certain write sequence followed by TRIM, then power cut in the middle of TRIM. They will both be killed.

      • by ssam (2723487)

        Was SMART showing anything before the failure?

    • Re:Poor statistics (Score:4, Interesting)

      by smash (1351) on Friday September 13, 2013 @12:09AM (#44837551) Homepage Journal
      Not only that, if you add enough SSDs to store the same amount of bytes as a 4TB hard drive you either need to RAID them or you will have cumulative failure rate far higher than the HD failure rate, which you could mirror (or hell, RAID6 it) for way less money.
      • Re:Poor statistics (Score:4, Interesting)

        by Mashiki (184564) <mashiki@gmailCURIE.com minus physicist> on Friday September 13, 2013 @01:58AM (#44838105) Homepage

        Really? Odd that I can buy SSD's in a 1.5-3TB flavor these days, they're expensive as all hell, but I can buy them. They come in PCI-e and SATA flavors. And really at that point, you're running with a mirror or shadow backup, or something anyway. Besides, if you're using a single drive like that, you're at a single point of failure at both the consumer level and at the enterprise level. But let's be honest, you can't beat good backup practices into anyone. As much as you try, and all that.

    • Re:Poor statistics (Score:5, Interesting)

      by hairyfeet (841228) <bassbeast1968@gm[ ].com ['ail' in gap]> on Friday September 13, 2013 @12:30AM (#44837649) Journal

      I would just add the whole thing ignores that big old rotting elephant in the room which is HDDs? I have found that in damned near every case, not all but most, will give you PLENTY of warning before it goes completely tits up whereas the SSD? One day its working and the next....nothing. No warning, no noise, no indication at all that there was a problem just...poof, buh bye data. This is also ignoring the fact that if the circuit board fails in a HDD you can swap one out for the same model and get it back in most cases, at least long enough to get the data, the SSD? Hope you are good with a soldering iron and a chip reader and I have heard even then its unlikely.

      I may be just a little country shop guy but when my gamer customers have all experienced multiple failures when it comes to SSDs, and these guys don't go cheap, sorry but ATM I still don't trust it. I tell folks if they want an SSD don't have anything on it they would feel bad if they lost, now does that mean there aren't still uses for SSDs? Of course not, for one thing if you have a laptop where most if not all of your data is in the cloud? Knock yourself out, just make a weekly disk image so you can re-image when it goes tits up and you are golden. I also have several customers that have bought either hybrid drives or that Sandisk caching drive for Win 7 and in both of those cases they have seen pretty big speed boosts while not having to worry because if it dies all you do is go back to HDD speeds as it is just a cache.

      Oh and one final thing....its gonna get worse. its common knowledge that with each shrink the number of writes goes down and the number of failures go up and with all of the major chip companies seeming to only care about how many bits they can stuff per nano-meter? The failure rate WILL get worse, you can count on it. Its too bad that SLC is so insanely high as those seem to have lower failure rates than MLC but as long as all the companies care about is getting that GB number up at all costs its really not gonna be getting better, its gonna be getting worse.

      Ironic that they talk about how supposedly high HDD failure rates are when I cleaned out a how drawer of them before moving into the new place, we are talking drives going back to Quantum Fireballs in the 200Mb size, yes Mb not Gb, and they all fired up. granted some of them were noisy as hell but I could still get files off of them while not a single one of my gamer customers have their first SSD, they are all dead. yes i know its an anecdote but I'm not the only one that has seen this, coding horror calls SSDs the hot crazy scale [codinghorror.com] as you trade red hot performance for crazy failure rates. Call me old fashioned but I think I'll just pick upa caching SSD and keep the 5Tb in spinning rust, thanks ever so Intel.

      • Re:Poor statistics (Score:5, Insightful)

        by girlintraining (1395911) on Friday September 13, 2013 @01:20AM (#44837949)

        I have found that in damned near every case, not all but most, will give you PLENTY of warning before it goes completely tits up whereas the SSD?

        Yeah, sure, okay. If you're sitting next to your computer, then yeah, maybe you notice. How about the hundreds of millions of drives that are sitting in a rack somewhere, and will only see a human being twice: Once when it gets installed in the rack, and then only when it stops working for whatever reason and a tech is sent out to replace it.

        The "it made a funny noise first" line item is a joke either way. This is like saying "Well, I prefer diesel engines because they make more noise when they die." Hookay. Yeah.

        I may be just a little country shop guy but when my gamer customers have all experienced multiple failures when it comes to SSDs, and these guys don't go cheap, sorry but ATM I still don't trust it.

        I may just be a Ferrari repair shop owner, but when my car owners have all experienced multiple failures when it comes to ceramic brakes and high end engine components, and these guys don't go cheap, sorry but ATM I still don't trust it.

        Now do you see how utterly ridiculous that sounds? High performance almost always means less robust. That graphics card you just plunked over $200 on? It's operating temperature is so high from the current being pumped through it that it's literally cooking itself at the molecular level from the moment you plug it in -- it's called electromigration, and in three to five depending on how often you use it, it's going to shit itself. But that's okay... because in two years, you'll be spending even more on a new one.

        Ironic that they talk about how supposedly high HDD failure rates are when I cleaned out a how drawer of them before moving into the new place, we are talking drives going back to Quantum Fireballs in the 200Mb size, yes Mb not Gb, and they all fired up. granted some of them were noisy as hell but I could still get files off of them while not a single one of my gamer customers have their first SSD, they are all dead.

        Yeah, and? How many gamers are still using their 200Mb Quantum Fireballs in an actual computer? I know it's a common geek past time to see what kind of antiquidated hardware you can pull out with your friends... that old parallel port Zip drive, or floppies the size of your head... and yeah, it's fun to talk about to show you had IT chops before the person you're talking to was even a glint in daddy's eye... but that's the only value they have.

        Nobody's coming up to me and asking for an AT command initialization string for their modem -- AT&F&C1&D2S95=55 in case you were wondering -- because it's not a technology very many use anymore. Yeah, I can dig out an old 2400 baud modem and get it working... but that doesn't mean 2400 baud modems are superior to cable modems that "have a higher failure rate".. and so, you know... I don't know if I trust such 'new' technology.

        Now, get off my lawn.

        • Re:Poor statistics (Score:5, Informative)

          by icebike (68054) on Friday September 13, 2013 @01:53AM (#44838079)

          Yeah, sure, okay. If you're sitting next to your computer, then yeah, maybe you notice. How about the hundreds of millions of drives that are sitting in a rack somewhere, and will only see a human being twice: Once when it gets installed in the rack, and then only when it stops working for whatever reason and a tech is sent out to replace it.

          Hmm, my drives send me emails when they start having problems. (And having gotten one of these emails a few years after setting up the drive initially, I was shocked to find it the email arrived in plenty of time. I pleasantly surprised to find the drive and all data still intact, and had time to swap a replacement into the raid).

          Why don't you find out how this is handled by people who actually have hundreds of drives to deal with.
          If you let them fail before servicing them you are doing it wrong.

          Look into: man 8 smartd

      • OK LOL!

        HDD give you plenty of warning now. In fact most of SMART tech, and a host of other things to run continuous tests looking for potential failure, as well as OS that specifically look for indications as well. Now.

        Years ago, this was not the case. You MIGHT get some warning depending on how it decided to fail (bad sectors etc.,,), however most back in the day gave you about one second of actually notice before dying in a grinding crunching sound, or in a small black puff of smoke. I suppose in that lig

  • by Marrow (195242) on Thursday September 12, 2013 @10:13PM (#44836805)

    So you need to multiply the failure rate of the SSD by as many SSDs as it would take to equal the storage of the disc. Do you want the storage rate per arbitrary device size, or rate of failure per data stored?

    • by Rockoon (1252108) on Thursday September 12, 2013 @11:34PM (#44837381)
      Thats a silly thing to do. Lets examine this, shall we?

      A 5% chance to lose 2TB vs a 1.5% chance to lose 250GB.

      You argue that since it requires 8 of these 250GB SSD's to equal the capacity of the 2TB HDD that we should multiply 1.5% by 8, so a 12% chance... a 12% chance of what, tho? In actuality, there isnt a 12% chance of anything...

      The chance of losing at least 1 of those 8 SSD's (that is specifically 1 or more) over the period is (1 - (1 - 0.015)) = 0.114, but the chance of losing all of those 8 drives over the period is 0.015^8 = 0.0000000000000025628906. In other words, losing all 2TB in the SSD scenario is effectively never going to happen while it remains 5% for the HDD scenario.

      The actual breakdown of all possibilities of drive failings (0 drives, 1 drive, 2 drives, etc..) rounded to thousands of a percent is:

      0 drives: 88.611%
      1 drives: 10.795%
      2 drives: 0.575%
      3 drives: 0.000%
      4 drives: 0.000%
      5 drives: 0.000%
      6 drives: 0.000%
      7 drives: 0.000%
      8 drives: 0.000%

      So we see that you would be twice as likely to lose some data than in the HDD scenario, but invariably it will only be 250GB of data instead of 2TB of data (only 1 in 173 of these 8 drive experiments will witness more than 1 drive fail, and the majority of those will be exactly 2 drives failed)

      So no, you do not need to multiply the failure rate of the SSD's by the number of SSD's that you would need to equal the HDD. What you need to do is define the problem better because as it stands SSD's look a hell of a lot better when you suppose that you need a pile of them.
      • by Rockoon (1252108)
        Woops. Missed an exponent when typing that out. The first equation is supposed to be (1 - (1 - 0.015)^8) = 0.114.
  • by danbob999 (2490674) on Thursday September 12, 2013 @10:18PM (#44836853)

    5 years should be mandatory by law. If you can't support your drive for 5 years, you shouldn't be allowed to manufacture hard drives at all.
    I don't understand this new trend in making new hard drives with only 1-2 years warranty. The same goes for SSD.

    • margins are paper thin. no time to do QA. what ends up happening is that we, the buyers, are the 'remote QA dept'.

      sad but true. we have to test the hell out of things we buy for the first 30 days.

      profit profit profit! isn't extreme capitalism wonderful? sigh ;(

      • by citizenr (871508)

        margins are paper thin.

        WD and Seagate have a healthy 40% margin on every drive they sell.

        • Except they've priced their drives so low that you're looking at 40% of very little... on a per-unit basis, they're still making very little money and better have a very low return rate to account for it.

      • by tftp (111690)

        margins are paper thin. no time to do QA.

        Not just margins. Development time is short. A model of the drive has to be produced and sold in less than a year, and replaced with a new model after that. Who can afford an endurance test, even if accelerated?

    • by mysidia (191772)

      I don't understand this new trend in making new hard drives with only 1-2 years warranty. The same goes for SSD.

      If it shaves cost off the unit; there are people who will buy it, and take the chance.

      I would say that the manufacturers have a right to offer them this option.

      In fact; I would say manufacturers have a right to provide options with less than a 1 year warranty.

    • by Burz (138833) on Thursday September 12, 2013 @10:51PM (#44837071) Journal

      In the 2000s consumers became almost the exact opposite re: warranties as they were in the late 80s/ early 90s when a good warranty seemed to matter as much as any other criteria. I've been trying to buck that trend, but until the last couple years it was almost impossible. When I shop for electronics that have no moving parts and are *not* portable, the warranty has be be at least 3 years and this even includes some moving-parts items like hard drives. My two most recent HDD purchases (and some that I helped friends and clients with) had 5 year warranties.

      The thing about insisting on a 'long' warranty is that the price then becomes an aid in finding equipment that is actually more reliable. Among stable brands, the cheaper models in the longer warranty class will tend to be more reliable; A higher confidence level from the manufacturer is often reflected in the lower price. Likewise, the junkier models will get higher price tags in order to be able to cover the higher failure rate. Nowhere is this more obvious than with computers that have options to purchase mfg extended warranties.

      Of course, even if the prices are the same, getting equipment with a higher failure rate is still a raw deal because of the cost of downtime, possible data loss, shipping, etc.

    • by xlsior (524145)
      I don't understand this new trend in making new hard drives with only 1-2 years warranty. The same goes for SSD.

      It's very simple, really: Because they can.

      The main reason is that there's only three hard drive manufacturers left in the world: Seagate, Western Digital, and Toshiba. (Samsung & Hitachi's HDD divisions have both been aquired in recent years, although you can still find drives with their brandname on them, for now)

      Out of those three, only WD and Seagate manufacture large capacity 3.5"
    • I don't understand this new trend in making new hard drives with only 1-2 years warranty. The same goes for SSD.

      Most of my hard drives have died either very quickly or after about 3 years. I would count on one replacement during the 5-year warranty. So, when they cut warranties to 1 year, it at least doubled the cost of hard drives.

      Not sure where that plugs into the inflation calculator...

  • That under warranty less SSDs fail doesn't mean they outlast HDDs... If warranty is 1 year, and all SSDs fail in 1.5 years, yet hard drives usually fail only in 3 years, hard drives are still better off.

    In other news, Laxori666 was too lazy to RTFA and is hoping someone will chime in. He is tired and drowsy and so he will blame it on that when in fact, he would have done the same regardless - except perhaps without this addendum as such honesty usually requires some sort of altered state of consciousness
  • by JoeyRox (2711699) on Thursday September 12, 2013 @10:34PM (#44836969)
    While catastrophic drive failures make headlines what's more likely to happen during the useful service life of both HDDs and SSDs are unrecoverable media/bit errors and these may ruin your day as much as a catastrophic error. If you look at the bit error rate of any contemporary HDD and compare it to its capacity you'll come to a startling conclusion - an unrecoverable read error is rated to occur once every 2 to 5 times the full capacity of the drive is read. SSDs have about the same unrecoverable read error rate.
    • by Dorianny (1847922)
      In hdd's a mounting number of bit errors or frequent controller resets are red flags pointing to a imminent drive failure. In my experience ssd typically fail catastrophically without any warning signs at all, something which is pretty rare in hdd's.
  • by QuietLagoon (813062) on Thursday September 12, 2013 @10:34PM (#44836973)
    The author of the summary and/or TFA seems to draw a conclusion based upon a paucity of information.

    Yet, according to IHS, client SSD annual failure rates under warranty tend to be around 1.5%, while HDDs are near 5%. So an SSDs not only outperforms, but on average outlast spinning disk."

    The unknown in the equation is the length of the warranty periods for the drives used in the comparison.

    • And also what happens afterwards. Most drives will not fail under warranty, most should fail many many years afterwards.
      This does not tell us if the average SDD fails 1 week after its 2 year warranties runs out, or if the HDD lasts for 8 year longer on average.

      • by dfghjk (711126)

        But why would you assume they are different?

        "Most drives" will not fail under warranty but so what? It's not informative.

        Drives have early life failures. Once that period is past they will typically last a long time. That's common for a lot of things.

        What evidence do you have that SSD and HDD are fundamentally different in this regard? Lacking that, your comment is worthless.

  • Yawn. (Score:4, Insightful)

    by Anonymous Coward on Thursday September 12, 2013 @10:38PM (#44836993)

    Anyone who isnt using a SSD by now for at least their boot drive is stuck in the past.
    It's the single best upgrade you can make anymore.

    Either way stop the fucking articles about it.
    Leave them with their warm feelings for spinning rust full of multi gigs of stuff they never touch.

    They'll wise up eventually. Or not.
    Either way it won't hurt you any. Enjoy your speedy pc and laugh at the rusties if you must.

    • by dbIII (701233)

      Anyone who isnt using a SSD by now for at least their boot drive is stuck in the past

      You've just summed up those stupid applications on MS Windows with hard coded paths to "C:" drive. They still exist.

    • by 0123456 (636235)

      Anyone who isnt using a SSD by now for at least their boot drive is stuck in the past.

      I boot my work PC about every two months.

      It's the single best upgrade you can make anymore.

      If you spend all day just booting your PC. Otherwise, a faster CPU or GPU or more RAM is likely to be far more useful.

    • by toddestan (632714)

      Well, all I can say is that I jumped on board early with SSDs. After nothing but problems I went back to 'spinning rust' on my desktop PC. Why? Because it works. The marginal speed increases after the PC has booted aren't worth the wasted time and headaches of using a technology that, for whatever reason, doesn't seem to have matured yet.

  • by JDG1980 (2438906) on Thursday September 12, 2013 @10:42PM (#44837023)

    OCZ's failure rates are higher than the rest of the industry's by an order of magnitude. Also, earlier SandForce drives have reliability problems because the firmware was written by paranoid loons who were deathly afraid of reverse-engineering and the drive goes into irrecoverable 'panic mode' when any abnormality of any kind is sensed. I think that newer SandForces (post-LSI acquisition), especially Intel's, are less likely to do this, but the original failures still taint the brand with the stigma of flakiness.

    If you stick with Samsung, Intel, and SanDisk, you should be fine. Stay away from OCZ at all costs, and be skeptical of any SandForce drive not made by Intel.

  • I've heard several people state that they've bought SSD drives that just would not work when they got them home and they had to do an exchange. Do these statistics include those returns or only ones that failed in service?
    • SDDs and HDDs. There is a huge percentage that arrive dead, or die within a week. But I do not think SDDs fair any worse that HDDs.

    • All electronics fail on a bathtub curve ... they either fail predominantly when new, or when old, and rarely in the middle. That's why burn-ins are common among the knowledgeable -- it gets you past that initial failure stage before you've used the drive for something important.

  • another story on /.
    a TRULY dead ssd is impacting the linux kernel release.
    one in Linus's server.
    bad timing, to try to pump bad statistics.
    there are lies
    damn lies
    then there are statics


    better to go with the lies ...

    and hire better tech aware ad men
  • by ArchieBunker (132337) on Thursday September 12, 2013 @11:00PM (#44837139) Homepage

    Now for the useful information. How many of the failed SSD's were they able to recover data? I suspect not many.

  • Wrong stat.

    Yes, things to break, but its important HOW they break. HDDs have very 'nice' failure modes. You can recover bits from the platters as long as you do not put one in MRI machine or a fire. SSDs just DISAPPEAR from the system with data and encryption keys to that data and NO ONE including manufacturer can do recovery (they can put flash chips in reader and read encrypted bytes, but encryption keys were in the controller that just died).

    How about another one: Warning before failure rate? Again 90% H

  • So, if 5% of hard drives are failing in the first year of warranty, then the other ones have to last 180 years on average in order to meet the MTBF specifications of 1.5 million hours that hard drive makers claims. Because surely they wouldn't lie to us.
  • Statistics are wonderful things, if you choose the right one you can make any case you want. I want to know more about the warrentees. I want to hear about the nature of the issues. Recoverable errors vs complete death. Infant mortality vs just wear.

  • Ancient data. (Score:5, Informative)

    by Reeses (5069) on Friday September 13, 2013 @02:26AM (#44838203)

    All this discussion on this and no one has commented that TFA is from 2011??

    This article isn't reliable information. It's from when SSDs were relatively new and definitely doesn't apply to the in-the-field results people are seeing in 2013.

As far as we know, our computer has never had an undetected error. -- Weisert

Working...