Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Data Storage

BackBlaze's Hard Drive Stats for Q2 2017 (backblaze.com) 99

BackBlaze is back with its new hard drive reliability report: Since our last report for Q1 2017, we have added 635 additional hard drives to bring us to the 83,151 drives we'll focus on. We'll begin our review by looking at the statistics for the period of April 1, 2017 through June 30, 2017 (Q2 2017). [...] When looking at the quarterly numbers, remember to look for those drives with at least 50,000 drive hours for the quarter. That works out to about 550 drives running the entire quarter. That's a good sample size. If the sample size is below that, the failure rates can be skewed based on a small change in the number of drive failures.

Editor's note: In short: hard drives from HGST, a subsidiary of Western Digital, and Toshiba were far more reliable than those from Seagate across the models BackBlaze uses in its datacenters.

This discussion has been archived. No new comments can be posted.

BackBlaze's Hard Drive Stats for Q2 2017

Comments Filter:
  • by CrashNBrn ( 1143981 ) on Tuesday August 29, 2017 @12:52PM (#55104057)

    Editor's note: In short: hard drives from HGST, a subsidiary of Western Digital, and Toshiba were far more reliable than those from Seagate

    Which has been the case since BackBlaze started releasing it's reliability numbers, aside from a few instances where a specific model of Seagate performed unusually well.

    • HGST has come a long way since the Deathstar [wikipedia.org] days.
      • HGST has come a long way since the Deathstar [wikipedia.org] days.

        From your own link HGST never manufactured a "Deathstar". That was entirely IBM's doing.

        • From your own link HGST never manufactured a "Deathstar". That was entirely IBM's doing.

          "The line was continued by Hitachi when in 2003 it bought IBM's hard disk drive division and renamed it Hitachi Global Storage Technologies. In 2012 Hitachi sold the division to Western Digital who rebranded it as HGST."

          Hitachi didn't buy just the IP, they bought the division. That typically means all of the IP, manufacturing facilities, engineering, and business operations related to a product line. It's likely that there are a few people that worked on the "deathstar" that still work for HGST today.

          • by Anonymous Coward

            Also, the deathstars were all the "IDE" consumer drives. The SCSI professional drives were quite good. It depended where the platters were sputtered and the electronics were made. I worked at the IBM facility in San Jose a few years before Hitachi bought the storage division...

    • and yet, they continue buying far more Seagate than anything else.
      The reason is that they are cheaper, and it's not worth it to pay a premium for other brands.

      On amazon.ca, 4TB Seagate is $140, Western Digital is $157.

      For external (don't understand why, but it's cheaper) 8TB, Seagate is $235 and Western is $280

      • They've said in the past that there are alot of factors that go into it. For example, the Seagates may fail more often, but they're also cheaper. And sometimes it comes down to simple availability. If they can't get the HGST drives in the capacity and quantity they want, well, they're a data storage company, it's not like they're *not* going to buy hard drives, even if they are more likely to fail.

      • by jedidiah ( 1196 )

        When I had to go into the hospital for an extended period I decided that the price advantage was no longer worthwhile. I was just replacing the buggers (Seagate) too often. I was no longer in the position to "baby" my arrays.

        So I dumped all my Seagate drives for WD.

        Those Amazon prices of yours really don't reflect a discount sufficient enough to justify the extra bother.

      • by kinocho ( 978177 )
        Well, in my case it is because the first western digital I bought died on me three days after warranty expired. Admittedly it was just one disk, but I got so angry for that that I have never bought another disk from them. Until now I have been lucky. And as of late I have been purchasing 8tb disks from seagate.
    • Editor's note: In short: hard drives from HGST, a subsidiary of Western Digital, and Toshiba were far more reliable than those from Seagate

      I'm not sure that you can reach that conclusion from their data for the latest generation. The STD4000s were definitely hot garbage and the HGST 4TB were fantastic. But none of those drives are still on the market.

      The Seagate STD8000s they are running have a combined 1.2M drive days with 38 failiures. That's 1 average failure every 32k days. They also have around 1.75 failures per thousand.

      The WDC60 drives only have 40k days on them so they're still in the ballpark of similar performance. With 443 dr

      • by dgatwood ( 11270 )

        I'm not sure that you can reach that conclusion from their data for the latest generation. The STD4000s were definitely hot garbage and the HGST 4TB were fantastic. But none of those drives are still on the market.

        With an average of less than two months' use per drive, the current Seagate drives showed a 1.55% failure rate over three months, or a 6.2% expected annual failure rate, assuming all else is equal. With an average of five days' use per drive, the current HGST drives showed a 0% failure rate over

  • by Anonymous Coward on Tuesday August 29, 2017 @01:14PM (#55104199)

    ...that old R.E.M. song "That's me on the hard drive, losing my partition"

    • Re: (Score:2, Funny)

      by Anonymous Coward

      I thought that I heard you crashing
      I thought that I heard a ding
      I think I thought I heard you dying

    • by arth1 ( 260657 )

      ...that old R.E.M. song "That's me on the hard drive, losing my partition"

      Back in the early 90s we joked "That's me on the Conner". I guess these days, few people know or remember Conner hard drives and their dubious reliability, so the update make sense! :)

  • For a single drive, go with the most reliable model. For a RAID, however, be sure to mix different manufacturers, models, and batches to avoid correlated failures...

    Because, if the failures are random, your mirror or even a large-count RAID5 will do fine for millennia [algebra.com], assuming you replace the failing ones in a reasonable time.

    But if the drives are all the same, they may all — after spending the same time in the same enclosure under the same load — fail for the same reason at the same time. Having hot-spares or multiple redundancy will not help you...

    • Hot spare drives DO help, as long as you don't leave them spinning when they are in standby. It allows you to restore redundancy on a detectable failure and return your system to a "dual failure required for data loss" condition.

      Your point about not using drives from the same manufacturing run in a RAID array is somewhat valid, in that it can increase the possibility of having multiple drives fail at similar times, but if you *monitor* your drives, many failures are evident before they become catastrophic

      • by mi ( 197448 )

        Hot spare drives DO help, as long as you don't leave them spinning when they are in standby. It allows you to restore redundancy on a detectable failure and return your system to a "dual failure required for data loss" condition.

        Having a hot spare will cut your drive-replacement time by no more than a few hours. That's usually no more than a fraction of the actual rebuild time and, as my graph shows, that does not perceptibly affect the RAID's total MTTF. Moreover, if you monitor the drives — as you r

        • I think we are going to disagree about the hot standby, but in truth it boils down to what your response time to a drive failure is. I've had systems that I was responsible for that I couldn't lay hands on for at least 24 hours or more (They where in other countries). Granted it wasn't an ideal situation and we did have limited onsite support, I found that having a hot spare (as well as one or more on the shelf) to be useful to the MTBF of the system. My biggest problem was getting the replacement drive p

          • by mi ( 197448 )
            If:
            • I couldn't lay hands on for at least 24 hours or more and
            • you have a spare slot in your RAID-enclosure anyway

            then yeah, you may as well put the spare drive into the otherwise empty slot instead of on the shelf.

    • Agree competely.

      Had a job where one time we installed a dozen or so low end RAID disk arrays. They had 8 slots but only 4 populated. So, RAID-5 was our only real choice, (given the application space requirement). The array vendor did supply a mixture of disk drive manufacturers, but it still did not help.

      We ended up with a lemon model, the IBM 3.5" 9GB SCSI. They started dying. Whence the root cause was determined, (bad batch of drives), we asked our array vendor to replace them all. Perhaps 40 more dis
    • by guruevi ( 827432 )

      Large count RAID5 WILL NOT DO FINE (trust me). Your RAID5 takes exponentially more time for each drive you add and during that time your data is in a RAID0 situation. RAID with at least 2 parity drives is the minimum requirement, regular mirrors if you have failover systems or triple mirrors if you're in a SAN-situation.

      • by mi ( 197448 )

        Large count RAID5 WILL NOT DO FINE (trust me).

        The larger the count, the bigger the risk, yes. And yet, if the drives are all different, it will still do fine for thousands of years. No, I will not just "trust you". You may have some personal anecdote to "prove" it, but the math I referred to speaks for itself.

        RAID with at least 2 parity drives is the minimum requirement

        Wasteful bullshit. All too common among Infrastructure people (who never even studied Statistics, much less got a decent grade), but still b

        • by guruevi ( 827432 )

          Even a single anecdote would disprove your theory of 'thousands of years'. There is no such thing as 'thousands of years' of runtime on a drive, you're talking MEAN time BETWEEN failures (or MTTDL, mean time to data loss) and even then you have to account for all the drive configurations in existence, in an ideal world.

          You can do the calculations, go ahead, there are calculators on the Internet for you. There used to be an Excel spreadsheet from a Sun engineer a long time ago, but

          I'm sure you won't understa

          • by mi ( 197448 )

            Even a single anecdote would disprove your theory of 'thousands of years'.

            It is not a "theory", I offer a mathematical proof [algebra.com]. You, on the other hand, would not offer even an anecdote.

            There is no such thing as 'thousands of years' of runtime on a drive,

            Not on a drive, but for an array — a RAID5 with disks failing randomly will survive for millennia before two unrelated failures happen within the period of replace/rebuild-time of each other.

            I'm sure you won't understand the content of this article

            Ah, ho

            • by guruevi ( 827432 )

              Again, you don't understand the mathematics involved, linking to your own (bad) calculation is not proof. This calculator explains it much better than I can: http://wintelguy.com/raidmttdl... [wintelguy.com]

              RAID5: 10 drive groups of 8 drives, 6TB drives:

              MTTDL: 1.2years
              Probability of data loss is more than 50% in a year
              MTTDL due to multiple disk failures: 286 years
              Do you understand how incredibly low a MTTDL of 286 years is?

              Just swapping it over to RAID6 with the same settings and the probability of data loss is 1% over 10

              • by mi ( 197448 )

                linking to your own (bad) calculation is not proof

                It is proof, unless an error is identified. Math has this nice property about it, that it is not subject to opinion. What is bad about them?

                This calculator explains it much better than I can

                Because you do not understand the Math...

                there is no situation where RAID5 does any better using real world numbers than RAID6 or RAID10.

                I never claimed, that RAID5 does better. My claim is, it is perfectly sufficient and that the higher redundancy does not improve reliab

                • by guruevi ( 827432 )

                  Okay, quick scanning of your site:

                  The entire section of Poisson, having more drives does not make it less likely that one will fail.
                  "Failures occur every mttf hours" - that's not what MTTF means. MTTF simply means that if you have a collection of "n" drives, that you can statistically expect 1 failure every (MTTF / n). So if you have a manufacturer says, your MTTF is 100 years (which is about what they promise), then you will have a failure amongst a set of ~1M drives every 1 hour.

                  Poisson's distribution sim

  • Seagate is not reliable.

  • by DontBeAMoran ( 4843879 ) on Tuesday August 29, 2017 @02:15PM (#55104575)

    It's sad that Windows and Linux users have to go to such troubles.

    Me? I only buy Macs. Because Apple takes the 1% of the best drives made in each manufacturing lot and puts them in their Macs. That's why Macs are so expensive.

    I mean, this has to be the reason, right? Surely they're not just buying the same parts as Dell and others and just selling overpriced computers and pocketing the profits.

    • by jedidiah ( 1196 )

      > It's sad that Windows and Linux users have to go to such troubles.

      Why do you have this deranged idea that PC users are forced to buy any particular brand of hard drive? We can buy any brand we like.

      • My joke was that you had to do any research at all and that Apple, being costly, had the best 1% of hardware and PC peasants had to buy the 99% rejects.

  • Hard disks are so old fashioned, why don't they use the Cloud ?

  • BackBlaze is about to put in a behemothic order as it gets ready to take on all the CrashPlan customers.

  • Seagate stats don't make any sense: ST4000DM001 - 400 - 5 - makes it 1.25% failure rate - I see 30.43% in the table. Likewise with ST4000DX000.

    Could anyone explain how the f they calculated Seagate data?

    • They extrapolate annual failure rates. You divide the number of failures by the total amount of time the drive has been in service (sum of all drives - i.e. 100 drives for 50 days = 5000 drive-days), then multiply by 1 year. In this case, those 400 drives have been in service for a total of 6k days (average of 15 days/drive). If 1.25% of drives crap out in 15 days, you can probably see how that becomes a 30% annual failure rate. Granted, there's some bathtub curve going on here, so that particular statisti

One man's constant is another man's variable. -- A.J. Perlis

Working...