Hard Drive Reliability Study Flawed? 237
storagedude writes "A recent study of hard drive reliability by Backblaze was deeply flawed, according to Henry Newman, a longtime HPC storage consultant. Writing in Enterprise Storage Forum, Newman notes that the tested Seagate drives that had a high failure rate were either very old or had known issues. The study also failed to address manufacturer's specifications, drive burn-in and data reliability, among other issues. 'The oldest drive in the list is the Seagate Barracuda 1.5 TB drive from 2006. A drive that is almost 8 years old! Since it is well known in study after study that disk drives last about 5 years and no other drive is that old, I find it pretty disingenuous to leave out that information. Add to this that the Seagate 1.5 TB has a well-known problem that Seagate publicly admitted to, it is no surprise that these old drives are failing.'"
In all fairness (Score:2, Offtopic)
Re:In all fairness (Score:4, Interesting)
Back in '99 we used to get factory sealed boxes of Seagate drives DOA, or already having cluster collapse. There's nothing quite like 250-500+ brand new units which are all dead or dying, and then shipping them back. I've only started using Seagate again in the last few years.
Re: (Score:3)
Or a bad batch?
Re:In all fairness (Score:5, Funny)
Or a bad batch?
No, of course not. This is /. It must be that a major hard drive manufacturer that was around 20+ years prior, and is still around 14 years later made nothing but bricks and packaged them as hard drives. That's how they survived when so many of their competitors went bankrupt. Bricks are so much cheaper to produce, so the profit margin is considerably higher. ;-)
Re:In all fairness (Score:5, Interesting)
Find the full text here: http://www.justice.gov/osg/briefs/1996/w961430w.txt [justice.gov]
Now, though, Seagate is not Miniscribe.
Re:In all fairness (Score:5, Informative)
Correct me if I'm wrong, but I believe Seagate absorbed Miniscribe by way of Maxtor. I wouldn't be so sure that 'shipping bricks' isn't in their patent portfolio.
Since most /. folks weren't even alive back then, let me recap a few of Miniscribe's business tactics:
- Set up off-the-books companies to which they "sold" drives that were simply stored in warehouses.
- Claimed the sale of drives which had not yet been delivered to customers. Their outside auditors called them out on the fact that they couldn't claim the income from drives that were still on the boat from China, and made them restate earnings. When it all fell apart, the criminal investigation discovered that the drives had never even existed to begin witth.
- Took returned dead disk drives, tossed them onto a pile in the office which was nicknamed the "dog pile", and when the pile got big enough, packed them up and shipped them out as new orders.
So no, Seagate is nothing at all like Miniscribe ;-)
Re: (Score:2)
Seeing "CompuAdd" brought back memories. I may still even have a mouse pad from them. Got my first SoundBlaster there, too, with about a year's saved allowance... (Some of us /.ers were alive back then)
Comment removed (Score:5, Informative)
Re:In all fairness (Score:5, Informative)
If the entire box is dead, wouldn't that imply mishandling during shipping?
Bad batches during production. Seagate used to be famous for this, and if you look back at their 90's financials you can see that for quite a while they were hanging on from folding by the edge of their teeth.
Re: (Score:2)
If the entire box is dead, wouldn't that imply mishandling during shipping?
Not necessarily. It could also be an awful batch of drives off a production line with really shit quality control.
Re:In all fairness (Score:5, Interesting)
Eh, it depends. I've had plenty of bad luck with Seagate's consumer drives dying pretty quickly. On the other hand, I've yet to have to replace a single enterprise ES (or ES2) series drive. We use Seagate's ES series drives in the arrays we depend on and Western Digital black drives in the arrays we don't care too much about (video editing rigs). Though I said, "Don't care too much about", I at least expected them to last more than a few months. Unfortunately, a few months is a tall order for Western Digital. The black drives die so often that their entire warranty department probably knows me by name...
Re: (Score:3)
Meantime, I had absolutely horrible experiences with the 3TB ES drives. It varies.
Re: (Score:3)
Eh, it depends. I've had plenty of bad luck with Seagate's consumer drives dying pretty quickly. On the other hand, I've yet to have to replace a single enterprise ES (or ES2) series drive. We use Seagate's ES series drives in the arrays we depend on and Western Digital black drives in the arrays we don't care too much about (video editing rigs). Though I said, "Don't care too much about", I at least expected them to last more than a few months. Unfortunately, a few months is a tall order for Western Digital. The black drives die so often that their entire warranty department probably knows me by name...
I've had enterprise Seagate drives die on me before. The failure rate is definitely much lower though. One of the reasons, but not he only one, that enterprise drives last is because they are in a temperature and humidity controlled environment. Just remember, though, if you ever have a cooling failure in your server room, expect to replace a bunch of drives over the next year.
Re: (Score:2)
The only drive I ever had DOA was a Seagate ES. It arrived with visible mechanical damage that was not done in transport, but was failed to be detected in Q&A. The only external drives that ever failed on me so far is a current Seagate (1 year old or so), where the connector came off. I checked, a second one I have also has the badly mounted connector.
Sorry, rumors of Seagate drive reliability are far overstated.
Re: (Score:3)
I haven't met anyone personally that has had problems with Seagate drives, but I have heard people claim it online. Maybe I'm just lucky or something.
Re:In all fairness (Score:5, Interesting)
I've got ~170 failed Seagate Enterprise 500G drives sitting here in my cube. That's pretty close to a 50% failure rate after 4 years of that fleet. Sadly Dell who branded them won't warranty them after 1 year. I'm pretty close to playing hard drive dominoes with them and posting that on youtube. Also noteworthy, we have almost as many Western Digital drives of that same generation with just one failure. Due to this, my company refuses to buy any more Seagates until we see things get better.
Re:In all fairness (Score:5, Interesting)
This reflects my anecdotal experience of late as well. My Dell server just turned 3 years old (and I had a 3 year service agreement on it). It came with three 1-terabyte drives. All failed before my service period ended and were replaced; the last of the three was replaced this past summer. 100% failure of the original drives in less than 3 years.
Re: (Score:3)
Re: In all fairness (Score:5)
Re:In all fairness (Score:5, Insightful)
One of the patterns I've noticed with Seagate is that drive failures seem to spike when manufacturing moves. The reliable Barracuda IV models made in Singapore were replaced by shoddy ones made by newer facilities in Thailand. Then around 2009-2010 they shifted a lot more manufacturing into China, and from that period the Thailand drives were now the more reliable ones from the mature facility. A lot of the troubled 1.5TB 7200.11 models came out of that, and perhaps some of your 500GB enterprise drives too.
If you think about this in terms of individual plants being less reliable when new, that would explain why manufacturers go through cycles of good and bad. I think buying based on what's been good the previous few years is troublesome for a lot of reasons. From the perspective of the manufacturer, if a plant is above target in terms of reliability, it would be tempting to cut costs there. Similarly, it's the shoddy plants that are likely to be improved because returns are costing from there are costing too much. There's a few forces here that could revert reliability toward the mean, and if that happens buying the company that's been the best recently will give you the worst results.
At this point I try to judge each drive model individually, rather than to assume any sort of brand reliability.
Re: (Score:2)
Had the same issues with Seagate 500gb Laptop drives. We had anywhere between 10-25% failure rates on those on a 400 laptop batch. Hitachi's were lower (about 8 to 15% on the worst batch) but not much in the 7200 spindle speed. 5400 Hitachi's on the other hand were rock solid (I can remember replacing only 3 in a 400 Laptop batch), and I would take those Seagates over Fujitsu's or Toshibas any day. Hell one batch of Fuji's back in 2003 had an almost 50-70% failure rate over three years. At least you had a d
Re: (Score:3)
You do support and you just bring them to your desk for the last four years and just pile them up? How many monitors, keyboards, and memory dimms do you have stacked on your desk?
Monitors, Keyboards and DIMMs don't store confidential information which requires proper disposal.
Several years ago I worked for a company which handled financial data for several big banks. We were contractually bound to dispose of all storage media in the most destructive and showy manner possible so that there would be no chance of information being leaked. Since nobody wanted to go to the expense of shredding, crushing and atomizing the things the moment they were pulled we just tossed them into a bi
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
Pick the one that's got the performance or capacity you want and the price you like.
If your data is that critical, you shouldn't be relying on one storage system for it anyway (RAID, then HDD backed up to a different system such as tape, or at least a different HDD based system). Nor sho
Release Date != Age of Drive (Score:5, Insightful)
Is he saying that 1.5TB drives are all 5 years old? If you look at the table in TFA, it talks about "release date" -- which may well be some time ago, but I'm sure 1.5TB drives may had new, even if the design hasn't changed in a while.
Re:Release Date != Age of Drive (Score:4, Insightful)
Is he saying that 1.5TB drives are all 5 years old? If you look at the table in TFA, it talks about "release date" -- which may well be some time ago, but I'm sure 1.5TB drives may had new, even if the design hasn't changed in a while.
I think the takeaway here is this man is neither terribly detail-oriented nor well-suited for his line of work. Things like date of manufacture, make and model, I/O amount, number of power cycles, environment, etc., are all obvious things to record to an experienced IT person. He appears to have done very little of that. He is a bean counter pretending to be an engineer.
5 years? That's not a given. (Score:2)
Re:5 years? That's not a given. (Score:5, Insightful)
Install that drive in a server in an online backup company and see how long it lasts.
Re: (Score:3)
Install that drive in a server in an online backup company and see how long it lasts.
Probably longer, since once the drive is filled with data, it basically just sits there spinning. Sure, there might be a patrol read of the disk every month or so, but no real work.
I expect that almost all my drives in server environments would be running fine at 8-10 years, but most get replaced after about 6 simply because bigger drives are so much cheaper at that point.
Re: (Score:3)
Can't give exact numbers as I only worked in an IT role where I was dealing with largish storage for a couple years (2006-2008) ~200TB spinning disk on ~400 disks in a dozen or so raid arrays. Anyways: failures seemed fairly clustered. We'd lose a drive in an array get the replacement then a month later the same chassis would lose another drive. It might have been power supply stressed the drives, it might have been for whatever reason those disks where getting hit harder over time than other arrays, might
Re: 5 years? That's not a given. (Score:5, Funny)
I have an 80gb ide deskstar that runs as primary storage for my DNS, key, and SSH jump box for my home network.
Theolder a drive gets, you need to put it into higher positions of authority and privilege due to its years of experience. It inverts the failure rate (which is mainly from burnout and boredom of routine).
Re: (Score:2)
Install that drive in a server in an online backup company and see how long it lasts.
My experience with SAN's and SAN operators is that they are are far too over-sensitive when it comes to detecting drive failures, if the SAN even thinks the drive might possibly entertain the idea of doing anything slightly like failing in the next 30 years it'll say the drive has failed.
This is not necessarily a bad thing in an enterprise environment dependent on your SAN, especially if you've got a support contract where EMC/NetApp et al. send you replacement disks for free.
Ditto. (Score:2)
I still have and use my Seagate Barracuda 7200.7 (ST380011A; 7200 RPM; 80 GB) HDD for storage/backup/secondary in my current Debian stable box. I got it on 12/18/2005 for my old Linux box to to replace the dying and super slow Maxtor 30 GB HDD according to my http://zimage.com/~ant/antfarm... [zimage.com] list. ;)
# /usr/sbin/smartctl -a /dev/sda
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourcefor... [sourceforge.net]
=== START OF INFORMATION SECTION =
Last about five years? (Score:5, Interesting)
I've either personally owned or purchased for companies I've worked for dozens of hard drives of all (except ESDI) technologies including MFM/RLL, IDE (parallel and serial), SCSI (original, wide, ultra wide, etc.) of form factors from full height 5.25 inch to 2.5 inch, dating back to 1991, and in my experience most hard drives last until you throw them away after 10 or 15 years because they're too small.
A few hard drives die in the first 6 months, and maybe 5-10% die in 3-5 years. Saying that disk drives last about 5 years just doesn't agree with my experience at all. Hard drives essentially last an infinite amount of time, defined by me as until they're so small that their storage can be replaced for under a dollar.
I do agree with the author's other points. Certain lines of hard drives have more like a 100% failure rate after 5 years. One 250 GB hard drive I purchased was RMA replaced with a 300 GB model because the 250 GB line was essentially faulty.
I think these studies might be looking at 7200 or 10000 RPM SCSI units under extremely high use. That's not how consumers use hard drives.
Re:Last about five years? (Score:4, Funny)
Re: (Score:2)
Agreed. I miss (real) Samsung drives. I started buying them about 6months to a year before seagate bought them. Picked up a couple more after I heard that (cause they'd still be off Samsung's line). They run great, quiet, and very cool compared to the comparable WD and seagate models from the same time and size.
Might have to look into Hitachi next round of purchases... stats and reputation on recent drives looks good.
Re: (Score:2)
Maybe you don't use them as much as Backblaze.
Just sitting there spinning, with their fluid bearings and pretty much no friction, they'll last forever.
Constantly moving the head and writing data puts extra stress on the drive.
Re: (Score:2)
Maybe you don't use them as much as Backblaze. Just sitting there spinning, with their fluid bearings and pretty much no friction, they'll last forever. Constantly moving the head and writing data puts extra stress on the drive.
Backblaze drives are basically just sitting there spinning, at least after the drive is filled up.
I write a lot more than 2TB to each of my 2TB drives over their lifetime, while a Backblaze usage model would have me only writing 2TB.
Re:Last about five years? (Score:5, Interesting)
My experience tends to mirror Backblaze, both with my own personnal business and as an employee at 2 different companies.
Seagates would always fail prematurely, but usually in a way that is noticeable through SMART monitoring. Interestingly it matches up with when they acquired Maxtor, which also started going bad when they bought Quantum. With my colocated servers for my side business I used to have to go at least twice a month to replace a failed drive. I eventually gave up on Seagate and replaced all but 1 drive (a spare raid member) with a mix of WD and Hitachi. I'm also really pissed at Seagate to have slipped so much in reliability, especially with their 7200.11 and early 7200.12 drives.
WD would fail sometimes, near when they were new. Where I worked previously, we had a policy of mixing new and old WD drives in a new server. It was a lenghty process but helped avoid losing a simple RAID1 setup.
Hitachi very good and usually inexpensive.
Where I'm working now we're trying out Toshiba and so far one drive failed out of 12, but that sample size is way too small and its not been long enough to truly tell how it will go.
Doesn't match my anecdotes (Score:2)
My experience has been pretty different. The only drive to ever fail pretty much beyond recovery for me, was a Western Digital drive (that was about ten years ago).
On the other hand, I've bought a lot of Seagate drives over the years and they have held up really well - only having data issues when a computer crashed at some particularly bad point. They've also generally performed really well.
Hitachi has been OK for me also, but they don't seem to be very performant.
Re: (Score:2)
Bwahaha... no offence mr low uid, but "dozens" does not make you an expert. I have had thousands of bad drives in my 15 or so years in the business. out of tens to hundreds of thousands of machines. Right now I see about 50-100 bad drives a year supporting a medium sized company as well as the occasional home computer.
Hard drives power supplies and capacitors are the number 1 2 and 3 things that fail on all compu
Re: (Score:2)
I seem to recall it being mentioned in my statistics for managers course that harddrives similar to light bulbs failures follow a logirithmic distribution. A drive is exactly as likely to fail in the 0-1 month interval as the 5yr 5yr1mth interval. What that ends up meaning in practice is the number that fail in the first few years is misleadingly large since it is a half life of a much larger set of drives, what matters is the relative reduction in the population which is approximately constant. As you say
Re: (Score:2)
Newer drives are designed to fail. Old drives had spare capacity "just in case", new drives expect to use it regularly during normal operation. This puts a finite lifetime on the disk.
Consumer vs. server workloads make little difference. It's not the actions of reading or writing that cause the drive to fail, it is mostly down to heat and dust. The drives have dust filters on their vents but are not hermetically sealed. If anything consumer systems are likely to be worse than a well cooled server in a nice
Who cares? (Score:2)
Who cares about known issues. If I buy a hard drive and two years from now it has a 'known issue', then I would much rather not buy it in the first place.
Re: (Score:2)
This. A failed drive is a failed drive, regardless of who it comes from and whether the issue is known or not and publicly admitted.
I will never again buy seagate (Score:5, Informative)
Re: (Score:2)
Re: (Score:2)
If you are inconvenienced by a drive failure and have to restore a backup, get angry at the manufacturer.
However, if you lose data from a single drive failure, get angry at yourself for not doing backups.
Re: (Score:3)
The seagate 3TB that just failed on me was my backup drive.
Now, I didn't lose any current data, but all my time-machine backups are gone, poof...
You need to backup repository also. (Score:2)
If you wanted to keep your version-cotrolled data you should have backed it up. Time Machine is a backup, but also a version control repository which itself must be backed up... it can fail for other reasons too.
Not to mention you should ALWAYS have at least two duplicates of data, so backing up your backup is just a good idea anyway.
Re: (Score:2)
Re: (Score:3, Interesting)
BS. I have had at least 2/3 of my newer seagates fail. From 500 gigs to 2TB drives. At LEAST 10 in the last 3 years. In the same time I have had 1 of 6 hitachi and 2 of 18 western digital. I will NEVER buy another seagate drive. Just lost my external 1.5TB USB3 drives go last week with 0 warning and TON of my data. I hate seagate with a passion that I feel for no other.
Hard drive manufacturers are extremely cyclical in quality. I've said it on /. many times, but back in the day they all went from the bottom of the reliability and performance list to the top on a yearly basis. Now that we have fewer drive manufacturers to choose from it's probably closer to every 3-5 years. I have a 500 MB Seagate external that just died that's at 7+ years old. Actually the HDD is fine, the electronics for the USB controller died. I also have a 12GB Maxtor that was in a BSD box that I jus
Re: (Score:2)
The manufacturer who fared best in the Blackblaze report was Hitachi. Hitachi bought their hard drive division from IBM. IBM produced the Deskstar series of hard drives. The 75GXP had such an abysmally high failure rate it was nicknamed the Deathstar. Lots of folks swore they wo
Hitachi may be reliable, but performance? (Score:2)
I am kind of mystified at all of these bad Seagate results, as I've had a number of Seagate drives without problems.
However there is a factor that I wonder makes a difference - generally I've been buying drives that were not the cheapest, but were more on the upper end of the model line - for example my latest drive purchases have been mostly 4TB drives. I'm wondering if buying early runs of newer model HD's brings you a greater success rate.
Also it seems like some models are better than others and perhaps
Re: (Score:2)
I encountered this and did the same thing you did once the data was recovered.
But does anyone have a clue as to why WD is using these "encrypting" controllers? The encryption isn't safeguarding the user's data (because so far as they can tell, through the USB interface, the data is in cleartext, duh). And it doesn't look like it's doing anything for WD's reputation either to make it so difficult to recover data when the controller fails. So what's the point?
I was not aware this was a scientific study. (Score:5, Insightful)
someone got paid (Score:5, Interesting)
Sorry, you're full of shit Henry Newman. How many people follow specifications about burn-in on a drive when they buy it wholesale OEM and it comes in nothing more than a plastic bag? How many people only buy drives released recently? If you're like most people and you want a 1.5TB drive you go out and buy the cheapest one that meets your needs. If Seagate still has 8 yr old drives on the market, then it's damned right that their failure rate should be considered. And so what if a drive "has a well-known problem that Seagate publicly admitted to"? As long as Seagate publicly admits all the issues with every drive they release we should then adjust stats to eliminate those flaws? That's ridiculous. This study was about "If you go out and buy a drive off the market, this is the rate you can expect it to fail at." I don't think any consumer that got a Seagate drive, had it fail and lose all their data, would then say "Oh! Well they publicly admitted to a problem! Shit! My bad!"
Sounds like Mr Newman is going to get a nice paycheck soon.
Re: (Score:2)
Sadly I've already posted on this thread so can't +1 you. I totally agree: admitting the problem isn't enough. Admitting early XBox360s had a problem wasn't enough. There is a problem what are you going to do about it? The numbers are the numbers if you realize you have a problem you need to yank that crappy 1.5TB drive off the market not throw it into your cheap enclosure and sell it at Walmart for $50 and hope no one notices before your crappy 1yr warranty is up.
Also, I could be wrong here but my guess is
Re: (Score:3)
I know Henry. He's an enterprise storage guy. My guess is that he was coming from the perspective of enterprise storage builders. Which is to say, the Backblaze data may be a fine review of the experience consumers are likely to have with hard drives, it's a terrible review of what enterprise storage platform makers would do and what their buyers would expect. Whether or not that's an appropriate response to Backblaze, who intentionally and haphazardly uses consumer drives in their systems, is its own quest
What's the point? (Score:5, Informative)
This article states everything anyone competent already knew. Consumer drives come rated for a lighter workload than enterprise.
Duh? That's the point - it's a cost:reliability tradeoff. With "enterprise" drives being 1.5x+ the expense, for uses like Backblaze where you can survive multiple disk failures with ease it's a no-brainer.
I also got "burned" by these Seagate 1.5TB disks. By *far* the worst drives we have in production (~300 or so these days), and they have had an annual failure rate around 20% since the day they were put into service. Other consumer drives don't even come close to that metric, but are rated similarly.
I actually like Seagate - every disk manufacturer has problematic models from time to time. No big deal, we knew the risks when we bought them. However, the data Backblaze published is completely validated by our own internal data. It's a drive model to avoid when at all possible. Most of our disks have a less than 5% annual failure rate, but this specific model is close to, or over, 20%. That's a major difference.
This article just states the obvious. Consumer drives generally fail earlier under heavy loads. This is not interesting, it's a known tradeoff anyone with a high school degree can figure out for themselves by looking at cycle ratings and MTBF. The only thing I care about for this workload, is if my failure rate exceeds the savings I get from utilizing the lesser drives. The answer has thus far (even with 20% of drives failing each year) been a resounding yes.
There is a difference between consumer drives, data like this is *great* to have published as it can add to your own data and you can compare notes. Will I make a buying decision based off it? Probably not. But it will certainly be one data point of many when it comes time to buy more disk. Known issue? I don't care. All I care about is if the drive works or not, and this particular Seagate model does not. The author of this article completely glances over the fact Seagate admitted to the issue, but did absolutely *nothing* to make it right for their customers essentially blaming them. This fact is what bothers me the most, not the fact they had a problematic drive model - and will likely be the largest factor when it comes to my evaluating Seagate products in the future.
Re:What's the point? (Score:4, Insightful)
To continue a bit on his ridiculous rant of "what you should be doing if you release any data on your real-world experiences".
1. The age of the drives as it affects the failure rate of the drive.
Fair enough. Backblaze did this, in the average age metric. Is average the most complete one available? Of course not, but it certainly gives you a starting point.
2. Whether the drives are burned in or not burned in, as it impacts the infant mortality.
Backblaze has stated they perform drive burn-in testing before putting into production. A tiny amount of reading the other blog posts will show you this. Any company using drives in such a manner will do so.
3. How much data will be written to and read from each drive time and if over time the drives in question will hit the limits on the hard error rates.
Duh? All drives in backblaze's pool are generally subjected to "similar" write patterns I'd imagine. Does this author *really* think Backblaze has it out for Seagate and is writing only to those drives to make them fail earlier? What I care about is how long the drives last for my workload. If I know about when to expect a failure, all the better. Specifications are rarely more than a super conservative CYA from the vendor though, and most drives outlive their rating by many multiples.
4. The load and unload cycles and if any of the failures exceed manufacturer specification.
What? How is load/unload cycles remotely relevant to an on-line backup service? Has this guy ever ran anything at all at even close to this scale? You never let drives spin down - both for this cycle rating reason, and software raid in many ways does *not* play nicely with spun down disk. No one operating with 10's of thousands of drives is going to forget this small detail, as they will have inordinate drive failures across the board if they are cycling them constantly.
5. Average age does not tell you anything, and statistics such as standard deviation should be provided.
It tells me quite a bit. Is it as detailed as a scientific study should be? Of course not. This is not a scientific study, it's simply publishing real world data the company in question has experienced. If we're talking percentage differences, this metric will matter a lot. We're not. We're talking 3% to 25%. I don't need things broken down into standard deviation to know there is a big problem. If their intention was to mislead readers, then you might have a point. But I doubt they have something out for Seagate.
6. Information on SMART data monitoring and if any of the drives had exceeded any of the SMART statistics before they are more likely to fail.
Who cares? A failure is a failure. If I replace a drive due to an early SMART warning, I'm still replacing damned drive. It failed. How it failed or the manner it failed in is absolutely irrelevant to me.
7. Information on vibration, heat or other environmental factors as it impacts groups of drives. Will a set of new drives from a vendor get put into an area of the data center that is hotter or have racks with more vibration?
Has this guy ever worked in a datacenter? Or seen what Backblaze even does? There is enough scale here to make these factors inconsequential. We're talking dozens of racks, with many servers. Drives get put into identical chassis, and into identical racks. Will some racks have slightly higher inlet temps? Sure. But unless Backblaze is co-located in some ghetto colo somewhere this is an absolute non-issue. Drives are not nearly as temperature sensitive as the idiots on review sites would lead you to believe. Google published a report on this a long while back if you need scientific evidence of that fact.
This would matter a lot if they were putting drives into different types of systems.
Revisionist History (Score:4, Interesting)
People seem to forget that Seagate denied the issue for almost a year.
I remember.
I was a seagate buyer, before they lied. It was my preferred vendor. We had a number of drives in disk arrays, but when it was time to swap them out, I avoided Seagate as replacements. Never had any data loss due to Seagate drives, but the company was a client of the software my team wrote for enterprise customers, so I did get a view on the edges of the company. Something changed.
Last year, those drives were 6 yrs old and had never gave us any issues, but old drives can't be trusted. The new drives were Hitachi - because I can read reliability reports. I'm still using the old Seagate for unimportant things from time to time. Mainly transporting large amounts of data. No issues and if there are any at this point, the old drives have exceeded expectations.
However, I don't plan to buy another Seagate drive again. They lied! Didn't step up and tell the truth. That is a management issue, not technical, and I remember it. It was a management failure. I will always remember it and were I work (BTW, I'm a CIO) - we will never buy Seagate drives again, if there is a choice.
Life and work is too important to deal with liars.
8 Years Old (Score:3)
'The oldest drive in the list is the Seagate Barracuda 1.5 TB drive from 2006. A drive that is almost 8 years old!
I recently had a 1.5TB drive die and it was still new enough to be under warranty. Seagate shipped a 2TB as a replacement.
"intellectual rigor" (Score:2)
Re: (Score:2)
the known problem with this series "does not result in data loss nor does it impact the reliability of the drive".
Of course not.... the bug just results in the controller becoming permanently busy, and if your drive is still under warranty, and you work with support, you can probably get the drive unbricked and updated.
No data loss is caused by the drive -- any loss caused by your RAID array deciding multiple drives have "failed", or by your operating system.... it's all your RAID or OS vendor's fau
coming from a storage provider... (Score:5, Informative)
Re: (Score:2)
Something can't be "pretty flawless".
Study directly reflects my personal experience (Score:3)
The only thing I found flawed in the study was how many seagate drives actually made it through the warranty period.
My personal experience shows a failure rate of seagate drives at around 300-400%(pool of 20-30 drives). What I am saying is that not only did the original drives fail, but the "refurbished" replacements failed as well, numerous times. Not a single drive got through warranty without the nice green border. The amount I spent on advanced replacements could have bought me quite a few new drivers from another vendor.
I no longer buy seagate drives. I do not have any abnormal failure rates on the other brands I use.
Betteridge's Law (Score:5, Interesting)
No.
Henry Newman's response, however, is deeply flawed.
1) Newman complains that average drive age is a "useless statistic." But he seems to prefer "time since product release" which is far worse than useless -- it is an obviously incorrect way to estimate the age of a drive population and is directly contradicted by the average age data reported in the blog post.
2) Newman has questions about Backblaze's burn in. He can find answers by googling "Backblaze burn in" to learn more about the company's remarkably transparent operations. Beach does not go into these details because an effective blog post will focus on its key conclusions rather than discussing every detail of methodology. It is not a research paper.
3) Newman digresses into hard error rate which is unrelated to drive failures. I look forward to a future Backblaze blog post about error rates. In any case since all these drives are consumer drives and all but one have the same specified error rate it is a non-sequiter.
4) Newman points out that Backblaze probably vastly exceeds manufacturer specs for drive throughput. I think this is exactly the point. Is there really enough difference in reliability between commodity and enterprise drives to justify their price difference? Or is it just a form of price discrimination? Does the spec sheet reflect reality or is it a marketing-driven fiction?
Overall this article strikes me as being written by an industry flack: someone who is more interested in parroting jargon and received wisdom rather than indulging in genuine curiosity.
re: drive throughput (Score:3, Informative)
IIRC Backblaze's workload is write once read maybe once (I mean, they are a backup company). So it's quite likely that they are massively under the specs for throughput.
The truly interesting thing about this study is that they name names; previous work in the area (lke Bianca Schroeder's FAST 07 paper, http://www.cs.cmu.edu/~bianca/... [cmu.edu] or Google's FAST 07 paper, http://research.google.com/arc... [google.com], or NetApp's FAST 08 paper http://www.usenix.org/event/fa... [usenix.org]) doesn't give away vendor names. The Backblaze res
Re: (Score:2)
Well, it's better than nothing, so if you have no information, it's a starting point. But if you actually know the age of the drives, it's a significantly worse metric.
With that said, it might be interesting to see whether there are any interesting correlations with the
Speaking of obsolete tech (Score:2)
Does anyone have a magneto-optical disk drive? I've got some old files I'd like to retrieve.
Not scientific, but... (Score:2)
Almost every time I buy another brand, the damn thing takes a crap and I get to do the job again for free. (Thank FSM Maxtor went away: they were the WORST).
Re: (Score:2)
(Thank FSM Maxtor went away: they were the WORST).
They didn't go away. Seagate bought them and embedded them into their own business. Keep that in mind --- just because a HDD has a certain manufacturer, doesn't necessarily mean the HDD is equal to other hard drives made by that manufacturer (even of the same model number); reliability varies with manufacturer standards over time, and it may depend on exactly which of the manufacturer's factories produced such and such unit.
There used to be a great
My Seagate Experience (Score:2)
Re: (Score:2)
Out of the four harddrive failures I have had in the last ten years (I often replace smaller drives with bigger ones before they fail), 3 of them were Seagate drives and one was a hitachi.
What was the fraction of drives in your environment at the time that were Seagate......... and were most of the Seagates the same age as the other brand drives in the environment, or brand new, or older on average?
Obviously if there are 3 times as many Seagates in the environment, or the Seagates have twice the I
Seagates Are So Bad... (Score:2)
A while ago I bought three of their HDDs...and somehow within a month seven of them failed. Not only that, a friend of mine tried to top off a rack of Seagate hybrid SSDs with unleaded and the whole server just burst into flames on the spot.
my car mp3 drive is still working (Score:2)
granted, its a notebook drive (old ide style, though) but its been in my car for about 10 yrs now and still has not shown any errors, music plays and does not seem glitchy and yet its in the trunk of my car being bounced around during the daily commute every day for nearly 10 years.
drives last only 5 years? really? who said that? that's not at all my experience with home drives or notebook drives. if the drive is not bad by design, I've gotton 10 yrs continuous use from most of mine. 5 yrs seems very c
Why think the data irrelevant? (Score:2)
Re: (Score:2)
The data is irrelevant because EVERY vendor has bad batches from time to time. Once you have been in this game long enough, you will realize that. The author is wise enough to not judge a company by a single bad batch.
There are two vendors out there that I will buy drives from. They are the same vendors that EMC buys drives from. Those guys at EMC are in the business, and they go through A LOT of drives. I absolutely trust them to have done their research, and to go with vendors who have the lowest fai
English best practices (Score:2)
A drive that is almost 8 years old
A sentence that is almost complete!
Cheap Drive Study? (Score:2)
If this article refers to the previous article where the low priced SAN vendor used the lowest priced drives possible, I am not surprised. They touted their low cost, and tried to mask it by stating that "These are the same drives that regular old Joe's buy."
Well surprise! There are bad batches out there. I remember WAY back in the day, the Maxtor 540MB drives (yes kids, that is megabytes, as in about one half of a gigabyte). They had a ridiculously high failure rate. HP was putting them in their Vectr
Not just flawed, but includes non-enterprise drive (Score:2, Offtopic)
As I posted before, this study had included non-enterprise drives which any thoughtful enterprise data preservation expert would not have ever used for enterprise data storage.
Study is fine, the conclusions are wrong (Score:3)
Article is accurate (Score:2, Informative)
Just last December of 2013, we have purchased around 200 Toshiba (but are the same Hitachi/HGST drives before) and 250 Seagate drives (500GB). There were no DOA for Hitachi drives and there were a couple for the Seagate. After around a month, so far only Seagate has been sent for warranty.
Previously we have purchased around 350 Hitachi/HGST drives (500GB) and the failure rate is definitely less than 5% per annum in a span of around 3 years. I haven't proceed warranty of around 50 pcs. Probably it will b
Re:Meh. fud spam. (Score:5, Insightful)
Someones working overtime to make seagate look good.
But the pile of dead seagates at work says otherwise.
Yeah, this guy is essentially saying the pre-known facts validate this research finding so therefore the research was deeply flawed.
It really doesn't matter what the accumulated knowledge over the intervening years says, the facts remain that for this user, Blackblaze, the results were the results, and it happened to match what the industry already knew.
Their results: Hitachi has the lowest overall failure rate (3.1% over three years). Western Digital has a slightly higher rate (5.2%), but the drives that fail tend to do so very early. Seagate drives fail much more often — 26.5% are dead by the three-year mark."
If anything, this guy just validated Blackblaze's study,
Re:Meh. fud spam. (Score:5, Insightful)
"I’ve noted that we just found that the Seagate 1.5 TB drives are about 8 years old since release, for the failure rate, but the average age of the Seagate drives in use are 1.4 years old. Averages are pretty useless statistic, and if Seagate drives are so bad then why buy so many new drives?"
If the company began rolling out Seagates for 3 years at 5k a year and stopped after three years because of the high failure rate, moving on to Hitachi and such, then the average age even over 8 years could very well be only 1.4 years. Because, let's face it, when it's your ass on the line and you see a particular type of drive putting your servers into a precarious state, you might start migrating away as fast as you can.
Those Seagate drives still running are probably either running in very low IO servers or very low-risk servers (clustered or such), but in such few quantities that their continued lifespans are not increasing the overall average much. The remainder could be shelved to avoid the risk of failing in a critical system and while they are listed in the total number of drives purchased, their age might not be included in the average presented.
Re: (Score:2)
Naaa, he says that Seagate confirmed that some of their drives were bad, which means they are not random failures anymore. Mathematically speaking, he is entirely right, of course, but it is a direct lie nonetheless as that is not what is being talked about.
What users care about is whether drives fail or not, not whether failures are random or expected. The Backblaze data is rock solid on that and Seagate looks like the hacks they are. My personal guess is that Seagate lets drives through QA that the others
Re:Meh. fud spam. (Score:5, Informative)
SSD is getting cheaper and faster every day.
You know whats getting cheaper? TLC flash, the kind that degrades WHEN YOU READ IT, the kind that has internal read counter and needs to be written again after a certain number of reads to level cell voltages, the kind that has ~300 writes life span. Its designed to DIE no matter what you do with it.
Re:Meh. fud spam. (Score:5, Informative)
Recent reliability testing [techreport.com] has been downright horrifying for the TLC based drives. I predict a whole lot of people buying Samsung 840 drives because they're cheap are going to regret that.
Re:Meh. fud spam. (Score:4, Informative)
Although the 840 Series is clearly in worse shape than the competition, these results need to be put into context. 500TB works out to 140GB of writes per day for 10 years. That's an insane amount even for power users, and it far exceeds the endurance specifications of our candidates.
Seems like it's not as bad as you make it out, I don't think i'd be using a 'puny' 250gb drive in 10yrs much like I don't use 250gb HDDs now that drives over 1TB are around. 1TB SSDs are already around the $500 mark and after ~5 yrs I think they'll be quite affordable.
Re: (Score:3)
Reallocated sectors indicate things are starting to go wrong with the drive, and those skyrocket starting at only 100TB of writes.. Admittedly they're busy servers, but I do have systems on SSD I've measured hitting 75TB of writes in only a year of heavy use. And while extremely heavy on writes, this test rig is pretty simple compared to the mess real world drives go through. As my parent post pointed out, there's more to TLC wear than just writes, and it's not hard at all to imagine workloads that will
Re: (Score:3)
What kind of BS test result analysis is that? Of course, reallocated sectors will rise in such a test pretty soon. The important thing to know is a) is the drive honest and b) where is the limit where data-loss or other failures will happen? From the data given, it is not possible to deduce that the Samsung TLC drives have a problem, just that Samsung is careful. It may also well be that for TLC they simply use a different strategy, namely less ECC per sector and more spare sectors. They may also replace ea
Re:Meh. fud spam. (Score:4, Insightful)
Yes. They are getting cheaper and faster. They are already much faster than magnetic rotating discs in read/write/iops.
Don't be facetious, you can't get a 1TB SSD for 100$ yet and you know this. The OP clearly wrote "getting cheaper", he didn't say they have parity on price.
The reliability rate for current generation SSDs is now higher than traditional HDDs. So in regards to " run 24hours/24hours for 5 years without any problems ?", take your pick, they can all do it better than a traditional HDD.
I think traditional HDDs have precious few years left.
Re: (Score:2)
reliability rate
How is that calculated? in particular do drives that corrupt data silently or worse lock the user out of their data requiring a hard reset to make the drive work again count or do only drives that actually fail permanently and get RMA'd count?
Re: (Score:2)
The same way it is calculated for rotating platters, I would guess.
The lab sets up a test with 1000 drives for 1000 hours. If one drive failed during the 1000 hours, the MTBF will be advertised as 1,000,000 hours ([1000x1000]/1). [short time period] * [number of pieces tested] / [number of pieces tested which failed within that time period] = MTBF.
Re: (Score:3)
I don't know the particular methodologies used.
Reported mtbf for SSDs is higher than regular HDDs.
I can't remember where but there was a recent release of stats from a data center (similar to the stats in question) that had SSDs vs HDDs and the SSDs came out on top for reliability.
I think a major factor that makes SSDs more reliable is that they are not sensitive to shock or vibration - so in a laptop/tablet/phone dominated world, you will have much higher reliability with SSDs than traditional HDDs.
Re: (Score:2)
Yes. They are getting cheaper and faster. They are already much faster than magnetic rotating discs in read/write/iops.
Don't be facetious, you can't get a 1TB SSD for 100$ yet and you know this. The OP clearly wrote "getting cheaper", he didn't say they have parity on price.
The reliability rate for current generation SSDs is now higher than traditional HDDs. So in regards to " run 24hours/24hours for 5 years without any problems ?", take your pick, they can all do it better than a traditional HDD.
I think traditional HDDs have precious few years left.
Not until I can get a 3TB SSD for under A$120.
You've only just gotten 60 ish GB SSD's under $100, even still a 500 GB SSD is still around $300.
Prices have been dropping like a brick the weight of your average American but they will level out as production meets demand. I think it will be around twice the per GB price of mechanical HDD's but capacities will still be limited. Even with the price drops, SSD's are still not mainstream becuse most people want capacity and mechanical HDD's are fast enough.
Re: (Score:2)
Perhaps this is how you accomadate Seagate.... "We throw all our hard drives out after 5 years" so Seagate's drives seem pretty much the same to us..... (we don't bother checking which vendors' drives have greater longevity, because it's against the industry talking points --- even if those industry talking points about hard drive longevity are not backed by any rigorous statistical study, or even informal statistical analysis based on historic real-world data about hard drive longevity)