Backblaze Hard Drive Stats Q2 2019 (backblaze.com) 127
Backblaze: As of June 30, 2019, Backblaze had 110,640 spinning hard drives in our ever-expanding cloud storage ecosystem. Of that number, there were 1,980 boot drives and 108,660 data drives. This review looks at the Q2 2019 and lifetime hard drive failure rates of the data drive models currently in operation in our data centers.
Re: (Score:3, Insightful)
What you have described is mostly honest attempts to get rid of the plague that are lunatics and criminals creating fake identities to commit all sorts of crimes. I know perfectly well that it hurts free speech and the need to express yourself without necessarily revealing who you are (especially when reporting corporate crimes), but lately the problems of anonymity have far outweighed the benefits. Take for example the ocean of fake hate accoun
Re: (Score:1)
It's not fake, it's a sub-category of crime.
There's not necessarily any hate in driving your car over the speed limit but it is a crime.
There is probably a lot of hate in the act of shooting up a mosque and it is also a crime. It's a crime with a far more sinister backdrop then just driving your car like a dick and as such gets its own sub-category.
More manufacturers (Score:4, Insightful)
I wish we had data for more manufacturers. They seem to use mostly Seagate and HGST (Hitachi) drives with a few Toshiba thrown in. I would like to see Western Digital as well. In my personal anecdotal experience, I've seen many more failures from Seagate compared to Western Digital. I'd love to see some real data to back up my experiences, or even prove them wrong.
HGST was acquire by western digital, but it doesn't seem that they are using any Western Digital branded drives.
Re: (Score:1)
Looks like you did not read the blog post.
Anyway, I for one wish there were more manufacturers, period.
Re:More manufacturers (Score:5, Informative)
> I for one wish there were more manufacturers, period.
Internally, everybody at Backblaze very much agrees with you. It is scary to us to be down to only two or three suppliers.
The good news is that within a few years, SSDs will be a viable alternative to hard drives for big data storage, and there seem to be more SSD manufacturers than hard drive manufacturers.
Re:While I agree with you as a commercial entity. (Score:5, Interesting)
For pure cost reasons, we (Backblaze) tend to migrate off of the smaller drives after about 5 years anyway. Right now as we speak we are planning on moving data off of 4 TByte drives to 12 TByte drives.
The rough rule of thumb is that it is worth us moving to more dense drives when we can get than factor of three density increase. Basically any drive takes the same amount of electricity (and physical rental space) as any other drive, so when we can "shrink" the footprint to 1/3 the former footprint, we save 2/3 on space rental and 2/3 on the electricity so it is worth doing.
Now, if hard drives stop increasing in density, or take 8 years to increase by a factor of three, then we will need to re-evaluate based on failure rates and the like. But for now, we retire old equipment for cost reasons before it fails, so all the drives have enough longevity for us.
> I do think for the average consumer or the more neglectful business, that we will see far more issues related to data corruption and loss during the transition from hard disks to SSDs.
I've spent 12 years in the "backup business", and I agree.
Re: (Score:2)
Do you sell off old drives in bulk?
I have some *old* 12 bay SANs that I use for my on-site cold storage. I'm running a RAIDZ3 on 500GB drives for now, but I would happily buy discounted "only 4TB" drives or smaller to replace them with. Especially ones that you've flushed out for infant failures.
Re: (Score:2)
Re: (Score:1)
"some short lived brand made in India between 95 and 99"
I had one of those too, it was a slow piece of shit, but cheap.
Re: (Score:1)
The brand was JT Storage.
Re: (Score:1)
Re: (Score:2)
Re:More manufacturers (Score:5, Interesting)
If I recall correctly they cannot purchase Western Digital consumer drives in bulk, and so have not really purchased them since the Thailand and post-Thailand years where they resorted to buying and shucking retail drives.
I know that their own blog reported in 2016 that they cannot purchase large bulk orders from HGST, so they're left with smaller bulk orders from them.
Seagate is apparently not a problem - which I'd expect given press references (principally Anandtech) to Seagate's datacenter beta programs and datacenter marketing. Backblaze principally uses consumer drives, but apparently Seagate is willing to do it for those too.
Re: (Score:2)
Source for WDC information here [backblaze.com] ("Buying drives from Toshiba and Western Digital").
Re:More manufacturers (Score:5, Insightful)
They consistently see higher failure rates for Seagate drives, year after year. They also consistently buy more Seagate drives year after year.
It seems that for them the higher failure rate is tolerable because of Seagate's lower cost. Of course for someone without massive amounts of redundancy and high performance storage servers that might not be the case, especially as the cost difference is fairly marginal.
Re: (Score:2)
That's only part of it. They get a bulk discount going to Seagate, but they have stated repeatedly that they also have problems sourcing WD drives in quantities of 5000 at a time.
Re: (Score:2)
I wish we had data for more manufacturers. They seem to use mostly Seagate and HGST (Hitachi) drives with a few Toshiba thrown in. I would like to see Western Digital as well.
They had Western Digital drives for many years. They were awful, mainly because of the flooding in Thailand, but they weren't good even before that and didn't improve very quickly afterward. Some of their Western Digital models were retired simply because so many of them failed they weren't able to keep homogeneous pods in operation.
But there just aren't any more manufacturers. The HGST brand was eliminated last year after Chinese regulators were sufficiently bribed by Western Digital to let them start f
Re: (Score:2)
As a counterpoint, I've seen probably five Western Digital drives give up the ghost in the last 8-10 years and not a single Seagate drive. Personally, I think the balance shifts back and forth. Seems like every manufacturer has a lemon model every now and again.
Re: (Score:2)
My anecdotal evidence is that hard drives are pretty reliable. Until a few years ago, I would consider them more reliable than SSDs, but that's more of a statement of how shitty some of the early SSDs were. I backup, but in the interest of not having downtime, I replace any drive that's doing anything important within 5 years. In the last 10 years, I've personally had only 2 drives fail before their time, one Hitachi and one Toshiba. Both gave warning of their impending doom, so data loss was minimal.
SSD stats? (Score:2)
Would have been interesting to see SSD failure rate stats. Assuming they use any of course.
Re:SSD stats? (Score:4, Interesting)
Backblaze probably does not. As far as I understand their business model, they provide storage for backups and archiving, nothing intended to be used real-time.
Re:SSD stats? (Score:5, Informative)
> Backblaze probably does not [use SSDs]
For the big raw storage, you are correct. The hard drives still save us a ton of money over SSDs. However, we have quite a few SSDs in our billing servers, monitoring servers, front end caching servers, and in our large Cassandra clusters (used for resolving the "friendly" file names to raw hex storage ids/locations in our B2 service).
I don't know that we have a large enough sample size of any one model to be statistically significant. I wish some open source project would startup where you ran a TINY little app on any computer you own that reported home once per day to a global database of drives so the world could get all this info collected and reported.
Re: (Score:2)
Thanks for the info. The interesting SSDs would probably be the caching ones and the ones used in the Cassandra clusters. I do agree that you need quite a lot to get statistical significance.
Re: (Score:2)
You mean some kind of SDD... telemetry? :)
Re: (Score:1)
Re: (Score:2)
Just curious - do you use "enterprise" SSDs for these applications, or are they also consumer drives?
Seagate isn't very good for this use (Score:1)
I take BackBlaze reports with a grain of salt, because they are using consumer grade drives which were not really designed for this 24/7 use. Although you can gain some insight into how one maker has a stronger showing then others in this severe use scenario. Seagate has never had very good results and so given all the drives are consumer grade you can draw some conclusions if you intend to run your own PC in this sort of cycle of use.
Re: (Score:3)
Hitachi engineers at Hitachi facilities -- that's why the drives are superior quality and make WD, Seagate, IBM look poor.
Hitachi bought IBM's hard drive business [wikipedia.org] in 2003. They probably improved upon the IBM drives of yesteryear, but they didn't make IBM look poor. IBM walked away with $2B.
Re:I Used To Wonder (Score:5, Interesting)
> I used to wonder why Backblaze relied so heavily on Seagate
Exactly. We Reed-Solomon encode each file across multiple drives in multiple machines, and we always use "enough parity" that the failure rate won't result in data loss.
So for us, we have a (pretty simple) little spreadsheet that takes into account drive failure rate as a cost, along with drive density (more dense drives means renting less data center space) and we let the spreadsheet kick out which drives to buy based on the cheapest total cost of ownership. Honestly, we're not brand loyal AT ALL, and we're not afraid of a higher failure rate (other than that raises the cost because we have to buy replacements for failed drives).
With that said, if an individual purchases ONE DRIVE they might value a lower drive failure rate differently than Backblaze does. But honestly, if there is a 1% drive failure rate per year or a 10% drive failure rate in a year, you should still backup the data so you can sleep at night.
Your theory has been debunked. (Score:1)
It was found that the more expensive enterprise drives failed at the same rate as consumer drives. The only thing that enterprise drives benefited was the manufacturers bank account.
Re: (Score:3)
Re: (Score:3)
Especially if it is used in a ultra-raid, hot swappable system. And in this case, isn't it use for backup/archive purposes? So, it gets written, and quite possibly never accessed again? If that is so, I'd expect that their usage pattern resembles the typical consumer more than an active server.
Re:Seagate isn't very good for this use (Score:5, Insightful)
> if a consumer-grade drive can survive in non-ideal situations then it will be an excellent choice for normal (consumer) conditions.
I think it's funny people think we (Backblaze) abuse the consumer drives. When I open my home gaming box, it is filled with dust bunnies. Our datacenters have filtered air and no carpets, and we walk across sticky fly paper mats as we enter. When we open a storage pod after 3 years, it is absolutely dust free inside!
We also don't run drive intensive applications. Our backup service you basically write the data ONCE, then leave it alone for years, and possibly read it back once to do a restore. (Well, we actually walk the drives from time to time recalculating all the checksums to make sure there is no bit-rot to repair from parity.)
The one thing we do differently than consumers is we leave the drive "powered up" for 5 years continuously. Internally we debate a lot whether this PRESERVES the drives or HURTS THEM, we honestly don't know. There is some evidence that heating drives up, then cooling them off by powering them down, then heating them up again is a lot worse for the hard drive than just leaving it in a very nice, constant temperature for 5 years.
But if a consumer at home leaves their hard drive powered up, I claim Backblaze is WAY NICER to the drives than a consumer. No dog hair or carpet fibers clogging up the cooling systems, we don't bump the table holding the drive with a chair every time we sit down. The cat isn't sleeping on the computer vents to keep warm.
Re: (Score:2)
I claim Backblaze is WAY NICER to the drives than a consumer. The cat isn't sleeping on the [data center] computer vents to keep warm. :-)
That you KNOW of....
Look all pretty good (Score:3)
Considering that these are mechanical devices, even an AFR of 2.8% is pretty impressive. Looks like this has finally gotten to be a mature technology.
Now, what I would really be interesting is comparable stats for SSDs.
Re:Look all pretty good (Score:4, Insightful)
I'd like to see stats on how accurate SMART data is at predicting failures. I could tolerate a 2.8% failure rate if SMART gave me warning in advance 99% of the time.
Re: (Score:2)
The old estimates by IBM were just 50%. I have no idea whether that has gotten better. But I do not think backups have become obsolete, also because they protect against things like encryption-Trojans and user error as well.
Re: (Score:1)
My last two HDD (Seagate in QNAP NAS) failures were predicted by SMART and I was able to replace both before either failed fully.
Re: (Score:1)
Re: (Score:2)
That is pretty bad for "enterprise" disks. Maybe consumer disks are actually more reliable? The only enterprise disk I ever bought (Seagate) was DOA, never spent the extra money since then and never had any HDD failures since the old IBM "deathstar" drives.
Re: (Score:2)
Enterprise disks get the shit pounded out of them 24x7, data from immense number of servers striped across them when storage is virtualized. Your home disk mostly snoozes if you're typical user.
Re: (Score:2)
True. And the they should most definitely not fail that often, regardless of load. They should also most definitely not arrive DOA.
I have a nagging suspicion that at least some "enterprise" disks are basically a scam. But I do not have numbers to back that up.
Wish they had stats for noise (Score:4, Insightful)
I've been trying to put together a large RAID array for my home, as such although I want something reliable I also really would like to be sure I'm getting relatively quiet hard drives (this after getting a Seagate for a small set of drives, that was quite a bit louder in operation than other drives I had).
Looking at other reviews it seems like a few years ago some reviewers used to include decibel levels, but I can't find anything current...
From reading around it seems like Western Digital RED drives may be some of the quieter drives, but if anyone has any hard data I would love to see it.
This is a RAID for redundancy, not speed, so I'm OK with a drive failure or two over time, just want something that is not distracting when it fires up.
P.S. As one ingredient in this mix, I decided to get a longer Thunderbolt 3 cable which helps also, if anyone is thinking along similar lines.
Re: (Score:2, Insightful)
Once you get four or more in an array (you say large, so I'm guessing 12+), there is no quiet HDD. You will hear the combined whirr and click-clack of all the drives, and the fans which cool them. SSD is your only option for quiet, and it helps with the heat too.
Re:Wish they had stats for noise (Score:5, Interesting)
> if you are using large numbers of HDD forget about quiet. Stick it somewhere you don't have to hear it.
Here is an amusing Backblaze startup story.... Backblaze's original office was in my living room of my dive one bedroom apartment in Palo Alto, California. I mostly work on the client, but the guy building the original storage pods would power up the early pod prototypes in my living room on Friday and then GO HOME FOR THE WEEKEND. The 45 drives (often in an open enclosure) and fans would SCREAM in my living room all weekend long, it was super annoying. So I bought a plastic "tuff shed", put it on my back patio, and drilled a hole and strung power and ethernet out to my patio, and moved the prototype pods out there. Here are two pictures of that setup:
https://i.imgur.com/awTcR2t.jp... [imgur.com]
https://i.imgur.com/gg49XhE.jp... [imgur.com]
Thanks! (Score:1)
Probably not going to go with an outdoor shed route but that was fun to see. :-)
HOLY CRAP XKCD WAS RIGHT. (Score:2)
The 45 drives (often in an open enclosure) and fans would SCREAM in my living room all weekend long, it was super annoying.
There has never been a more appropriate time to post this XKCD https://xkcd.com/908/ [xkcd.com] and it has never been more scarily on point!
Re: (Score:2)
Also I like that your post of those pictures is not too far under your post about how much care you take of your servers to ensure cats don't sleep on them :-D
It does seem like there can be a large difference (Score:1)
Once you get four or more in an array (you say large, so I'm guessing 12+), there is no quiet HDD. You will hear the combined whirr and click-clack of all the drives
I agree with what you are saying, I do have pretty realistic expectations that it's not noise free like an SSD would be.
However like I said I did have this experience with a really loud Seagate drive - I have a four bay external case currently (non RAID) and when I bought this Seagtae drive, 7200 RPM like all the rest, it was really lots louder
Re: (Score:2)
Re: (Score:2)
Western Digital Reds are cooler and quieter because they run at 5400 RPM. This isn't a problem for serving up files at home, so I've been using them for my personal RAIDs without any issues.
Looking at Red Pro (Score:1)
Thanks for the feedback, I was looking at the 7200 RPM Red Pro [amazon.com] drives, if anyone has any feedback about those in particular...
I had read somewhere they were fairly quiet, though I cannot find where now. Even though I don't care about performance a lot I still care enough that I'd like to use 7200RPM drives over 5400 RPM.
I think what I may do is buy one of a few different kinds of drives, plug them in and see what seems best, then mostly use those drives and use the noisier ones for offsite backup units.
Re: (Score:1)
After double-checking, it looks like WDs largest Reds and Red Pros have very similar specifications according to the spec sheets on WD's website.
WD Red [westerndigital.com]
WD Red Pro [westerndigital.com]
Re: (Score:2)
I did this with HGST helium SAS drives, 3 years ago. Made a nice 6-drive RAID-10 array, since I did care about speed, with a hardware RAID controller. It's been rock solid, though I did need to put in a fan blowing directly on the RAID card. For HW RAID, make sure to get a super-capacitor backup on the card in case of power loss.
Of course, if you don't care about performance at all, software RAID 5 is about half the price, and if you're OK with consumer drives, half again. But I'd stay away from Weste
Re: (Score:2)
Oh, and until I added the fan in the side of the case, the server was completely inaudible most of the time - you could just barely hear drive activity in a dead quiet room, but not with anything else going on.
Re: (Score:2)
If this isn't for high-speed access, just throw it somewhere where it can be loud and network-connected. Basement?
Thanks - I had thought about that also, currently have it close to a desktop I work on. But you are probably right that I should just forget about keeping it close and instead find a place in the house I can put it it's not audible, and just use a fast network to reach it... I'll look into just doing that, I had been using some powerline ethernet for a while but I can probably do a real run of
Re: (Score:2)
Re: (Score:2)
From reading around it seems like Western Digital RED drives may be some of the quieter drives, but if anyone has any hard data I would love to see it.
I can't provide data, I can only provide an anecdote from someone who is very noise sensitive. I had 2 arrays, one with 4x 1.5TB seagate drives and the other with 4x 4TB WD drives. Two of the seagate drives started throwing SMART errors so I looked at replacing the entire array.
I ended up going WD drives *because* that array was significantly quieter. Now the caveat there is I was comparing apples to oranges (1.5TB drives vs 4TB drives) purchased over a year apart, but in general I was happier with the WD p
Stats (Score:2)
The post provides data on the total number of failures over a given period. There is some data on average age, but I think it would be much more interesting to look at failure over time in something like a survival curve. How the failures are distributed over time.
On HDD Specs and Stats (Score:3)
I recently-ish built a new NAS with six 4TB HGST "CoolSpin" drives in a single RAID-Z2 vdev -- HGST because reliable; CoolSpin because quiet, lower power, and fast enough for GigE.
However, when I was selecting drives, the most annoying aspect was concretely determining whether a given drive supported TLER [wikipedia.org]. The datasheets almost never disclose this, and you have to dive into various support and discussion fora to find out.
Now, the number of variables I need to keep track of has increased, and manufacturers seem even less interested in documenting them. Specifically:
I wonder if any Backblazers would be willing to say a few words (informally, of course) on how they rate the importance of these characteristics, and how they discover what a given drive supports without having to buy a pallet of them first.