Endurance Experiment Writes One Petabyte To Six Consumer SSDs 164
crookedvulture (1866146) writes "Last year, we kicked off an SSD endurance experiment to see how much data could be written to six consumer drives. One petabyte later, half of them are still going. Their performance hasn't really suffered, either. The casualties slowed down a little toward the very end, and they died in different ways. The Intel 335 Series and Kingston HyperX 3K provided plenty of warning of their imminent demise, though both still ended up completely unresponsive at the very end. The Samsung 840 Series, which uses more fragile TLC NAND, perished unexpectedly. It also suffered a rash of cell failures and multiple bouts of uncorrectable errors during its life. While the sample size is far too small to draw any definitive conclusions, all six SSDs exceeded their rated lifespans by hundreds of terabytes. The fact that all of them wrote over 700TB is a testament to the endurance of modern SSDs."
context (Score:2)
has anyone tried this with platter drives? would it simply take too long?
it's hard for me to judge whether this is more or less data than a platter drive will typically write in its lifespan. I feel like it's probably a lot more than the average drive processing in its lifetime. and anyway, platter drive failure might be more a function of total time spent spinning or seeking or simply time spent existing for all I know.
Re: (Score:2)
Re:context (Score:4, Interesting)
Not that much higher for streaming reads and writes, the new Seagate 6TB can do 220MB/s @128KB [storagereview.com] streaming reads or writes. That works out to ~19TB/day so it would only take around 2 months to hit 1PB.
Re:context (Score:5, Informative)
If seeking does wear a drive, then using an SSD for files that generates lots of seeks will not only greatly speed up the computer, but also extend the life of HDDs relegated to storing big files.
Re: (Score:2)
But contiguous writes is the absolute (and unrealistic) best case in terms of MB transferred before failure for an HDD, because it minimizes the number of revolutions and seeks per megabyte written. For whatever it's worth, it used to be said that "enterprise grade" drives were designed to withstand constant seeking associated with accesses from multiple processes, instead of fewer seeks associated with sporadic, single-user access.
If seeking does wear a drive, then using an SSD for files that generates lots of seeks will not only greatly speed up the computer, but also extend the life of HDDs relegated to storing big files.
In regard to mixing SDDs and HDDs, there are some great caching programs that allow a single SSD to act as a front end to several HDDs. The driver looks at the traffic coming through, and if it is "mainly sequential writes", bypasses the SSD to write direct to the disk. For random stuff, across the x HDD drives, the SSD acts as a cache. The percentage of Sequential to Random is selectable.
Very recently I purchased my first SDD at about $0.48 per gigabyte (128gig for $59.00) I expect that next yea
Re: (Score:3)
Why? The failure modes are completely different (and yes there are quite a few reports around on this subject..)
SSDs have a write capacity limitation due to write/erase cycle limitations (they also have serious long term data retention issues).
Mechanical drives tend to be more limited by seek actuations, head reloads, etc. The surfaces dont really have a problem write erase/write cycles.
Nether are particularly good for long term storage at todays densities. Tape is MUCH better.
Re: (Score:3)
the problem with tape is by the time you can retrieve the data you're interested in, it no longer matters.
Re:context (Score:4, Informative)
Tape actually has pretty high transfer rates. Its seek times are what sucks, but if you're doing a dump of tape you arent doing any seeking at all.
Re: (Score:2)
Re: (Score:2)
The main difference is LTO tapes (and similar) are actually designed so they can be used for archival storage (in the region of 30 years). Hard drives just aren't. If you can get a drive that's been sat in storage - no matter how good - for 20 years to spin up then you're very lucky.
Re: (Score:1)
Supported (Score:2)
RAID implementations don't always support cobbling together a random mixture of disk sizes which change over time.
Linux' software RAID support this without any problem. As you finished a cycle of yearly swap over the whole pool, you can increase the RAID to the new maximum (= shared minimum accross the drives). The resize is done on-line and is gracefully restartable (in fact, you can even migrate to bigger RAIDs with more drives gracefully).
(e.g.: After 6 years, once you've upgraded a RAID6 from 6x 1TB to 6x4TB, you can easily grow the system from 4TB to 16TB).
In addition to that, modern filesystems like BTRFS and ZFS
Re: (Score:1)
BTRFS and ZFS (Score:2)
I think that's what I was saying: a random mixture of disk sizes is not supported by this particular RAID implementation - it will only use the same size across each disk, meaning you are constrained to the size of the smallest disk in the pool.
Okay I was thinking that you were comparing with other RAID implementation (most fake RAID cards can't even *grow* the raid, once you've cycled the drives and that the "smallest disk in pool" is now bigger).
Btrfs and ZFS sound like they handle it much better.
Yup, they would handle whatever you throw at them, as long as they can manage to fit the constrains you've asked.
Re: (Score:2)
What would be involved in continually verifying the viabil
Re:context (Score:4, Informative)
has anyone tried this with platter drives?
A few years ago, Google published a study [googleusercontent.com] of hard disk failures. Failures were not correlated with how much data was written or read. Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use. Failures were negatively correlated with temperature: drives kept cooler were MORE likely to fail.
Re: (Score:1)
Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use.
That makes no logical sense unless the statement is missing a "not" somewhere, or unless you WANT failures.
Re:context (Score:4, Informative)
While ShanghaiBill apparently struggles with the English language, the phase "you should idle a drive not in active use" means the drive will spin up fewer times. You should disable spin down and leave the drive idling, not on standby.
You'll reduce the number of head load/unloads.
You'll reduce peak current consumption of the spindle motor.
The drive will stay at a more stable temperature.
Re: (Score:1)
Is there any way to tell a WD Caviar Black drive to behave this way? Mine automatically spins down after 30 minutes of inactivity I believe.
Re: (Score:2)
It should be under OS control, not the drive.
Re: (Score:1)
Thanks for the reply. I'll go do a little research on that.
Re: (Score:3)
Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use.
That makes no logical sense unless the statement is missing a "not" somewhere, or unless you WANT failures.
You're reading the sentence wrong. You're reading it as "Times the disk was spun up".
What they mean is the total amount of time the disk has spent spinning over its lifetime.
Re: (Score:2)
What they mean is the total amount of time the disk has spent spinning over its lifetime.
Yes, this is correct. It is the total amount of time spent spinning that you want to minimize, not the number of "spin-up/spin-down" cycles. The longer the disk spins, the more wear on the bearings.
Re:context (Score:5, Interesting)
That's curious. Almost all of the drive failures I've seen can be attributed to head damage from repeated parking prior to spin-down, whereas all the drives that I've kept spinning continuously have kept working essentially forever. And drives left spun down too long had a tendency to refuse to spin up.
I've had exactly one drive that had problems from spinning too much, and that was just an acoustic failure (I had the drive replaced because it was too darn noisy). With that said, that was an older, pre-fluid-bearing drive. I've never experienced even a partial bearing failure with newer drives.
It seems odd that their conclusions recommended precisely the opposite of what I've seen work in practice. I realize that the plural of anecdote is not data, and that my sample size is much smaller than Google's sample size, so it is possible that the failures I've seen are a fluke, but the differences are so striking that it leads me to suspect other differences. For example, Google might be using enterprise-class drives that lack a park ramp....
Re: (Score:3)
The problem is that there are two ways for the drive to park the heads. (FYI - ALL spinning rust drives these days park the heads on power down). One of them is more violent than the other.
There is the normal
Re: (Score:2)
I've never had a drive that did emergency parking until my HD-based MacBook Pro. All my dead drives were too dumb to have the needed sensors, as were the machines that they were in.
With that said, I'm terrified at the aggressiveness with which that MacBook Pro parks its heads. I literally can't pick the thing up and place it gently on my bed without the heads doing an emergency park. I don't have a lot of faith in that drive lasting very long. Non-emergency parking is hard enough on the heads. Emergen
Times spun up was a factor too (Score:3)
Stopping and starting a drive is also a moment where you can break/wear down a drive. This can be explained by the fact that heads rest on platters (unless in parked position) when the platters are not spinning at the right speed. Also, because a drive that is being spun down will cool down and warm up again when being spun up. These temperature fluctuations will be of influence on the drive reliability. The most plausible explanation I can come up with is that temperature shifts will make parts inside the
Re: (Score:2)
A few years ago, Google published a study [googleusercontent.com] of hard disk failures. Failures were not correlated with how much data was written or read. Failures were correlated with the amount of time the disk was spun up, so you should idle a drive not in active use. Failures were negatively correlated with temperature: drives kept cooler were MORE likely to fail.
Actually the paper says that the Google guys approximated power-on hours with a notion of age, which I assume was approximated by a knowledge of either the manufacture date of the delivery date. From the paper, annualized failure rate (AFR) is somewhat correlated with age, but not necessarily strongly enough to predict probability of failure. Even with their large drive population, the paper points out that the drive model mix is not consistent over time and therefore, not much can be made of the apparent
Re: (Score:2)
With SSDs, we know that the NAND dies a bit every time it is erased and rewritten; sometimes after surprisingly few cycles with contemporary high density MLC NAND; but the supporting solid state stuff should last long
Re: (Score:2)
Re: (Score:1)
Re: (Score:2)
All still going (Score:2)
I have around 30 ranging from 40G to 512G, all of them are still intact including the original Intel 40G SSDs I bought way at the beginning of the SSD era. Nominal linux/bsd use cases, workstation-level paging, some modest-but-well-managed SSD-as-a-HDD-cache use cases. So far wearout rate is far lower than originally anticipated.
I'm not surprised that some people complain about wear-out problems, it depends heavily on the environment and use cases and people who are heavy users who are not cognizant of ho
Re: (Score:2)
a bit unclear why, but any HDD I've ever put on the shelf (for 6-12 months) that I try to put back into a machine will typically spin-up, but then fail within a few months after that.
The lubrication in the bearings of the platters and head arms gets thicker over time after being heated a few times. It needs to stay warm to keep a lower/workable viscosity. The drag becomes too great fairly rapidly after even a few months initial use when then stored on the shelf.
Good news for me (Score:2)
I am sticking to rated lifespan (Score:2)
Ability to write hundreds of terrabytes more is nice. But it's reading them back that I am really worried about. Great news for someone deploying a short term cache.
extremesystems test (Score:4, Informative)
There was also a very interesting endurance test [xtremesystems.org] done on extremesystems.org. Very impressive stuff. I don't yet own an SSD, but I'll continue to consider buying one! Maybe next Black Friday. Just waiting for the right deal.
Re: (Score:2)
Re: (Score:2)
Am assuming USB keyboard. Have you tried toggling the bios settings for legacy suppport for usb keyboards and mice. Also, ps2 works will if you got the ports and a keyboard.
Nope. This is the laptop's native keyboard. There is no problem if I use a USB keyboard. However, carrying an extra keyboard kind of defeats the purpose of having a laptop.
The really weird thing is that this delay happens *AFTER* grub. The keyboard works fine in grub. However, once I choose an OS, I get stuck for 30ish seconds at the login prompt, whether I choose Windows or Linux.
Re: (Score:2)
Now, instead of waiting on their HD to seek around and find information (a boot process measured in minutes, program loading t
Re: (Score:1)
How does this compare to hard drives, though? That's the key metric. I don't mind my pc booting up in 30 rather than 10 seconds if I don't have to do disaster recovery and pay far more per gig.
IO pattern (Score:4, Insightful)
That's a heck of a lot of data, and certainly more than most folks will write in the lifetimes of their drives.
Continued write cycling [...]
That's just ridiculous. Since when the reliability is measured in how many petabytes can be written?
Spinning disks can be forced into inefficient patterns, speeding up the wear on mechanics.
SSDs can be easily forced to do a whole erase/write cycle just by writing single bytes into the wrong sector.
There is no need to waste bus bandwidth with a petabyte of data.
The problem was never the amount of the information.
The problem was always the IO pattern which might accelerate the wear of the the media.
Re: (Score:3)
Yes, but it's a well-known problem. Pretty much the only thing that will write inefficiently to a SSD (i.e. cause a huge amount of write amplification) is going to be a database whos records are updated (effectively) randomly. And that's pretty much it. Nearly all other access patterns through a modern filesystem will be relatively SSD-efficient. (keyword: modern filesystem).
In the past various issues could cause excessive write amplification. For example, filesystems in partitions that weren't 4K-alig
Re: (Score:1)
I agree, measuring reliability like this is strange.
Even more disturbing is the number of drives being tested. What is the statistical significance of their results?
Graceful Failover ? What Graceful Failover? (Score:2)
Even Intel, behemoth of reliable server hardware, wasnt able to fix Sandforce problems. :DDD and not switch it to read only mode (like you promise in the documentation).
According to Intel representative Graceful Failover of SSD drive means you _kill_ the drive in software during a reboot
Kiss your perfectly readable data goodbye.
Re: (Score:2)
That was also my question when I RTFA. It says that the Intel drive entered some sort of "read-only" mode, and that at that point the drive was still OK. Then a new write cycle was forced (how?), and the drive committed seppuku and became unreadable.
Which is it? Can I be confident that my SSD will fail to a gracious read-only mode? All my ~ is in RAID1 and backed up so I'm not worried, but it'd be nice to be able to just copy the / from a read-only SSD to a new one when the time comes.
Endurance Experiment Writes One Petabyte To Three (Score:2)
"how much data could be written to six consumer drives. One petabyte later, half of them are still going."
Re: (Score:3)
Yes, they are sooo reliable, every single SDD I've bought has been dead within 3 months.
Odd - I've got 5 and all are well. 1 Intel, 2 Samsung and 1 Critical. I guess I'm lucky and you are not.
Re:Sigh. (Score:4, Funny)
Rejoice then, you still have 75 SSDs!
Re:Sigh. (Score:5, Funny)
We seem to have the beginning of a trend here - AC's don't have very good luck with SSD's.
Try logging in and see if that changes your outlook.
Re:Sigh. (Score:5, Funny)
No, I logged in and I've still got Outlook 2007.
Re:Sigh. (Score:5, Funny)
That apparently doesn't prevent you from dropping bits, though. 1+2+1=4.
Re: (Score:3)
I don't recall the brand of the fourth, got distracted and forgot to edit. But I knew someone would have fun pointing it out, so it would be rude for me to deny you the pleasure. So - yeah - I dropped a bit. :D
Re:Sigh. (Score:4, Funny)
There you go again.
Re: (Score:2)
Re: (Score:2)
How do you know he wasn't listing them chronologically? "1 Intel, 2 Samsung[, 1 don't recall] and 1 Critical. "
Thanks, but no - I just screwed up. Yesterday was just my turn to be in the barrel.
Re: (Score:2)
not to mention a write error: "Critical" instead of "Crucial"
Hee hee. That's a "loose nut behind the keyboard" error - not an SSD error.
Re: (Score:2)
I have two different Crucial mSATA drives - one runs VMware in one workstation (well, "server"), and the other runs virtualbox in another. Each is a different generation SSD - and no problems. I've also shipped many to customers in servers (real servers on RAID controllers, not workstations posing as servers). Not one failure.
Re: (Score:3)
hey thanks for sharing your anecdotal experience as if it carries any weight whatsoever compared to actual controlled experiments and statistics.
for comparison, I've owned 8 and no failures yet. I have a raid0 array of SSDs upstairs that has been working flawlessly since 2008. an aberration maybe. anecdotal evidence works like that.
Re:Sigh. (Score:5, Insightful)
that reminds me ... I should do a backup ....
Re: (Score:2)
Re: (Score:2)
Let me guess, every single SSD you bought was a low capacity sand force controlled one.
Re:Sigh. (Score:5, Funny)
Yes, they are sooo reliable, every single SDD I've bought has been dead within 3 months.
A happy OCZ customer, I take it?
Re:Sigh. (Score:5, Funny)
Amen to this, I STUPIDLY bought a REFURBISHED OCZ drive which, coincidentally failed shortly after OCZ announced bankrupcy. The other drive I bought was a Corsair that, like it's OCZ bretheren died three weeks after put into service. The speed is wonderful but the life is pathetic. Despite this, I have a Kingston and a Samsung which are both going strong so I can confidently state that HALF OF ALL SSDs FAIL AFTER THREE WEEKS, THE OTHER RUN FOREVER!
Perhaps I need to work on my sample set and my over-use of capital letters.
Re: (Score:2)
Might be luck, might be an exception, but my Agility 2 is still kicking after 3 years, half of that was under XP (no TRIM).
I've had 4 spinning drives (Seagate) die or get bad sectors in the same time frame.
Re: (Score:2)
Re: (Score:2)
I purchased two OCZ 64Gb SSDs and they both failed right around the end of the warranty. One was replaced under warranty, the other not. They replaced the 64Gb drive with a 60Gb drive which was a little upsetting, but better than nothing I suppose.
Both died suddenly, and with no warning.
Re: (Score:2)
I'll see your incredibly small sample size, and raise you with "the company I work for has bought hundreds of Kingston SSDs, and we haven't had even one fail in the last two years."
Re: And the winners are... (Score:1)
100TB a day? Roughly 1.2GB per second? No. No you won't.
Re: (Score:2)
SATA 3.2 isn't out yet for consumer drives, so no you won't.
1.2GB/s twice the bandwidth of SATA 3.0
.
Re: (Score:2)
Its all irrelevant, because theres no SSD out there that could handle that write rate, and theres no way hes generating that much data 24/7.
Hes full of crap and doesnt want to admit it.
Re: (Score:2)
That the l
Re: (Score:2)
No, with 1920x1080 24bit per pixel and 30fps I'm getting 178MB/s so six cameras almost saturate the SSD. Of course it would be a great deal more reasonable to acquire the data compressed in H264 frame by frame, or H265 if it comes out and is better for that.
Though, we might get mad and use uncompressed 1080p at 60fps. Then you can have a realistic "zoom in and ehnance" sequence as in the dumb movies and TV shows, with an algorithm able to combine data from multiple pictures and see a face more clearly esp.
Re: (Score:2)
You're editing 4k video 24/7? Thats quite impressive, but not terribly believable.
Re:And the winners are... (Score:5, Insightful)
Re: (Score:2)
You couldn't sustain that bit rate on a SATA interface. No normal workflow would sustain that volume of writes or encoding, especially prosumer or lower.
There may be broadcast or industrial uses but they would be writing to industrial strength storage via 16 gig fc to SAS SLC arrays.
Re: (Score:2)
You failed at math.
You won't be writing 1.2GB/s to any SSD currently available. They all max out at SATA 3.0 - 600MB/s.
Since you'd need at least 3 striped drives to even try to sustain 1.2GB/s, your endurance has now tripled from 700TB to 2.1PB.
Re: (Score:2)
Thats not really how it works. The wear is leveled across cells, so increasing the number of drives in a RAID0 really does increase the amount of data till a predictable failure (ie, the "write limit").
Re: (Score:2)
But the SSD controllers aren't fully aware of what is going on and the free or reserved space for wear leveling is splitted as many times as you have drives. A n times bigger SSD are still better than n SSD in a RAID 0.
Then we're limited by interface for speed, but we have good incremental progress on the horizon. PCIe storage is already standardized in the form of M2 and SATA Express, the latter works in PCIe 2.0 2x or 4x, the latter is 2x : that gives a theoretical 1GB/s and 2GB/s. Upgrade to PCIe 3.0 dou
Re: (Score:3)
See for yourself [forret.com].
Why didn't you just refer to the LHC web page and imply that you are writing at that same data rate to a single SSD...it would have exactly the same value as an argument.
Re: (Score:2)
By the way, 700TB isn't all that much these days. Betcha I could do it in a week's worth of video editing.
I'll take that bet. Most SSDs have physical bandwidths of less than 1GB/sec. So even if you were writing continuously, without sleep or bathroom breaks, and reading nothing back, you would still need more than a week to write that much data.
Re: (Score:3)
Which will also spread around the writes. If you're writing a 4TB video across 10 disks, that's only 410GB to each, so you only get that much endurance used up.
Re: (Score:2)
Protip: less than 1GB/sec is much less than 700TB/week.
Protip 2: SATA 3.0 is only 600MB/sec, the peak interface bandwidth is only 346GB/week.
Re: (Score:2)
I think your math is off a bit by a factor of 1000:
600 MB/s and 604,800 sec/wk = 362,880,000 MB/wk = ~362,880 GB/w = ~ 362 TB/w
Re: (Score:2)
Off by a letter.
s/G/T
Re:And the winners are... (Score:5, Informative)
You might want to do a bit of math before making such a statement. 700TB is a very large amount of data. And in order to do that in a week, would require quite a bit of data transfer bandwidth. To wit:
700,000,000,000,000 / 7 days = 100,000,000,000,000 / 24 hours = 4,166,666,666,666 / 3600 seconds = 1,157,407,407 bytes per second.
Do you really write 1.157GB/second every second for a week? And if so, what data interface are you using? I'd really like to know since SATA 3.0 can only handle 600MB/second. Perhaps you're using SATA 3.2 which does have the required speed?
Now in an environment using multiple drives, you can get to the 700TB mark much more rapidly with much lower per drive bandwidth. But then again, that's not the test criteria. They are testing how much endurance individual SSDs have.
Re: (Score:2)
As far as I know SAS 12Gb gives you 1.2GB/s theoretical.
Re: (Score:2)
Good luck with that.
The Intel 335 has a sequential write speed of about 350MB/s (the rest are around the same speed). Writing 700TB at that speed would take 24 days and change, with no breaks to do things like read any of that data.
Re: (Score:2)
The 8MB problem is an Intel firmware bug (older, non-Sandforce controllers). If you don't care about your data, ATA "security erase" can make it usable again. I think I used the DOS-based hdderase, and after a few problems it went through. Intel's DOS-based flash idiotically ignores the SSD because it identifies itself as "BAD_CTX"...
Re: (Score:2)
but the actual behavior when it ran out seems to be exactly what's supposed to happen
FTFY
When a flash cell fails, it can no longer hold the charge that stores the bit.
It will always be read as if it had no charge, therefore read checksums will fail and the drive is unreadable.
Re: (Score:2)
And... that's it? What did SMART say? Did you actually wear the SSDs out as-per the wear indicator? Or did you hit a bug in the samsung controller before the wear-indicator maxed out?
To be fair, the precise situation you describe, particularly if you did not retune the RAID-6 setup or the mysql server, and if the server was fsync()ing on every transaction (instead of e.g. syncing on a fixed time-frame as postgres can be programmed to do)... that could result in el-cheapo samsungs not being able to do any
Re: (Score:2)
Also Samsung 840 (non pro) is TLC chips, three bits stored per flash cell. They'd be the drives that suffer the most from that write amplification, with 840 EVO that is similar and is very aggressive in working with very few overprovisioning.
840 Pro would have taken a lot longer to die, while still being a consumer drive. Still that was an interesting experiment.
Re: (Score:3)
How is 700TB "endurance"? I copy near a TB of data from Backups at work almost daily. So 1-2 years (if that) is "endurance"? Screw that! Sounds more like modern SSD's suck hard and aren't designed to last past 1-2 years of work. I'll stick with traditional HD's until they figure out DRAM drives that don't need batteries or constant power.
How large is your backup filesystem(s)? This was 700TB written to a 250gb drive. If you're copying "near a TB of data from Backups ... almost daily", then I'm betting you have many many TB of storage in the backup pool... so divide that by 250gb and multiple that by 700TB and that's the endurance the SSD's would have. However, even then it doesn't really apply... your backups are not likely to be rewriting a lot of sectors (ex. deduplication, if used, means few files are actually written). You also said you
Re: (Score:2)
1TB of writes per day to an SSD probably isn't a normal usage scenario for your average consumer. Samsung [samsung.com] for example claims that the average consumer writes no more than 10GB/day to an SSD:
Re: (Score:2)
Re: (Score:3)
If you read the article (I know this is Slashdot) they explain that MWI is an Intel-only SMART attribute. They use different SMART attributes for the Kingston and Samsung drives.
Intel:
Re: (Score:2)
I'm using a 500GB 840 EVO as my main drive in my system. I've moved stuff like /var to a separate hard drive (because of log files and constant tiny read/writes that aren't speed-sensitive), and I do all compiles on a RAM disk, I've upgraded to 16GB to avoid swapping.
Based on reviews etc., I fully expect the 840 EVO to outlast every other component in my PC.
Re: (Score:2)
Now if I could only justify the price... (The only computer where an SSD would be relevant for me is a laptop with a 500 GB HDD that typically sees heavy load. An SSD that fits my storage requi