If the contents are lost in a week, we're probably talking about capacitor-backed SSDs that use some other technology than flash memory. Yes, it would be insane to use flash memory for archival purposes as well, but it still should easily retain its contents for at least a decade. When powered on, this problem does not exist as normally the controller slowly walks through the flash refreshing it.
The statements are actually completely accurate, but a bit misleading. First, this is about what JEDEC requires, not what actual SSDs deliver. Second, this is when SSDs are stored in idle at 55C. And third the JEDEC requirements for minimum off-time data-retention are only 3 months @40C for enterprise-grade SSDs and only 12 months for consumer SSDs at 30C. These are kind of on the low side, although I have lost some OCZ drives that were off for just about a year. (Never buying their trash again...)
That said, anybody conversant with SSD technology knows that SSDs are unsuitable for offline data storage as data obviously has potentially far shorter lifetimes than on magnetic disks, which in turn again have far shorter data lifetime than archival-grade tape. These is absolutely no surprise here for anybody that bothered to find out what the facts are. Of course, there are always those that expect every storage tech to keep data forever, and those dumb enough to have no or unverified backups and those often on media not suitable for long-term storage. Ignoring reality comes at a price.
My personal solution is mixed HDD/SSD Raid1 and, of course, regular backups.
Uhh... doesn't that mean that the RAID controller has to wait for the HDD on every read/write to verify it's the same as on the SSD, so effectively you get HDD performance?
Every write, not every read. Reads are satisfied as soon as either drive returns the data. And if the raid controller has a battery or supercap so it can cache writes, you'll almost never notice the difference.
"Reads are satisfied as soon as either drive returns the data. And if the raid controller has a battery or supercap so it can cache writes, you'll almost never notice the difference."
RAID controllers do not launch reads on all involved drives. That would be stupid.
Implementing battery backed write back cache on an array that uses SSD would be similarly stupid.
RAID 1 with mixed SSD/HDD is the worst of both worlds further complicated by people who don't understand it.
For a RAID1, most RAID controllers (and software RAID implementations) will absolutely read from all devices so as to service the read ASAP.
For distributed parity forms of RAID, you inherently have to read from all devices.
The problem is guaranteed with distributed parity raid; the controller will have to wait for the slowest disk to complete the read. Both reads and writes will be limited to mechanical disk performance levels.
With a RAID1 mirror set, you can get a performance improvement on reads since the SSD would presumably service all of them. Writes will still be delayed by the mechanical drive(s).
In addition, most RAID controllers do not support mixing drive types. Most of them don't even recommend mixing drive speeds
For a RAID1, most RAID controllers (and software RAID implementations) will absolutely read from all devices so as to service the read ASAP.
No, almost every RAID1 controller I've ever encountered does not do that at all. It balances the reads across the drives so that the it maximizes throughput and IOPS. Only when one drive attempts to read a sector and it detects an error through it's internal CRC checks and is unable to rectify the error (short period for raid drives, long period for desktop class drives), THEN it will request the data from the alternate drive and have the original drive correct itself.
RAID controllers do not launch reads on all involved drives. That would be stupid.
I think you mean that they do not launch a read request for the same chunk of data on all drives in a raid mirror. That would be accurate. However, they usually will read from both drives (read chunk 1 from drive A, read chunk 2 from drive B... doing so in parrallel can significantly increase read performance using a mirror).
RAID 1 with mixed SSD/HDD is the worst of both worlds further complicated by people who don't understand it.
Do you mean people like you? Look up "md raid write-mostly", or try this page (one of many found): http://tansi.info/hybrid/ [tansi.info] That setup is for a linux software RAID 1 mirror with one side
Every write, not every read. Reads are satisfied as soon as either drive returns the data. And if the raid controller has a battery or supercap so it can cache writes, you'll almost never notice the difference.
Ah, I thought RAID1 would warn you somehow of bit flips which I assume would be the way heat-deteriorated storage would show up. Guess it won't, you'll need ZFS or something like that.
Thats very dependant on whose implementation of raid 1. I've seen everything from read from one drive, stripe reads, and read from both and compare. Linux will actually let you choose from among some of those options.
ZFS and btrfs add a crc for a group of blocks and and detect which drive has the bad data, correct that and tract that it happened.
Ah, I thought RAID1 would warn you somehow of bit flips which I assume would be the way heat-deteriorated storage would show up.
It does. The description of how RAID1 works was incorrect. No raid controller that I am aware of implements RAID1 that way. That would include DELL's persec raid controllers, INTEL's ICH raid controllers, Adaptec raid controllers, LSI's raid controllers, rocket raid controllers, and window's implementation.
I've been thinking about getting a bunch of cheap usb sticks and building a zfs pool out of them with some redundancy and then using that for a zfs pool for a usenet spool just to see what can go wrong. If anything can abuse a disk its usenet.
I've done this, experimentally, using not-super-cheap 128GiB patriot and hyper-x usb3 sticks.
For a USENET load, performance will depend on whether your incoming feed is predominantly batched, streaming, or effectively random -- small writes bother these devices individually, and aggregating them into a pool works best if you can maximize write size. One way to do that is to use a wide stripe (e.g. zpool create foo raidz stick0 stick1 stick2 stick3 stick4...), which works well if your load is mainly batch
Ah, cannot ETA, so: decent USB 3 sticks make *excellent* l2arc devices in front of spinny disk pools. They can often deliver a thousand or two IOPS each with a warm l2arc, and that would otherwise mean disk seeks. I use them regularly.
(The downside is that in almost all cases l2arc is non-persistent across reboots, crashes and pool import/export, and can take a long time to heat up so that a good chunk of seeks are being absorbed by them, so you're limited by the IOPS of the storage vdevs and the size
the RAID controller has to wait for the HDD on every read/write to verify it's the same as on the SSD
Drives use checksums, no need to read from a second drive if it succeeds from the first one.
Also, "write mostly" configurations for the HDD allow for most reads to be made from the SSD, with the HDD acting as a backup. Writes still need to be duplicated, but this can be done in the background.
I might be wrong but isn't it also when the SSD is stored at 55C AFTER having been stress tested at 55C to their endurance rating in terabytes written (page 39) under a given workload? And even then the cherry picked value was in example data submitted by Intel for unknown hardware and very likely extrapolated and quite possibly meaningless because it wasn't part of the chart targeted for the standard.
The article seems to have totally misrepresented the presentation's purpose: which is to lay out endurance testing methodology/standards.
The only important values were on page 26 where they set the minimum requirements of 40C 8hr/day load/30C 1 year retention for consumer (with a higher error ratio) and 55C 24hr/day load/40C 3 months retention for enterprise (with a lower error ratio.) And it looks like they haven't actually worked out the consumer workload for testing yet.
Cherry-picked for "a week" yes, but still disturbing. It's not an issue for datacenters, but for offices.
Imagine an office PC set next to the radiator - oh, the employees are free to set up their desks as they like, and they really don't care about stuff like that. Given employee going for a holiday break for a month, taking the family for a skiing trip. The PC experiencing 50C on regular basis. That's quite enough to cause the data loss.
Yes, in a responsible company there will be backups - or the data will
I don't know about that. Even at every office that I've ever done work at even the women refuse to sit near/next to the radiators. That includes in super cold areas where it hits -40C to -50C in the late fall/winter. And in the cases where the 'rad' is pumping out 55C temps, there is already 30-100CM of space around them to simply stop possible burns from the rad which gives you plenty of space to normalize the air temperature. Most places are now on forced air from the ceiling.
Indeed. Laser print (or pigmented ink) on good paper will get > 100 years. The question is what you need. "Archival" media usually start at > 10 years ensured and for that archival tape is the best option these days. It is also not that expensive. About 1500 EUR/USD should get you a suitable drive and about 500EUR/USD more should get you a starting supply of tapes. That is not expensive on any professional scale.
Truth has always been found to promote the best interests of mankind...
- Percy Bysshe Shelley
I call BS (Score:3, Insightful)
Re:I call BS (Score:5, Informative)
The statements are actually completely accurate, but a bit misleading. First, this is about what JEDEC requires, not what actual SSDs deliver. Second, this is when SSDs are stored in idle at 55C. And third the JEDEC requirements for minimum off-time data-retention are only 3 months @40C for enterprise-grade SSDs and only 12 months for consumer SSDs at 30C. These are kind of on the low side, although I have lost some OCZ drives that were off for just about a year. (Never buying their trash again...)
That said, anybody conversant with SSD technology knows that SSDs are unsuitable for offline data storage as data obviously has potentially far shorter lifetimes than on magnetic disks, which in turn again have far shorter data lifetime than archival-grade tape. These is absolutely no surprise here for anybody that bothered to find out what the facts are. Of course, there are always those that expect every storage tech to keep data forever, and those dumb enough to have no or unverified backups and those often on media not suitable for long-term storage. Ignoring reality comes at a price.
My personal solution is mixed HDD/SSD Raid1 and, of course, regular backups.
Re: (Score:2)
My personal solution is mixed HDD/SSD Raid1
Uhh... doesn't that mean that the RAID controller has to wait for the HDD on every read/write to verify it's the same as on the SSD, so effectively you get HDD performance?
Re:I call BS (Score:5, Informative)
Every write, not every read. Reads are satisfied as soon as either drive returns the data. And if the raid controller has a battery or supercap so it can cache writes, you'll almost never notice the difference.
Re: (Score:3)
"Reads are satisfied as soon as either drive returns the data. And if the raid controller has a battery or supercap so it can cache writes, you'll almost never notice the difference."
RAID controllers do not launch reads on all involved drives. That would be stupid.
Implementing battery backed write back cache on an array that uses SSD would be similarly stupid.
RAID 1 with mixed SSD/HDD is the worst of both worlds further complicated by people who don't understand it.
Re: (Score:3)
RAID controllers do not launch reads on all involved drives. That would be stupid.
?
For a RAID1, most RAID controllers (and software RAID implementations) will absolutely read from all devices so as to service the read ASAP.
For distributed parity forms of RAID, you inherently have to read from all devices.
For dedicated parity disk forms of RAID, you have to read from all devices except the parity device.
I've never tried a mixed RAID1 of SSD and magnetic disk, but with a large enough write cache the theory se
Re: (Score:2)
For dedicated parity disk forms of RAID, you have to read from all devices except the parity device.
I think the idea is to make a dedicated parity disk RAID with one data SSD and one parity HDD.
Re: (Score:3)
For a RAID1, most RAID controllers (and software RAID implementations) will absolutely read from all devices so as to service the read ASAP.
For distributed parity forms of RAID, you inherently have to read from all devices.
The problem is guaranteed with distributed parity raid; the controller will have to wait for the slowest disk to complete the read. Both reads and writes will be limited to mechanical disk performance levels.
With a RAID1 mirror set, you can get a performance improvement on reads since the SSD would presumably service all of them. Writes will still be delayed by the mechanical drive(s).
In addition, most RAID controllers do not support mixing drive types. Most of them don't even recommend mixing drive speeds
Re: (Score:2)
For a RAID1, most RAID controllers (and software RAID implementations) will absolutely read from all devices so as to service the read ASAP.
No, almost every RAID1 controller I've ever encountered does not do that at all. It balances the reads across the drives so that the it maximizes throughput and IOPS. Only when one drive attempts to read a sector and it detects an error through it's internal CRC checks and is unable to rectify the error (short period for raid drives, long period for desktop class drives), THEN it will request the data from the alternate drive and have the original drive correct itself.
Re: (Score:2)
Sorry, the same applies to parity drives and dedicated parity drives as well during reads.
During writes, all the data on a particular stripe need to be read so that the correct parity can be calculated.
Re: (Score:3)
Implementing battery backed write back cache on an array that uses SSD would be similarly stupid.
How do you figure? Write to ram is a whole lot faster than write to flash, especially if the flash block has to erase first.
Re: (Score:3)
RAID controllers do not launch reads on all involved drives. That would be stupid.
I think you mean that they do not launch a read request for the same chunk of data on all drives in a raid mirror. That would be accurate. However, they usually will read from both drives (read chunk 1 from drive A, read chunk 2 from drive B... doing so in parrallel can significantly increase read performance using a mirror).
RAID 1 with mixed SSD/HDD is the worst of both worlds further complicated by people who don't understand it.
Do you mean people like you?
Look up "md raid write-mostly", or try this page (one of many found): http://tansi.info/hybrid/ [tansi.info]
That setup is for a linux software RAID 1 mirror with one side
Re: (Score:2)
Every write, not every read. Reads are satisfied as soon as either drive returns the data. And if the raid controller has a battery or supercap so it can cache writes, you'll almost never notice the difference.
Ah, I thought RAID1 would warn you somehow of bit flips which I assume would be the way heat-deteriorated storage would show up. Guess it won't, you'll need ZFS or something like that.
Re: (Score:2)
Thats very dependant on whose implementation of raid 1. I've seen everything from read from one drive, stripe reads, and read from both and compare. Linux will actually let you choose from among some of those options.
ZFS and btrfs add a crc for a group of blocks and and detect which drive has the bad data, correct that and tract that it happened.
Re: (Score:2)
Ah, I thought RAID1 would warn you somehow of bit flips which I assume would be the way heat-deteriorated storage would show up.
It does. The description of how RAID1 works was incorrect. No raid controller that I am aware of implements RAID1 that way. That would include DELL's persec raid controllers, INTEL's ICH raid controllers, Adaptec raid controllers, LSI's raid controllers, rocket raid controllers, and window's implementation.
Re: I call BS (Score:2)
use ZFS l2arc/zil or flashcache/dm-cache to get a happy medium.
Re: (Score:3)
I've been thinking about getting a bunch of cheap usb sticks and building a zfs pool out of them with some redundancy and then using that for a zfs pool for a usenet spool just to see what can go wrong. If anything can abuse a disk its usenet.
Re: (Score:2)
The modern day floppy-raid. [wired.com]
Re: (Score:3)
I've done this, experimentally, using not-super-cheap 128GiB patriot and hyper-x usb3 sticks.
For a USENET load, performance will depend on whether your incoming feed is predominantly batched, streaming, or effectively random -- small writes bother these devices individually, and aggregating them into a pool works best if you can maximize write size. One way to do that is to use a wide stripe (e.g. zpool create foo raidz stick0 stick1 stick2 stick3 stick4 ...), which works well if your load is mainly batch
Re: (Score:2)
Ah, cannot ETA, so: decent USB 3 sticks make *excellent* l2arc devices in front of spinny disk pools. They can often deliver a thousand or two IOPS each with a warm l2arc, and that would otherwise mean disk seeks. I use them regularly.
(The downside is that in almost all cases l2arc is non-persistent across reboots, crashes and pool import/export, and can take a long time to heat up so that a good chunk of seeks are being absorbed by them, so you're limited by the IOPS of the storage vdevs and the size
Re: (Score:2)
--Is there a decent USB3 thumbdrive brand you can recommend that is available from say, Amazon Prime? TIA
Re: (Score:1)
the RAID controller has to wait for the HDD on every read/write to verify it's the same as on the SSD
Drives use checksums, no need to read from a second drive if it succeeds from the first one.
Also, "write mostly" configurations for the HDD allow for most reads to be made from the SSD, with the HDD acting as a backup. Writes still need to be duplicated, but this can be done in the background.
Re:I call BS (Score:5, Informative)
I might be wrong but isn't it also when the SSD is stored at 55C AFTER having been stress tested at 55C to their endurance rating in terabytes written (page 39) under a given workload?
And even then the cherry picked value was in example data submitted by Intel for unknown hardware and very likely extrapolated and quite possibly meaningless because it wasn't part of the chart targeted for the standard.
The article seems to have totally misrepresented the presentation's purpose: which is to lay out endurance testing methodology/standards.
The only important values were on page 26 where they set the minimum requirements of 40C 8hr/day load/30C 1 year retention for consumer (with a higher error ratio) and 55C 24hr/day load/40C 3 months retention for enterprise (with a lower error ratio.)
And it looks like they haven't actually worked out the consumer workload for testing yet.
Re: (Score:3)
Cherry-picked for "a week" yes, but still disturbing. It's not an issue for datacenters, but for offices.
Imagine an office PC set next to the radiator - oh, the employees are free to set up their desks as they like, and they really don't care about stuff like that. Given employee going for a holiday break for a month, taking the family for a skiing trip. The PC experiencing 50C on regular basis. That's quite enough to cause the data loss.
Yes, in a responsible company there will be backups - or the data will
Re: (Score:2)
I don't know about that. Even at every office that I've ever done work at even the women refuse to sit near/next to the radiators. That includes in super cold areas where it hits -40C to -50C in the late fall/winter. And in the cases where the 'rad' is pumping out 55C temps, there is already 30-100CM of space around them to simply stop possible burns from the rad which gives you plenty of space to normalize the air temperature. Most places are now on forced air from the ceiling.
Re: (Score:2)
Indeed. Laser print (or pigmented ink) on good paper will get > 100 years. The question is what you need. "Archival" media usually start at > 10 years ensured and for that archival tape is the best option these days. It is also not that expensive. About 1500 EUR/USD should get you a suitable drive and about 500EUR/USD more should get you a starting supply of tapes. That is not expensive on any professional scale.