Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

How Does Flash Media Fail? 357

bhodge writes "Aside from the obvious 'it stops working' answer, how does flash media — such as USB, SD, and CF — fail? Unlike with traditional hard drive, where anyone who's worked with computers for a while knows what a drive failure looks like, I don't know anyone who has experienced such a failure with flash. I've haven't been able to find more than scant evidence of what such failures look like at the OS level. The one account I have found detailed using a small USB drive for /var/log storage; it failed very quickly, and then utterly (0 byte unformatted device), after five years of service in the role. This runs contrary to other anecdotal claims that you should still be able to read the media after you can no longer write to it. So my question is: what have you seen of the nature of flash media failure, if anything?"
This discussion has been archived. No new comments can be posted.

How Does Flash Media Fail?

Comments Filter:
  • by Intron ( 870560 ) on Friday April 10, 2009 @10:14AM (#27531049)

    If a cell fails, you can't read or write that cell.

    If a gate fails in a page, you lose access to the page.

    If a gate fails in the overall control logic, you lose access to the whole device.

    Is there something I'm missing? Did you think there were oil changes or brake shoes? It's one silicon chip with metal on it.

  • Fail on write (Score:5, Insightful)

    by fishybell ( 516991 ) <.moc.liamtoh. .ta. .llebyhsif.> on Friday April 10, 2009 @10:15AM (#27531077) Homepage Journal
    The biggest difference I've encountered is when traditional hard drives fail, they fail on reading data back.

    Flash media fails when you write the data. In theory this means that you can always recover data as you can never write data to bad sectors. In practice the entire media device (CF, SD, etc.) fails at once.

  • If a cell fails, you can't read or write that cell.

    If a gate fails in a page, you lose access to the page.

    If a gate fails in the overall control logic, you lose access to the whole device.

    Is there something I'm missing? Did you think there were oil changes or brake shoes? It's one silicon chip with metal on it.

    What about redundancy and self-healing? How do those work?

  • by flyingrobots ( 704155 ) on Friday April 10, 2009 @10:24AM (#27531213)
    If the flash drive fails, yes you can continue to read from it, but you also have to consider what is meant by reading.

    You can always read the raw data from the device, that will never change. There is nothing that prevents the electrical signals from forming a proper read transaction on the IO pins of the flash IC chip.

    However, when you consider the software that is on top of the raw data (a file system for example), this is where you will have the trouble.

    With older CF cards, the concept of wear leveling was not implemented, I don't know about newer ones. This being the case, the directory structure for a file would more than likely reside in the same physical location on the flash. Opening, writing, closing a file with the same name would no doubt wear that space out as the directory entry gets hammered. Once that has "worn out", data is lost because the file system can no longer track it (even though the actual data may be viable).

    Also consider the device that does support wear leveling. At some point it will run out of places to wear. Some large files will remain static and won't move (they are only read), some files will be moved all over the device by the device's ASIC as the data in the file is updated or changed. At some point, the flash will run out of cells. This could happen as some critical directory entry is being updated, and the whole file system could be corrupted because there are no more viable flash cells to use.

    Your data might still be there is all its binary glory, but w/o a viable file system data structure to access it, well, you're toast. Unlike a harddrive that burped and lost a few bytes, a worn out flash drive has no recordable medium available to do any file system data structure repairs.

    Kevin
  • Re:In my case (Score:5, Insightful)

    by AKAImBatman ( 238306 ) * <akaimbatman@gmaYEATSil.com minus poet> on Friday April 10, 2009 @10:27AM (#27531251) Homepage Journal

    Washing machines are pretty harsh places. You get tidal forces that will apply various physical stresses to the components. Rapid heating and cooling can cause expansion problems. Water can wear down contacts. Soaps can contaminate contacts or have negative chemical effects. So on and so forth.

    If it makes it to the drier, your card could easily end up at temperatures outside the optimal storage temperature for the device. (Ever read those warnings, "Store between 70F and 100F?" Yeah, me neither.) These extreme temperatures combined with the rapidity at which they're introduced is a cornucopia of ways your device could be damaged.

    In short, water isn't the real problem. It's all the stuff above and beyond that.

  • Re:In my case (Score:4, Insightful)

    by ptomblin ( 1378 ) <ptomblin@xcski.com> on Friday April 10, 2009 @10:28AM (#27531277) Homepage Journal

    Usually the case falls apart. I can still get the data off the drive, but I stop using it and just spend another $20 to get something with 8 times the capacity of the last time.

  • Re:In my case (Score:4, Insightful)

    by AKAImBatman ( 238306 ) * <akaimbatman@gmaYEATSil.com minus poet> on Friday April 10, 2009 @10:29AM (#27531283) Homepage Journal

    However since i'm lazy i don't want to hold it in forever so it's been retired.

    There's a fix for that. [wikipedia.org] :-P

  • Re:FAT (Score:3, Insightful)

    by AKAImBatman ( 238306 ) * <akaimbatman@gmaYEATSil.com minus poet> on Friday April 10, 2009 @10:35AM (#27531397) Homepage Journal

    A single power failure during a write can ruin a perfectly good SD card. It took me a single try.

    You're right, I think that's the most common situation people see these days. Most of the other posters are describing sudden, total failures. Which are consistent with frying the drive rather than failures of bad blocks. Not all that different than losing a head on a hard drive.

  • by Vellmont ( 569020 ) on Friday April 10, 2009 @10:48AM (#27531611) Homepage


    Is there something I'm missing?

    Maybe the part where you assume everyone knows the above?

    Or how about the part where the submitter is asking about typical failure modes, not all possible failure modes?

  • FAT Failure (Score:3, Insightful)

    by ArcherB ( 796902 ) on Friday April 10, 2009 @10:52AM (#27531691) Journal

    When I was in the digital imaging kiosk business, we had to repair about three flash drives a week. A customer would put it in one of our systems and pull it out while it was being read, or it was a cheap drive or whatever. Either way, the customer would blame our systems for killing their drives (rightly or wrongly). Of course, it would contain pictures of their dead grandfather or ex-girlfriend naked or whatever was completely priceless and irreplaceable.

    The vast majority of the time, we would be able to run an application that would be able to recover whatever was on the drive. While I'm not certain of the original problem, the system acted as if the drive had no FAT (File Allocation Table... do I really need to say it?) on it or the FAT had become corrupted. This particular application would be able to go in and recover whatever was on the drive and most of the time repair the drive to its previous working state.

    I say it ACTED like the FAT was corrupt, but I don't know or care if a flash drive has a FAT on it. Could have been a hardware thingie in there that hiccuped. The repair utility acted much like a scan-disk that would repair an MBR or FAT and/or act like an undelete utility would, restoring the files on the drive.

  • Re:In my case (Score:5, Insightful)

    by Bakkster ( 1529253 ) <Bakkster@man.gmail@com> on Friday April 10, 2009 @10:57AM (#27531743)
    Don't forget about the extreme static charges built up in a drier. Even though most USB devices have mechanisms to prevent static damage, a drier could overwhelm these protections. Regardless, usually a SSD failure should usually be due to the failure of the suport electronics, not the storage itself.
  • Re:Flashmemory (Score:2, Insightful)

    by scatterbrained ( 144748 ) on Friday April 10, 2009 @10:57AM (#27531745) Journal

    google 'tin whiskers' and 'RoHS solder failures'

  • Re:In my case (Score:3, Insightful)

    by klaun ( 236494 ) on Friday April 10, 2009 @11:10AM (#27531909)

    Washing machines are pretty harsh places. You get tidal forces that will apply various physical stresses to the components. Rapid heating and cooling can cause expansion

    I'm sorry, tidal forces in a washing machine? Tidal forces are caused by gravity. It's an effect of the inverse square distance portion of the gravity force equation. They certainly exist in a washing machine as they do anywhere else subject to the effects of gravity, but no more so than anywhere else.

    Within the rotating frame of a washing machine drum, there are dynamic forces, centrifugal and Coriolis. I imagine that only the former is really significant, but I would think contact with an agitator or sides of the drum would subject the flash memory to far higher forces.

  • by dannycim ( 442761 ) on Friday April 10, 2009 @11:25AM (#27532151)

    I've been running my home desktop/server (Linux 2.6) on a Sandisk Cruzer 8GB usb stick (root, swap, tmp, everything except large media files) for a year and four months without any glitches. I've napkin-calculated that at current usage and wear levelling, I should be able to use it for over 50 years without a failure. Funnily enough, the portable USB drive that I use to back it up failed last December. I keep multiple backups, I didn't flinch.

    Then again some flash devices fail miserably and silently. I've had a few 64MB and 128MB stick batches with stuck bits, and those were practically new. The operating systems they were used on didn't detect the errors, I did, by trying to open garbled files.

    My wish list: A SATA gizmo that has 4-5 USB connectors with each their own bus that presents itself to the SATA bus as a single drive, and does RAID-5 automatically. That'd be sweet.

  • by aaron.axvig ( 1238422 ) <aaron@axvigs.com> on Friday April 10, 2009 @11:35AM (#27532305)

    There is a conceivable edge case:

    You perform an atomic write to sectors A and B. Write A succeeds, but write B fails as that sector is worn out. Then you try to roll back sector A, only to discover that sector is also now worn out. Boom, inconsistent file state.

    This would probably be a rare occurance.

  • Re:flash faliure (Score:4, Insightful)

    by clone53421 ( 1310749 ) on Friday April 10, 2009 @11:44AM (#27532429) Journal

    Actually, the rule of thumb is:

    backup your data, ESPECIALLY when it's on a flash based drive

  • by Anonymous Coward on Friday April 10, 2009 @12:03PM (#27532709)

    I have an old flash device which becomes smaller every time I plug it in. It's old, I mean REALLY old. One of the first generation of flash devices, so the write times are awful, and the life expectancy isn't very good, but I still check it every once in a while.

    It was originally 512 MB, but now it's dwindling somewhere around ~200 MB. Any write you make isn't particularly likely to fail, but some files might disappear as you put something new on it. Deleting files often delete whole folders, etc.

    Hope that gives you an idea.

  • I am shocked! (Score:1, Insightful)

    by Anonymous Coward on Friday April 10, 2009 @12:38PM (#27533137)
    I can't believe it. This is the first technically interesting Ask Slashdot I've seen in a very, very long time. It may be the first one I've ever seen that makes me wonder why they didn't just Ask Google or ask RTFM.
  • Re:flash faliure (Score:3, Insightful)

    by Midnight Thunder ( 17205 ) on Friday April 10, 2009 @12:53PM (#27533363) Homepage Journal

    I think the moral of this story is backup your data, even when it's on a flash based drive, and don't code directly on a cheap thumb drive :)

    Yup, this is important, but then again this important because for me the single biggest cause for data loss related to thumb-drives is: loss of drive.

    I would like to say that I am very careful with my drives, but the truth is the loop holding the drive to the key chain is usually very weak. There is also the person is in question which has something to do with it, but that is a little harder to change.

  • by gmccloskey ( 111803 ) on Friday April 10, 2009 @02:27PM (#27534457)

    hey, what have strippers ever done to deserve being classed with politicians?

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...