Forgot your password?
typodupeerror
Data Storage Hardware

Intel Confirms Data Corruption Bug, Halts New SSDs 137

Posted by ScuttleMonkey
from the solid-state-death dept.
CWmike writes "Intel has confirmed that its new consumer-class X25-M and X18-M solid state-disk drives (SSDs) suffer from data corruption issues and said it has pulled back shipments to resellers. The X25-M (2.5-inch) and X18-M (1.8-inch) SSDs are based on a joint venture with Micron and used that company's 34-nanometer lithography technology. That process allows for a denser, higher capacity product that brings with it a lower price tag than Intel's previous offerings, which were based on 50-nanometer lithography technology. Intel says the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer. When that happens, the SSD becomes inoperable and the data on it is irretrievable. This is not the first time Intel's X25-M and X18-M SSDs have suffered from firmware bugs. The company's first generation of drives suffered from fragmentation issues resulting in performance degradation over time. Intel issued a firmware upgrade as a fix."
This discussion has been archived. No new comments can be posted.

Intel Confirms Data Corruption Bug, Halts New SSDs

Comments Filter:
  • Test before you ship (Score:5, Interesting)

    by alain94040 (785132) * on Monday August 03, 2009 @06:00PM (#28933773) Homepage

    Maybe they should have used HW/SW co-verification (like Seagate in that study [eve-team.com] - an example of how a storage company tests their firmware).

    For you software developers out there who enjoy free debuggers, you should know that we, hardware designers, also have our own debuggers. Except they are a little bit more expensive (think $500,000+) and can be quite bulky. But they are the only way to really test firmware before taping-out a chip.

    • by Anonymous Coward on Monday August 03, 2009 @06:33PM (#28934069)

      As a professional FW tester, I can say 1) firmware can be tested easier than the hardware verification the parent is talking about, and 2) Parent is confusing HW verification with firmware verification. Don't confuse HW verification with Firmware, and don't confuse Software testing with hardware verification. They are vastly different than each other, and have their own set of tools and methods (try sitting through a STAR East or STAR West seminar as a FW tester - it is a total waste of time).

      I can (and do) test firmware on buggy hardware all day long - its not an issue.

    • Re: (Score:1, Insightful)

      by Anonymous Coward

      For you software developers out there who enjoy free debuggers, you should know that we, hardware designers, also have our own debuggers. Except they are a little bit more expensive (think $500,000+) and can be quite bulky. But they are the only way to really test firmware before taping-out a chip.

      Or, if you designed your FW properly (as a piece of modularized software running with stubs and drivers for testability), you could have tested it before dumping it to a live EPROM. Or are you proposing that this was a real hardware fault, and not a problem with the firmware?

      Sorry, your software is not a unique snowflake. I know you think it's special because it runs in an embedded environment, but if you chose to ignore what software developers have spent the last 60 or 70 years in developing best practi

    • You obviously have never ever worked on complex hardware/software. Unless there are only a few commands, you cannot possibly test all the various possible combinations! If you read the article, you would find that a reboot between the offending command was required before the problem showed ... not something that most testing regimes would have specified in the first place!

      A little knowledge is a dangerous thing!
  • Ugh... summary.... (Score:3, Informative)

    by blahplusplus (757119) on Monday August 03, 2009 @06:07PM (#28933833)

    "The company's first generation of drives suffered from fragmentation issues resulting in performance degradation over time."

    The performance degradation in the Intel X-25 is not because of a "firmware bug". All SSD's will suffer performance degradation whether or not their writing/wear leveling algorithms have been updated via firmware.

    • by ShadowRangerRIT (1301549) on Monday August 03, 2009 @06:26PM (#28933997)
      The X25-M's initial firmware was unusually bad; the degradation was more rapid and more severe than necessary. Thus, they issued a firmware update [slashdot.org]. The results were quite impressive [pcper.com]. It not only reduced the perf degradation, but it seems to have made writes faster across the board.
      • Re: (Score:3, Informative)

        by blahplusplus (757119)

        "Although Intel acknowledged that all of its SSDs will suffer from reduced performance because of significant fragmentation, the type of write levels needed to reproduce PC Perspective's results aren't likely for everyday users, whether they're running Windows and Apple's Mac OS X. Even so, it still released the firmware upgrade to slow fragmentation."

      • Re: (Score:3, Informative)

        by cecom (698048)

        The X25-M's initial firmware was unusually bad; the degradation was more rapid and more severe than necessary.

        Unusually bad? More severe than necessary? Not really. Even with this supposed degradation, it was ages ahead of any and all competition. What was unusually bad was the complete lack of understanding from all reviewers who did not understand basic principles and the fundamental limitations of flash and yet rushed ahead with their articles. Those poor fools expected that the driver should behave lik

        • by maxume (22995)

          Between spare sectors and the fact that sectors are not physical things (they are mapped), no, you won't hit the 10000 rewrite limit relatively quickly.

          To put it more clearly, recent wear leveling algorithms move full sectors, spreading writes over the entirety of the actual physical storage.

          • Re: (Score:3, Informative)

            by cecom (698048)

            Don't answer with generalities unless you have really thought about it. Wear-leveling is based on heuristics; since it cannot predict the future it is always possible to construct scenarios which will hit the worst case. And if it is theoretically possible, it will happen.

            Imagine a simple case and go from there. Imagine a flash with 5 blocks total, 4 sectors per block. The logical capacity is 16 sectors; the extra block is over-provisioned for wear leveling, etc. Now, imagine that you have the 4 blocks neat

      • What makes Intel a hard disk vendor anyway? Yes, it is still a disk. Expertise which Intel doesn't have is a huge factor along with software support.

        Other alternative? It is "OCZ" and Samsung. What kind of software support do they give? Zero. Samsung can't even produce pages without english spelling mistakes.

        Call me old fashioned, I am waiting and will continue to wait until Seagate, Western Digital does real stuff, not "we can do it too" stuff if you understand what I mean.

        • by magarity (164372) on Monday August 03, 2009 @10:31PM (#28935823)

          What makes Intel a hard disk vendor anyway? Yes, it is still a disk
           
          It's solid state mass storage, where "solid state" = "chips". A disk is a spinning thingy which is completely different. Since Intel designs and make chips (see: "solid state" = "chips"), it is a perfect choice for them to make solid state mass storage devices out of chips.
           
          Have I mentioned the relationship between "solid state" and "chips" and how "solid state" != "spinning thingy"?

    • by Krizdo4 (938901) on Monday August 03, 2009 @06:30PM (#28934035) Homepage

      The performance degradation in the Intel X-25 is not because of a "firmware bug".

      Bugs can cause slowdowns, too

      Though it's highly regarded, Intel's X25-M SSD had a firmware bug that adjusted the priorities of random and sequential writes, leading to a major fragmentation problem that dropped throughput dramatically. The issue was originally uncovered by PC Perspective after two months of testing. Those tests showed that write speeds dropped from 80MB/sec. to 30MB/sec. over time, and read speeds dropped from 250MB/sec. to 60MB/sec. for some large block writes.

      https://www.techworld.com.au/article/302571/ssd_performance_--_slowdown_inevitable?pp=3 [techworld.com.au]

      Before firmware update

      the result suggested a write speed of 30 MB/sec.

      http://pcper.com/article.php?aid=691&type=expert&pid=3 [pcper.com]

      After firmware update

      After composing myself, I did the same file copy I had tried earlier. 76 MB/sec.

      http://pcper.com/article.php?aid=691&type=expert&pid=4 [pcper.com]

      Not a firmware bug?

      • ""Although Intel acknowledged that all of its SSDs will suffer from reduced performance because of significant fragmentation, the type of write levels needed to reproduce PC Perspective's results aren't likely for everyday users, whether they're running Windows and Apple's Mac OS X. Even so, it still released the firmware upgrade to slow fragmentation.."

      • by MobyDisk (75490)

        Adding an optimization does not mean that the previous revision was a bug.

    • by Eil (82413)

      The performance degradation in the Intel X-25 is not because of a "firmware bug". All SSD's will suffer performance degradation whether or not their writing/wear leveling algorithms have been updated via firmware.

      You're missing several months of history here.

      Back in February, several reviewers found that the X-25s performance fell to unacceptably low levels after a certain threshold was reached. Intel tried to deny it, saying that you'd never see the problem in real-world usage and only benchmarking the dis

      • by amorsen (7485)

        I remember it completely differently. The way I remember it went like this:

        Anandtech discovered that write performance on JMICRON controllers (not used by Intel) went to practically zero with time. The writer (and other publications I believe) went looking for the same issue in non-JMICRON controllers, and discovered that while Intel controllers were by far the least affected, they still suffered some degradation. Intel quickly updated their firmware, while everyone else (who had much more severe issues) ei

        • by Eil (82413)

          Anandtech discovered that write performance on JMICRON controllers (not used by Intel) went to practically zero with time. The writer (and other publications I believe) went looking for the same issue in non-JMICRON controllers, and discovered that while Intel controllers were by far the least affected, they still suffered some degradation. Intel quickly updated their firmware, while everyone else (who had much more severe issues) either fixed it later or not at all.

          It was my understanding that the performa

    • The performance degradation in the Intel X-25 is not because of a "firmware bug". All SSD's will suffer performance degradation whether or not their writing/wear leveling algorithms have been updated via firmware.

      1) As ShadowRangerRIT [slashdot.org] pointed out, it is a bug.
      2) These [fusionio.com] don't suffer performance degradation, so your "all" comment is 100% incorrect.

      However, if you want to apply that statement to all crappy consumer SSDs powered by Intel or JMicron controllers, then I will happily submit defeat.

  • I find it difficult to really blame them for this. What an obscure bug. How do you QA yourself out of something like that without spending more than you did on your R&D?

    • by hf256 (627209)
      I would have agreed with you on the obscure part if it only occured when the password is disabled. But to occur on password change and reboot seems more like an obvious case to me?
      • Re: (Score:3, Insightful)

        Not really. Making an educated guess from the article, it appears that this is implemented as a simple controller lockout, not actual encryption. So swapping the flash memory into another controller (common computer forensics technique) would bypass it. Most people paranoid enough to want a disk password want real encryption, so using Intel's half-measure of a password is likely a very uncommon scenario. The tests are probably very simple; glossing over this case would be an understandable, if not desir
    • by Pyrion (525584) *

      Take a down payment from your users as a massive discount in exchange for them signing on as "beta testers." If they actually find something wrong with the product and send in problem reports, then they get to keep the product for just that initial down payment so long as they keep sending in problem reports. If no problem reports come in within a given amount of time, bill them the remainder of the MSRP on the product, since it obviously works well enough for their uses.

      I guarantee you something like this

    • Re:Well.. (Score:5, Interesting)

      by rickb928 (945187) on Monday August 03, 2009 @06:37PM (#28934113) Homepage Journal

      Is this a cost issue, or a thoroughness issue?

      No, we dont catch every possible scenerio here, either, but we do try very, very hard. Knowing one of the coders in Intel's RAID drivers groups, he goes crazy with stuff. And he just writes Linux drivers. I do not envy him - this past year, every bug he's had to fix has been caused by someone else's code. Someone not writing Intel drivers. And he gets slammed every time for bad testing, as if he can test all the rest of the kernel team's stiff, NTM every fly-by-night Chinese hardware outfit. They're killing him.

      I can't even say 'ext4', he just goes insane. Though he chuckles when I whisper 'ReiserFS', and opens another beer.

      I'm glad I'm not in that line of work.

      • by syousef (465911)

        I can't even say 'ext4', he just goes insane. Though he chuckles when I whisper 'ReiserFS', and opens another beer.

        Perhaps a competitor has discovered this and hired someone to whisper "ReiserFS ReiserFS ReiserFS" in his ear repeatedly. That would explain the bugs. He's coding drunk.

      • Just curious, Though I know why he runs screaming from the room when you say ext4. Is his chuckle for ReiserFS a good thing or a bad thing? I'm enough of an aspberger baby to miss out on the subtleties of his reaction.
  • Intel says the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer.

    What does this mean? The flash drive has a password lockout? If so:

    (1) a password lockout on a drive is daft, you want to encrypt the drive or not worry about it.

    (2) flash drives trashing themselves irretreivably when you reboot after enabling passwords? I've seen that before, on "secure" thumb drives. I won't have anything to do wit tha

    • a password lockout on a drive is daft, you want to encrypt the drive or not worry about it.

      That's hardly daft. I have motion-detecting laser bullets in my foyer, but I still lock my front door.
  • Feature Not A Bug (Score:5, Insightful)

    by mrbene (1380531) on Monday August 03, 2009 @06:32PM (#28934061)

    Seriously, I'd say this is in the By Design bucket. For the security conscious - set a BIOS password. If the (feds/aliens/wife/others) remove the password, all access to the data is gone.

    Brilliant! Secure!

    Mind you, not being able to change my password once every other day might hinder my current security model.

    • by Tycho (11893)

      It is important to set the password on the hard drive itself and delete the password in the BIOS when "they" come. Setting a BIOS password for the computer itself is the only option on many desktop computers and would be a waste of time. When "they" come they will boot the computer, see the password, giggle madly, mock you, turn the computer off, disassemble the computer, remove the drives and happily read the contents of the hard drives on another computer. For really stupidly broken motherboards, and r

  • "the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer". A password protected SSD? Can someone please explain? I must be new to computers...
    • Yes, you must be new to computers since hard disks have had passwords for years. It was a popular feature in the "enterprise" market before full-disk encryption became practical.

    • by ihavnoid (749312)

      Password protection was supported for a long time, and is a part of the standard ATA specifiation. Although it typically has nothing to do with full-disk encryption, it was more or less enough to keep honest people honest, and add a little bit of cost+effort to bypass it.

      Many RAID controllers use this feature to prevent the user from connecting a RAID-formatted hard drive to a normal ATA controller, thereby accidently destroying all data. Unlocking the drive is a non-issue, since they use the same passwor

  • by owlstead (636356) on Monday August 03, 2009 @06:37PM (#28934107)

    Although this bug should have been caught faster it seems that it is possible to update the firmware without any data loss (fortunately I have put it in a laptop, power outages are no problem). I've looked at the Intel site and the flash utility seems to be simply bootable from CD - if this is the last bug I'll be a very happy punter indeed.

    My 80 GB G2 SSD replaced a not too fast laptop drive. I'm now trying Linux, but I'll try Vista as well just for fun - I'll just write my 80 GB to an external drive using Gparted. These drives come highly recommended even if they would slow down to 50% of performance (which, it seems, they don't). I unzipped Eclipse to it and JavaDoc and I could see that the archiver that unzipped the .zip has some performance issues reading the index. It took longer than the unzipping and gunzipping and untarring (the Eclipse gunzipping/untarring took less than 2 seconds - yikes). The only thing faster is the tmpfs in RAM which I used to compile the OpenJDK in on my "workstation". Starting Eclipse takes now less time on my laptop than on my workstation even though it got twice as few cycles.

    • by D Ninja (825055)

      My 80 GB G2 SSD replaced a not too fast laptop drive. I'm now trying Linux, but I'll try Vista as well just for fun - I'll just write my 80 GB to an external drive using Gparted. These drives come highly recommended even if they would slow down to 50% of performance (which, it seems, they don't). I unzipped Eclipse to it and JavaDoc and I could see that the archiver that unzipped the .zip has some performance issues reading the index. It took longer than the unzipping and gunzipping and untarring (the Eclipse gunzipping/untarring took less than 2 seconds - yikes). The only thing faster is the tmpfs in RAM which I used to compile the OpenJDK in on my "workstation". Starting Eclipse takes now less time on my laptop than on my workstation even though it got twice as few cycles.

      This just goes to show how much of a bottle neck traditional hard drives really are. A friend of mine recently replaced his hard drive in with an SSD and I was extremely impressed by the speed improvement - so much so that I'm considering installing an SSD drive on my computer as the primary hard drive and using the second as backup space.

      • by Pyrion (525584) *

        If your OS is small enough, skip the Flash SSD altogether, get 4GB of cheap DDR memory and a Gigabyte i-RAM SSD and put your OS on that.

  • by neokushan (932374) on Monday August 03, 2009 @06:44PM (#28934171)

    "How to recover lost/corrupted files from an SSD?"

    • Ones who flames us whenever we say "it is early, don't beta test storage hardware" should come up and answer them. Especially when it is predictably personal memories which has no backup.

      In an enterprise environment which X-25 was originally designed for, data loss is not a huge problem. They have all kinds of backups,verification, mirroring and cool filesystems like ZFS. When it comes to personal data of ordinary OS X or Windows user, the problem begins. Whenever they suggest an untested technology to ordi

      • by Tycho (11893)

        And the CSR on the other end when called should not be able to mute, end, or transfer the call without supervisor assistance.

    • by rdnetto (955205)

      Put it in a freezer for a bit...

      ---
      For those who don't get it, the above post is humour, not ignorance.

  • by Anonymous Coward
    Conservatively, 40% of Seagate's high-capacity (1TB+) drives have suffered from a firmware bug which bricked the drive. Seagate has promised free data recovery + firmware fix on affected units - not many people know this! So if your SATA or external Seagate has failed recently on boot, you may be able to recover the drive and your data free. Customer support is very sketchy but if you keep trying for the free data recovery you will succeed. http://www.engadget.com/2009/01/19/seagate-offers-fix-free-data-rec [engadget.com]
  • by imscarr (246204)

    It sounds like Signetics WOM (Write Only Memory) to me! http://www.national.com/rap/Story/WOMorigin.html [national.com]

  • by JakFrost (139885) on Monday August 03, 2009 @08:33PM (#28935055)

    This really seems like a very unlikely event to happen to trigger the problem on these drives for most users since from my experience personally and professionally I have yet to see anyone actually know about BIOS passwords, much less about setting a password on the drive using the ATA secure drive password feature. I am surprised that this was even caught by anyone unless it was a complete fluke or there actually are people or companies using this type of a feature for security. (I don't doubt it but haven't seen it.)

    I personally own the first generation Intel X25-M 80GB MLC SSD [intel.com] and I have written about it extensively here on this forum. I heard rumors that the new TRIM feature support will only made available to this second generation release of these drives but I'm unsure if that is really true. I'm on the fence right now whether I should sell my G1 drive and upgrade to the G2 because of this feature and also for a little more performance because I am so happy with the performance of this drive and also the current 8820 firmware that solved the fragmentation and slowdown issues.

    If you are one of those folks who is still sitting around not knowing what to do when all of this Solid State Disk news is coming out all over then you are missing the biggest paradigm shift to computing performance since the transfer from floppy disks to hard drives.

    With the upcoming re-release of this newly affordable drive around 2009-08-28 from Intel X25-M G2 80GB MLC SSD at ~$230 USD from Newegg [newegg.com] or ZipZoomFly [zipzoomfly.com] you should definitely dig down deep and save a little money to buy one of these drives and experience the biggest performance and responsiveness improvement to your computer that you could imagine.

    If you need a primer on the SSD revolution check out my previous post regarding the articles to read.

    Required Reading for Solid State Drives (Score 1) [slashdot.org]

    • by Ilgaz (86384)

      I am extremely old fashioned in regards to hard drives. Not buying until something with normal price comes out from 2 vendors of mine, Seagate and Western Digital. They do storage for years.

      Basically Intel is a CPU vendor/monopoly. Not a GPU vendor or a hard disk manufacturer.

      • by karnal (22275)

        Intel makes chips.

        Graphic cards have chips. Given, they don't necessarily pander to the high end.

        Flash drives have chips. Intel can make chips.

        Intel. Chips. Enjoy.

        • by Luthair (847766)
          Intel, and other SSD manufacturers are getting a free ride on reliability and performance. When these types of problems occur in the storage world it can be game over for the manufacturer.
      • by gordyf (23004)

        "I dunno about this chip-based storage from the biggest chip manufacturer in the world. I'm gonna wait until a company that has never made Flash makes Flash-based storage instead."

        Yeah, that makes total sense.

    • IIRC some laptops will automatically set a hard disk password if you set a bios password.

  • by AllynM (600515) * on Monday August 03, 2009 @09:29PM (#28935419) Journal

    Welcome to 2 weeks ago:

    http://www.pcper.com/comments.php?nid=7544 [pcper.com]

    Allyn Malventano
    Storage Editor, PC Perspective

  • by Allnighterking (74212) on Tuesday August 04, 2009 @01:42AM (#28936949) Homepage
    I've seen this before, though I can't remember where. In that case what was happening was that when you changed or removed the password it would corrupt the password file and lock you out. The first time (no password exists set original) does the following
    • read the password
    • hash the password
    • write the hash to the data file

    Now the problem came in that case when you wanted to change/delete the password. It would use a second subroutine to do.

    • read the old password
    • get the old password hash and use it to check if the user knows the correct password
    • get new password (twice and compare)
    • hash the result of the diff of the first entry and the second entry for the new password

    That last step was the killer, seems that someone had declared a global variable and a local variable with the same name. End result one overwrote the others data, and one never knew exactly what the box hashed, nor you could figure out what to key in to the screen to unlock the door. (so to speak.)

    • VNC? I'm sure I've encountered the same problem. Setting an initial password is fine, but trying to update it invalidated the password.
  • This news is days and days old, very old.
    Anyone who cares knows about this, we've long since known! What we want to know now is when is the patch coming out, for existing owners and when will the god damned disks be going back on shelves?
    There is going to be even more demand for the things, as soon as they are re-listed, prices are going to skyrocket at the retailers.

    Also, on this note, it's August 4'th where I am right now, Windows 7 is available within about 72 hours internationally for certain MSDN subsc

  • by Waccoon (1186667) on Tuesday August 04, 2009 @07:11AM (#28938537)

    Ask anyone who bought a JMicron-based SSD about insufficient testing. How any company thought that controller was worthy for their SSDs is beyond me.

    Before I replaced mine with a Samsung SSD, my [censored] was regularly giving me studders and pauses that lasted for 20-40 seconds at a time. It just flat-out halted everything on the computer for half a minute for no apparent reason, even while reading, not just writing. Apparently, this was predominant behavior for the controller that dominated the SSD arena until the X-25 started blowing people away.

    I think I understand now why Seagate, WD, and the other HD manufacturers are taking so long to get SSDs on the market. Since their market depends almost exclusively on storage, they can't afford to screw up their first SSDs. At least, I hope that's the reason. Even they have to understand that the hard drive market isn't going to last forever.

  • confirm the data errors in my Phison SSD, but the things been booting since somewhere around mid 2008.

1: No code table for op: ++post

Working...