Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Bug Intel Hardware

Some Core I7 5960X + X99 Motherboards Mysteriously Burning Up 102

An anonymous reader writes "Intel's Haswell-E Eight-Core CPU and X99 motherboards just debuted but it looks like there may be some early adoption troubles leading to the new, ultra-expensive X99 motherboards and processors burning up. Phoronix first ran a story about their X99 motherboard having a small flame and smoke when powering up for the first time and then Legit Reviews also ran an article about their motherboard going up in smoke for reasons unknown. The RAM, X99 motherboards, and power supplies were different in these two cases. Manufacturers are now investigating and in at least the case of LR their Core i7-5960X also fried in the process."
This discussion has been archived. No new comments can be posted.

Some Core I7 5960X + X99 Motherboards Mysteriously Burning Up

Comments Filter:
  • HCF (Score:5, Funny)

    by Anonymous Coward on Saturday September 06, 2014 @05:15PM (#47843009)

    Seriously don't execute the halt and catch fire instruction.

    • Re:HCF (Score:5, Funny)

      by infolation ( 840436 ) on Saturday September 06, 2014 @05:52PM (#47843175)
      As the late Murray Walker said...

      "...there's nothing wrong with the motherboard, except that it's on fire"

      • by Anonymous Coward

        ...murray walker isnt dead

    • Comment removed based on user account deletion
    • Re: HCF (Score:5, Funny)

      by Selivanow ( 82869 ) <selivanow@gmail.com> on Saturday September 06, 2014 @07:47PM (#47843569)

      Maybe they need to include NOSMOKE.SYS in their CONFIG.SYS file.

    • Seriously don't execute the halt and catch fire instruction.

      I think it can be simply counteracted with a quick Ctrl+Alt+ohfuck...
      But it does make one long for the days of the three fingered salute.

    • by gweihir ( 88907 )

      Is that another of the NSA instructions added with RDRAND? Seriously, would not really surprise me, the NSA is after sabotaging anything these days.

      I gather however that this is plain incompetence (Dunning-Kruger-Type) with regards to the voltage regulators. Switching voltage regulation is really hard to do right unless you over-engineer seriously. You can get all sorts of bizarre effects, including a puff of smoke.

      • Re:HCF (Score:5, Insightful)

        by CaptQuark ( 2706165 ) on Sunday September 07, 2014 @01:14AM (#47844627)

        I gather however that this is plain incompetence (Dunning-Kruger-Type) with regards to the voltage regulators. Switching voltage regulation is really hard to do right unless you over-engineer seriously. You can get all sorts of bizarre effects, including a puff of smoke.

        I appreciate the irony of you mentioning the Dunning-Kruger syndrome with your statement. Switching voltage regulation has been around for over 30 years and isn't much of a mystery. Since the early motherboards started reducing voltages from 5v down to 3.3v (and below), every motherboard has had on-board voltage regulation. It's hard to believe that something as fundamental as a switching regulator would suddenly exceed the engineering skill of the motherboard designers.

        ~~

        • by Agripa ( 139780 )

          I gather however that this is plain incompetence (Dunning-Kruger-Type) with regards to the voltage regulators. Switching voltage regulation is really hard to do right unless you over-engineer seriously. You can get all sorts of bizarre effects, including a puff of smoke.

          I appreciate the irony of you mentioning the Dunning-Kruger syndrome with your statement. Switching voltage regulation has been around for over 30 years and isn't much of a mystery. Since the early motherboards started reducing voltages from

        • by gweihir ( 88907 )

          Really, you know nothing about this issue, the irony is entirely imagined on your side.

          Switching power regulator design is among the hardest things in power electronics. It has gotten easier, but even well-known designs based on manufacturer application notes (and that is the extent to which most designers have "mastered" anything here) can literally blow up in your face if you get something wrong or the situation is not quite as expected. There is noting "fundamental" here. You need a specialized design fo

        • An engineer for Intel mentioned once that the CPU power regulators now, although low voltage, pass enough amps to do arc welding. Just think about that for a second.

    • "AMD Really Is On Fire This Generation!" Put that on the cover of the mag.

  • by Anonymous Coward

    Adopt a fire extinguisher

  • by oic0 ( 1864384 ) on Saturday September 06, 2014 @05:25PM (#47843049)
    I bought one of the first AM boards and they said it was rated for high watts. I made the power regulator shoot flames 10 minutes after I had it together. They lowered the rated power handling and refunded me but lame Newegg made me pay return shipping...
  • Buck feta (Score:1, Offtopic)

    by KiloByte ( 825081 )

    Why in the blazes the inter-Slashdot link leads to the Beta travesty? If I, due to some mental malady, would prefer Beta, I'd set it as my default.

    Why does even Dice keep Beta afloat after it failed? I seriously hope the plans to make it the main -- and only -- interface are gone. Oh well, there's still Soylent...

  • Looks like... (Score:4, Informative)

    by Kyrubas ( 991784 ) on Saturday September 06, 2014 @05:36PM (#47843107) Journal

    ...a failure to contain the magic smoke.

  • Easy to repair (Score:5, Informative)

    by ArcadeMan ( 2766669 ) on Saturday September 06, 2014 @05:40PM (#47843131)

    All you need is this little kit [sparkfun.com].

  • by qrwe ( 625937 )
    The remotely destructible chip is finally here. I feel sorry for the guinea pigs, though...
    • Finally, it's been around for a while [intrawebnet.com]

      Well, maybe not a single chip but the concept anyways.

    • by gweihir ( 88907 )

      Funny as that sounds at first glance, it actually makes a lot of sense. In an ordinary mainboard, the CPU cannot directly influence the voltage regulators, but in these CPUs, it can, and hence self-destruct has become possible. After the NSA transparently sabotaged the RDRAND design (the design is insecure, individual CPUs may or may not be sabotaged, but the design basically makes it impossible to tell), it would not surprise me one bit if they actually had Intel add a self-destruct as well. We really need

      • by gweihir ( 88907 )

        Ah, sorry. These are not the CPUs with integrated voltage regulators. Still possible but harder to add this type of self-destruct.

  • Not just one mobo (Score:5, Informative)

    by gman003 ( 1693318 ) on Saturday September 06, 2014 @05:47PM (#47843159)
    Since nobody reads TFA, Phoronix killed an MSI X99S, and LR lost an Asus X99 Deluxe. It was also different RAM (Corsair vs G.Skill). However, both reported the burn was near the VRMs (Phoronix also reported a second event near the northbridge). The two mobos might be using identical parts for that, but I was unable to find out for sure.
    • Re:Not just one mobo (Score:5, Interesting)

      by Forever Wondering ( 2506940 ) on Saturday September 06, 2014 @06:32PM (#47843279)

      Does Intel have a reference design board for this? Also, how close are the VRMs to the chips they're regulating?

      I once worked at a company that had a reference board with 3 FPGAs with 3 VRMs near the FPGAs. When designing their own board, the company reduced this to one VRM for all 3 FPGAs and put the VRM on the opposite side of the board. It took nine months to realize that this caused the FPGAs to reset during heavy logic switching because the single VRM + the greater length of the traces meant that the VRM couldn't keep up with the demand.

      • Re:Not just one mobo (Score:5, Informative)

        by ShanghaiBill ( 739463 ) on Saturday September 06, 2014 @07:05PM (#47843399)

        It took nine months to realize that this caused the FPGAs to reset during heavy logic switching because the single VRM + the greater length of the traces meant that the VRM couldn't keep up with the demand.

        FPGAs use synchronous logic, so they pull power in spikes as the logic switches. If it took 9 months to realize there was a problem, you can probably make some small modifications to get it working reliably. Make sure the leads from the VRM are as fat as possible, preferably have it feed into full ground and power layers, and make sure no other traces are splitting those planes. Clock all three FPGAs from the same xtal, and use a delay gate or tune the length of the traces so the signal is skewed enough that the power spikes from each FPGA are not hitting simultaneously. Add plenty of decoupling caps on every power and ground pin. Make sure the caps have leads that are fat and short. It is better to have a physically small cap (0201 or 01005) in close than a bigger one further out. Good luck.

        • by PopeRatzo ( 965947 ) on Saturday September 06, 2014 @08:40PM (#47843787) Journal

          "FPGAs use synchronous logic, so they pull power in spikes as the logic switches. If it took 9 months to realize there was a problem, you can probably make some small modifications to get it working reliably. Make sure the leads from the VRM are as fat as possible, preferably have it feed into full ground and power layers, and make sure no other traces are splitting those planes. Clock all three FPGAs from the same xtal, and use a delay gate or tune the length of the traces so the signal is skewed enough that the power spikes from each FPGA are not hitting simultaneously. Add plenty of decoupling caps on every power and ground pin. Make sure the caps have leads that are fat and short. It is better to have a physically small cap (0201 or 01005) in close than a bigger one further out. Good luck."

          It's like Penthouse Forum for nerds.

          • It's like Penthouse Forum for nerds.

            Not quite - It didn't start with "I never believed that this would happen to me..."

        • You're quite right about all that you mentioned. My example was from a 2004 timeframe.

          I can't recall if the faulty design cut back on bypassing/deglitching as well as the VRMs, but my guess would be yes.

          Afterwards the engineer [that caused the problem in the first place] added back all bypassing/deglitching, shorter leads, VRMs, etc. that got elided from the [then] new design. The engineer was definitely in the dog house for this: for creating the problem and taking so long to diagnose it.

          Personally, bas

        • by tlhIngan ( 30335 )

          The problem is not the fix - once you know the problem is power, it's trivial to fix.

          The problem is identifying the root cause. Power problems are highly subtle - and usually very intermittent. The FPGAs may crash under heavy load, but it's one of the "phase of moon" bugs because you can feed in the same test patterns that crash it and it'll work the next time around.

          And bugs that are impossible to replicate are the hardest ones to fix - especially if it's a new board that requires a new change to the RTL s

          • The problem is identifying the root cause. Power problems are highly subtle - and usually very intermittent.

            Power problems are also very common. If you have intermittent failures, it should be the first place you look. They are also easy to diagnose: If you have a failure once a week, then remove some decoupling caps. Now it fails every hour or so. Remove a few more caps, and now it fails in minutes or seconds. Once you are sure it is a power problem, it is straightforward to remedy. Add more capacitance. Check your ground and power layers. etc.

            • Thanks for what you just said. Dead on and I couldn't agree more.

              In our case, we were a small [startup] company, so we didn't have the resources to be second guessing each other. I was doing the device drivers, but I'm also 50% EE. When we found out what had happened, we were struck speechless that the first thing to check [IMO, yours, and the opinion of some of the other engineers] wasn't checked. Sigh.

          • The problem is not the fix - once you know the problem is power, it's trivial to fix.

            ShanghaiBill is correct. Power is the first thing to check/suspect. In our case, the other engineering team members assumed the lead engineer had checked this--because it is so fundamental. He hadn't. He was almost fired for this.

            The problem is identifying the root cause. Power problems are highly subtle - and usually very intermittent. The FPGAs may crash under heavy load, but it's one of the "phase of moon" bugs because you can feed in the same test patterns that crash it and it'll work the next time around.

            We had no problem generating test vectors that caused the problem to occur once per hour.

            And bugs that are impossible to replicate are the hardest ones to fix - especially if it's a new board that requires a new change to the RTL so you're not exactly sure if it's a hardware or software problem. Or even a compiler problem (since half the issues can easily be caused by bugs in the compiler).

            We were quite confident that it was a hardware problem because both boards were 100% compatible software [device driver]-wise. My device drivers also would log all access to the board in r

      • The real problem would have been inadequate bypassing at the FPGA. From the point of view of high-speed logic, power comes from capacitors, not voltage regulators.

    • Since nobody reads TFA, Phoronix killed an MSI X99S, and LR lost an Asus X99 Deluxe. It was also different RAM (Corsair vs G.Skill).

      However, both reported the burn was near the VRMs (Phoronix also reported a second event near the northbridge). The two mobos might be using identical parts for that, but I was unable to find out for sure.

      I've had 7 Asus motherboards burn up in the past 4yrs. 2 actually caught fire. So that's no suprise to me, Asus is on my banned list.

      MSI, however, has been nothing but good to me. They don't generally have the fastest or most feature rich boards available, but reliabilities been their strong suit over the years.

      • I have yet to ever have an Asus mobo fail on me after many years and many boards
        • I kind of do my own non-profit buisness of building computers for everyone I know or am related to. So I've got a small business account with newegg and do about $25k in computers a year. Asus was my board of choice for years, but about 3yrs ago they just went to shit. I've no idea why but suddenly I had massive failures, massive compatibility issues, etc... When a computer I build actually catches fire, that worries me. Asus was decent about the RMAs... which actually worried me more. A MB manufacturer wil

          • by PopeRatzo ( 965947 ) on Saturday September 06, 2014 @08:51PM (#47843829) Journal

            Also on my banned list:

            OK, no Asus or Gigabyte. I'm gonna build a new game rig. Which companies should I use? I've had good luck with Asus motherboards, but I only make a new computer every 3-4 years or so.

            Personally, I'm surprised every system I assemble doesn't burst into flames, but that's only because I'm not really expert at these things. I hold my breath whenever I have to plug a CPU into a motherboard or slop that silver goop on top of one when I'm attaching a cooler. Once many years ago, I attached a motherboard without putting in those little round standoffs onto the case and it just sort of went "zzzt!" and then smelled like a vacuum cleaner when the belt burns. I took it back to MicroCenter and wept and moaned and they actually gave me a new one. Since then, I make sure to keep a fire extinguisher and a pint of vodka on hand when I build a system. The vodka is to keep my hands from shaking.

            I know I should just go with one of the outfits on the internet that assembles gaming PCs, but I'll probably end up doing the next one myself.

          • You can't just dig around in the RMA'd parts bin and ship some other broken piece of crap back to me.

            Well, they obviously can so. Companies like that need to go bankrupt.

          • by K10W ( 1705114 )

            I kind of do my own non-profit buisness of building computers for everyone I know or am related to. So I've got a small business account with newegg and do about $25k in computers a year. Asus was my board of choice for years, but about 3yrs ago they just went to shit. I've no idea why but suddenly I had massive failures, massive compatibility issues, etc... When a computer I build actually catches fire, that worries me. Asus was decent about the RMAs... which actually worried me more. A MB manufacturer will rarely take a return with scorch marks on it unless they know there's an issue. When the RMA boards I got back from them started blowing caps as well, I knew something was terribly wrong.

            Also on my banned list: Gigabyte - I had several Gigabyte MB and Gigabyte Video cards. They would not work with each other and Gigabyte claimed it was a capability issue and not their problem, despite having put their names on both the card and the board! This was purely a customer service issue, they should have shipped me a different card to make things right.

            Zotac - For 2yrs I shipped the same video card back to them over and over again. They just kept replacing it with defective cards. Some came to me dirty, or with blown components. You can't just dig around in the RMA'd parts bin and ship some other broken piece of crap back to me. I'm currently awaiting about the 4th RMA on that card and my warranty will run out. At least they're paying for the shipping.

            Anyways, I'm done building computers for people. Components are just too unreliable now. I don't need to be spending half my life in the UPS shipping office.

            I've found msi make the list too, lot of others find the same. Seems common across wide range of experiences of rig builders and no guaranteed reliable manufacturers now IMO. I just use handful of UK stores with great support whop replace no questions asked for free and honour the warranties for products; the 3 main places I use cover everything without specifics for 12month but do 3 to 5 year on certain products. I always always use psu's that are reliable and tested by good sources. Takes a week of readin

          • > They just kept replacing it with defective cards.
            I've seen a few companies do this over the years. They just keep sending defective parts until you give up in frustration or they go out of business.

        • That's not my experience. I've always wondered why Asus has been held in such high regard when I've found their stuff be to be pretty much crap, dating back to at least the Socket A days. Not just motherboards too, as their video cards are just as flaky and die just as quickly, and don't buy their laptops either unless you need a paperweight. Heck, I'd buy ECS before buying Asus. The quality may not be any better but I'd at least save myself some money.

  • Just a failure of the magic smoke sealant.
  • Houston ... (Score:5, Funny)

    by Forever Wondering ( 2506940 ) on Saturday September 06, 2014 @05:50PM (#47843169)

    ... We've had a main B bus undervolt

    • by distilate ( 1037896 ) on Saturday September 06, 2014 @06:27PM (#47843265)
      It looks like we are venting somthing
      • by Anonymous Coward
        Roger. We copy your venting.

        Dramatic Music Intensifies
    • Re:Houston ... (Score:4, Insightful)

      by Ed_1024 ( 744566 ) on Sunday September 07, 2014 @07:37AM (#47845343)
      I was interested by the actions of the user of the motherboard on one of the linked articles. Initially this happened:

      "The system came up, hung for a very short time and then powered off with a audible click of the Corsair AX860i power supply. If you have ever heard the loud click of the Over Current Protection (OCP) shutting down the PSU you know exactly what click I heard. Now when I press power button on the motherboard the system clicks after being on for a split second. I unplugged all the cables on the power supply and did the built-in self-check and it passed with flying colors. I still swapped out the PSU with a backup Corsair AX860i and the same click was to be heard. and it is doing the same thing (Corsair AX860i). After clearing the CMOS, removing the memory, SSD and video card the system still would not post. At that point in time I switched to a non-digital power supply (Corsair AX1200) and it did the same thing although this time the OCP took a little longer to kick in. There was some audible crackling noises, followed by some smoke near the CPU VRM heatsink. So, the heart shattering smell of burnt electronics filled the room..."

      10/10 for investigative journalism but putting more and more juice into something that is continually tripping out the power supply is not going to have a happy ending. Maybe some of the $1,400-worth of motherboard and processor may have been salvageable if he had stopped at the first warning?

      If the circuit breaker pops twice on a ring main at home, do you a) replace the circuit breaker with a bigger one, b) hold it in until smoke appears from behind the wall or c) do some serious investigation and/or call an electrician before putting the power back on?
  • Kind of a small sample size to categorize these failures as "some boards", with the implication that "some" has on perception. Admittedly two mobos failing in spectacular fashion out of the relatively few that have shipped into customers' hands is a bit troubling.
    • by Anonymous Coward

      It's *REVIEW* Boards. Even assuming the reviewers bought them off the shelves, having two fail spectacularly that were different brands and memory, but the same CPU/Chipset raises some eyebrows.

      Assuming the failures were similiar and the non-discrete components along the failure paths were not from the same manufacturer, it would sound like either a design flaw in the reference implementation, or manufacturing defect in either the cpu or chipset.

      • by Anonymous Coward

        the board in the Phoronix article wasn't an engineering sample / review board but the author mentioned about buying the board from NewEgg...

      • by gweihir ( 88907 )

        Actually, the chip-set is pretty irrelevant. The damage observed would indicate that the CPU draws a lot more vurrent, maybe in short spikes, than what the voltage regulators can handle. It may also be that the CPU causes instabilities. Switching power regulators can be tricky, and they certainly are at the voltages (very low) and currents (very high) we are talking about here.

        • Switching power regulators can be tricky, and they certainly are at the voltages (very low) and currents (very high) we are talking about here.

          Citation?

          ~~

          • by gweihir ( 88907 )

            Experience cannot be gotten from citations. Experience is something you have to acquire yourself. For this case here, even an introductory text will warn you, so stop being lazy.

  • It's not burning up, it's just burning off the machine oil.
  • by dbc ( 135354 ) on Saturday September 06, 2014 @07:10PM (#47843419)

    From the photos and the write-ups, it looks like a voltage regulator is failing. So, maybe a spec in the data sheet is wrong (for reasons from typo to ooops, we didn't compute that rating correctly...) or maybe a parts vendor for that regulator had a bad-batch day. It happens. Years ago I was involved in one of the latter... "Which date codes do you want us to pull from the parts crib again? I think we have about $2 million of the bad ones...." -- at least that time I was on the customer side, which has much less impact on your sleep schedule.

  • In the old days, before computers went solid state, smoking on startup was often put down to worn valve-guides.

    Damn... I think Americans call them "tubes" -- in which case the joke doesn't work :-(

  • .. looks like consumers are on the bleeding edge.

  • Aren't these chips covered in Intels new thermal insulation paste? Maybe it's just insulating a bit too well?
  • by Lussarn ( 105276 ) on Sunday September 07, 2014 @04:09AM (#47844923)

    It was a bad motivator?

  • by Anonymous Coward

    Anyone who has a 6 or 8 core AMD FX chip will know the troubles with motherboard makers and VRM quality. If you plan to really use those chips then you better have a board with quality VRMs and proper cooling. If you use water cooling, then no airflow is going over the VRM heatsink. If you use a side to side air cooler, the situation is the same.

    Overclock.net has had people complain about this very issue for years.

    http://www.overclock.net/t/943109/about-vrms-mosfets-motherboard-safety-with-125w-tdp-proce

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (10) Sorry, but that's too useful.

Working...