Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Intel Hardware

New HyperThreading Flaw Affects Intel 6th And 7th Generation Skylake and Kaby Lake-Based Processors (hothardware.com) 135

MojoKid writes: A new flaw has been discovered that impacts Intel 6th and 7th Generation Skylake and Kaby Lake-based processors that support HyperThreading. The issue affects all OS types and is detailed by Intel errata documentation and points out that under complex micro-architectural conditions, short loops of less than 64 instructions that use AH, BH, CH or DH registers, as well as their corresponding wider register (e.g. RAX, EAX or AX for AH), may cause unpredictable system behavior, including crashes and potential data loss. The OCaml toolchain community first began investigating processors with these malfunctions back in January and found reports stemming back to at least the first half of 2016.

The OCaml team was able pinpoint the issue to Skylake's HyperThreading implementation and notified Intel. While Intel reportedly did not respond directly, it has issued some microcode fixes since then. That's not the end of the story, however, as the microcode fixes need to be implemented into BIOS/UEFI updates as well and it is not clear at this time if all major vendors have included these changes in their latest revisions.

This discussion has been archived. No new comments can be posted.

New HyperThreading Flaw Affects Intel 6th And 7th Generation Skylake and Kaby Lake-Based Processors

Comments Filter:
  • Apocryphal .... (Score:4, Insightful)

    by brantondaveperson ( 1023687 ) on Sunday June 25, 2017 @04:51PM (#54688219) Homepage

    .. doesn't mean what the article writer appears to think it means.

    Anyhow, that a new highly complex processor contains subtle bug that's fixable without hardware modification isn't exactly earth-shaking news, surely? How about they just fix it, and we move on.

    • Comment removed based on user account deletion
      • Re:Apocryphal .... (Score:4, Interesting)

        by swilver ( 617741 ) on Monday June 26, 2017 @04:28AM (#54690139)

        I looked into HT a bit, and its performance gains.

        Basically, it comes down that as soon as you have real cores available that HT barely does anything and sometimes even becomes detrimental for performance. So if you have 1 core, HT shows some real benefits. With 2 cores it was pretty marginal, and with 4 cores or more you might as well disable it.

        • by Bengie ( 1121981 )
          There are work loads where HT makes the system ~10% slower, but there are also work loads that make the system ~100% faster, and a huge range of possible work loads in between. A one dimensional analysis is useless.
      • by arglebargle_xiv ( 2212710 ) on Monday June 26, 2017 @06:16AM (#54690385)

        If you don't know the model name of your processor(s), the command below will tell you their model names. Run it in a command line shell (e.g. xterm):
        grep name /proc/cpuinfo | sort -u

        C:\>grep name /proc/cpuinfo | sort -u
        'grep' is not recognized as an internal or external command, operable program or batch file.
        C:\>

    • by Anonymous Coward

      How about they just fix it, and we move on.

      What about people on machines that haven't received bios updates in 5 years?

      "fixes need to be implemented into BIOS/UEFI updates as well and it is not clear at this time if all major vendors have included these changes in their latest revisions."

      • by Anonymous Coward

        Hmm, this is only about very recent Intel processors, it is pretty much guaranteed that your system vendor can issue the update if you ask them.

        If they don't, I expect sooner or later Intel will start publishing the Kaby Lake microcode fixes just like they did with Skylake, and you can fix it that way. Yes, even on Windows, although it takes some hacking.

      • "fixes need to be implemented into BIOS/UEFI updates as well and it is not clear at this time if all major vendors have included these changes in their latest revisions."

        Methinks it's fixed in microcode like everything else, and the article even mentions that it is. Probably written by somebody who thinks everything non-windows means BIOS.

        • A microcode patch is included with the BIOS image and is loaded by BIOS at startup. This isn't the only way to do it -- it is also possible to load a patch from the O/S. But for most users the best solution is for BIOS to do this, and that means updating the BIOS image. So the article is correct.

    • by Picodon ( 4937267 ) on Sunday June 25, 2017 @05:35PM (#54688381)

      Are you complaining about the topic as being too insignificant to deserve an article (as in: no need to tell people that they way want to update their servers) or are you preemptively commenting that other readers shouldn’t bother to comment on such an insignificant topic?

      • Ah you feisty person, you. I bet the details of the bug would be super-interesting lots of people, but the article glosses over that, and so I suppose I'm complaining that the article was insignificant (and not very good), the bug is significant but fixed, and I think everyone should comment about how microcode works, so that we can all learn something. Myself included, since I've no idea really.
    • The word apocryphal is not in the article, nor is there an update notice. Did they silently change it?
  • by Misagon ( 1135 ) on Sunday June 25, 2017 @05:03PM (#54688261)

    AMD Ryzen also seems to have a similar bug, related to hyperthreading that happens only in very special circumstances.

    Quite a few Ryzen users have experienced instability problems during heavy compilation loads under Linux, especially those using compile-based distros such as Gentoo, but also under the Ubuntu subsystem on Windows.
    There has been some debate whether the problems would have been caused by an actual bug, or if the people who experienced them simply had an unstable overclock - the latter being something that has also cropped up in forums recently.

    Matthew Dillon, of Dragonfly BSD fame (and Amiga fame before that...) does believe that he has found a reproducible bug [phoronix.com]. He sent a test case about it to AMD in April.
    This is not the first time Dillon has found a hardware bug in a AMD CPU. He found one for an earlier AMD CPU back in 2012 which was fixed in a microcode update.

    I expect this to be fixed in a BIOS/microcode update soon, if not already in AGESA 1.0.0.6 - but I have yet to see any confirmation that it would have been fixed.

    • by gweihir ( 88907 ) on Sunday June 25, 2017 @05:07PM (#54688275)

      The difference is that Ryzen is a new architecture, where this is sort-of expected. Intel has this in an old architecture and that is just not acceptable.

      • WTF? no a bug like this is NOT expected in a new architecture, such bugs can result in billions in losses, something AMD can't afford. having said that people overclocking has always been a problem when it comes to stability.
        • by gweihir ( 88907 )

          You seems to be utterly clueless about how these things work. Please go away and play with LEGO or something.

        • by AmiMoJo ( 196126 )

          If you are losing billions due to a CPU architecture bug then you are an idiot.

          Even if you are somehow sure that the CPU has no bugs, you can't be sure it won't suffer from power glitches, cosmic rays and innumerable other things that can make it crash or get the wrong result. Not to mention the rest of the computer - ECC RAM can only correct single bit errors, assuming it has no faults.

      • > The difference is that Ryzen is a new architecture, where this is sort-of expected.

        No. In the sense that this is a hardware issue, not a software issue. Hyperthread issues are not expected because the hardware is "new". Software "engineering" is a joke compared to the rigor required in hardware development. I know you may not understand this, but (successfully employed) engineers of hardware are not allowed to fuck up. Manufacturing companies spend 1000x more in design, expertise, and quality con

        • by gweihir ( 88907 )

          You seem to never even have heard or processor errata sheets, let alone ever read one. And you seem to be completely unaware what hardware flaws have been historically found in CPUs. As I said to the other clueless response, go please go away and play with LEGO or something.

          • Not at the same magnitude that they are today.

            Sure, lots of little things like the cpu will crash in some cases if a second level shadow of the carry flag register is set immediately after some other thing... so the fix is for the microcode to reorder the operations a little bit so that operations that target the carry flag are on the shallow side of the shadow registers... or at least never in that 2nd slot.. shit like that aint nothing

            It is the parallel execution itself that isnt working here. Wrong r
            • by gweihir ( 88907 )

              Well, yes, to a degree. But that _is_ as expected. As CPUs are using every last trick to get faster, the whole design necessarily gets more fragile and needs longer to stabilize. This is no surprise at all. The good thing is that it looks like the current AMD design will be around for a long, long time because I think we have reached the end of large performance improvements and this was the last step. Intel may have one more fundamental re-design, but they may also not, in particular if they cannot really

          • And you seem to be completely unaware what hardware flaws have been historically found in CPUs. As I said to the other clueless response

            Clueless? You're the one who equates this hypertheading bug to some other bug that may crop up once in a billion operations, and doesn't even effect the computed result.

            As I said to the other clueless response, go please go away and play with LEGO or something.

            I only got one response from you today. Now I know who likes using anonymous coward accounts to make personal attacks. Which I should forward to the mods, even though its unlikely they'll care.

    • by Anonymous Coward

      Isn't one of the features of ryzen a more advanced turbo boost, why overclock?

    • There has been some debate whether the problems would have been caused by an actual bug, or if the people who experienced them simply had an unstable overclock

      If it only happens when you overclock, it's not a bug, it's ID10T problem.

    • by Megol ( 3135005 )

      A little reminder: overclocking is the practice of running a processor with a higher clock frequency than the manufacturer guarantees* being okay. Doing that the user should always keep in mind that this can lead to errors without there being any problem with the processor and so overclocking should only be done when the result of programs aren't important.

      TL;DR if the problem only occurs when overclocking then it isn't a bug.

      • by Misagon ( 1135 )

        Your post is a typical example of the behaviour that has hindered proper discussion on this problem.
        People read half of one paragraph from one forum-post and half of a paragraph in another thread and then post a knee-jerk response to something they don't understand.

  • by Anonymous Coward

    The linked FA does not contain a link to the original Intel DOC:
    https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/6th-gen-x-series-spec-update.pdf

    Unfortunately it does not contains much info...

  • Wow! I kept reading over and over trying to find how it was escalating ring level or information leak through cache etc. but couldn't find it! I reaaaaallly wasn't expecting any type of "flaw" on slashdot to not be about some dumb security mistake. Way to surprise me again, SLUSHBOT

  • It will be a cold day in hell before I buy another Intel CPU, let alone let them install microcode on my current CPU.
  • by Cochonou ( 576531 ) on Sunday June 25, 2017 @06:55PM (#54688683) Homepage
    It's a bit paradoxical that it was the OCaml team who found this bug, whereas OCaml is notoriously bad at parallelism.
  • by Anonymous Coward on Sunday June 25, 2017 @08:09PM (#54688935)

    There are a lot of inaccurate comments here. First of all, reloading a new BIOS/system firmware may be the best solution for most users, however it is not the only solution. If you know how you can do a hotfix load of firmware in Linux and I suspect other OSes.

    For example, I downloaded the latest firmware from Intel (dated 10 May) and placed it in /lib/firmware. Then running:

    echo 1 > /sys/devices/system/cpu/microcode/reload

    was enough. In the log is an entry:

    [2246029.695843] microcode: updated to revision 0xba, date = 2017-04-09

    In addition, the article points to a message on the debian-devel (not users) mailing list. This indicates that i3/5/7 processors with hyperthreading are affected. AFAIK, no i5 processors have hyperthreading, even though the family/model/stepping on my system is indicated in the message as vulnerable.

    CPU(s): 4
    On-line CPU(s) list: 0-3
    Thread(s) per core: 1
    Core(s) per socket: 4
    Socket(s): 1

    Well what is it? Hyperthreading or all skylake/kaby lake? Curious minds want to know.

    One last thing. The current firmware package is dated May 10. Seven weeks ago, The firmware itself was produced April 9 -- 11 weeks ago. Unless Intel has not updated yet for this, many posters here are running around with their hair on fire about something already fixed.

    But I guess that is normal for slashdot.

    • by Anonymous Coward

      Mobile i3 and i5s can have hyperthreading enabled.

  • Always interested in the history of computing, I bought Andy Grove's "Only the Paranoid Survive", and was immediately intrigued by chapter one dealing with the Pentium Bug.

    I never made it to chapter two. Every focus was on controlling message and image. No acknowledgement this directly affected customers, no outreach, no mitigations. Much anger at people communicating a flaw in the product, and defying Intel's response plan and schedule.

    Seeing these reports of the response doesn't fix my impression.

  • for future intel chips, the microcode is expanding in size at a rapid rate as Intel adds more advanced ISA features, that's now the primary focus since there is not much to be gained from physical improvements.
  • Does this one get a nifty name?

Brain off-line, please wait.

Working...