Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AMD Bug Security Hardware

AMD 'Zenbleed' Bug Leaks Data From Zen 2 Ryzen, EPYC CPUs (tomshardware.com) 40

Monday a researcher with Google Information Security posted about a new vulnerability he independently found in AMD's Zen 2 processors. Tom's Hardware reports: The 'Zenbleed' vulnerability spans the entire Zen 2 product stack, including AMD's EPYC data center processors and the Ryzen 3000/4000/5000 CPUs, allowing the theft of protected information from the CPU, such as encryption keys and user logins. The attack does not require physical access to the computer or server and can even be executed via JavaScript on a webpage...

AMD added the AMD-SB-7008 Bulletin several hours later. AMD has patches ready for its EPYC 7002 'Rome' processors now, but it will not patch its consumer Zen 2 Ryzen 3000, 4000, and some 5000-series chips until November and December of this year... AMD hasn't given specific details of any performance impacts but did issue the following statement to Tom's Hardware: "Any performance impact will vary depending on workload and system configuration. AMD is not aware of any known exploit of the described vulnerability outside the research environment..."

AMD describes the exploit much more simply, saying, "Under specific microarchitectural circumstances, a register in "Zen 2" CPUs may not be written to 0 correctly. This may cause data from another process and/or thread to be stored in the YMM register, which may allow an attacker to potentially access sensitive information."

The article includes a list of the impacted processors with a schedule for the release of the updated firmware to OEMs.

The Google Information Security researcher who discovered the bug is sharing research on different CPU behaviors, and says the bug can be patched through software on multiple operating systems (e.g., "you can set the chicken bit DE_CFG[9]") — but this might result in a performance penalty.

Thanks to long-time Slashdot reader waspleg for sharing the news.
This discussion has been archived. No new comments can be posted.

AMD 'Zenbleed' Bug Leaks Data From Zen 2 Ryzen, EPYC CPUs

Comments Filter:
  • I always felt my processors could use a good bleeding.

  • by lsllll ( 830002 ) on Sunday July 30, 2023 @12:08AM (#63724868)

    but it will not patch its consumer Zen 2 Ryzen 3000, 4000, and some 5000-series chips until November and December of this year

    Makes me wonder if the performance hit is just too much with the current patch and they're afraid of losing market share because of it. They've had a good run with Ryzen.

    • microcode (Score:3, Informative)

      by Anonymous Coward
      They have to test the fuck out of changes to microcode to make sure nothing else breaks, they've got a metric buttton of products to test - three series of Zen chips, and they already had work planned for the people who will do the development and testing. It's not like they sit around reading Playboy magazine like a bunch of firefighters waiting for the bell to go ding-ding-ding. Most likely they have products launching and they won't have free resources until after those launch. This is a pretty low prior
      • Playboy? No one reads playboy. This isn't the 80s. Obviously they're not reading playboy.

        However, they are watching porn all day.

      • by lsllll ( 830002 )
        What drug are you on bud? Is that you, AMD? It affects a whole bunch of Ryzen processors which aren't meant for the data centers. And the fact that it's exploitable via Javascript in a browser should throw some pause in as well, since that's what users do; you know, browser the web. Additionally, the CVE [nist.gov] hasn't yet been evaluated, so no score. I don't know why you think it's low priority. Finally, December is 5 months away. Coming up with a microcode update should not take 5 months. If it would have
        • Glad I bought an Intel. Go ahead and hate me.
        • You would have a better chance of getting hit by lightning than manage an exploit via JavaScript. Is it technically possible? Yes. do any javascript virtual machines actually utilize the instructions impacted? Nope. What you really need is to be able to execute a specifically crafted set of instructions in order to really exploit it.

          It affects a whole bunch of Ryzen processors which aren't meant for the data centers.

          The lion's share of profits come from server CPUs, which is why they are first. Again, javascript is a non-issue.

          Coming up with a microcode update should not take 5 months.

          Coming up with a microcode update only takes a day or two at mo

          • by Anonymous Coward

            You would have a better chance of getting hit by lightning than manage an exploit via JavaScript. Is it technically possible? Yes. do any javascript virtual machines actually utilize the instructions impacted? Nope. What you really need is to be able to execute a specifically crafted set of instructions in order to really exploit it.

            The bug involves leaving prior data in the YMM registers during a switch to a new context.
            Normally, a new context has them reset to zeros, and context switching restores the previous values for that context.
            New processes (threads) will not get zeros, but some other processes YMM data instead, due to the bug.

            XMM and YMM registers are used by many opcodes and extended instructions.
            SSE, AVX, VEX, likely others as well.
            Quite a few of those are available to access inside a JS VM.

            It is questionable how much initi

            • Your description isn't quite accurate.

              When you do predicted code execution, your data _must_ stay physically where it is, so that it can be restored if the prediction failed. So while you are executing predicted code, other processes must be prevented from writing to the physical location. For two obvious reasons: One, if another process can write to that location, and then my process decides the branch was mispredicted and tries to restore the data, then my process gets incorrect data and probably crash
          • You would have a better chance of getting hit by lightning than manage an exploit via JavaScript. Is it technically possible? Yes. do any javascript virtual machines actually utilize the instructions impacted? Nope. What you really need is to be able to execute a specifically crafted set of instructions in order to really exploit it.

            It affects a whole bunch of Ryzen processors which aren't meant for the data centers.

            The lion's share of profits come from server CPUs, which is why they are first. Again, javascript is a non-issue.

            Coming up with a microcode update should not take 5 months.

            Coming up with a microcode update only takes a day or two at most. What takes the most time is finding the optimal solution and testing it.

            If you're really concerned about it then there is the "chicken bit" which allows you to disable the feature entirely.

            I suspect that the microcode fix is to block other firmware from exploiting the bug while it is doing something it was designed to do that won't be fixed. Without researching the matter at all so again, it's just a suspicion. Sooo... watch your Rust code because I also suspect this bug is exploitable with it IDK.

          • by bn-7bc ( 909819 )
            There is a lot of serverside js arround tho, the hole node thing iirc
        • > the fact that it's exploitable via Javascript in a browser

          That doesn't seem to be a fact - just a claim somebody made.

          Based on Cloudflare's blog it's probably only technically true on 4+-yr old browsers without timing-attack mitigation features.

          Practically it appears to not be true for any modern use case.

      • by dougmc ( 70836 )

        This is a pretty low priority because multi-user systems typically run Epyc not Ryzen.

        They say the exploit can even be run via javascript (and presumably other sandboxed languages that we usually think of as safe), so it could be a viable attack even against a typical desktop machine only used by one person.

        Also, even the /. summary makes it clear that Epyc is affected as well.

        The 'Zenbleed' vulnerability spans the entire Zen 2 product stack, including AMD's EPYC data center processors and the Ryzen 3000/4000/5000 CPU

        • They say the exploit can even be run via javascript

          Tom's hardware makes that claim. However, no evidence has been presented that javascript VMs can trigger this bug. It can be reliably triggered using a nonsensical string of instructions but no evidence has been presented showing that javascript can pull this off.

          • Not quite a "nonsensical string".

            The instruction with the broken rollback is going to be hit frequently due to its use in libc.
            After that, there are 2 things that the CPU must be triggered to do in order to force it into a broken state. These things have to be done in a semi-strict timing window, but they do not have to be issued sequentially. Both of those things will also be done in libc, though triggering them via a JS VM will require intimate knowledge of the JIT. Absolutely doable though.

            Trivial t
            • triggering them via a JS VM will require intimate knowledge of the JIT. Absolutely doable though.

              If you were certain that the instruction was used, then maybe. However, you don't even know if it is.

              • If you were certain that the instruction was used, then maybe.

                Yup. Not maybe, then yes. The conditional is very plain. If this, then that.

                However, you don't even know if it is.

                There's zero chance whatsoever that this can't happen.
                As I said earlier, they're all prevalent within libc for good reason- they're part of modern vectorization optimizations.
                There's no fucking chance in hell that a modern JS JIT VM doesn't use the faulty instruction, or the the instructions that trigger the faulty rollback.

                There has been speculation that Spectre mitigations for JS will be helpful here, but I don't really see ho

                • There's zero chance whatsoever that this can't happen.

                  So I decided to stop guessing and cloned and searched the source code for Chromium's V8 engine [googlesource.com] and Firefox [github.com] which bundles it's SpiderMonkey engine.

                  For V8, I ran grep -ir "vzeroupper" v8/ and it returned nothing at all.

                  For Firefox, SpiderMonkey is the "js/src" directory. so I did grep -ir "vzeroupper" gecko-dev/js/src/ and the returned results were all from the "gecko-dev/js/src/zydis/" which is a copy of the
                  Zydis x86_64 disassembler (see https://github.com/zyantific/z... [github.com]).

                  vzeroupper is not used in JavaScrip

                  • Uh, lol.

                    Why on earth do you think you could pull C and C++ source, and grep for an assembly mnemonic?

                    You didn't pick up that you may have been barking up the wrong tree when the only place you found the mnemonic was in a disassembled that turns machine code into mnemonics?
                  • Just for shits and giggles, and for educational value for my fellow slashdotter who likes to speak with authority in regimes they have no actual knowledge, I decided to do an actual evaluation of JS engine impact. For this, we'll need to build, and then disassemble.
                    I used IDA Pro for the disassembly, but Ghidra is a great free option if you'd like to take a poke yourself.

                    vzeroupper is used in 1 single function in v8, fast_search_avx (and all of its polymorphisms)
                    I didn't track down exactly which intrins
        • but not Zen3 7000 CPU?
          • by dougmc ( 70836 )

            I may not be the ideal person to ask -- I just read the summary and skimmed over the links given in it.

            That said, the summary says "AMD has patches ready for its EPYC 7002 'Rome' processors now", and AMD's response talks about that and "AMD Ryzen 7020 Series Processors", so ... I guess yes?

            I've got several Ryzen machines, but most are 2xxx so I guess they're not vulnerable, but one is a 5500 so I guess that is. It doesn't strike me as a large concern in my specific case, but it's easy to see common cases w

        • by AmiMoJo ( 196126 )

          The Javascript exploit is probably a very remote possibility, given that it seems to rely on a low probability event occurring. In other words you need to run the Javascript a lot, and all modern browsers throttle Javascript execution quite aggressively to save power and reduce system load. There are exceptions, mainly Javascript that the user is directly interacting with, but even if you could engineer the user doing that for a while... ...the next problem is that you get the content of one register at a r

    • IIUC Linux kernel 6.4.6 onward has a Zenbleed fix until the firmware updates arrive.

    • Makes me wonder if the performance hit is just too much with the current patch and they're afraid of losing market share because of it. They've had a good run with Ryzen.

      Likely. People are interested in performance. All of these attacks are incredibly hard to pull of outside of a lab or a situation where you already have access to the hardware. Sure it sounds nasty that a website can run javascript to exfiltrate data, but actually knowing what to exfiltrate and when is not only non-trivial, it's prohibitive in the schemes of more successful hacks (e.g. social engineering).

      There's a reason every one of these vulnerabilities so far have been opt-in. Performance matters more t

  • by DrMrLordX ( 559371 ) on Sunday July 30, 2023 @04:13AM (#63725080)

    https://blog.cloudflare.com/ze... [cloudflare.com]

    "Given that the successful exploitation of this vulnerability requires very precise timing that is difficult to achieve without executing native code the vulnerability, filed under CVE-2023-20593, has initially received the CVSS score of 6.5 and therefore was classified as Medium Risk. The initial mitigation suggested by the researcher is achieved by turning off certain functionality via modification of the Model Specific Register - namely DE_CFG. This change will prevent certain instructions with complex side effects like vzeroupper from being speculatively executed. "

    • All of these CPU vulnerabilities over the past several years are incredibly difficult to exploit. I haven't heard of a single one of them being successfully exploited in the wild. Not to say we shouldn't try to fix them, but from a hacker's point of view there are much easier and cheaper ways to successfully attack an organization.

    • When they say "native code", do they mean compiled code or code that is not running within a VM, or both? Would WebASM be able to exploit this vulnerability?
      • I don't have the expert knowledge to answer if on my own, but I've asked and I've gotten the answer that it would be nearly impossible to both craft the exact instructions necessary to initiate the exploit and read the correct registers at exactly the right times to get the CPU to leak useful data.

        My impression is that it would take a carefully-crafted compiled binary to pull it off.

  • Presumably this instruction exists in Zen1. Since the functionality would be useful for AVX there too.

    So, VZEROUPPER internally changed from Zen1 to Zen2 and then changed back again in Zen3 - All without anyone noticing there was a problem in Zen2.

  • This gives a good summary, impact assessment and fix ETAs for affected AMD Zen 2 CPUs:
    Encryption-breaking, password-leaking bug in many AMD CPUs could take months to fix [arstechnica.com]

    I'm glad this appears to only affect an older family of AMD Zen 2 CPUs i.e. does NOT affect AMD Zen 3 and later CPUs.
    Odd that fixes are N/A for Desktop Zen 2 CPUs with old server AMD Zen 2 CPU microcode patched now. This seems like a weird delayed response to something that can lose passwords and encryption keys.

    Warning: running Internet sou

  • Microcode can be pretty gnarly, even compared to assembler. Perhaps the solution is to adapt a proof checker from math to microcode. I wouldn't expect it to be a trivial exercise, but it appears as if it could be quite valuable to do.

  • Zen 2 is 5 years old. Zen 1, 3, 4 and 5 do not have these issues, AFAIK.
  • Primarily, this is just an ordinary bug.

    If you write totally non-malicious code that does a lot of 256 bit XMM operations, and you are unlucky and use an unfortunate sequence of instructions, then it can happen that another process also using totally non-malicious code doing the same overwrites the upper half of one of your 256 bit XMM registers. Requires a lot of bad luck, but it could affect your program badly obviously.

    Now if you replace your code with some malicious code, then your code can perfor
    • then the attacker might be able to read 16 of these 10,000 bytes.

      More, if you happened to be scheduled as a second running thread on the core.
      I agree that in the case of OS scheduled sharing of a core, the memcpy is likely to complete in a single time slice.

Every nonzero finite dimensional inner product space has an orthonormal basis. It makes sense, when you don't think about it.

Working...