Intel DOWNFALL: New Vulnerability In AVX2/AVX-512 With Big Performance Hits (phoronix.com) 68
An anonymous reader quotes a report from Phoronix: This Patch Tuesday brings a new and potentially painful processor speculative execution vulnerability... Downfall, or as Intel prefers to call it is GDS: Gather Data Sampling. GDS/Downfall affects the gather instruction with AVX2 and AVX-512 enabled processors. At least the latest-generation Intel CPUs are not affected but Tigerlake / Ice Lake back to Skylake is confirmed to be impacted. There is microcode mitigation available but it will be costly for AVX2/AVX-512 workloads with GATHER instructions in hot code-paths and thus widespread software exposure particularly for HPC and other compute-intensive workloads that have relied on AVX2/AVX-512 for better performance.
Downfall is characterized as a vulnerability due to a memory optimization feature that unintentionally reveals internal hardware registers to software. With Downfall, untrusted software can access data stored by other programs that typically should be off-limits: the AVX GATHER instruction can leak the contents of the internal vector register file during speculative execution. Downfall was discovered by security researcher Daniel Moghimi of Google. Moghimi has written demo code for Downfall to show 128-bit and 256-bit AES keys being stolen from other users on the local system as well as the ability to steal arbitrary data from the Linux kernel. Skylake processors are confirmed to be affected through Tiger Lake on the client side or Xeon Scalable Ice Lake on the server side. At least the latest Intel Alder Lake / Raptor Lake and Intel Xeon Scalable Sapphire Rapids are not vulnerable to Downfall. But for all the affected generations, CPU microcode is being released today to address this issue.
Intel acknowledges that their microcode mitigation for Downfall will have the potential for impacting performance where gather instructions are in an applications' hot-path. In particular given the AVX2/AVX-512 impact with vectorization-heavy workloads, HPC workloads in particular are likely to be most impacted but we've also seen a lot of AVX use by video encoding/transcoding, AI, and other areas. Intel has not relayed any estimated performance impact claims from this mitigation. Well, to the press. To other partners Intel has reportedly communicated a performance impact up to 50%. That is for workloads with heavy gather instruction use as part of AVX2/AVX-512. Intel is being quite pro-active in letting customers know they can disable the microcode change if they feel they are not to be impacted by Downfall. Intel also believes pulling off a Downfall attack in the real-world would be a very difficult undertaking. However, those matters are subject to debate. Intel's official security disclosure is available here. The Downfall website is downfall.page.
Downfall is characterized as a vulnerability due to a memory optimization feature that unintentionally reveals internal hardware registers to software. With Downfall, untrusted software can access data stored by other programs that typically should be off-limits: the AVX GATHER instruction can leak the contents of the internal vector register file during speculative execution. Downfall was discovered by security researcher Daniel Moghimi of Google. Moghimi has written demo code for Downfall to show 128-bit and 256-bit AES keys being stolen from other users on the local system as well as the ability to steal arbitrary data from the Linux kernel. Skylake processors are confirmed to be affected through Tiger Lake on the client side or Xeon Scalable Ice Lake on the server side. At least the latest Intel Alder Lake / Raptor Lake and Intel Xeon Scalable Sapphire Rapids are not vulnerable to Downfall. But for all the affected generations, CPU microcode is being released today to address this issue.
Intel acknowledges that their microcode mitigation for Downfall will have the potential for impacting performance where gather instructions are in an applications' hot-path. In particular given the AVX2/AVX-512 impact with vectorization-heavy workloads, HPC workloads in particular are likely to be most impacted but we've also seen a lot of AVX use by video encoding/transcoding, AI, and other areas. Intel has not relayed any estimated performance impact claims from this mitigation. Well, to the press. To other partners Intel has reportedly communicated a performance impact up to 50%. That is for workloads with heavy gather instruction use as part of AVX2/AVX-512. Intel is being quite pro-active in letting customers know they can disable the microcode change if they feel they are not to be impacted by Downfall. Intel also believes pulling off a Downfall attack in the real-world would be a very difficult undertaking. However, those matters are subject to debate. Intel's official security disclosure is available here. The Downfall website is downfall.page.
Multiple Cores (Score:1)
Re:Multiple Cores (Score:5, Interesting)
1) The vulnerability is not in the OS, its in the anything that has an AVX2/AVX-512 CPU instruction. And where the security information would be "raided" would be the contents in the CPU cache anyway. Not having an expert knowledge of Downfall's exploit process, there's no reason to think that a user application with an AVX2 instruction wouldn't be able to peek into the CPU cache contents anyway.
2) If what you asked can be done, you've just converted your OS to a single core OS. You don't think that isn't going to put a "hit" on performance?? You may as well apply the "mitigation" patch and deal with the drop in "AVX2/AVX-512" performance, which I suspect the CPU instruction doesn't even exist inside the OS code (but may exist in video related APIs).
3) If you know linux, you may want to look into an intriguing OS called Qubes [qubes-os.org], but you'll also need a better grounding in security, hardware, and virtualization in order to use it "effectively". It basically "sandboxes" every application away from the OS and other applications.
will apple force this on? as an push to there own (Score:1)
will apple force this on? as an push to there own cpus with the high ram markup? at least the mac pro has pci-e for your own storage vs apple storage at 3-4X markup.
Re: will apple force this on? as an push to there (Score:2)
Yes. Apple owns your immortal soul.
A comptition (Score:5, Funny)
Re: (Score:2)
Don't think so. There was a branch return patch with minor performance penalty. Nothing to do with AVX.
Zenbleed is very similar (Score:4, Informative)
The recently discovered "Zenbleed" [xda-developers.com] vulnerability is very similar to this. It results in AVX register values leaking from one process to another.
Re: (Score:3)
Main difference here is that patching Zenbleed has a near-zero performance penalty except for certain very specific AVX operations, while (according to the Phoronix article) the general performance penalty for Downfall can be rather severe.
Also Intel only currently has one generation of server hardware (Sapphire Rapids) available that is immune to this exploit. Zenbleed doesn't affect Milan or Genoa, meaning you'd have to be running some fairly old AMD enterprise gear to be affected by it.
Re:Zenbleed is very similar (Score:4, Informative)
Main difference here is that patching Zenbleed has a near-zero performance penalty except for certain very specific AVX operations
Incorrect.
I demonstrated previously [slashdot.org] that this line of wishful thinking is complete bullshit.
Also Intel only currently has one generation of server hardware (Sapphire Rapids) available that is immune to this exploit. Zenbleed doesn't affect Milan or Genoa, meaning you'd have to be running some fairly old AMD enterprise gear to be affected by it.
ZenBleed affects consumer CPUs too, and many many consumer CPUs are still Zen2.
Re: (Score:2)
You seem confused. Your linked post shows that the relevant instructions will be called and even that such a call can be induced from JS, but you said nothing at all about the performance impact of the mitigation.
Re: (Score:2)
The demonstration is that the instruction is all over the place, not just in "very specific AVX operations"
Re: (Score:2)
Until you account for the actual performance cost per call, you're just measuring frequency.
Re: (Score:2)
However, the nature of the mitigation was well-known (to not speculate the faulty instruction), and in case denying speculation on an instruction equates to a pipeline stall when the instruction is encountered.
In some cases, the instruction already couldn't be speculated, so there's no cost, but if a superscalar processor is doing its job and superscalar-ing, that should be the exception, not the rule. With mitigations, it's the inviolable rule.
It
Re: (Score:2)
Modern CPU performance is entirely about keeping the instructions/cycle ratio high. When it gets low, then you're operating like a 486 (or other pre-superscalar CPU) and are strictly bound by clocks, rather than architectural optimizations.
Re: (Score:2)
It is therefor very safe to bet the farm that frequency of encountering of the instruction will directly correlate to performance impact.
Yes, there will be a correlation. But without absolute numbers, you cannot meaningfully draw conclusions about the impact of one mitigation vs. the impact of a different mitigation of a different problem on a different processor.
Re: (Score:2)
Saw a CPU averages 4 instructions per cycle.
The average for a mitigated VZEROUPPER on a Zen2 will be 1.
That is unavoidable, period.
Re: (Score:2)
So now do the analysis for DOWNFALL, compile statistics of how frequently the performance hits occur in practice, multiply it all out, and then you'll have a meaningful comparison to tell us about.
Re: (Score:2)
So now do the analysis for DOWNFALL, compile statistics of how frequently the performance hits occur in practice, multiply it all out, and then you'll have a meaningful comparison to tell us about.
No.
I wasn't trying to draw a comparison.
I haven't begun to look at how bad DOWNFALL is.
You'll note the reply that started this bullshit AMD fanboi defense was this:
Main difference here is that patching Zenbleed has a near-zero performance penalty except for certain very specific AVX operations
I called that claim false. I did not in any way compare the two.
DOWNFALL sounds plenty bad to me. I'm correcting misinformation, not trying to measure dicks.
Re: (Score:2)
That at least requires you figure out the numbers for Zenbleed to show the magnitude of the effect.
Personally, I wouldn't call it nothing, but I haven't done the analysis to decide how bad it might be.
Re: (Score:2)
That at least requires you figure out the numbers for Zenbleed to show the magnitude of the effect.
No. Because I know the cost of a pipeline stall.
It can only be bad.
A pipeline stall is a series of instructions that converge upon 1 instruction per cycle, which is very, very bad.
Re: (Score:2)
Unless, in practice it happens once or twice a day (I know, it's more than that).
Re: (Score:2)
AVX isn't just used for vector math. It's used anywhere that SIMD is a more-efficient way to move memory around (increase instructions/cycle throughput)- and in particular, that means memory copies, since you can move a ton of data with a single instruction.
Re: (Score:2)
That's my mistake.
Here [openbenchmarking.org] are some benchmarks of mitigation impact in case you need pretty graphs to understand.
Re: (Score:2)
Sjames beat me to it, but it seems you don't really understand Zenbleed.
Re: (Score:2)
ZenBleed mitigation requires prevention of speculative execution of VZEROUPPER.
Any code that uses VZEROUPPER is therefor performance impacted. This isn't "specific AVX operation", this is everything that is autovectorized or tuned in any way for SIMD performance improvements. Including a shit-ton of libc.
You are talking out of your ass.
Re: (Score:3, Funny)
"Poorly written bad"
"Learning English is apparently not a priority"
It must be Irony Day again.
Re: (Score:2)
"about being an illiterate geek"
Yup, its Irony Day. I'll get the popcorn...
Dayam (Score:2)
Dodged a bullet there. I just moved to an Alder Lake CPU. :D
You may want to help Intel out with their surplus and grab one of those Microcenter CPU/Motherboard/RAM combos while you still can! Or if you're willing to spend, wait for the new Intel CPU releases when Commerce Day rolls around in 5 months. But desktop computer demand softened hard in 2022-2023, so perhaps Intel decides to make less 14th gen CPUs. (Dunno if there will still be any "cheap" 12th gen surplus left by then.) Hit the trade mags fo
Re: (Score:3, Funny)
Dodged a bullet there
Are you running the system in a way to give untrusted users enough access to categorise and craft their attack in a meaningful way, i.e. a datacentre operator? Otherwise you dodged a bullet the same way I dodged a bullet by being on the opposite side of the planet from where it was fired. None of these attacks have been relevant for end users so far, and as far as anyone including security researchers knows none have been used outside of a lab environment. They are too hard to meaningfully pull off.
Re: (Score:3)
. None of these attacks have been relevant for end users so far, and as far as anyone including security researchers knows none have been used outside of a lab environment. They are too hard to meaningfully pull off.
You're saying the quiet part out loud, stop it!
Re:Dayam (Score:4, Insightful)
Well the argument here is, its always theoretical until one day and metasploit module appears..
but I agree, most of these attacks require a implausible level of 'where and when' knowledge about what else is happening on the system. Oh you just happen know EXACTLY the instant that other thready is running those crypto operations when the registers will be interesting, or on a modern ASLR enabled platform you know EXACTLY what memory addresses in that other process you want recover values from.. and as an attack you know these things on a system that isn't yours where you don't already have the thing severely compromised though some other means, to get you code execution and information you need in the first place...
I'll also agree its *more* plausible in the data center where perhaps you have VM instance that is owned, not pwnd, by you and you can use it attack other VMs. However even there I have yet to see any digital forensics published on any attack where where these speculative execution bugs were on the critical path.
The same is true about the old DRAM issues ROWHAMMER and the like. Everyone got really really excited because okay yes you could write some JS that would run from the web, but in practice again that had to be highly highly targeted attack with pretty special conditions to actually work.. Its not like you could just walk around reading memory until you found where the kernel was keeping LSASS or something or even go probing for magic bytes that might indicates DER cert etc.
Re: (Score:1)
For me then, speculative execution is more like a feature, an established feature of subprocessing. Just because some heap service that fast tracks hardware reads does not revise it scope is not a reason to put it all on the CPU instructions. After all SOMETHING is supposed be able to read branch instructions what ever they are.
Re: (Score:2)
From what it says here [downfall.page], "This vulnerability, identified as CVE-2022-40982, enables a user to access and steal data from other users who share the same computer.". A lot of people living in the same residence (or dorm) share usage or have access to the same computer.
As for "Remote Trojan Exploit 101", all that a malicious hacker has to do is bundle an executable (with the AVX2/AVX512 CPU instruction exploit) to run locally on a computer they don't have access to, and send the collected CPU cache dumps to an
Re: (Score:2)
And we had vulnerabilities of this kind exploitable by Javascript. This one doesn't appear to be (easily) javascriptable, but there's so many other kinds of untrusted code from malicious sources (I consider ads and tracking to be malicious) that you run all day on a client machine that you can't ignore hardware bugs. Oh, and this one can't be containerized against.
On the other hand, the pipeline for new CPUs is at least 5 years -- and the first gen of speculative execution vulns happened in 2018. Thus, t
Re: (Score:2)
> So the only real option we have is to upgrade CPUs more often.
Meh. I don't particularly think it makes sense, as a "security" measure. I was using my last rig for 10 years, but I build my PCs, so it was more of a case where I valued my time more than "needing" performance out of my machine. Hardware vulnerabilities probably won't push me to upgrade more than once every four years.
Re:Dayam (Score:5, Insightful)
A lot of people living in the same residence (or dorm) share usage or have access to the same computer.
If you have physical access to the computer, then all bets are off anyway.
What would be a simpler way to get your roommate's encrypted data:
A) Figure out how to turn this PoC vulnerability into a real-world exploit against the particular software your roommate happens to be using, or
2) Put a key logger on the keyboard
(Of course, you might not even need to actually complete the second one. When you pick up the keyboard to install the logger, you'll probably find the sticky note your roommate wrote all their passwords on.)
Re: (Score:2)
A lot of people living in the same residence (or dorm) share usage or have access to the same computer.
An a vulnerability like this still isnt going to be half so useful as the attack path you described.
Let's assume neither of you has Administrator/Root access to the machine (because otherwise you can already just sudo and go rifle thru their stuff).
Let's assume the drive is encrypted, and the bios is password protected, for giggles (because otherwise pop the disk out and just dump the stuff)
Now do these speculative execution attacks even help you - Not really. Oh your can read register or cache - cool excep
Re: (Score:2)
It's more a threat to iron serving up vms. Though pity anyone using IceLake-SP. Hopefully this will give them incentive to upgrade to Genoa servers.
Re: (Score:2)
Though pity anyone using IceLake-SP.
For those people, its just a business expense. I won't even shed a tear for the business owner.
Interesting, but.... (Score:1)
From the summary (of course I didn't read the full bulletin, this is slashdot), it sounds like an attacker can't access data unless they already have their own software running on the server.
So, anyone running on their own hardware is safe. Anyone running solo on someone else's cloud should be safe. It is unclear what an attacker on a shared AWS/Azure instance can do. That's apparently the only situation where this _might_ be a problem. Everyone else can forget about this entirely; no need to patch anyt
Re: (Score:2)
There are many production servers handling multiple users running VMs.
Re: (Score:1)
Sure but I'm not worried about my own users attacking my other users on a corporate vm. But yes on a shared multi customer vm such as Amazon/azure this is can be a real problem depending on the hack details. That's why I called those out.
Re: (Score:2)
You might worry if one of your users introduces malware to their vm which can then silently steal credentials from all your other users without necessarily spreading detectable malware anywhere else.
Re: (Score:1)
Possibly but unlikely in my case due to the setup and wipe n refresh configuration. No one could easily install random malware.exe off the net and if they did manage to somehow wouldn't last long.
Security is more than just patching.
It would still be easier and more efficient to apply for a job and just steal what they want directly on their way to the airport. All these little keyhole hacks are hard to pull off a d yield tiny bits of data. Has there ever been a legit report of any data leaks ever from an
Re: (Score:2)
That's really been the truth behind all the Spectre/Meltdown type issues - it only really works if you're running on a shared machine.
Some of the attacks are theoretical - you can do it via
Re: (Score:1)
When spectre and friends came up years ago that's what I said to my CEO. Sure we'll patch because big money customers insist but if someone really wanted our data it would be a hell of a lot easier for them to send someone with solid technical skills to get hired and we'd then just give them the access. To his credit he got it.
Are we sacrificing too much security for speed? (Score:4, Insightful)
It seems these sorts of CPU hardware issues while not exactly common are appear too often. With billions of transisters and probably trillions of operation paths on a single die no testing can find all the errors , so perhaps a greater number of simpler CPUs/cores would be a better idea? At least for security critical systems.
Re: (Score:2)
An open ISA like RISC-V might be the way forward - with universities studying its every intricacy lending itself to a doctoral student who might formally prove the soundness of a particular piece of difficult logic to be immune to such attacks.
(N.B. open ISA doesn't equal open implementation but the likes of T-Head have already published their Xuantie Verilog on their github page)
Re: (Score:2)
I suppose at the very least it makes the open implementation more secure and trustworthy, particularly for anyone who doesn't have the ability or the want to spend millions on their own design team. But t
Re: (Score:2)
just spitballing here, but it would be nice if we could programatically disable speculative computing for security operations.
Then we'd have the speed for compiling, rendering, gaming, etc -- where i don't care if you peek at my registers.
But, we'd run in a slower non-branching path for secure operations, where we don't care about speed (or perhaps even slower is better (like password encryption, sometimes)).
The key problem here is, you are once again relying on something to say 'this needs to be secure', w
Some CPUs have that for timing (Score:2)
ARM CPUs have a flag that can be toggled that makes all instructions take the same amount of time no matter what their input is, to try to defeat timing attacks.
So yeah, it could be done.
Re: (Score:2)
just spitballing here, but it would be nice if we could programatically disable speculative computing for security operations.
The process that does the security operation is not the one doing speculative execution. It's the other process.
Re: (Score:2)
With branch prediction, you can make changes to registers which turn out to be wrong when the branch was predicted incorrectly, and then the processor has to restore the original register value. It should be obvious that for this to work, the processor must not be allowed to write to rename registers that belong to predicted branches.
And that is done wrong, on t
Re: (Score:2)
" The bug is always present, but it only manifests itself if another process is writing to vector registers exactly at the same time when your branch prediction is running."
And yet you seem to think that couldn't have been found if that particular mode of operation had been tested. Curious.
Re: (Score:2)
Simple answer: No. Not at all, and not even in the slightest. Security is not a binary concept. These attacks sound scary but the necessary effort and knowledge of the target system has so far required unfretted pre-existing access to the systems. Even in this case, it sounds nasty that we can read an encryption key out of cache but the amount of details you'd need to know of the target system to do this outside of a lab environment makes anyone wonder why you'd bother and not simply install a keylogger.
But
Downfall? (Score:2)
Wait till Hitler finds out about this vulnerability!
https://en.wikipedia.org/wiki/... [wikipedia.org]
HPC seems an unlikely target (Score:2)
How much HPC work is being performed in a context where you're running random software from random vendors? Manage the environment and you don't need to worry about the info leak. I imagine this won't be as impactful as some of the similar types of attacks we've seen.
Re: (Score:2)
If you read the description of the vulnerability, you will notice that even general (read:non-HPC) workloads can suffer major performance hits with the mitigations installed.
You don't have to be running HPC workloads to be affected by this (unlike Zenbleed).
Re: (Score:2)
The point is if you are running local HPC you probably want mitigations=off anyway as you usually have a local system shared by trusted users. And probably running non-secret workflows. Our local users can get root if they really need it on our local systems.
Makes you miss Itanium (Score:2)
Re: (Score:2)
No, seriously; whatever other problems it had, Itanium did not have these problems.
You don't know that. Nobody knew Intel had these problems. Until AMD had the problem, and then someone decided "let's see if Intel has the same or a similar problem". Itanium might have the exact same problem, except nobody would bother attacking it.
Looks exactly like Zenbleed (Score:2)
It is primarily a CPU bug. If you use vector instructions that change vector registers are speculatively executed due to branch prediction, and the branch prediction was wrong and the original register value needs to be restored, then in very rare cases the restored value is one that has been stored by another process, which will make your application go wrong.
And a malicious app can exploit this: Create the fault conditions intent