Theo de Raadt Details Intel Core 2 Bugs 442
Eukariote writes "Recently, Intel patched bugs in its Core 2 processors. Details were scarce; soothing words were spoken to the effect that a BIOS update is all that is required. OpenBSD founder Theo de Raadt has now provided more details and analysis on outstanding, fixed, and non-fixable Core 2 bugs. Some choice quotes: 'Some of these bugs... will *ASSUREDLY* be exploitable from userland code... Some of these are things that cannot be fixed in running code, and some are things that every operating system will do until about mid-2008.'"
Re:Yay AMD (Score:5, Informative)
by day towards open source operating systems too, perhaps because
their serious errata lists are growing rapidly too).
Re:Summary sucks, someone please provide better on (Score:5, Informative)
Some of the bugs are so dangerous that it doesn't matter WHAT operating system you're running, code could be written that could attack the entire system. It would still be OS-specific code, but since the exploit is in the hardware, it's a LOT harder to prevent the attack, if it's even possible.
Some of the bugs are unfixable, as well. (I assume they mean without physcially replacing the chip with a 'fixed' one that doesn't exist yet.)
Re:How hard is it to get right? (Score:5, Informative)
Re:Good stuff. (Score:5, Informative)
Basically the MMU simply does not operate as specified/implimented in previous generations of x86 hardware. It is not just buggy, but Intel has gone further and defined "new ways to handle page tables" (see page 58).
Some of these bugs are along the lines of "buffer overflow"; where a write-protect or non-execute bit for a page table entry is ignored. Others are floating point instruction non-coherencies, or memory corruptions -- outside of the range of permitted writing for the process -- running common instruction sequences.
It will be interesting to see what Intel has to say about this.
Re:intel issues (Score:2, Informative)
Re:Yay AMD (Score:4, Informative)
Re:Time for RISC? (Score:2, Informative)
Did you know that one of the main reasons that x86 outperforms any similarly specified RISC chip is because those horribly inelegant, variable length x86 instructions allow for considerably higher code density than RISC?
Elegant does not necessarily equal faster or better, no matter how much you might want it to.
Re:Patches (Score:4, Informative)
Hardware has had built-in firmware/software for as long as I remember. BIOS is software. Microcode for even consumer CPUs has been done for as long as I remember, Pentium II had it. Apparently, the 8086 had microcode-based instructions.
Re:Summary sucks, someone please provide better on (Score:3, Informative)
True, but that depends on how easily they could be exploited in the real world, rather than in the theoretical world. From what I remember, one was about incorrect behaviour when your code runs off the end of a 4GB boundary; certainly that might be exploitable, but not on any system which can't run >4GB of code.
I skimmed through the bugs which the author said really scared him and didn't see anything that looked easy to exploit from a user program. Yes, if you want total security on your system then they'd be scary, but if it's almost impossible to exploit then it really doesn't matter to anyone much outside the most secret parts of the government (and, even then, bribing people would probably be an easier way of stealing secrets).
"I wouldn't be surprised if businesses like that started switching to AMD hardware."
You're assuming that AMD chips are any better.
Re:Theo says... (Score:1, Informative)
Re:Patches (Score:3, Informative)
Don't confuse microcode with firmware. Two different things. Microsode isn't intrinsically updateable, and may be placed in a read-only memory block.
Re:Summary sucks, someone please provide better on (Score:2, Informative)
Re:How hard is it to get right? (Score:5, Informative)
% uname -a
FreeBSD myhost.grateful.net 6.2-STABLE FreeBSD 6.2-STABLE #0: Mon May 28 09:52:28 PDT 2007 me@myhost.grateful.net:/usr/obj/usr/src/sys/AMD64 i386
granted, I'm using 32bit mode - but I've been running 6.2 for as long as its been out and my 'always on' freebsd box. what issues are you seeing? this is my production box - but I don't see any problems with bsd. in fact, I also have 6.2 running with an old amd64 3000+ that was a mobile chip and had to have cpufreq enabled just to move it off its default 800mhz and up to the 2.mumble ghz that its supposed to clock at. works fine.
I have seen some hardware devices not behave well but often its not a well designed piece of hardware or its just not meant for server style loads (cheap consumer onboard sata sometimes times out and usb2.0 always times out if you give it enough load).
I can't speak to amd64 USING 64bit mode, but 32bit mode works as well as (or better) than linux on headless style computing.
Linus doesn't think it's a big deal. (Score:5, Informative)
Re:Summary sucks, someone please provide better on (Score:5, Informative)
Here's a little more detail, based on my (very incomplete) understanding of the issues:
It appears that Intel has made changes to the way the memory management unit in the processor works, plus there are also some bugs that affect memory management. So what does that mean?
There are other issues as well... but these are a good sample, and should give an idea of what kind of bad stuff these CPU bugs/changes can make possible.
Re:Yay AMD (Score:5, Informative)
For general-purpose usage, the most interesting design I've seen recently is the PWRficient from P.A. Semi. It's a nice dual-core 64-bit PowerPC, with low power usage, similar performance to IBM's PowerPC 970 series. It has a lot of nice stuff on-die (crypto, a really shiny DMA architecture, etc).
For a complete round-up of current alternatives, take a look at this article [informit.com] and the next two in the series.
[1] They are generally marketed as 'cell phones' or similar, rather than 'computers'.
Re:Summary sucks, someone please provide better on (Score:5, Informative)
Re:Patches (Score:-1, Informative)
People have fixed/coded around bugs before. This is all a tempest in a teapot.
Scariest post of the thread (Score:4, Informative)
http://marc.info/?l=openbsd-misc&m=11830201643010
and control computers remotely.
* Monitor and control (filter) the network traffic - before/under the
running operatingsystem
* sending out patches to computers - even if they are turned off.
* Control, upgrade, change, add and remove software
Re:Patches (Score:2, Informative)
http://www.enlight.ru/docs/cpu/INFO/mcupdate.htm [enlight.ru]
I think with this and with clever hacks in the OS [x86.org], you can fix most bugs. So probably there's a lot of person to person communication between processor manufacturers, Bios writers and OS vendors and the net result is that it all seems like it works. Of course if you're an obnoxious vendor of a not too commercially important OS, you're probably excluded from this, which is why Theo is upset.
Re:Summary sucks, someone please provide better on (Score:3, Informative)
Re:Quantum effects? (Score:2, Informative)
In the case of the chip, large working temperature of such chips will very much help in suppressing the quantum effects. The length scale of the little wires inside of the chip is very important, so, simply stated, one must counter-balance the size of the wires with the working temperature, and, in the end, you get the chip to behave completely classically, designed using standard laws of digital/analog electronics. Going to much finer wires, like 10 fold thinner, would reintroduce quantum effect very big time, to the point of breaking the classical laws of conduction of these wires, transistors would have to be redesigned/refabricated using completely different technology (that, AFAIK, does not exist yet), etc...
Re:Summary sucks, someone please provide better on (Score:4, Informative)
From what the errata says, unless the host software has specifically disallowed access to parts of the MSR, a VMX guest/non-root system could reload the CPU microcode.
This leads to a whole universe of complicated data theft/code execution/etc. exploits that will probably never be created due to their complexity. However, it also leads to a very, very, very simple DoS/crash exploit (load some bad microcode, crash the CPU).
Re:Time for RISC? (Score:4, Informative)
Yes, they will. But those chips are designed with a target price of thousands of dollars and without anywhere near as much concern about heat.
Power has a 128 KB L1 cache (64 KB on Core 2), 4 MB L2 cache per core (4 MB L2 shared on Core 2), and a 32 MB L3 cache (none on Core 2). If you're willing to pay for that, x86 would be a lot faster.
Oh, don't forget that Power chips run really really hot. Hotter than Pentium 4's. The market has made it clear that lower power usage / heat generation is a priority now.
Re:Linus doesn't think it's a big deal. (Score:4, Informative)
He may be entirely right, and his experience in CPUs, BIOS vendors and Intel, AMD, etc may mean what he is saying is accurate - but the tone doesn't really sound very professional.
The erratum mentioned (Score:5, Informative)
AI65 - Thermal interrupt does not occur if DTS reaches an invalid temperature. What the hell is an invalid temperature? A disconnected sensor or something? It doesn't sound like something a userland thermal-generating loop can exploit but the errata is not detailed enough to know for sure.
AI79 - REP/STO in specific situation may cause the processor to hang. BIOS patchable. The errata mentions an uncacheable memory store. If this is a pre-requisit then only user programs with access to
AI43 - Concurrent MP writes to non-dirty page may result in unpredictable behavior. This one is extremely serious. It effects any threaded program and possibly even programs which are no threaded. This would cause me to not purchase the cpu. It says that a BIOS workaround is possible (aka microcode update).
AI39 - Cache access request from one core hitting a modified line in the L1 cache of another core may cause unpredictable system behavior. What the hell? Are they out of their minds? This is a big-time show stopper. It says it can be fixed with the BIOS (aka microcode update). I sure hope so.
AI90 - Page access bit may be set prior to signaling a code segment limit fault. This one is pretty serious. This cannot occur on most operating systems because the code segment is set to be unlimited and access is governed solely by the page tables. In 64 bit mode emulating 32 bit operation the problem might occur if a bit of code wraps the segment. There are possibly issues in other emulation modes, such as VM86 mode. The effect of setting the page accessed bit will not make a page accessible that was not previously unaccessible, but it will result in unexpected modifications to the page table page and numerous operating systems may free such pages to the page-zerod page list under the assumption that they cleaned the page out when in fact there may be a page table entry with the access bit set (meaning the page wasn't completely zerod when freed). That could cause problems.
AI99 - Updating code page directory attributes without tlb invalidation may result in improper handling of a page fault exception. This one doesn't look too serious, it just means the wrong exception will be taken first, meaning that the OS will probably seg-fault the program. If the OS corrects the issue and retries, the correct exception will be taken on retry. All BSDs that I know of handle page fault exceptions generically and will not be effected. Of greater concern is what sort of modifications to a page directory entry now require TLB invalidations? On FreeBSD and DragonFly, and I assume most BSDs and probably Linux too, page directory entries usually transition between only two states and a TLB invalidation is made when a page directory entry is invalidated, so they wouldn't be effected by this bug.
AMT - 0wned at the hardware level. (Score:3, Informative)
That's actually a bad article about a real issue. A better article is here. [monstersandcritics.com]
Intel's AMT technology puts special purpose hardware in the network controller which recognizes UDP and TCP packets on ports 16992, 16993, 16994, and 16995. This is completely independent of the operating system. Various system administration functions can be performed. Anybody can inventory the machine and read its ID. Other functions, like power off/on, reboot, user disable (disables keyboard/mouse/on-off switch) and remote disk I/O require a password or crypto key.
This has been around for a while; the previous version was called IPMI, Intelligent Platform Management Interface. It talked UDP only. AMT also talks TCP and HTTP; there's a whole protocol stack in the network controller now just for this. This was originally a server farm management system, but now it's on desktops, too. If HTTP mode is enabled, you can control the machine from a web browser via port 16692.
It even works while the computer is "turned off"; it's part of "wake on LAN" functionality.
Supposedly, there is no valid default password or key, and the feature is supposedly off by default. But if any software ever enables this, you're 0wned.
The computer manufacturer can preload management keys. "An OEM may supply platforms with a PID-PPS pair already written to the Intel AMT Flash memory.", according to Intel. If a vendor does that, they 0wn your computer. Something to watch for. AMT can also be enabled from the Intel Management BIOS extension screen. (Password: "admin", it says in the manual.)
The normal way AMT keys get loaded in a corporate environment is that you plug in a USB key with a special file ("setup.bin") and power cycle the machine. The machine then tries to connect to the mothership on port 9971, doing a DNS lookup for "ProvisionServer" if no IP address was specified.
If you don't want AMT enabled, here's how to disable it: [intel.com], "Intel AMT is returned to Factory Mode by selecting the Unprovision option on the BIOS Extension menu or by disabling Intel AMT from the BIOS extension Manageability Feature Selection."
The whole AMT system is reasonably designed; it even has Kerberos authentication. But it's so powerful and so hidden that if it's ever enabled, it's worse than a root kit. Even reinstalling the OS won't help.
Here's Intel's technical info about AMT. [intel.com]
Re:The erratum mentioned (Score:5, Informative)
AE1 - CPU to memory copy with FST with numeric and null segment exceptions may cause GP faults to be missed and FP linear address mismatch. In otherwords, a segmentation violation will be missed and a write will be allowed to proceed. This will not effect OSs using page tables for protection, which is all OSs. Sounds bad but doesn't sound like it will effect existing OSs
AE2 - Code segment violation may occur if a code stream wraps around a segment. No program does this on purpose, and OSs will just seg-fault the program if it does. The intel errata says it could be exploted by a virus but I don't see how by its current description. Maybe there is something they aren't telling us.
AE3 - POPF/POPFD that sets the trap flag (aka when single-stepping a program) may cause unpredictable behavior. Holy shit. This one is serious.
AE4 - REP MOVS in fast string mode continues in that mode when crossing into a page with a different memory type. This means that when crossing over from a cacheable page to an uncacheable page, the I/Os remain cacheable. And vise-versa. This will never happen on purpose so the question is whether it can be exploited in some way, and the answer to that is not that I can see.
AE5 - Memory aliasing with inconsistent dirty and Access bits may cause a processor deadlock. This means a PTE with 'D'irty set but with 'A'ccess not set. FreeBSD and DragonFly always set the A bit when setting the D bit and will not be effected but I don't know about other OSs. This is a very serious bug though.
AE6 - VM bit will be cleared on a double fault exception. Double faults are usual fatal for the whole machine so unless they can occur in an emulation mode (where the double fault is being emulated). Check your OS. FreeBSD and DragonFly do not try to resume after a double fault and do not take faults in VM mode and are not effected.
AE7 - Incompatible write attributes in page table verses MTTR may consolidate to UC. Not a big deal, doesn't happen unless something has been misprogrammed.
AE8 - FXSAVE after FNINIT without an intervening FP instruction may save uninitialized values for FDP and FDS. This isn't an issue unless the data being written represents a security leak of some sort, such as a portion of the state of another program's FP unit. This could be a security issue with regards to one program snooping another program's cryptography. Statistical snooping possible through this sort of mechanic has been shown to be effective in recent years.
AE9 - LTR can result in a system hang. Well, BSDs don't really use LTR all that much and the conditions required just will not happen on BSD or probably linux either. A break point must be set on the cache line containing the descriptor data? Not from userland!
AE10 - Invalid entries in the page directory pointer table register may cause a GP fault. Not an issue.
AE11 - REP MOVS operation in fast string mode continues in that mode when crossing into a page with a different memory type. Not an issue.
AE12 - FP inexact result exception flag may not be set if the #inexactresult occurs in any FPU instruction with certain instructions occuring afterwords. This is a very serious bug that only compilers can work around (and probably won't).
AE13 - IFU/BSU deadlock may cause system hang. I've no idea what IFU and BSU is.
AE14 - MOV with debug register causes a debug exception. Sounds like the worst that happens here is a program seg faults if this condition is hit while the program is being debugged.
AE15 - INIT does not clear global entries in the TLB. Oh, joy. Intel says that BIOS writers would know of thise errata and cod efor it, but insofar as I know this could be an issue when starting up APs.
AE16 - Use of memory aliasing with inconsistent memory type may cause system hang. It shouldn't be possible for this to happen with a modern OS. It means mapping the same physical page of memory with different memory contr
Re:Yay AMD (Score:1, Informative)
Re:Yay AMD (Score:2, Informative)
Not knocking him in general but here he hasn't produced anything we didn't know already.
Re:Modern processors (Score:1, Informative)
Those CPUs don't work quite the way you think they do.
The internal 'instruction set' (really microcode) doesn't resemble any ordinary RISC instruction set; it's designed specifically to implement x86, not to operate like an end-user-visible instruction set would. Problems of the type you're talking about arise when translating between dissimilar ISAs with different condition code handling and the like. That's just not an issue here.
Few x86 instructions translate to more than two or three microcode ops, and a large number translate to just one. The point of the translation is mostly to separate ALU instructions with memory operands into discrete loads, stores, and ALU ops. Instructions which don't touch memory are very unlikely to need more than one microcode op.
The underlying core has no instruction format suitable for storage in external memory; uops are fully decoded control words, which are much larger than the original instructions. It would be terribly inefficient to expose this to user programs.
1. x86 isn't nearly as dirty as you think it is. The real cruft was pre-386; if you lock a 386 or later x86 into 32-bit mode and ignore the other modes it's not that bad. The biggest remaining wart was the original x87 FPU instruction set, but that's now possible to ignore in favor of SSE.
2. What the hell does Microsoft wanting to keep Windows closed source have to do with the popularity of x86? You're completely insane for suggesting a connection. There's nothing which would make Microsoft open-source Windows if it had to run on other CPUs. The proof is that Microsoft had (and in some cases shipped to end users) working ports of Windows NT for PowerPC, MIPS, Alpha, and probably others I'm forgetting. All closed source. All cancelled for reasons which had nothing to do with open/closed source.
3. AMD64 isn't any cleaner than 32-bit x86. It doesn't change much, just adds support for 64-bit registers, 64-bit ALU ops, and 8 more registers. To do this they needed to use more opcodes, which meant chaining more prefix bytes. Thus you get fun things like register-to-register ALU instructions being longer if one of the operands is one of the new registers.
Oh god.
Never studied how much a mess that made of early versions of Alpha, have you?
This is stupid.
Sorry there's no less harsh a way to put it. It's a bad, dumb idea. RISC ISAs which left out hardwa
Re:Time for RISC? (Score:3, Informative)
RISC binaries tend to be larger than CISC binaries. The reason is the complex instruction set, like in the x86 architecture, were made complex to save memory. Most of the common instructions are represented in only one byte, while the rarer instructions can be much much longer. RISC instruction sets, on the other hand, typically have a fixed instruction size for all instructions, and the average instruction length ends up being longer.
While most people have plenty of hard drive space now, and RAM isn't as scarce as it used to be, CPU caches are still pretty small. Using a more compact instruction set can make the code caches more effective, which can dramatically improve performance.
I doubt this completely justifies choosing CISC over RISC, but it is at least one piece of information in favor of sticking with CISC.
Re:wrong (Score:3, Informative)
SGI/MIPS canceled two high end CPUs, Beast and Capitan, specifically because of the threat of Itanium. I was there, I saw it happen.
http://news.com.com/Silicon+Graphics+scraps+MIPS+
Compaq killed Alpha before the HP merger, before Carly, with the intention of moving their high end business to IA-64:
http://en.wikipedia.org/wiki/DEC_Alpha [wikipedia.org]
Obviously, Itanium redirected HP's focus away from PA-RISC since it was a HP/Intel project.
Itanium failed to completely derail SPARC, but it caused a great deal of controversy inside SUN about the future of the SPARC architecture and disrupted SPARC development for a year or two.
http://news.com.com/2100-1001_3-237583.html [com.com]
IBM's Power architecture was perhaps the least affected by Itanium. IBM was pretty skeptical about Itanium and kept the Power program very much alive. As a result, they are the only RISC family which still has a significant presence on the Top500 supercomputing list.
http://www.top500.org/stats/list/29/procfam [top500.org]
http://www.top500.org/stats/list/13/procfam [top500.org]
I have no idea whether all these other CPU families would have been successful in the marketplace without Itanium. However, the fact is they were killed due to one of the most influential vaporware announcements in the history of the computer industry.