Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Intel Bug Hardware

Theo de Raadt Details Intel Core 2 Bugs 442

Eukariote writes "Recently, Intel patched bugs in its Core 2 processors. Details were scarce; soothing words were spoken to the effect that a BIOS update is all that is required. OpenBSD founder Theo de Raadt has now provided more details and analysis on outstanding, fixed, and non-fixable Core 2 bugs. Some choice quotes: 'Some of these bugs... will *ASSUREDLY* be exploitable from userland code... Some of these are things that cannot be fixed in running code, and some are things that every operating system will do until about mid-2008.'"
This discussion has been archived. No new comments can be posted.

Theo de Raadt Details Intel Core 2 Bugs

Comments Filter:
  • Yay AMD (Score:4, Funny)

    by delt0r ( 999393 ) on Thursday June 28, 2007 @07:58AM (#19674651)
    Thank God I got a AMD this time around.
    • Re:Yay AMD (Score:5, Informative)

      by BosstonesOwn ( 794949 ) on Thursday June 28, 2007 @08:02AM (#19674703)
      I don't think that is a good thing either. It looks like AMD may be doing this as well.

      (While here, I would like to say that AMD is becoming less helpful day
      by day towards open source operating systems too, perhaps because
      their serious errata lists are growing rapidly too).
      • Re: (Score:3, Insightful)

        by delt0r ( 999393 )
        He said they are getting less and less helpfull. Which is not the same as "just as bad".

        Dam I really think it would be better if we didn't have a "two party system" in the x86 field. A third (or fourth) vendor would be nice. But given the high barrier to entry, its not going to happen anytime soon.
        • Re:Yay AMD (Score:4, Informative)

          by vadim_t ( 324782 ) on Thursday June 28, 2007 @08:26AM (#19674963) Homepage
          Well, there's VIA as well, althought their stuff left a lot to be desired the last time I checked it out. Their mini-ITX stuff had potential -- small, low power usage, REALLY good crypto and video acceleration to compensate for the slow CPU. Unfortunately when I tried a Nehemiah board, it was very unstable.
          • by delt0r ( 999393 )
            Well VIA are not really performance CPU's. There are more for application that are size/power sensitive IIRC. The idea of using the phrase "two party system" is that voting for the independants does not get them into power. That is we have a market dominated by just 2 players. Really just one major player with a significant other.

            We just got a 150 000 euro cluster from IBM. The options were intel or AMD.
        • Re: (Score:3, Funny)

          by Intron ( 870560 )
          Why x86? You write a lot of code in machine language? Ever hear of Java?
          • by delt0r ( 999393 )
            I program 98% in java. What cost/performance option is there that not x86? The Cells are comming, but how well do the JIT perform on them? There was the PowerPC etc with apple. But now thats gone too.
            • Re:Yay AMD (Score:5, Informative)

              by TheRaven64 ( 641858 ) on Thursday June 28, 2007 @08:51AM (#19675311) Journal
              SPARC is doing very well for certain categories of workload, although mainly web-app types at the moment. Most computers sold these days have some form of ARM chip[1], which is a nice, low-power architecture, but lacks floating point. This isn't a huge problem, since a lot of ARM designs (particularly those from TI) have a DSP on die which can seriously out-perform a general purpose CPU for a lot of FPU-heavy workloads.

              For general-purpose usage, the most interesting design I've seen recently is the PWRficient from P.A. Semi. It's a nice dual-core 64-bit PowerPC, with low power usage, similar performance to IBM's PowerPC 970 series. It has a lot of nice stuff on-die (crypto, a really shiny DMA architecture, etc).

              For a complete round-up of current alternatives, take a look at this article [informit.com] and the next two in the series.


              [1] They are generally marketed as 'cell phones' or similar, rather than 'computers'.

        • Re: (Score:3, Interesting)

          by MoxFulder ( 159829 )

          Dam I really think it would be better if we didn't have a "two party system" in the x86 field. A third (or fourth) vendor would be nice. But given the high barrier to entry, its not going to happen anytime soon.

          Heck, it doesn't even have to be x86. I don't use anything but open-source software, so I don't really care one bit about the underlying architecture, as long as it performs well. If somebody builds a well-performing, price-competitive mobo/processor combo that I can drop into my current box, and supports USB/SATA/PCI/2D graphics, I'll use it.

          ARM, MIPS64, PowerPC, SPARC, whatever works... I imagine there's a large community of open-source users who would similarly jump ship from x86 if there were an alte

    • by Etyenne ( 4915 )

      From the TFA:

      (While here, I would like to say that AMD is becoming less helpful day by day towards open source operating systems too, perhaps because their serious errata lists are growing rapidly too).

      Yay indeed.

  • by supersnail ( 106701 ) on Thursday June 28, 2007 @07:58AM (#19674653)
    .. not intel compatable.

    Ask for your money back folks!
  • by Junior J. Junior III ( 192702 ) on Thursday June 28, 2007 @07:58AM (#19674657) Homepage
    We're talking about a few hundred million transistors. I imagine that detecting and fixing bugs when there's that many components involved is really, really difficult. Are other comparably complex processors better? How do AMD, VIA, Motorola, IBM, etc. fare?
    • by delt0r ( 999393 )
      This is all true, and AMD have had problems in the past. Though it does seem to be more of a Intel problem. There bugs tend to be worse and intel tend to be worse at letting everyone know. IMO anyway.

      Don't get me wrong, I have had my fair share of intel machines. I *almost* got a duo this time around. But a store special on a AMD X2 was just too good to pass up.
    • by ardor ( 673957 ) on Thursday June 28, 2007 @08:09AM (#19674781)
      Actually we are talking about VHDL. The "million transistors" argument is just as appropiate as saying "software is so large, it has so many ones and zeros". Development does not happen at this low stage.
      • by imgod2u ( 812837 ) on Thursday June 28, 2007 @08:34AM (#19675069) Homepage
        Yes and no. There are limitations to HDL's (and Intel, last I heard, was all Verilog). For one, it is *very* difficult, if not impossible in certain situations, to describe asynchronous signals. With something as complicated as a microprocessor that is so aggressively designed for both power and speed, I would guess that they didn't go with a completely synchronous design (hell, no one does anymore). Locally synchronous, globally asynchronous design has been in use for a while. It helps when you want to be able to shut off, or slow down, only parts of the chip that aren't being used very much.

        It is not possible to describe such things (let alone voltage islands, voltage scaling) in an HDL language and they must either be a special feature built into synthesis (with an extra set of constraints) or done by hand at the transistor/gate level.

        Then there's the point of verification. Every software release since about the mid-1990's has almost been immediately followed by patches. Just because it's "1's and 0's" does not mean that it doesn't get harder to detect corner cases as complexity grows. And it's much more difficult if you have to simulate on a cycle-accurate model (boot-up for an operating system, in simulation, would take a day on a nice cluster on something as big as the Core 2).

        Then there's post-synthesis/layout issues. Timing analysis do best on sequential logic (completely synchronous). When you throw in clock gating, multiple voltage islands, dynamic voltage scaling (meaning dynamic gate delays), not to mention the plethora of other techniques that those folks might be doing, what you see in simulation at the RTL level will not match what you see in reality. First rule they ever teach in any ASIC design class is never trust simulation.

        The point is that abstraction, like how it's "vhdl", does not mean that it's not difficult to get right and even sometimes impossible to be certain.
    • Re: (Score:3, Interesting)

      by supersnail ( 106701 )
      Except it seems that intel have just arbiteraly changed the way MMU instructions work.

      Intel seems to regard these as unpublished improvements rather than bugs.
    • Three words....Black box testing [wikipedia.org].
    • by Zontar_Thing_From_Ve ( 949321 ) on Thursday June 28, 2007 @08:27AM (#19674983)
      How do AMD, VIA, Motorola, IBM, etc. fare?

      AMD64 doesn't like FreeBSD 6.2 at all. We use FreeBSD and Linux in our business. FreeBSD is very important to us. In fact, I would go so far as to say that the senior management here in our IT department borders on being fanboys of FreeBSD. We were running various versions of FreeBSD on our AMD64 servers from 6.1 down to 5.something and we (foolishly in hindsight) decided that we had to upgrade to version 6.2 because it had some bug fixes we thought we needed. Oh they did fix those bugs, but they opened up a huge one that apparently nobody knows what causes it and nobody has any idea how to fix. What happens is that AMD64 systems will panic with some sort of a "sleeping on a non-sleepable lock" panic. Some people think that this is being caused by a large number of writes. Given how our servers work, this is certainly possible for us. The bottom line for us is that FreeBSD on AMD64 is so unstable that we are probably going to have to go to Linux instead for our web servers. Nobody wants to do that, but we simply can't have our webservers going down every day with the same panic and we lose one server a day on average to this problem. We've even had boxes crash within minutes of being brought up with the exact same panic.

      Once we move to Linux, I don't think we'll go back to FreeBSD. My best guess is that because the problem has apparently been going on for months with no resolution that we'll start moving servers from FreeBSD to Linux when we can. We don't have this problem under Linux. The fact is that whether we like it or not, more people use Linux and if stuff is seriously broken under Linux, someone will fix it soon enough. With FreeBSD nobody seems to have any idea what to do for this problem and I'm not sure that it will even be fixed this year, let alone soon enough to keep us from moving to Linux.
      • Comment removed (Score:4, Insightful)

        by account_deleted ( 4530225 ) on Thursday June 28, 2007 @08:37AM (#19675119)
        Comment removed based on user account deletion
      • by TheGratefulNet ( 143330 ) on Thursday June 28, 2007 @08:44AM (#19675233)
        AMD64 doesn't like FreeBSD 6.2 at all

        % uname -a
        FreeBSD myhost.grateful.net 6.2-STABLE FreeBSD 6.2-STABLE #0: Mon May 28 09:52:28 PDT 2007 me@myhost.grateful.net:/usr/obj/usr/src/sys/AMD64 i386

        granted, I'm using 32bit mode - but I've been running 6.2 for as long as its been out and my 'always on' freebsd box. what issues are you seeing? this is my production box - but I don't see any problems with bsd. in fact, I also have 6.2 running with an old amd64 3000+ that was a mobile chip and had to have cpufreq enabled just to move it off its default 800mhz and up to the 2.mumble ghz that its supposed to clock at. works fine.

        I have seen some hardware devices not behave well but often its not a well designed piece of hardware or its just not meant for server style loads (cheap consumer onboard sata sometimes times out and usb2.0 always times out if you give it enough load).

        I can't speak to amd64 USING 64bit mode, but 32bit mode works as well as (or better) than linux on headless style computing.
        • by suv4x4 ( 956391 ) on Thursday June 28, 2007 @09:04AM (#19675461)
          % uname -a
          FreeBSD myhost.grateful.net 6.2-STABLE FreeBSD 6.2-STABLE #0: Mon May 28 09:52:28 PDT 2007 me@myhost.grateful.net:/usr/obj/usr/src/sys/AMD64 i386


          Wait... this works in Slashdot's text area?

          % uname -a

          % uname -a

          Damn it :(
    • by suv4x4 ( 956391 )
      We're talking about a few hundred million transistors. I imagine that detecting and fixing bugs when there's that many components involved is really, really difficult. Are other comparably complex processors better? How do AMD, VIA, Motorola, IBM, etc. fare?

      It's not about size, it's about messiness. The current x86 architecture is just hacks upon hacks. Various CPU modes slapped on top in each generation. Commands abstracted into other commands or translated to RISC commands. Certain commands running on par
    • Re: (Score:3, Insightful)

      by cyfer2000 ( 548592 )
      most of those transistors are used to make cache. Assume there are 4MiB L2 cache, roughly 33.5 million bit, Intel uses 6 transistors per bit design, that's about 200 million transistors. There are also several million transistors used for the L1 cache. I remember there are no more than 300 million transistors on Conroe chip, so we are talking about tens of transistors for code execution now. And there are sill a lot of transistors used to locate the cache, decode...
  • by Blahbooboo3 ( 874492 ) on Thursday June 28, 2007 @08:00AM (#19674683)
    Uh, the slashdot summary is pretty lousy. After RTFA I am still a bit confused, can someone at slashdot please provide an "english" translation of the problems and how dangerous they are to normal users?

    Thanks!
    • by Aladrin ( 926209 ) on Thursday June 28, 2007 @08:07AM (#19674757)
      Sure:

      Some of the bugs are so dangerous that it doesn't matter WHAT operating system you're running, code could be written that could attack the entire system. It would still be OS-specific code, but since the exploit is in the hardware, it's a LOT harder to prevent the attack, if it's even possible.

      Some of the bugs are unfixable, as well. (I assume they mean without physcially replacing the chip with a 'fixed' one that doesn't exist yet.)
      • by 0123456 ( 636235 )
        "Some of the bugs are so dangerous that it doesn't matter WHAT operating system you're running, code could be written that could attack the entire system."

        Indeed. For example, 'rm -rf ~'

        I can see that a number of these bugs could potentially be exploited by evil code running on your machine. But if you have evil code running on your machine, you're already in deep crap.
        • by vadim_t ( 324782 ) on Thursday June 28, 2007 @08:29AM (#19675003) Homepage
          This is going to be a big deal for shared hosting environments for example.

          If you can, as a normal user, execute something that lets you get root on the box, and there's nothing the OS can do to prevent it, then it's a seriously nasty situation for quite a few businesses.

          I wouldn't be surprised if businesses like that started switching to AMD hardware.
          • Re: (Score:3, Informative)

            by 0123456 ( 636235 )
            "This is going to be a big deal for shared hosting environments for example."

            True, but that depends on how easily they could be exploited in the real world, rather than in the theoretical world. From what I remember, one was about incorrect behaviour when your code runs off the end of a 4GB boundary; certainly that might be exploitable, but not on any system which can't run >4GB of code.

            I skimmed through the bugs which the author said really scared him and didn't see anything that looked easy to exploit
        • by DrSkwid ( 118965 )
          There's the added bonus of the possibility that the source code would look benign but compile it to buggy machine code and it turns belligerent.
      • by martyros ( 588782 ) on Thursday June 28, 2007 @08:44AM (#19675225)
        So perhaps the NY law requiring software for voting machines to be held in escrow should include the chip layout as well...
      • Some of the bugs are so dangerous that it doesn't matter WHAT operating system you're running, code could be written that could attack the entire system. It would still be OS-specific code, but since the exploit is in the hardware, it's a LOT harder to prevent the attack, if it's even possible.

        Here's a little more detail, based on my (very incomplete) understanding of the issues:

        It appears that Intel has made changes to the way the memory management unit in the processor works, plus there are also some bugs that affect memory management. So what does that mean?

        • Theo mentions changes in how TLB flushes must be handled. Translation Lookaside Buffers (TLBs) are tables where operating systems cache information used to quickly determine what physical memory page corresponds to a given virtual memory page. Each running process has it's own address space (meaning the data at address, say, 1000, is different for each process) and operating systems have to be able to quickly translate these virtual addresses to addresses within the physically-available RAM. The authoritative data on the mapping is in a set of data structures called the "page table", but the processors provide a mechanism for creating and managing TLBs which act as a high-performance cache of part of the page table data. Failing to properly flush the TLBs during a context switch (putting one process to sleep and activating another) might result in the new process' virtual memory mapping being done incorrectly. From a security perspective, this could give one process access to memory owned by another.
        • Another issue mentioned is the possibility that No-Execute bits may be ignored. The OS can set the No-Execute (NX) bit on a page of memory that it knows to be pure data that should not be executed. The processor will refuse to execute code from any memory page with NX set. This makes most buffer overflow attacks impossible, because the normal buffer overflow attack involves getting a bit of malicious code shoved into a stack-based buffer as well as overflowing the buffer to overwrite a return address so that the CPU will jump to and execute the malicious code. Obviously, if the processor sometimes ignores NX bits, the buffer overflow attacks become possible again.
        • Theo also mentions possibly-ignored Write-Protect (WP) bits. The OS can mark memory pages as read-only. This is used for all sorts of things related to security. One of the biggest is preventing processes from writing to the memory in which shared libraries are loaded. If my process could overwrite, say, the C library code implementing "printf", other processes that call this function would execute my code. Some of them will be running as root, so I can execute code with root permissions. Modern operating systems do lots of data-sharing between processes, some of it completely non-writable, other parts of it "copy on write". Copy-on-write pages are implemented by setting the WP bit and then catching the page fault generated by the CPU when a process tries to write the page. The fault handler quickly copies the page in question, allows the write to hit the copy, and reswizzles the page table so the virtual page of the writing process points to the new copy. WP bits being ignored would also break this, so lots of cases where data is "opportunistically" shared would become really and truly shared, allowing one process to corrupt data used by another.

        There are other issues as well... but these are a good sample, and should give an idea of what kind of bad stuff these CPU bugs/changes can make possible.

        • by Bri3D ( 584578 ) on Thursday June 28, 2007 @10:16AM (#19676299) Journal
          Another scary bug (perhaps the scariest, since it appears to be the one that most reliably/repeatably occurs) is AI88: Microcode Updates Performed During VMX Non-root Operation Could Result in Unexpected Behavior.
          From what the errata says, unless the host software has specifically disallowed access to parts of the MSR, a VMX guest/non-root system could reload the CPU microcode.
          This leads to a whole universe of complicated data theft/code execution/etc. exploits that will probably never be created due to their complexity. However, it also leads to a very, very, very simple DoS/crash exploit (load some bad microcode, crash the CPU).

      • Re: (Score:3, Informative)

        by mwvdlee ( 775178 )
        What worries me most is; with software, you can supply a fixed version to customers for the cost of a CD and a postage stamp (or less), with hardware it's slighly more expensive and thus slightly less likely to ever happen.
    • by Hurga ( 265993 ) on Thursday June 28, 2007 @08:08AM (#19674773)
      can someone at slashdot please provide an "english" translation of the problems and how dangerous they are to normal users?

      "We don't have the complete picture yet, but things look bad"

      Hanno
    • the computer thingamajibs don't do things right and the computer nerds are all upset about it. best not to click on ANYTHING until 2009
    • Uh, the slashdot summary is pretty lousy. After RTFA I am still a bit confused, can someone at slashdot please provide an "english" translation of the problems and how dangerous they are to normal users?

      The second link [geek.com] in the article, containing brief descriptions of bugs, might be useful, although perhaps still quite technical. One bug that is perhaps easy to communicate to the "normal user" is AE30, where the bug might cause some software running on Core Duo during the dehibernation to reload data from the wrong memory location. It's labeled as "potentially catastrophic", and I imagine that after the wrong reload, more or less anything can happen: some program crashing, OS crashing, to, who knows, may

      • by TheRaven64 ( 641858 ) on Thursday June 28, 2007 @09:08AM (#19675507) Journal
        I don't know why Theo posted that link, because it is about the Core, not the Core 2. They are two completely different micro-architectures. The Core was a slightly tweaked Pentium M (which is basically a P6 with extra vector instructions and the NetBurst branch predictor), while the Core 2 is a completely new micro-architecture. If you compare the errata in the two links, you will see that they are quite different.
  • Good stuff. (Score:5, Insightful)

    by AltGrendel ( 175092 ) <ag-slashdot@eGAUSSxit0.us minus math_god> on Thursday June 28, 2007 @08:00AM (#19674685) Homepage
    I always find Mr. De Raadt's comments an interesting read. He's like a geek version of Harlan Ellison.
    • Re:Good stuff. (Score:5, Informative)

      by Lisandro ( 799651 ) on Thursday June 28, 2007 @08:22AM (#19674915)
      Same here. The guy might seem like a bit of an asshole sometimes, but he surely knows what he's talking about. Some of the things he points out are plain unbelievable:

      Basically the MMU simply does not operate as specified/implimented in previous generations of x86 hardware. It is not just buggy, but Intel has gone further and defined "new ways to handle page tables" (see page 58).

      Some of these bugs are along the lines of "buffer overflow"; where a write-protect or non-execute bit for a page table entry is ignored. Others are floating point instruction non-coherencies, or memory corruptions -- outside of the range of permitted writing for the process -- running common instruction sequences.


      It will be interesting to see what Intel has to say about this.
      • by suv4x4 ( 956391 ) on Thursday June 28, 2007 @08:44AM (#19675241)
        Some of these bugs are along the lines of "buffer overflow"; where a write-protect or non-execute bit for a page table entry is ignored. Others are floating point instruction non-coherencies, or memory corruptions -- outside of the range of permitted writing for the process -- running common instruction sequences.

        It will be interesting to see what Intel has to say about this.


        Yea! Damn, where's the Intel Opinion Center exactly when you need it!
      • by Vryl ( 31994 ) on Thursday June 28, 2007 @10:54AM (#19676897) Journal
        "Come on people, move along, nothing to see here".
  • Patches (Score:4, Insightful)

    by suv4x4 ( 956391 ) on Thursday June 28, 2007 @08:04AM (#19674723)
    outstanding, fixed, and non-fixable Core 2 bugs

    Well, in these days of fast-paced business, business at the blink of an eye, at the speed of light, at the speed of spooky action at distance kinda speed, it's normal that companies would release products prematurely and then patch later.

    Thankfully, software is very easy to patch post-release.

    Now, the only thing left to do, is someone tell Intel that they're selling hardware.
    • Re:Patches (Score:4, Informative)

      by Jeff DeMaagd ( 2015 ) on Thursday June 28, 2007 @08:36AM (#19675091) Homepage Journal
      Now, the only thing left to do, is someone tell Intel that they're selling hardware.

      Hardware has had built-in firmware/software for as long as I remember. BIOS is software. Microcode for even consumer CPUs has been done for as long as I remember, Pentium II had it. Apparently, the 8086 had microcode-based instructions.
      • Re: (Score:3, Informative)

        by suv4x4 ( 956391 )
        Hardware has had built-in firmware/software for as long as I remember. BIOS is software. Microcode for even consumer CPUs has been done for as long as I remember, Pentium II had it. Apparently, the 8086 had microcode-based instructions.

        Don't confuse microcode with firmware. Two different things. Microsode isn't intrinsically updateable, and may be placed in a read-only memory block.
    • by jefu ( 53450 )
      According to Andy Grove (Intel co-founder) "Hardware is nothing but frozen software.", so they are selling software, "frozen software".
  • Time for RISC? (Score:3, Insightful)

    by 644bd346996 ( 1012333 ) on Thursday June 28, 2007 @08:11AM (#19674795)
    For years, the x86 instruction set has been implemented on top of RISC cores. That microcode layer has been getting thicker over the years, and now it seems that it may be too complex to be reliably dealt with. I wonder if this means that we should toss out that x86 layer and deal just with the high-performance, straightforward RISC core.
    • Re:Time for RISC? (Score:5, Insightful)

      by Viv ( 54519 ) on Thursday June 28, 2007 @08:23AM (#19674931)
      The market resoundingly rejected that idea when Intel tried to hoist IA64 on it.
      • wrong (Score:3, Interesting)

        by nanosquid ( 1074949 )
        IA64 was not a RISC version of x86-like chips, it was a completely new architecture coming out of VLIW work at HP. IA64 architectures had failed once before, and it was a stupid idea for Intel to push them so heavily (personally, I think they will never work because they push too much complexity into software). Switching to IA64 meant that many compilers couldn't be made to work at all, and many other compilers would generate inefficient output.

        Intel should be developing a conservative RISC processor, an
        • Re:wrong (Score:5, Insightful)

          by Sparohok ( 318277 ) on Thursday June 28, 2007 @12:20PM (#19678115)
          Itanium is a lesson in how not to handle technological transitions. Itanium was picked by geeks who had no idea of what the market wanted or needed, and Intel marketing and management blindly believed what they were hearing from the geeks.

          Actually, Itanium was a wildly successful product. Mere rumors of Itanium's capabilities were sufficient to kill DEC Alpha, drive SGI/MIPS out of the high end processor market and disrupt SPARC and PA-RISC development programs. Intel virtually eliminated the threat of competitive RISC architectures for years with the announcement of Itanium.

          (Another company that works like that is Microsoft, which is why they keep churning out such bad software.)

          To much the same effect.
    • Re: (Score:2, Informative)

      by Slashcrap ( 869349 )
      I wonder if this means that we should toss out that x86 layer and deal just with the high-performance, straightforward RISC core.

      Did you know that one of the main reasons that x86 outperforms any similarly specified RISC chip is because those horribly inelegant, variable length x86 instructions allow for considerably higher code density than RISC?

      Elegant does not necessarily equal faster or better, no matter how much you might want it to.
      • Re: (Score:3, Insightful)

        by LWATCDR ( 28044 )
        No the real reason the that x86 has such good performance is that Intel and AMD have spent billions of dollars in R&D to make the x86 the fastest flying pig ever.

        I think the latest Power series will give any Intel CPU a run for it's money as well the latest Sparc.

        Code Density is nice since access to main memory is a bottle neck. CISC does have some advantages over RISC just as RISC has some over CISC.
        However the x86 as an ISA really does suck. x86-64 is a lot better but it could still be better.
        • Re:Time for RISC? (Score:4, Informative)

          by edwdig ( 47888 ) on Thursday June 28, 2007 @10:46AM (#19676729)
          I think the latest Power series will give any Intel CPU a run for it's money as well the latest Sparc.

          Yes, they will. But those chips are designed with a target price of thousands of dollars and without anywhere near as much concern about heat.

          Power has a 128 KB L1 cache (64 KB on Core 2), 4 MB L2 cache per core (4 MB L2 shared on Core 2), and a 32 MB L3 cache (none on Core 2). If you're willing to pay for that, x86 would be a lot faster.

          Oh, don't forget that Power chips run really really hot. Hotter than Pentium 4's. The market has made it clear that lower power usage / heat generation is a priority now.
    • Re:Time for RISC? (Score:4, Interesting)

      by p3d0 ( 42270 ) on Thursday June 28, 2007 @09:08AM (#19675513)
      What makes you think the bugs are in the "x86 layer"?
  • by N8F8 ( 4562 ) on Thursday June 28, 2007 @08:12AM (#19674797)
    Coming from the government sector, this kind of issue isn't going to be taken lightly. I work at a DoD facility and all our machines were just refreshed with Core 2 Duo machines. It is already almost impossible to get new software approved, if this causes the same paranoia for basic commodity hardware we're really gonna feel some pain.
  • intel issues (Score:3, Interesting)

    by artjunk ( 1088603 ) on Thursday June 28, 2007 @08:16AM (#19674837)
    I am exceedingly ignorant in this area, but even I can grasp how dangerous some of these are. And, as a mac user (PowerPC Dual G5 - thankfully), I suspect that this will come as really bad news to mac community as well. It's unbelievable to me that some of the "Show Stopper" issues are not being addressed by intel - especially when news of nation to nation cyber wars/cyber attacks are beginning to pepper the news. The fact that some of these are not resolvable through software patches is VERY worrisome to me! I am very appreciative to those who can fully interpret these flaws chip architecture and bring it out to the public's awareness.
    • Why would it be a problem for Mac users? there are literally a handful of viruses and malware for the Mac.

      I doubt these bugs will cause problems for many end users, it is those using servers that will be worried. But the code has to get onto the box first.
      • Re: (Score:2, Informative)

        by artjunk ( 1088603 )
        From what I gather from the article, it's irrelevant what OS you use - as some of these issues are at the lower level (under/before the OS). And, since all newer macs are Intel Core Duo's, I think this could be be an issue for them as well.
        • From what I gather from the article, it's irrelevant what OS you use...

          I don't think that is quite true. In order to execute one of these exploits, you need to get something running on the box and that means it has a binary that will run on your OS. Once that happens, you're rooted with no security measures in software able to do anything about it. To be useful, however, the OS will still have to be taken into account at this point as well.

          That is not to say no one will write a cross platform worm or a mac specific exploit for this, since it is as vulnerable (maybe more so

  • AI65. A Thermal Interrupt is Not Generated when the Current Temperature
    is Invalid
    Problem: When the DTS (Digital Thermal Sensor) crosses one of its programmed
    thresholds it generates an interrupt and logs the event
    (IA32_THERM_STATUS MSR (019Ch) bits [9,7]). Due to this erratum, if the
    DTS reaches an invalid temperature (as indicated IA32_THERM_STATUS MSR
    bit[31]) it does not generate an interrupt even if one of the programmed
    thresholds is crossed and the corresponding log bits become set.
    Implication: When the t
    • please allow me to correct myself before I get jumped on. the fault lies in my intel motherboard (D975xbx2), using the intel chip E6600.
    • by Speare ( 84249 )

      To turn to the usual car analogy tactic:

      I don't see why this is an issue. The Intel Desktop temperature monitor doesn't even work on my E6600, so how can it detect an invalid temperature?

      I don't see why this is an issue. The burnt-out low-oil lamp on my dashboard doesn't even work on my car, so how can the oil pump detect an invalid oil level?

      If the input is working right, then a high temperature reading should trigger an interrupt to warn the OS that it should back off for a while. If the input isn't working right, perhaps it's just making up values, most of which are valid (10C, 80C, 345C, etc.) but maybe sometimes those made-up values are not

    • ...wondering WTF an invalid temperature is?

      Surely a temperature is always going to be valid unless these processors only support an extremely small set of possible temperature values?

      Anyone have more insight into what an invalid temperature might be and how it might be caused?
  • I just bought a laptop with a Core 2, but its not one of this specific processors. Does that (necessarily) mean mine does not have these issues? I think its a later model.
  • Before, you can buy Mac for the Open arhitecture of the Power! Now, what can we buy ?

    Read to the end of TFA !

    (While here, I would like to say that AMD is becoming less helpful day by day towards open source operating systems too, perhaps because their serious errata lists are growing rapidly too).

  • and now? Everyone should trash their Core 2's? I for one have neither the money nor the incentive to do this. Its a good thing De Raadt highlights these very serious issues, but unfortunately it comes too late.
  • Quantum effects? (Score:5, Interesting)

    by DFDumont ( 19326 ) on Thursday June 28, 2007 @08:34AM (#19675077)
    My favorite errata in the list is AI22, Sequential Code Fetch to Non-canonical Address May have Nondeterministic Results. Basically the chip decides that all of the high oreder bits should be '1', instead of '0' - for no apparent reason as its not consistent.
    Did anyone notice these chips are using the 65nm process?
    At what point do the shear quantum affects overcome the deterministic EE rules that are used to design the chips? I don't know, but wikipedia defines a nanoparticle as one with at least one dimension less than 100nm. http://en.wikipedia.org/wiki/Nanoparticle [wikipedia.org]
    Given that definition every transistor's source, drain and gate are nanoparticles. And we expect them to behave classically why?
    • Re: (Score:3, Funny)

      by Reverend528 ( 585549 )

      My favorite errata in the list is AI22, Sequential Code Fetch to Non-canonical Address May have Nondeterministic Results. Basically the chip decides that all of the high oreder bits should be '1', instead of '0' - for no apparent reason as its not consistent.


      That's not a bug. It's a cryptographically secure random number generator!
  • Rush to conquer? (Score:4, Interesting)

    by Applekid ( 993327 ) on Thursday June 28, 2007 @08:35AM (#19675085)
    How does this errata compare to previous generations or even AMDs? I wonder if any increase could be from rushing Core 2 to market to kick AMD's flagship CPU off the top of the heap.
  • by davebert ( 98040 ) on Thursday June 28, 2007 @08:44AM (#19675235)
    Link [realworldtech.com]
  • Plate armor was great until someone came up with the longbow. Large bastions/fortresses ruled warfare until mobile artillery became more practical.

    There's a sense that operating systems are getting better with security, so it's no suprise that people are starting to look at hardware.
  • by Anonymous Coward on Thursday June 28, 2007 @09:02AM (#19675439)
    The first Pentium had a floating point bug. Maybe they're working too closely with Microsoft? (I kid! I kid! Put down that flamethrower!) Any way, here are a few Pentium jokes I dug up. If only the Core 2 bugs were all floating point erroirs we could recycle all of these old jokes to Core 2 jokes!

    Q: How many Pentium designers does it take to screw in a light bulb?
    A: 1.99904274017, but that's close enough for non-technical people.

    Q: What do you get when you cross a Pentium PC with a research grant?
    A: A mad scientist.

    Q: What's another name for the "Intel Inside" sticker they put on Pentiums?
    A: The warning label.

    Q: What do you call a series of FDIV instructions on a Pentium?
    A1: Successive approximations.
    A2: A random number generator.

    Q: Complete the following word analogy: Add is to Subtract as Multiply is to:
                    1) Divide
                    2) ROUND
                    3) RANDOM
                    4) On a Pentium, all of the above
    A: Number 4.

    Q: What algorithm did Intel use in the Pentium's floating point divider?
    A: "Life is like a box of chocolates." (Source: F. Gump of Intel)

    Q: Why didn't Intel call the Pentium the 586?
    A: Because they added 486 and 100 on the first Pentium and got 585.999983605.

    Q: According to Intel, the Pentium conforms to the IEEE standards 754
            and 854 for floating point arithmetic. If you fly in aircraft
            designed using a Pentium, what is the correct pronunciation of "IEEE"?
    A: Aaaaaaaiiiiiiiiieeeeeeeeeeeee!

    Q: Did you hear about the new "morning after" pill being developed as a replacement for RU-486???
    A: Its called RU-Pentium. It causes the embryo to not divide correctly.

    TOP TEN NEW INTEL SLOGANS FOR THE PENTIUM:
        9.9999973251 It's a FLAW, Dammit, not a Bug
        8.9999163362 It's Close Enough, We Say So
        7.9999414610 Nearly 300 Correct Opcodes
        6.9999831538 You Don't Need to Know What's Inside
        5.9999835137 Redefining the PC -- and Mathematics As Well
        4.9999999021 We Fixed It, Really
        3.9998245917 Division Considered Harmful
        2.9991523619 Why Do You Think They Call It *Floating* Point?
        1.9999103517 We're Looking for a Few Good Flaws
        0.9999999998 The Errata Inside

    THE TOP TEN REASONS TO BUY A PENTIUM MACHINE:
    10. Your current computer is too accurate
    9. You want to get into the guinness book as "owner of most expensive paperweight"
    8. Math errors add zest to life
    7. You need an alibi for the I.R.S.
    6. You want to see what all the fuss is about
    5. You've always wondered what it would be like to be a plaintiff
    4. The "intel inside" logo matches your decor perfectly
    3. You no longer have to worry about cpu overheating
    2. You got a great deal from JPL
    1. It'll probably work

    Thank you, thank you, I'll be here all week. Remember to tip the bartender. Lets see, 20% of... divide by...
  • by Anonymous Coward on Thursday June 28, 2007 @09:13AM (#19675571)
    Scariest post on that thread:

    http://marc.info/?l=openbsd-misc&m=118302016430106 &w=2 [marc.info]

    AMT is a technology intended to facilitate survailance, maintenance
    and control computers remotely.

    * Monitor and control (filter) the network traffic - before/under the
    running operatingsystem

    * sending out patches to computers - even if they are turned off.

    * Control, upgrade, change, add and remove software
  • by yAm ( 15181 ) on Thursday June 28, 2007 @09:36AM (#19675819)
    the AMT (Advanced Management Technology) is the truly frightening bit. Big Brother visits your computer:

    A Swedish ASIC designer explains:
    http://strombergson.com/kryptoblog/?p=311 [strombergson.com]

    (A rough) translation:
    http://marc.info/?l=openbsd-misc&m=118302016430106 &w=2 [marc.info]

    • That's actually a bad article about a real issue. A better article is here. [monstersandcritics.com]

      Intel's AMT technology puts special purpose hardware in the network controller which recognizes UDP and TCP packets on ports 16992, 16993, 16994, and 16995. This is completely independent of the operating system. Various system administration functions can be performed. Anybody can inventory the machine and read its ID. Other functions, like power off/on, reboot, user disable (disables keyboard/mouse/on-off switch) and remote

  • Lots of issues (Score:4, Interesting)

    by flyingfsck ( 986395 ) on Thursday June 28, 2007 @10:25AM (#19676403)
    Stack problems, memory management problems, interrupt problems and so on. Many of these bugs will not cause an immediate exception or crash but may look like software bugs, for example a stack problem causing a return to the wrong address.

    I guess MS Windows users will simply blame Microsoft's sloppy code, when it isn't even their fault...
  • by m.dillon ( 147925 ) on Thursday June 28, 2007 @12:21PM (#19678123) Homepage
    Ok, lets look at some of these.

    AI65 - Thermal interrupt does not occur if DTS reaches an invalid temperature. What the hell is an invalid temperature? A disconnected sensor or something? It doesn't sound like something a userland thermal-generating loop can exploit but the errata is not detailed enough to know for sure.

    AI79 - REP/STO in specific situation may cause the processor to hang. BIOS patchable. The errata mentions an uncacheable memory store. If this is a pre-requisit then only user programs with access to /dev/io or memory-mapped bus space can exploit it. So e.g. something like XOrg, but not the typical user program. Worse case seems to be a system freeze. Still, this is something to be concerned about.

    AI43 - Concurrent MP writes to non-dirty page may result in unpredictable behavior. This one is extremely serious. It effects any threaded program and possibly even programs which are no threaded. This would cause me to not purchase the cpu. It says that a BIOS workaround is possible (aka microcode update).

    AI39 - Cache access request from one core hitting a modified line in the L1 cache of another core may cause unpredictable system behavior. What the hell? Are they out of their minds? This is a big-time show stopper. It says it can be fixed with the BIOS (aka microcode update). I sure hope so.

    AI90 - Page access bit may be set prior to signaling a code segment limit fault. This one is pretty serious. This cannot occur on most operating systems because the code segment is set to be unlimited and access is governed solely by the page tables. In 64 bit mode emulating 32 bit operation the problem might occur if a bit of code wraps the segment. There are possibly issues in other emulation modes, such as VM86 mode. The effect of setting the page accessed bit will not make a page accessible that was not previously unaccessible, but it will result in unexpected modifications to the page table page and numerous operating systems may free such pages to the page-zerod page list under the assumption that they cleaned the page out when in fact there may be a page table entry with the access bit set (meaning the page wasn't completely zerod when freed). That could cause problems.

    AI99 - Updating code page directory attributes without tlb invalidation may result in improper handling of a page fault exception. This one doesn't look too serious, it just means the wrong exception will be taken first, meaning that the OS will probably seg-fault the program. If the OS corrects the issue and retries, the correct exception will be taken on retry. All BSDs that I know of handle page fault exceptions generically and will not be effected. Of greater concern is what sort of modifications to a page directory entry now require TLB invalidations? On FreeBSD and DragonFly, and I assume most BSDs and probably Linux too, page directory entries usually transition between only two states and a TLB invalidation is made when a page directory entry is invalidated, so they wouldn't be effected by this bug.
    • by m.dillon ( 147925 ) on Thursday June 28, 2007 @12:58PM (#19678635) Homepage
      Now the core duo/solo errata.

      AE1 - CPU to memory copy with FST with numeric and null segment exceptions may cause GP faults to be missed and FP linear address mismatch. In otherwords, a segmentation violation will be missed and a write will be allowed to proceed. This will not effect OSs using page tables for protection, which is all OSs. Sounds bad but doesn't sound like it will effect existing OSs

      AE2 - Code segment violation may occur if a code stream wraps around a segment. No program does this on purpose, and OSs will just seg-fault the program if it does. The intel errata says it could be exploted by a virus but I don't see how by its current description. Maybe there is something they aren't telling us.

      AE3 - POPF/POPFD that sets the trap flag (aka when single-stepping a program) may cause unpredictable behavior. Holy shit. This one is serious.

      AE4 - REP MOVS in fast string mode continues in that mode when crossing into a page with a different memory type. This means that when crossing over from a cacheable page to an uncacheable page, the I/Os remain cacheable. And vise-versa. This will never happen on purpose so the question is whether it can be exploited in some way, and the answer to that is not that I can see.

      AE5 - Memory aliasing with inconsistent dirty and Access bits may cause a processor deadlock. This means a PTE with 'D'irty set but with 'A'ccess not set. FreeBSD and DragonFly always set the A bit when setting the D bit and will not be effected but I don't know about other OSs. This is a very serious bug though.

      AE6 - VM bit will be cleared on a double fault exception. Double faults are usual fatal for the whole machine so unless they can occur in an emulation mode (where the double fault is being emulated). Check your OS. FreeBSD and DragonFly do not try to resume after a double fault and do not take faults in VM mode and are not effected.

      AE7 - Incompatible write attributes in page table verses MTTR may consolidate to UC. Not a big deal, doesn't happen unless something has been misprogrammed.

      AE8 - FXSAVE after FNINIT without an intervening FP instruction may save uninitialized values for FDP and FDS. This isn't an issue unless the data being written represents a security leak of some sort, such as a portion of the state of another program's FP unit. This could be a security issue with regards to one program snooping another program's cryptography. Statistical snooping possible through this sort of mechanic has been shown to be effective in recent years.

      AE9 - LTR can result in a system hang. Well, BSDs don't really use LTR all that much and the conditions required just will not happen on BSD or probably linux either. A break point must be set on the cache line containing the descriptor data? Not from userland!

      AE10 - Invalid entries in the page directory pointer table register may cause a GP fault. Not an issue.

      AE11 - REP MOVS operation in fast string mode continues in that mode when crossing into a page with a different memory type. Not an issue.

      AE12 - FP inexact result exception flag may not be set if the #inexactresult occurs in any FPU instruction with certain instructions occuring afterwords. This is a very serious bug that only compilers can work around (and probably won't).

      AE13 - IFU/BSU deadlock may cause system hang. I've no idea what IFU and BSU is.

      AE14 - MOV with debug register causes a debug exception. Sounds like the worst that happens here is a program seg faults if this condition is hit while the program is being debugged.

      AE15 - INIT does not clear global entries in the TLB. Oh, joy. Intel says that BIOS writers would know of thise errata and cod efor it, but insofar as I know this could be an issue when starting up APs.

      AE16 - Use of memory aliasing with inconsistent memory type may cause system hang. It shouldn't be possible for this to happen with a modern OS. It means mapping the same physical page of memory with different memory contr

One person's error is another person's data.

Working...