Slashdot Log In
Erratum Plagues Quad-Core Opterons, Phenoms
Posted by
kdawson
on Tue Dec 04, 2007 07:43 PM
from the correct-or-fast-choose-at-most-one dept.
from the correct-or-fast-choose-at-most-one dept.
theraindog writes "Errata are not uncommon with new processors, but a problem with the TLB logic in AMD's quad-core Opteron and Phenom processors appears to be quite serious. The erratum is so severe that AMD has issued a 'stop ship' order on all quad-core Opterons. AMD has also blamed this bug for the delay of the 2.4GHz Phenom, despite the fact that the erratum is unrelated to clock speed. A BIOS-based workaround for the issue has been made available to motherboard makers, but it apparently carries a 10-20% performance penalty. What's more disturbing is that AMD knew of the erratum and the potential performance hit associated with fixing it before it launched the Phenom processor. Hardware provided to the press for reviews did not include the fix, conveniently overstating Phenom performance."
Related Stories
[+]
Details of New Intel Dunnington and Nehalem Architectures Leaked 147 comments
Daily Tech is reporting that details about Intel's new processor models were leaked over the weekend. Both the six core Dunnington and Nehalem architectures were featured in this leak. "Dunnington includes 16MB of L3 cache shared by all six processors. Each pair of cores can also access 3MB of local L2 cache. The end result is a design very similar to the AMD Barcelona quad-core processor; however, each Barcelona core contains 512KB L2 cache, whereas Dunnington cores share L2 cache in pairs. [...] Nehalem is everything Penryn is -- 45nm, SSE4, quad-core -- and then some. For starters, Intel will abandon the front-side bus model in favor of QuickPath Interconnect; a serial bus similar to HyperTransport."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
What??? (Score:5, Informative)
But dictionary.com is your friend.
Design errors and mistakes in a CPU's hardwired microcode may also be referred to as an erratum. One well publicised example is Intel's "flag" erratum in early Pentium Pro processors. This made the conversion of floating point numbers to integers unreliable due to an exception not being signaled under certain conditions.
Re:What??? (Score:4, Insightful)
The thing is, the CPU is actually broken a bit and AMD has pulled the Barcelona line but are continuing to sell the Phenom(inal Failure) line to customers and, evidently, don't plan to 'fix' the problem later (Intel offered replacements for the Pentium floating point bug after they got dinged on it, for example... I know... I had one and replaced it).
So... if you actually get your hands on (or got your hands on) a Phenom, realize you have a broken CPU and the more you load it, the more likely you'll have stability issues.... and AMD isn't (currently) going to fix it.
Parent
Re:What??? (Score:5, Informative)
They may not have been published, but there are at least three:
1) A memory-indirect jump where the address is stored across a 256-byte boundary will read the second byte of the address from the wrong location.
2) The arithmetic status flags are not valid when performing arithmetic in BCD mode.
3) If a hardware interrupt occurs while the processor is fetching a BRK instruction, the BRK instruction is ignored.
Parent
Re:What??? (Score:5, Funny)
Mod me down, call me troll, but please don't claim to be a geek if you can claim to never have heard of erratum or errata. That's as bad as not knowing what a bug is or calling a PC case and its contents a hard drive.
Here's a heartfelt suggestion...read more.
Parent
Re:What??? (Score:5, Informative)
The conventional terms used for erratum, however, are usually "error" or "bug".
Parent
NDA for patch? (Score:5, Interesting)
Good thing it's just a patch, as opposed to a derived work of someone else's GPLed code. I wonder what the FSF guys would say about that. I also wonder: Red Hat, why?
Re:NDA for patch? (Score:5, Insightful)
There are other possibilities that are more likely. For example, perhaps the patched kernel is doing something like loading microcode into the processor. The kernel code would be GPLed but the microcode would not be.
Parent
Re:NDA not enforcible (Score:5, Informative)
The GPL only applies to redistribution. Private-use changes don't have to be GPL'd.
IANAL,TIJHIUI (I Am Not A Lawyer, This Is Just How I Understand It).
Parent
Bummer (Score:4, Insightful)
Re:Bummer (Score:5, Funny)
Parent
Re:Bummer (Score:5, Funny)
Just thinking out-loud here: Did you trying pushing-in the Turbo button?
Parent
Cue the intel jokes (Score:5, Funny)
Re:Cue the intel jokes (Score:5, Funny)
--------------
Intel's new motto: "United We Stand, Divided We Fall"
Q: How many Pentium designers does it take to screw in a light bulb?
A: 1.99904274017, but that's close enough for non-technical people.
Q: What do you get when you cross a Pentium PC with a research grant?
A: A mad scientist.
Q: What's another name for the "Intel Inside" sticker they put on Pentiums?
A: The warning label.
Q: What do you call a series of FDIV instructions on a Pentium?
A1: Successive approximations.
A2: A random number generator.
Q: Complete the following word analogy: Add is to Subtract as Multiply is to:
1) Divide
2) Round
3) Random
4) All of the above
Q: What algorithm did Intel use in the Pentium's floating point divider?
A: "Life is like a box of chocolates." (Source: F. Gump of Intel)
Q: Why didn't Intel call the Pentium the 586?
A: Because they added 486 and 100 on the first Pentium and got
585.999983605.
Q: According to Intel, the Pentium conforms to the IEEE standards 754
and 854 for floating point arithmetic. If you fly in aircraft
designed using a Pentium, what is the correct pronunciation of "IEEE"?
A: Aaaaaaaiiiiiiiiieeeeeeeeeeeee!
Q: Did you hear about the new "morning after" pill being developed as a
replacement for RU-486???
A: Its called RU-Pentium. It causes the embryo to not divide correctly.
TOP TEN NEW INTEL SLOGANS FOR THE PENTIUM
9.9999973251 - It's a FLAW, Dammit, not a Bug
8.9999163362 - It's Close Enough, We Say So
7.9999414610 - Nearly 300 Correct Opcodes
6.9999831538 - You Don't Need to Know What's Inside
5.9999835137 - Redefining the PC -- and Mathematics As Well
4.9999999021 - We Fixed It, Really
3.9998245917 - Division Considered Harmful
2.9991523619 - Why Do You Think They Call It *Floating* Point?
1.9999103517 - We're Looking for a Few Good Flaws
0.9999999998 - The Errata Inside
Worth a laugh anyway
Parent
"because", not "despite" (Score:5, Insightful)
Why does the summary claim this? I read through both articles, and AMD says this is a hardware issue across both chip models. Since this is a hardware issue, wouldn't it stand to reason that AMD would hold up a related chip because it's a hardware bug across both chip models and not because it's a clock speed issue? I'm not sure where the "despite" comes into play. I didn't see where the article said that AMD is not delaying a different speed Phenom.
Re:"because", not "despite" (Score:4, Interesting)
Parent
Re:"because", not "despite" (Score:4, Informative)
If it's a race condition in hardware, there's a good chance it's clock-sensitive. The bug probably exists in the whole line, sure. It'll manifest more as the clock ticks are closer together, because the margin for error without triggering the reversal of steps is smaller. If it's a matter of the wrong signal being sometimes being asserted because the edge of a clock line transition was missed, it's logically going to happen more when the clock cycles are shorter.
A bug being in the whole line regardless of clock frequency and that bug becoming more of an issue at higher clock frequencies are not at all mutually exclusive conditions. The higher frequencies and higher rates of the error may not coincide, but there's nothing in the article to logically say they don't.
The erratum probably does apply to the whole line equally but probably manifests as a percentage of the time in use as some function of the frequency.
For any geek wanting a basic understanding of issues like latching times, gate propagation delays, and other analog electrical signaling issues inside a digital CPU, I recommend the first few chapters of Structured Computer Organization [isbn.nu]. The book builds upon basic designs of computers from using TTLs to designing a CPU, then up by layers through microcode, designing an assembly language, and more. I have an older edition at home which covers up through the 68030 and the 80386 as examples. The newer one covers up through the Pentium II, the UltraSparc, and the Java chips. The book won't make you an electrical engineer by any means, but the discussions of the tricky timing issues within even simple CPUs might be useful here.
As for the clock speed not effecting the percentage loss in efficiency due to the microcode fix... well, yeah. The microcode is the same across the line regardless of the clock speed. If you insert two identical strings of instructions A1 and A2 into an identical pair of microcode stores B1 and B2, the resulting patched microcodes C1 and C2 will likewise be identical. The faster processor will decode and execute the microcode at the same clock speed as before, and so will the slower one. They'll each have the same percentage slowdown relative to their own clock speeds, because they're running the same microcode. We're not talking about two different generations of processors or even two different revisions. It's the same processor design at two clock speeds. One is going to get the same nerfs and buffs for any microcode change proportional to their clock speeds as the other.
Parent
Old issue, really (Score:4, Interesting)
Re:Old issue, really (Score:5, Informative)
AMD is in a world of hurt right now. The "true" quad-core line appears to be nothing more than marketing hyperbole since year-old q6600's are faster clock-for-clock than Phenom is. AMD will hopefully get these bugs ironed out... by next February. Even then though, AMD will have chips that are MASSIVELY expensive to make, but that they can't sell for the higher prices Intel is able to command. AMD would be fine if they had an expensive chip they could sell at a premium, or a very cheap to produce chip they could sell for the budget crowd, but right now they have Acura production costs coupled with Kia per-unit revenues: bad times.
Parent
Let's not forget.. (Score:5, Interesting)
that Intel's Core 2 also had a problem with the TLB when first released, although that problem manifested itself as data corruption instead of a lockup. Here are the two [theinquirer.net] articles [theinquirer.net] from The Inquirer about it - the second one especially. And note that this document was released after Intel had shipped the buggy Core 2's.
However, Intel was able to fix it without incurring a large performance loss. It's a shame for AMD that they weren't able to do the same.
Good thing they bought ATI (Score:5, Funny)
Perfect Linux CPUs (Score:4, Interesting)
The performance hit is probably 10% when patching the microcode which should mean steep price mark-downs on this generation of CPUs. But it's only a 1% performance hit when patching the (Linux) kernel.
So why doesn't every OEM that sells Linux servers and desktops just buy up all of AMD's supplies of defective chips at a big discount, and pass the savings along? I'd buy a couple.
No. (Score:5, Funny)
Parent
No, but it looks bad (Score:5, Insightful)
Parent
They did (Score:5, Informative)
Parent
Re:No, but it looks bad (Score:5, Insightful)
It might not be AMD's doom, but they're really not that many big screwups away.
Parent