Linux Shootout: Opteron 150 vs. Xeon 3.6GHz Nocona 217
danalien writes "Anandtech with their previous review have stirred up a bit of controversy, and they've released their follow-up review where they pit AMD's Opteron 150 vs Intel's Xeon 3.6 Nocona (on linux)."
Short version: Xeon RIP. (Score:5, Interesting)
No message here. Oh, did you know that an Athlon64 3000+ is within 2fps of a P4 3.4 Extreme Edition [anandtech.com] in Doom 3?
Look up the prices for those two items.
Re:Short version: Xeon RIP. (Score:5, Informative)
Athlon64 3000+ (2GHz): $167 [newegg.com]
Pentium 4 3.4GHz Extreme Edition: $1025 [newegg.com]
Re:Short version: Xeon RIP. (Score:2)
I'd also hesitate to make a choice based on performance on a single program unless that is the only program I ever plan to use.
Re:Short version: Xeon RIP. (Score:2)
I can run in Ultra mode in 1280x1024 with 2xAA and 16xAF and it's perfectly playable. I get 58.9FPS on timedemo 1 with those settings.
I think part of the secret is my FSB is at 265 because I've overclocked my P4 to 3.74Ghz, the extra memory bandwidth prevents any of the sudden jerking Ultra mode might give on stock systems. I'd expect an Athlon 64 939 to be even better with the onboard controller.
Re:Short version: Xeon RIP. (Score:2)
Re:Short version: Xeon RIP. (Score:2)
I'm not sure how one could confirm that there is no compression except by what John Carmack has said ultra mode does.
Also when you compare Ultra mode to High detail mode the image quality is definitely higher suggesting that there is no compression (or less compression at least).
Re:Short version: Xeon RIP. (Score:5, Informative)
The design intended to become the Pentium 5 (Tejas) was cancelled [theregister.co.uk] in favour of Pentium M derivatives. Intel basically had to give up on the Netburst micro-architecture and is now concentrating on increased parallelism (multiple cores) rather than extreme clock rates.
Re:Short version: Xeon RIP. (Score:3, Funny)
opteron (Score:1)
Re:opteron (Score:5, Informative)
Re:opteron (Score:3, Informative)
There's other minor diferences but *goes off dreaming about a 4-way processor in a database server*
Re:Is there somewhere that details all the opteron (Score:4, Interesting)
242 = 1.6GHz, +£15 / +14% faster clock
244 = 1.8GHz, +£90 / +28%
246 = 2.0GHz, +£190 / +43%
248 = 2.2GHz, +£345 / +57%
250 = 2.4GHz, +£465 / + 71%
First step's a no-brainer; next one isn't too bad, after that you're hitting significant diminishing returns, with each 200MHz gap being a smaller proportion of the total clock, not to mention other things becoming more likely to bottleneck (IO; memory bandwidth, disk latency, network, PCI bus, etc).
Core differences are going to be minimal, and hypertransport's remained at 800MHz across the S940 range afaik, so the clocks *should* be a pretty accurate upper bound on the performance differences within each range.
Memory (Score:5, Insightful)
This lets you take advantage of the on-die memory controller, by letting each processor do it's own memory work, rather than making the Northbrige do all the work.
If you want to use a single processor, you might as well use an FX-Whatever, since they are just an Opteron without MP capability and only one HT bus.
Re:Memory (Score:5, Informative)
Re:Memory (Score:2)
Like Linux 2.6, you mean?
Re:Memory (Score:2)
Re:Memory (Score:2)
Re:Memory (Score:2)
--Eamon
Re:Memory (Score:2, Informative)
What about Solaris, IRIX, AIX, etc.?
UltraSPARCs have been running with memory local to CPUs for quite a while now, for example.
Re:Memory (Score:2, Informative)
Re:Memory (Score:5, Insightful)
What really sets the Opteron apart in MP scenarios is the bandwidth. Each chip gets 6.4G/sec to memory: add more chips, get more bandwidth. The Xeon on the other hand has to share its 6.4G/sec with all the chips in the system, which severely limits its scaling. A quad Opteron has over 25G/sec of aggregate memory bandwidth, while a quad Xeon is stuck with its 6.4G shared 4 ways. That's half the bandwidth of a 400MHz P4 - no wonder the quad Xeons are often barely faster than the duals.
Add to this that cache snoops and other bus traffic all have to share the same FSB on the Xeon whereas on the Opteron local memory accesses don't touch the HT links at all. For a standard 2P system, this frees up 3.2G/sec of HT link bandwidth, and a NUMA-aware OS only increases the efficiency of the system.
Despite Intel's recent marketing push, they really don't have the best CPU, and don't have the best system either. There are still considerable advantages to choosing a Xeon system but these days they have little to do with the chip or the board and a lot to do with Intel backing. That's an advantage that will quickly evaporate as industry gets comfortable with non-Intel parts.
Re:Memory (Score:1, Interesting)
simple proof: consider an array of 486 processors connected by an ultra-fast, ultra-low-latency network.
long boring argument for idiots like you: suppose you have a single value in memory, called 'GO!!', it has the value 0 or 1. If it's ever set to 1, all the CPUs have to do some work and write an answer or something. If it's set to 0, the CPUs have to stay idle, otherwise the electricity man comes and s
Re:Memory (Score:4, Informative)
How do they know? By cache coherence signals transferred between the CPUs. This isn't free and consumes bus bandwidth.
The first CPU can't "instantly" write the value either, because it must first obtain exclusive ownership of that cache line by checking with the other CPUs.
On the Opteron architecture (we call this NUMA, or "point-to-point"), as soon as one CPU writes a value to the 'GO!!' area, well, that's _just the beginning_. It has to tell another CPU in the system that it just did that. etc etc
It has to use some communication resource to update the other CPUs on the state of that cache line. Just like the bus-based situation.
Re:Memory (Score:2)
If (theoretically) you got all of the CPUs working on an independant dataset then the CPUs would scale much better.
Re:Memory (Score:3, Insightful)
I agree. But at least AMD did something right when designing the AMD64 architecture. Virtual 86 mode and segmentation was eliminated from the 64 bit mode, but they still exist in 32 bit mode of course. Completely eliminating 8086 compatibility was not really so much of an option. Backward compatibility is part of the reason for the AMD64 success. But it would have be
These in-architecture tests are OK, but... (Score:5, Interesting)
Re:These in-architecture tests are OK, but... (Score:2, Interesting)
wtf (Score:2)
Re:These in-architecture tests are OK, but... (Score:3, Informative)
set-up benchmarks? (Score:1, Interesting)
But I've seen some computers which, having only switched from AMD to comparable (in clock frequency) Intel processor, got some boost in speed. Especially in games. And I've seen some changing from Intel to AMD, suffering loss of speed - mainly in games.
I don't know if recent games (I've seen these effects mainly with Neverwinter Nights) are compil
Re:set-up benchmarks? (Score:2, Interesting)
It really depends on what the rest of the hardware in the box is. AMD's (especially K6-II/III and Duron) CPUs tend to be seen as the low cost alternative and put in a box with a cheapo mobo, cheap mem and everything that goes with it, more often than Intel's CPUs. This is just my observation in dealing with a lot of SMEs, some who go all out and some who try to save where ever possible.
Shining example. We run an Astaro [astaro.com] firewall for one of our clients. At first they didn't have machine available, when w
Re:set-up benchmarks? (Score:2)
That doesn't make any sense. Current AMD chips ALLWAYS perform better at the same clock speed. If you really mean that AMD chips with a performance rating (or whatever their calling it this week) similar to the Intel tends to perform a bit worse then that might make some sense. In addition recent Intel and AMD chips won't work in the same motherboards so clearly
Opteron cpu hacked (Score:5, Interesting)
so what the hell.
Opteron Exposed: Reverse Engineering AMD K8 Microcode Updates
Summary
This document details the procedure for performing microcode updates on the AMD K8 processors. It also gives background information on the K8 microcode design and provides information on altering the microcode and loading the altered update for those who are interested in microcode hacking.
Source code is included for a simple Linux microcode update driver for those who want to update their K8's microcode without waiting for the motherboard vendor to add it to the BIOS. The latest microcode update blocks are included in the driver.
Background
Modern x86 microprocessors from Intel and AMD contain a feature known as "microcode update", or as the vendors prefer to call it, "BIOS update". Essentially the processor can reconfigure parts of its own hardware to fix bugs ("errata") in the silicon that would normally require a recall.
This is done by loading a block of "patch data" created by the CPU vendor into the processor using special control registers. Microcode updates essentially override hardware features with sequences of the internal RISC-like micro-ops (uops) actually executed by the processor. They can also replace the implementations of microcoded instructions already handled by hard-wired sequences in an on-die microcode ROM.
AMD's U.S. Patent 6438664 ("Microcode patch device and method for patching microcode using match registers and patch routines") goes into substantial detail on this.
Typically microcode update blocks are stored in the BIOS flash ROM and loaded into the processor as the system boots. They can also be loaded by the operating system; for instance, Linux contains a microcode device driver for Intel chips.
AMD recently released a "BIOS fix" to motherboard makers to address Errata 109, in which REP MOVS instructions caused subsequent instructions to be skipped under specific pipeline conditions.
Previously it was not clear if and how AMD even supported microcode updates in the K8 family until this announcement. After analyzing a number of BIOS images, it appears that AMD has secretly used the microcode update facility on several occasions over the past few years, but obviously avoided publicly disclosing that it actually had bugs patchable in this manner.
Early K7 (Athlon) cores initially supported microcode updates as well, until ironically the microcode update mechanism itself was found to be broken and subsequently listed as an errata!
The following sections describe the microcode update procedure, obtained by clean room reverse engineering various vendors' BIOS code. The actual microcode update blocks are embedded in the BIOS image; the most recent updates (created June 2004) have been included in the Linux driver source code attached to this description.
Microcode Update Procedure
The update procedure expects the 64-bit virtual address of the update data, including the 64 byte header, to be in edx:eax:
edx = high 32 bits of 64-bit virtual address
eax = low 32 bits of 64-bit virtual address
ecx = 0xc0010020 (MSR to trigger update)
Execute wrmsr with these register values. If the address and update block data are valid, wrmsr completes successfully. Otherwise, a GP fault is taken.
The microcode does not appear to update MSR 0x8B with the new update signature as it does on Intel processors, despite the fact that some BIOS code I have analyzed does seem to check this field. It is possible the MSR is only updated under certain conditions, for instance when microcode is loaded before initializing the cache controller. Nonetheless, as we shall see below, the processor is clearly doing something internally when it claims to accept an update in this manner.
The update generally takes around 5500 clock
Re:Opteron cpu hacked (Score:5, Insightful)
Also, the article admits that it's "very unlikely" that a particular processor could be fried using a dodgy microcode update, so why even mention it? It would be much easier to write a BIOS flashing virus, I believe a few of these did exist at one point (although the old memory is failing). I doubt the hoops you'd need to jump through to write such a thing for Intel processors are no higher than for AMD processors, and as such, this is just FUD.
Re:Opteron cpu hacked (Score:3, Insightful)
Re:Opteron cpu hacked (Score:5, Insightful)
Linux has for a long time already mostly ignored the system BIOS since they're notoriously broken because of legacy reasons. Supplying known good microcode is simply another step in eliminating variables that make system testing needlessly complex, I predict we'll see more developments along these lines in general.
Re:Opteron cpu hacked (Score:2)
Original source? (Score:2)
Does the above article have an original source? I'm guessing it didn't just spontaneously appear on half a dozen weblogs [google.com], it was probably written by someone who would like credit for his/her work? Perhaps this is why the story was rejected?
-jim
Re:Original source? (Score:3, Informative)
http://www.realworldtech.com/forums/i
hasn't been rejected yet, still pending, but no doubt will be.
Re:Opteron cpu hacked (Score:2)
Was an interesting read!
--Robert
Re:Opteron cpu hacked (Score:2)
http://www.realworldtech.com/forums/index.cfm?a
I only saw it because bruce scheiner pointed it out.
This doesn't look good for Intel (Score:5, Interesting)
Re:This doesn't look good for Intel (Score:2)
Re:This doesn't look good for Intel (Score:2)
Re:This doesn't look good for Intel (Score:5, Informative)
While in the past VIA and SIS have been really bad chipsets, modern VIA chipsets (KT266A and up) are rock stable and perform well. I have had both KT333 and KT600 boards which have never failed. SIS, while I have no first hand experience, I am told is similar.
Re:This doesn't look good for Intel (Score:4, Informative)
Re:This doesn't look good for Intel (Score:2)
Re:This doesn't look good for Intel (Score:2)
Other than that, which is not a weakness of the chipset, everything has hummed along smoothly. This was the first high end system I'd purchased in over 8 years, and it was well worth it. The only thing weak about this platform IMO, is the cost, which is still less than a dual Xeon
Difficult to trust? (Score:5, Insightful)
Re:Difficult to trust? (Score:5, Insightful)
Well, the conclusion that the opteron kicks the xeon's ass is pretty inescapable to me, finding out opteron is available and the xeon isn't quite yet and more expensive, really closes the deal to me. But the review isn't very scientific, and didn't go very deep.
Re:Difficult to trust? (Score:3, Insightful)
Re:Difficult to trust? (Score:2)
They do, but some performance-oriented flags can either cause instability or mistaken results. I've seen single flags cause a program to core dump or not, and even any optimization at all can cause some programs to crash (probably a very obscure bug regarding program and compiler assumptions...I never found it).
Re:Difficult to trust? (Score:3, Insightful)
Yes, yes we are. :)
You definitely have to be careful about which ones you pick, though, and if you're really worried about performance you have to do what they did in this review, and try different settings with different programs because different flags will produce different results on different programs.
In general I use -O3 on older processors, like pentium2 cores, and -O2 on newer ones. I don't know if it's still true but -O3 was known to cause problems (errors, not just a performance hit) with G
Re:Difficult to trust? (Score:5, Insightful)
You have to evaluate performance (possibly vs price) for your particular application. If you need a faster processor for Doom 3, look at Doom 3 benchmarks. If you need to encode video, look at video benchmarks. If you need to do integer computations, look at integer benchmarks. Xeons probably kick AMD's ass at some applications, and AMD might beat the Xeon at others. You can't just say that one is "better" than the other in general.
Re:Difficult to trust? (Score:2)
Re:Difficult to trust? (Score:2)
Re:Difficult to trust? (Score:2)
Re:Difficult to trust? (Score:2)
Well considering that that Intel chip is more than 6 times the price of the AMD chip, if $858 matters to you at all, then AMD appears to win hands down.
Considering that the AMD chip is considerably faster on the vast majority of tasks, if you run multiple kinds of software, or even one peice of software with code that does multiple things, then on average the AMD appears to win hands down.
But yeah, you're right. *IF* yo
Re:Difficult to trust? (Score:2)
This is a SERVER processor. If you are running a server, cost matters very little. Hell, you probably pay more than $800 to your janitor. Reliability, performance, and compatibility are what matters. AMD may be a better value, but Intel has a lot more experience with servers. Do you want to buy a $250K cluster to find out it doesn't work reliably with
Re:Difficult to trust? (Score:2)
Not at all; I'm not saying I'm more advanced than an average user, however, I in fact hate oversimplified and highly distilled articles, boiled down to taglines (our soundbites). These taglines and soundbites often find themselves back into slashdot blurbs, I've noticed, and that irritates me..
I have no probl
Re:Difficult to trust? (Score:2)
Lame conclusion? (Score:4, Informative)
After all is said and done it became difficult (nearly impossible?) to justify the Xeon processor in a UP configuration over the Opteron 150,
Huh? Here are some numbers:
Re:Xeon lose at SPEC too. (Score:2)
Re:Xeon lose at SPEC too. (Score:3, Informative)
Re:Lame conclusion? (Score:2)
Hold on a sec, you do have a valid point within the context of typical windows machines, but when you're talking about high end workstations and servers, which both chips are targeted for, you have a whole different ball game.
I have never relied on
Re:Lame conclusion? (Score:2)
Do you also usualy recompile libc? - though I could forsee a distro providing two versions of Libc if it was going to make a significant difference...
Obviously if you're using software that can be recompiled, and doing so will make a significant difference then it should be considered, but personally I think having a well tested, stable, and easy to recover in an emergency setup is far more important than tweaking the best performance. So for me, when it came to conside
Re:Lame conclusion? (Score:2)
That sounds about right. Yeah, I can't recompile oracle and there's some things I don't, but I would recompile MySQL, or APACHE(to name some more well known apps), and it is a trivial task
Re:Lame conclusion? (Score:2)
-
Opterons and Xeons and Prescotts, oh my... (Score:2)
The criticisms were that the Xeon is not a desktop CPU, or vice-versa, the Athlon is not a workstation/server CPU. But are they so different? The Xeon has 1MB L2 cache, and so does the P4 Prescott (and presumably Prescott with x86-64 enabled), and both run at the same speed.
Similarly, the 3500+ runs at 2.2GHz and has a 51
Re:Opteron vs. A64 (Score:4, Interesting)
This still leaves me wondering why an Opteron 250 (2.4GHz, 1MB L2 cache) seems to so seriously outperform an Athlon 64 3500+ (2.2GHz, 512KB L2 cache).
Re:Opteron vs. A64 (Score:5, Informative)
When people says that the first article was bad, it's because it was really bad: 64-bit binaries for Intel vs. 32-bit binaries for AMD, copy&pasted benchmark results from previous 32-bit benchmarks, tests (PI digit computation) that measured the libc optimization instead of the actual benchmark (when removing the printf() it got about a 10x boost). People on aceshardware forums were posting TSCP scores about 2x what Anandtech got, on the same processor. So the A64 3500+ scores you saw in that article are trash. Forget them.
Re:Opteron vs. A64 (Score:2)
Re:Opteron vs. A64 (Score:2)
Same benches with the same cache, the Xeon gets killed.
Jonathan
As the proud owner of an Opteron 150... (Score:2)
Hyperthreading is not good for these benchmarks (Score:5, Insightful)
Re:Hyperthreading is not good for these benchmarks (Score:3, Informative)
Re:Hyperthreading is not good for these benchmarks (Score:2)
Re:Hyperthreading is not good for these benchmarks (Score:3, Insightful)
Since it isn't possible to dynamically turn hyperthreading on and off while the system is running, the benchmarks should be run in the mode most systems will use - with (highly touted) HYPErthreading turned on. After all, it is supposed to be a useful feature...
Personally,
Re:Hyperthreading is not good for these benchmarks (Score:2)
However, we may be getting a few multiprocessor Xeon's for research use; since there are multiple processors, I'm debating whether HT should be turned OFF, as HT slows things d
Re:Hyperthreading is not good for these benchmarks (Score:2)
Well, you should really look at the number of compute-bound threads you have active on average. If that number is around 4 or greater, and you have a dual-processor box, then you should probably
Re:Hyperthreading is not good for these benchmarks (Score:2)
Re:Hyperthreading is not good for these benchmarks (Score:2)
Re:Hyperthreading is not good for these benchmarks (Score:2)
See the following:
page4 [2cpu.com]
page5 [2cpu.com]
AnandTech Biased (Score:2, Interesting)
Re:AnandTech Biased (Score:2)
We still have Ars.
Re:AnandTech Biased (Score:3, Insightful)
Don't get me wrong--I've liked Anand and company since they first hit the internet. They don't generally have an axe to grind or an ego to boost (both of which TomsHardware suffers from terribly), but they don't have the slightest bloody clue about statistics, or significance.
Fun to read, and not consistently biased, but not a great source of actual benchmarking or review information. (techreport.com is better for that)
Re:AnandTech Biased (Score:2)
The last review by anandtech was "EM64T Xeon vs. Athlon 64 under Linux".
I'll spare you the condescending jab at your last remark.
Those results speak for themselves (Score:3, Informative)
However, this speed increase seems to depend on being able to compile your software from scratch which is generally unknown in the windows world. That should change in the future, but for now it's still a tough call whether or not to buy one now. But if you're running gentoo, let the funroll-loops begin!
Re:Those results speak for themselves (Score:2)
Other Ideas for benchmarks (Score:5, Interesting)
Both Sun (the original innovators) and now Microsoft are putting their money on their bytecode (rather than binary) executables to try and avoid the whole backwards compatibilty problems when moving architectures. To get to grips with how important this is - Microsoft has only just recently managed to escape from the 16 bit code hell that it lived in for years (need proof - check out the Win16Lock you needed to get access to the video memory in DirectX).
That said, I can't imagine that many (someone might enlighten us here) performance benchmarks that a 64 bit bytecode interpreter could do better in when compared to its 32 bit smaller brother.
What would be interesting here would be to see how Javas bytecode and CIL scale to 64 bit. My first guess would be that Java should scale better (with Suns heritage of 64 bit platforms) but I wouldn't be surprised if MSFT weren't too far behind, as they were always keeping their eye on this test when designing the CIL. This would also be a good chance for the Mono project to try a "ours is better than yours" benchmark for their interpreterrs.
Re:Other Ideas for benchmarks (Score:2)
?!?! nocona ??! (Score:2, Funny)
If you want controversy... (Score:2)
... and to make it better ... (Score:2)
These benchmarks show the Opteron 150, a $600, real-world-available chip handily beating a $850, non-real-world-available chip. But that's still not the half of it.
Wait until tests are run on multi-CPU machines. Because the Opterons scale so much better than Xeons, the performance advantage of the Opteron will be even greater.
When I've bought a quad Xeon machine, I've never been at all impressed with the scalability. When I bought a quad Opteron, I was blown away.
steve
My own results (Score:2, Interesting)
The day of the Opteron, however, has come at last:
All these were run with stock tools in 32-bit mode, no fancy compiler optimizations. These are the same programs that we run on 2GHz P4s.
Agilent 3
Fair comparison? (Score:2, Insightful)
What about the AMD Opteron 850?
Re:Fair comparison? (Score:3, Informative)
Re:very little grey area (Score:5, Informative)
There are older dual and quad Opteron vs Xeon reviews around [aceshardware.com].
Humorously, the also say this:
Now we know that the Nocona is here, and it's getting slaughtered at the Altar of The Opteron.
Re:I have a great idea (Score:2)
Or it could be something else.
Re:Why this is still a joke (Score:2)
I prefer raw numbers because that's one less thing for the reviewers to get wrong - they may screw up the calculation or be tempted to draw stupid graphs that show nothing.
And I'd prefer to see the time taken to convert a 700MB wave file compared in seconds rather than percentages. Gives me a better feel of how long a system will take to do other sized files, and whether I should fork out the X bucks for the next faster system. Percentages are