Xeons, Opterons Compared in Power Efficiency 98
Bender writes "The Tech Report has put Intel's 'Woodcrest' and quad-core 'Clovertown' Xeons up against AMD's Socket F Opterons in a range of applications, including widely multithreaded tests from academic fields like computational fluid dynamics and proteomics. They've also attempted to quantify power efficiency in terms of energy use over over time and energy use per task, with some surprising results." From the article: "On the power efficiency front, we found both Xeons and Opterons to be very good in specific ways. The Opteron 2218 is excellent overall in power efficiency, and I can see why AMD issued its challenge. Yes, we were testing the top speed grade of the Xeon 5100 and 5300 series against the Opteron 2218, but the Opteron ended up drawing much less power at idle than the Xeons ... We've learned that multithreaded execution is another recipe for power-efficient performance, and on that front, the Xeons excel. The eight-core Xeon 5355 system managed to render our multithreaded POV-Ray test scene using the least total energy, even though its peak power consumption was rather high, because it finished the job in about half the time that the four-way systems did. Similarly, the Xeon 5160 used the least energy in completing our multithreaded MyriMatch search, in part because it completed the task so quickly. "
Re:Way to put the conclusion in the article summar (Score:2)
Re:Way to put the conclusion in the article summar (Score:2, Funny)
You must be new here...
AMD needs to get back in the game, quick (Score:5, Insightful)
Business needs to pay attention (Score:5, Insightful)
I know of and have worked with too many organizations that figure it's just a matter of slapping all the computers in an air-conditioned room. Every watt of waste heat adds to the A/C bill.
Old fashioned water-cooled mainframes and big iron (for it's time) often recirculated the wasted heat into the heating systems of the surrounding buildings. We've known all along how to be more energy efficient, if companies and management would only place the emphasis on the environment in their budgets.
Re: (Score:2)
I'm surprised there aren't more data centers in places with really cold climates. Must be nice to use waste heat to heat the building, or just put a radiator with a fan blowing through it outside instead of having to use air conditioning.
-Z
Re: (Score:1)
Why would anyone care? (Score:2)
The only ones affected are the tape monkeys, and their jobs were replaced by robotics years ago.
Twenty years ago satellite ground stations were dropped off up north with nothing more than a big tank of diesel, a power generator, and a fault-resilient or fault-tolerant server, left alone for months at a time.
With modern high speed networks and VPN access, it's often hard to tell the difference between being at work and remote access, other than the environment. Don't forget how much sysadmin work has b
Re: (Score:3, Insightful)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1, Interesting)
AMD's path (Score:4, Insightful)
Hmm, so which better reflects real-world usage? (Score:5, Interesting)
the Xeon 5160 used the least energy in completing our multithreaded MyriMatch search, in part because it completed the task so quickly.
So what does this mean for people shopping for servers?
If your servers constantly tick along at nearly 100% CPU use, you might do better going with the Xeon system. If your machines basically sit idle most of the time with an occasional spike for a few seconds when it actually does something, the AMD would save you more on electricity.
Of course, this raises a third possibility - Would running a number of virtual servers on one large Xeon machine waste more energy than it saves, or give a net gain?
Re: (Score:3, Insightful)
Best Practices (Score:5, Insightful)
Re: (Score:2)
A server that's providing services to regular users, sure. But if your server is doing computational work, like many of the scientific computing examples given in the article, it should be spending every minute of every day at 100% utilization.
Re: (Score:2)
Re: (Score:2, Insightful)
Re: (Score:2)
The anti-spam filters at my place of employment (two machines, each with a single 2.6GHz Xeon). That's why we are replacing them with two machines, each using two dual-core Xeons, for 4x the CPU power.
Re: (Score:3, Interesting)
For capacity planning purposes, most of my clients target 40-50% CPU utilization on servers. If it starts creeping above 60% on a consistent basis (or is forecasted to do so soon), they begin the acquisition process to either upgrade or add servers.
Queuing theory (M/M/1) shows that while the average response time doesn't increase that much, the standard
Re: (Score:2)
Fron the article, the idle power consumption of the 8 core xeon is ~230W. 4 core opteron us ~120W.
Which means, at idle, the single 8 way xeon is better than 2 4 way opterons. Given that the efficiency of the 8 way under load is better than the 4-way, I would think that stacking on the 8-way is better.
Of course, having two 4 way independant systems is better redundancy. On the other hand, the 8 way can be utilized to solve SMP multithread problems (without the expense of h
Re: (Score:3, Interesting)
It's really back
Power = Heat (Score:3, Insightful)
More importantly, I think, is that power consumption translates to heat output. If you have mostly idle servers with occasional spikes, you can either cool them for less or put more in the same space depending on what you need. And don't forget that you actually save money twice with the AMD since you have to pay to power and cool the
Re:Hmm, so which better reflects real-world usage? (Score:5, Insightful)
Or go Xen, OpenVz or whatever does the trick.
But, most important, get rid of the idling boxes.
Re: (Score:1)
Re: (Score:2)
the dual xeon consumes ~280 watts constant and each of the th
Re: (Score:3, Funny)
Couldnt agree more. Oh wait, something's sending an Int. Req. , cant type have; to see what it wants.....
Re: (Score:2)
80x86 may be ugly, but it's cheap for the processing power and has an entrenched economy of scale. It sucks. Even Apple switched from PowerPC and is now making glorified Wintel clone boxes (though with a pretty nifty feature set).
-b.
Re: (Score:3, Interesting)
That was my understanding, after reading articles like this one on Ars Technica [arstechnica.com]. If true, it would make fighting over CISC vs. RISC not make a lot of sense.
Re: (Score:2)
The fact that the CPU now runs at 324236GHz and can chew the math nice and fast doesnt alter the fact that the -rest- of the system (A20 gateway stuck on the KB controller and such.. ahem..) deserves to go the way of Wang...
I've always been a fan of systems like MIPS and Ultrasparc: Engineered r
Oh really? (Score:2)
Meanwhile my Sun has OH LOOK, a crossbar, and MY GOD! this newfangled PCI bus. WHAT HATH SCIENCE DONE?
Re: (Score:2)
Re: (Score:2)
It does:
push BP
mov BP,SP
sub SP, 10
and
mov SP,BP
pop BP
internally very quickly as RISC instructions. It's still 5 bus cycles.
Re: (Score:2, Funny)
Re:God, I'm sick of this architecture (Score:5, Informative)
RISC worked well when speed of memory and CPU's were at parity. The simplified instructions let the CPU be clocked a lot faster, not to mention their shallow pipelines made it less costly when branch prediction failed. The tradeoff was that it usually took more instructions to accomplish a given task.
But as CPU's have spent more and more time waiting for memory, CISC has really come into its own. Think of CISC as a compression algrorithm: An x86 instruction which fits in 16-32 bits might take 4 or 5 instructions on a RISC processor, weighing in at 96-128 bits. It's no surprise why CISC processors have destroyed RISC in the past decade.
Re: (Score:3, Insightful)
Re: (Score:2, Insightful)
This coding is more complicated than fixed-width instructions, but
Re:God, I'm sick of this architecture (Score:4, Interesting)
You're forgetting the basic formula from Hennessy and Patterson:
Yes, CISC has better work per instruction, except for one glaring issue I'll get to in a moment, but - for various reasons explained throughout H&P - it loses on the other two and thus overall. That's why nobody's making new processors that are CISC internally any more; they just couldn't hit the issue widths and clock speeds are achievable with a RISC core (even if that core has a CISC ISA bolted on the front). What's missing here is that not all work is useful work. As anyone who has accidentally coded an infinite loop knows, executing lots of instructions is not necessarily a good thing. The glaring issue I mentioned earlier is that a lot of the instructions executed on a register-poor architecture like x86 are not doing useful work. Register thrashing means i-cache bandwidth is wasted fetching instructions which are then used to waste d-cache bandwidth, which more than outweighs any advantage from variable-length instructions.
So, you say, wouldn't variable-length instructions on a register-rich processor be the best of both worlds? Not so fast. A regular instruction set makes superscalar execution easier because it means that multiple instructions can be fetched literally at the same time without having to examine the first one to figure out where the second one begins and so on. It also makes deeper pipelines easier because it allows many internal activities (e.g. register allocation, hazard detection) to start after a simple pre-decode stage, in parallel with the remainder of decode. Either way, regular instruction sets allow for more parallelism - and parallelism in some form is the generally the key to CPU performance. If you're willing to give up performance by eschewing most modern processor-design techniques, which might be the case for a deeply embedded system with extreme size and/or power requirements, then variable-width instructions might still be a reasonable choice. In that case you might as well use an older architecture; there are plenty to choose from. For new processor designs, though, variable-width instructions are almost invariably a way to lose.
Do you code in assembly? (Score:1)
Then why do you care?
And we "fixed" this with x86_64. The extended instruction set allows for more orthogonal expression of what you want to do with your ops w/r/t regs and memory (although not all of them are equivalent length, the more common ones are shorter, so what does it matter?)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Do you have evidence to back that up? From the limited amount that I've seen, the opposite seems to be true - one or two instructions on an ARM or MIPS processor can neatly do what takes several instructions of fumbling on an i386. Partly this is because of more registers accessible at the instruction level, and partly because of a more orthogonal instruction set.
You could compare the size of object code spat ou
Re: (Score:1)
Re: (Score:3, Insightful)
Sorry but CISC, specifically x86 and children, has won simply by being the architecture for which most software was written. The dominance of CISC is similar to (but not the same, trying to stave off an off-topic rant) story as the dominance of Windows -- backward compatability is King.
The RISC makers knew this too. Back when RISC was the hot new thing in the early 90s, they were touting that RISC would be so much faster than CISC
Re: (Score:2)
Please don't get the idea that I'm defending the Intel x86 instruction set. When I first saw it in the early 1980s, I thought it was the most gawdawful mess I'd seen in 25 years in the business (I wrote my first assembler code in 1960). It hasn't improved any w
Re: (Score:2)
Amen, brother! While I haven't been coding for quite as long as you (for me, it was 1976 when I started), I've used a hefty number of instruction sets and designed a handful myself. The 6809 was always my favorite. I still have a well-worn copy of the 6800 instruction set manual in my library; so clear, so beautiful. This was back when instruction set design was based purely on merit (what is the
80x86 has the benefit of code size (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:1, Insightful)
Well too bad get used to it (Score:3, Interesting)
Now personally to me you sound like someone who's spent a little too much time in a computer science architecture class soaking up theories about ISAs and too little time actually looking at how chips are made these days and what works. When you get right down to it, x86 work
Re: (Score:1)
Second, the
Re:God, I'm sick of this architecture (WE CPUs) (Score:1)
This just in! (Score:5, Insightful)
Re: (Score:2)
Conclusions converted to $$$ (Score:2, Interesting)
Presumably, the article tests power consumption because businesses are concerned with how much running each o
Re: (Score:2)
"Presumably, the article tests power consumption because businesses are concerned with how much running each of these systems will cost them. If the Xeons managed to win in power consumption because they completed the task in half the time, that has other cost-saving benefits even beyond power consumption. "
The benchmarks chosen have very little to do with the real business world.
They mostly demonstrate the effect of Intel's larger CPU caches on performance.
Choose a series of applications(p
Re: (Score:2)
oracle datacenter (Score:4, Informative)
Re: (Score:2)
it doesn't make any sense to swap out a working and functional server running intel chips with one running AMD purely for power saving, because electricity is a relatively small of the lifetime cost of a server, until
it's a similar problem for car users - for
Re: (Score:2)
it doesn't make any sense to swap out a working and functional server running intel chips with one running AMD purely for power saving, because electricity is a relatively small of the lifetime cost of a server, until
it's a similar problem for car users - for an average vehicle doing 25mpg, about half the energy of its lifetime of making, using, and recycling/scrap is consumed when making.. environmentally it's best to fix up an old car so it runs properly with minimal emissions than generate a lot of scrap metal & plastics and incur the environmental costs of mining/refining metals, drilling for oil for plastics, manufacture etc of a new car.
Considering that Xeons have been around for years now, for all the parent stated these could be old 1Ghz or slower Xeon based servers. Rather than upgrading to the latest, they decided to switch platforms, which would meet your criteria.
However, I disagree with your statement that the cost to power a server is a small fraction of its cost. A basic server, costing about $4k (nothing fancy), running 24x7x365.25 at about 300Watts, will use 18408.6 KWH in one year. At $0.07/KWH, thats $1288.60 per year just
Re: (Score:2, Informative)
It took me forever to figure out what was wrong with this. I knew your numbers didn't add up but I couldn't put my finger on it until I realized you multiplied out exactly what people say when they mean constant uptime. The problem of course, is that it should be 300(watts)*24(hours/day)x365(days/year) or 24(hours/day)x7(da
Re: (Score:2)
Meaningful, normalized values of watt/performance? (Score:1)
It's very useful to have some normalized way of measuring watts/performance, as they try to do in this article. But at least they could have used a more general and useful benchmark, like those offered by www.spec.org [spec.org].
Info Power (Score:2)
Chip sets for AMD are better (Score:2)
So with a lot of network use and disk use you can choke up that bus.
Re: Chip sets for AMD are better (No they aren't) (Score:2)
How did you come to the conclusion that AMD has better chipsets? I can get an nforce/crossfire/via motherboard for either AMD or Intel with pretty much identical specs. Intel has the advantage of making their own chipset, so Intel is the one that has the chipse
Re: Chip sets for AMD are better (Score:2)
also there very few intel workstation board that can run the new xeons and have at lest one full x16 pci-e slot.
Test idea (Score:2)
Also take a duel intel workstation and try to do the same thing the best that you can find is x8 x8
Use hacked sli drivers is ok.
I think that the amd system will do better
Re: (Score:2)
The "MyriMatch" benchmark shows intel is slower (Score:2)
Very interesting. The benchmark uses a database and is the only one I've seen that seems to test the limits of the CPU cache with a database.. and low and behold, at 8 threads, performance degrades for the 5355 and it's actually slower than the opteron 2218.
Or it could just be that this benchmark isn't coded well - it might use a global lock frequently so as you add more threads there's more contention. In any case someone with more time than
Re: (Score:2)
Take a look at http://tweakers.net/reviews/661/7 [tweakers.net] if you want to see how the performance of the Clovertown Core 2 chips scales with a scalable database and many clients.
HOWTO: save 20W/socket when idle on Opteron or A64 (Score:5, Informative)
All AMD K8 (Opteron and Athlon 64) CPUs have the ability to run the clock and an extra slow speed when in HLT (idle) mode saving a bunch more power. Many (most?) BIOSes are not smart enough to enable this. A simple setpci command will turn it on under linux.
find out if its on:
setpci -d 1022:1103 87.b
If that returns 00, its off. To turn on clock-divide-in-hlt to div by 512 mode use:
setpci -d 1022:1103 87.b=61
(see the above URL for links to the AMD documentation on the PMM7 register; other values can work).
timekeeping can go bad (Score:2)
Unless your Linux kernel is very recent, this condition will not be detected automatically. Linux will assume that the discrepency means you are losing clock ticks.
You can try kernel parameters like clocksource=pmtmr to fix it. Good luck, you may need it...
The BIOS vendors disable this power-saving feature because there are Windows games that, like Linux, assume the timestamp counters don't vary in speed.
Re: (Score:2)
i'll check our kernel sources later to see if they include the code from the referenced lkml post or already default to not preferring the tsc for timekeeping.
Watt-seconds ? (Score:2)
How can you subtract a unit of time (seconds) from a unit of power (watt) ?
Assuming multiplication was intended instead of subtraction, why use Watt.seconds instead of Joule ? Still, kudos for using SI units and not something like boe.
Re: (Score:1)
What About Efficiency as a Space Heater (Score:3, Funny)
More importantly, how does that compare to a dedicated space-heater?
Re: (Score:2, Insightful)
The energy in the light radiated from the monitor or from the LEDs in the computer case is very small compared to the energy consumed by the computer. Computers do no useful physical work. The result is that almost all energy consumed by a computer is converted to heat.
Re: (Score:2, Informative)