Forgot your password?
typodupeerror
AMD Intel Hardware

Xeon vs. Opteron Performance Benchmarks 362

Posted by timothy
from the for-your-summer-cottage's-data-farm dept.
QuickSand writes "Anand got his hands on some of Intel and AMD's enterprise processors including 4MB L3 Xeons, and put them to the test. Results were a little varied as 4-way Opteron systems seemed to fare the best, although dual Xeon configurations almost always beat dual Opterons. The exact benchmarks are here."
This discussion has been archived. No new comments can be posted.

Xeon vs. Opteron Performance Benchmarks

Comments Filter:
  • IA-32e vs IA-32 (Score:5, Interesting)

    by Stonent1 (594886) <(ten.kralctniop.tnenots) (ta) (tnenots)> on Wednesday March 03, 2004 @01:19PM (#8453143) Journal
    Can somebody tell me if the IA-32e processors will be in the socket 478 format to work with existing boards, or will they require a whole new socket and chipset (rather than a bios update) If they really are just "extensions" then I don't see why anything special would need to be on the motherboard correct? The cpu should switch into 64bit mode whenever the OS tells it to right?
    • Re:IA-32e vs IA-32 (Score:5, Informative)

      by irokitt (663593) <[archimandrites-iaur] [at] [yahoo.com]> on Wednesday March 03, 2004 @01:30PM (#8453266)
      I don't really know, but I think Intel's 64-bit chips will probably use a Tejas-style clip system, not pins. Technology that won't work in a current motherboard. But, once AMD upgrades to socket 939 for the FX-51, it won't work in current boards either. AMD and Intel are both set to release the new sockets at about the same time PCI-Express comes out, so upgrade-happy people will need to buy new motherboards anyway.
      • Re:IA-32e vs IA-32 (Score:4, Insightful)

        by Anonymous Coward on Wednesday March 03, 2004 @01:54PM (#8453577)
        need to buy new motherboards

        This makes me want to throw up. The last motherboard purchase I made, it was a chore finding one with the _least_ amount of features. Need an AMR riser slot? Fuck no, I'd rather have the PCI slot back. Need integrated sound? No, integrated sound makes my already bad speakers sound worse. It must've been tough figuring out how to make a decade's worth of improvements in technology amount to nothing. I have an ISA soundblaster from 10 years ago that sounds better than the onboard sound on my last motherboard. Need integrated video? I won't begrudge you this. Some people build clusters with their motherboards, and a video card is needed to boot, but if I have a choice I won't buy a mobo with integrated video.

        In short, I want a motherboard with slots for RAM, an AGP slot, a socket/slot/hole for a CPU, PS/2 hookups, serial and USB connectors, and the rest of the board filled up with PCI (or PCI express) slots. That's the ticket.
        • Re:IA-32e vs IA-32 (Score:3, Interesting)

          by Octorian (14086)
          Well, any basic standardized and commoditized integrated I/O is a good thing. I'd also be fine with on-board Ethernet (good chipset, of course) and on-board SCSI/FC/SATA/etc. But yeah, I'd rather add my own cards for video/sound.
        • by LordHunter317 (90225) <(askutt) (at) (gmail.com)> on Wednesday March 03, 2004 @02:30PM (#8453978)

          This makes me want to throw up. The last motherboard purchase I made, it was a chore finding one with the _least_ amount of features. Need an AMR riser slot? Fuck no, I'd rather have the PCI slot back.
          You do realize it costs much less to put a AMR or CNR slot on a motherboard then a PCI slot right?

          Need integrated sound? No, integrated sound makes my already bad speakers sound worse. It must've been tough figuring out how to make a decade's worth of improvements in technology amount to nothing. I have an ISA soundblaster from 10 years ago that sounds better than the onboard sound on my last motherboard.
          Now its obvious you're trolling. Say what you want about AC97-based onboard sound (which nearly everythign is), but its good enough. Your speakers are much more likely the problem. The long and short is that all PC sounds cards equally suck until you get to professional grade gear.

          Need integrated video? I won't begrudge you this. Some people build clusters with their motherboards, and a video card is needed to boot, but if I have a choice I won't buy a mobo with integrated video.
          What does having this cost you? Its not like you have to use it, or that boards with onboard video cost signifcantly more.

          • Re:IA-32e vs IA-32 (Score:5, Informative)

            by Yokaze (70883) on Wednesday March 03, 2004 @03:53PM (#8454981)
            > Now its obvious you're trolling. [...] The long and short is that all PC sounds cards equally suck until you get to professional grade gear.

            You must be joking. Most of the integrated sound I've had the joy to listen to produced noticeable background-noise. The most obvious one was the Eden-M board. Most mb-producers don't give much about seperating the analogue part from the digital, so accompanied with an intergrated graphics card, you can practically hear how a window is restored. It is usually not the quality of the on-board sound, which sells it, but purely the capability of producing some sound.
    • Re:IA-32e vs IA-32 (Score:3, Informative)

      by adler187 (448837)
      Technically since the new Prescott chips have the instructions on them, yes you could buy a 478 Prescott. You would just have to hack the chip to activate the instructions (good luck with that!).

      As for a real IA-32e chip with the instructions enabled, Intel has stated that they arent coming out for while, and since Intel is moving the P4 to the new 775 chipset in a few months, I would bet that they would also be released under this new chipset. Heck, they might not even release their IA-32e chips within
    • Re:IA-32e vs IA-32 (Score:4, Insightful)

      by Loki_1929 (550940) on Wednesday March 03, 2004 @02:26PM (#8453931) Journal
      " Can somebody tell me if the IA-32e processors will be in the socket 478 format to work with existing boards, or will they require a whole new socket and chipset (rather than a bios update)"

      They're disabled in all socket 478 chips. The new Pentium 4e chips (Prescott core) supposedly have the extensions, but they remain disabled. Technically, it may be possible to gain access to those instructions through some sort of BIOS hack, but you would also need to use an Operating system that can detect, support, and make use of those new instructions. Also, you risk using unfinished or untested parts of the CPU if you do manage to gain access and use the extensions. There would be no benefit other than simple tinkering.

      "If they really are just "extensions" then I don't see why anything special would need to be on the motherboard correct?"

      You still need a CPU that supports the instructions, and which has them enabled. Technically, if Intel released Prescotts in S478 form with IA-32e enabled, it should work fine with an existing motherboard which would otherwise support the Prescott chip you're using. The probablility of Intel taking the time and effort to do this less than a quarter away from a whole new socket is virtually nil.

      "The cpu should switch into 64bit mode whenever the OS tells it to right?"

      That's not entirely accurate. Technically, what happens under AMD64 (the basis for IA-32e), is that specific instructions can be sent to the CPU to have it run code in what's called "Compatibility mode", which essentially allows it to behave as though it were a 32-bit CPU. The difference is that you're not 'switching' to 64-bit mode. You're either in 64-bit mode with the option for compatibility mode when needed (meaning you need a 64-bit capable OS), or you're in 32-bit and you're stuck in 32-bit.

      If you're looking for 64-bitness, you may simply want to get an Athlon64. If you're waiting for 64-bitness on the Intel side of things, you'll be waiting until some time towards the end of this year. Good luck.

    • Re:IA-32e vs IA-32 (Score:3, Interesting)

      by DJStealth (103231)
      They will most likely require entirely new sockets as they are 64bit chips, as opposed to 32 bit.

      The 32e means its an extended version of the machine/assebly code modified from IA32 to work on 64bit processors and still have backwards compatibility.
  • by Can it run Linux (664464) on Wednesday March 03, 2004 @01:19PM (#8453145)
    because EVERYONE knows that Intel always wins.
  • by chef_raekwon (411401) on Wednesday March 03, 2004 @01:20PM (#8453151) Homepage
    dual xeons have owned the market for a long time...it will be difficult (although not impossible) for AMD to topple this.

    many people did not upgrade to Intel's Itanium, but rather were upgrading to their high end dualie xeon systems -- they run very reliably, and very fast. a few instances where we've put in dual 2.x ghz xeons for web/mail servers...and only a slashdotting could bring them down...(well, an exaggeration...but you get the point).
    • by ePhil_One (634771) on Wednesday March 03, 2004 @01:28PM (#8453232) Journal
      many people did not upgrade to Intel's Itanium

      Folks were avoiding the Itanium because it was a disaster; slow and expensive. We've been looking at 64 bit computing for a while, because of the seamless > 4GB RAM capabilities. Intel's PAE extensions are OK, but they really didn't solve any of the problems we were having.

      The net result was we went to 64 bit PPC architecture 3 years ago on those critical systems, And everything has been fine. AIX works great, and IBM's embrace of GNU/Linux means an easy learning curve for us Linux users.

      • by hackstraw (262471) * on Wednesday March 03, 2004 @01:55PM (#8453584)
        I have not heard that the Itaniums were slow since Intel came out with the Itanium2. Yes, the Itanium1's were dog slow. I've got 65 Itanium2 processors downstairs, and I'm happy with them. For our purposes (crunching numbers on very large datasets) the Itanium2 was the platform of choice because of its 64bit addressing, high memory bandwidth and good processor speed.

        I wish we could get by with cheap Xeons, but they just don't cut the mustard for our applications.
        • by Loki_1929 (550940) on Wednesday March 03, 2004 @02:39PM (#8454071) Journal
          The Itanium2s and up are pretty decent so long as you're working with code designed for 64-bit/EPIC. Where you run into problems is with 32-bit code, or pretty much any code not designed/optimized for EPIC. There's nothing wrong with Itanium in-and-of itself; it's just not cut out for compatibility the way x86 is. Had Intel stuck with the original plan for IA-64 (which was to replace x86 from top to bottom), this would have been fine. You simply would have lost the ability to use old applications, but new ones would run reasonably fast. 10 years later, Itanium has its niche, does quite well within that niche, and sucks for everything else. :)

          "I wish we could get by with cheap Xeons, but they just don't cut the mustard for our applications."

          This is exactly why Opteron DOES compete with Itanium - if only indirectly. Opteron will never hit the big-tin niche, simply because it was never designed, nor intended to do so. What Opteron does is bring 64-bitness, and all the benefits therein to the mid-range crowd. This forces Intel to choose between giving up on Itanium as anything other than a big-tin chip, or losing half its mid-range customers to AMD. Losing such a lucrative market would be far worse for Intel in the long run than losing the 10 years of R&D sunk into Itanium, so they've chosen to bring the Xeon line to the 64-bit world. With the new Potomac core (Q1/H1 '05), the XeonMP will be the CPU of choice for Intelphile mid-range customers in need of Itanium's benefits, but conscious of cost. The result will be that Itanium's legs will finally be completely taken out from under it, and it will be resigned to little more than a handfull of extremely high-end big-tin servers each year.

          Does this mean Intel should continue to develop Itanium, even if it becomes clear it can no longer sustain its own R&D? I don't know - I think that's a question for Intel's board to answer. What I do know is that AMD had it right in '98/'99 when they decided to help transition people to 64-bit CPUs without losing x86's incredible compatibility. The bottom line is that someone like you would have gladly gone with either Opterons or Xeons had the choice been given to you. Unfortunately for Intel's margins, you and those in your position now have that choice.

          • Opteron will never hit the big-tin niche, simply because it was never designed, nor intended to do so.

            Heh, I guess the Cray Red Storm [amd.com] system kind of shoots down that theory... ;-)

            Actually the design of Opteron beats Itanium for HPC, and the relative number of Opteron vs. Itanium HPC design wins bears that out nicely.

            • Except that theory takes a severe beating. Compare entries 4, 5 and 6 here at Top 500 [top500.org], especially the number of CPU's. Notably, entries 4 and 6 use the same kind of high-speed interconnects between the nodes, so the difference can't be blamed on that.

              The problem with the AMD approach is that you get the NUMA drawbacks not only against other nodes, but internally on the node. If the data CPU 1 needs isn't in it's own memory banks, it's got to request them from CPU 2, 3 or 4, with a latency penalty(Letting t
      • Itaniums are expensive, but not outrageous compared to other high-end processors like Power4 or ultrasparc. They also perform quite well. They are definately better performers than xeons for most of our apps.

        The problem with itanium is not that they aren't a good technology, but rather that intel is trying to shove them into the high-end of the market, which is a difficult place to compete. sparc, power, pa-risc, alpha have all been around for years, have established customer bases, and lots of businesses
  • by Pingular (670773) on Wednesday March 03, 2004 @01:20PM (#8453152)
    Xeon's are almost always for servers, wheras Opeteron's can be for anything. Try running a windows xp workstation on a dual Xeon system and you'll be very disappointed.
    • by Anonymous Coward
      I call bullshit. You make a blanket statement without anything to support it or any logical argument at all. Of course you will get modded up, though, because your post is anti-Intel.
    • by hackstraw (262471) * on Wednesday March 03, 2004 @01:41PM (#8453401)
      Xeon's are almost always for servers, wheras Opeteron's can be for anything. Try running a windows xp workstation on a dual Xeon system and you'll be very disappointed.

      OK, lets go over this again. There is nothing really special about Xeons vs a P4 except the P4 is crippled so that it cannot do SMP, and there may be more cache options on a Xeon. Performancewise they are the same @ the same clock speed. FWIW, I've been dissapointed with XP regardless of the hardware :)

      Now, back to this benchark thingy. 1st, I would appreciate in the article writeup that it said that it was only doing a simple read/write database benchmark, and that was it, but we don't come to slashdot for the stories, right? Also, in my opinion there was no significant difference between the two platforms regarding their speed on this benchmark. The difference between 1st and 2nd place, regardless of who won that test, was between 5 and 12%. I don't start to get interested until there is at least 20% difference, and even then that would only determine my choice for an initial purchase, I would never upgrade a system unless there was at least 100% speedup, preferably 200 -> 400% is worthy of doing an upgrade.

      It would have been interesting to see results like this for more platforms, because I have not seen any significant numbers from the Opteron yet. For example, the memory bandwidth of the Opteron is 1/2 that of the Itanium2's.
      • by Laur (673497) on Wednesday March 03, 2004 @02:00PM (#8453634)
        Also, in my opinion there was no significant difference between the two platforms regarding their speed on this benchmark. The difference between 1st and 2nd place, regardless of who won that test, was between 5 and 12%. I don't start to get interested until there is at least 20% difference

        How about cost? The Xeons cost twice as much as the Opterons, and the Opterons give equivalent or better performance! Although you are correct that the performance difference may not be staggering (and between top of the line chips, who would expect it to be?), the price/performance ratio certainly is.

        • Not only that, but when they were comparing prices, they said that the 2Mb cache Xeons were twice the price of the Opterons. Howevere in the article they tested the *4Mb cache Xeons*, which you can expect to be even more expensive.

          In terms of bang for your buck Opterons rock.
      • by cnkeller (181482) <cnkellerNO@SPAMgmail.com> on Wednesday March 03, 2004 @02:04PM (#8453676) Homepage
        I don't start to get interested until there is at least 20% difference, and even then that would only determine my choice for an initial purchase, I would never upgrade a system unless there was at least 100% speedup, preferably 200 -> 400% is worthy of doing an upgrade.

        Good post. However the one comment that I didn't agree with was the above.

        My guess is that you aren't involved with any applications where compute time = money. When you are running simulations (say large CFD runs for example) that can takes days or weeks per run, a 50% improvement in speed is a major breakthrough if you get it by not touching code, ie hardware upgrades. Optimzing code is great and all, but it can introduce bugs and other expected behavior. Plus, us development people are pricey. Hardware is relatively cheap. Add in the fact that you generally get charged for CPU time on these big machines (or clusters of little ones), then *any* speed that you get is a major breakthrough, ie you can run more simulations in the same time for the same money.

        In your environment, it's probably okay for you to only upgrade every three years when you get a doubling or more of performance, but there are enviroments where any speed increase is sought after highly, even if it's 20%. I suspect this is true of the special effects industry too, guys like Pixar, ILM, etc. If they can render more frames in the same time or even render the frames in the same time at a higher level of detail, that's worth paying for. Perhaps someone who knows more would care to enlighten us, I'm curious if I'm interpreting that correctly.

        • My guess is that you aren't involved with any applications where compute time = money.

          Your right. I work with scientists that run programs up to 5 days over 10 to 20 processors. We get our money upfront, but everyone wants their answers quickly.

          When you are running simulations (say large CFD runs for example) that can takes days or weeks per run, a 50% improvement in speed is a major breakthrough if you get it by not touching code, ie hardware upgrades.

          So your saying that its more cost effective for
          • o your saying that its more cost effective for you to upgrade every 6 to 9 months? Thats fine if it pays off for you. You probably don't have that many processors to worry about either. Trust me, its not trivial to upgrade 60 to 120 processors that often, even if the machines were given to me.

            It depends, I'm not trying to make a blanket statement that this is always the case, but yes, I can certainly envision scenarios in that the benefit to customers is worth the price of the upgrade when you get less

  • by hng_rval (631871) on Wednesday March 03, 2004 @01:21PM (#8453165)
    Test Results can be found here:
    http://www.anandtech.com/IT/showdoc.html?i=1982&p= 6 [anandtech.com]
  • by SuperBanana (662181) on Wednesday March 03, 2004 @01:22PM (#8453167)
    for-your-summer-cottage's-data-farm

    Ah, so for all our college-student friends, that would be "the parents' house"?

  • Cache always help (Score:5, Informative)

    by Anonymous Coward on Wednesday March 03, 2004 @01:22PM (#8453168)
    I remember AMD's K6-3 would blow away the K6-2 at the same clock speed with the major difference being the cache.
    • Re:Cache always help (Score:3, Interesting)

      by Vaystrem (761)
      Cache may always help but this is not as straightforward a statement as you indicate. It is highly dependent upon the architecture of the processor.

      The reason the 4mb Xeon's are significantly outperforming the 2mb Xeon's is due to the shared bandwidth architecture of the Xeon's. The cache makes up for the lack of access to data via the FSB and keeps the very deep pipeline of the P4 series processors full. The long pipeline is the reason that cache misses impact the speed of the P4s so much - despite Inte
  • I recommend Glasses (Score:5, Informative)

    by Avrice (237283) on Wednesday March 03, 2004 @01:22PM (#8453170) Homepage Journal
    Whomever is citing Anandtech as claiming the dual Xeons almost always beat the dual-Opterons needs to read the article again. Both Architectures in a dual configuration tended to perfom about the same with Opteron and Xeon each winning some of the time. The Opteron scales better above dual configurations. However the Opteron is HALF the price of a Xeon! Cost/performance (or else we would all have 12th generation DECAlphas or Power5s by now) is easily handed to Opteron. Nice spin!
    • by embarcadero (568047) on Wednesday March 03, 2004 @01:47PM (#8453482)
      In addition, Anand used sub-optimal memory in the Opteron, and non-NUMA config. Looks like he had some Intel "assistance" in designing the "benchmarks" as well... the database read/write ratio is not at all realistic, favors the Xeon.
      • by brucmack (572780) on Wednesday March 03, 2004 @02:12PM (#8453779)
        The purpose of the test is not to test the memory, but to test the processors. Thus, they used the same memory in testing each processor configuration.

        One of the purposes of the test was to show how the memory bandwidth bottleneck of the Xeons limits their effectiveness in 4-way configurations, which the Opterons do not have that problem. Doing this comparison with different memories would make things more complicated.

        Additionally, you'll notice that Anand's final words recommend the Opteron for being at least equivalent and much cheaper than Xeon. This was also the selection process for their new forum servers, so you can bet that they aren't getting any kickback from Intel, or those would be Xeons.

        If you still have doubts about the validity of Anandtech's testing, check out the benchmarks from their AMD vs. Intel web server test in December: http://www.anandtech.com/IT/showdoc.html?i=1935&p= 9 [anandtech.com]. All on dual processor configurations. There is definitely no Intel bias in that test.

        Really, I think some people ought to think before they flame like this. The benchmarks are showing the Opterons to be equivalent or faster in 2-way configurations and definitely faster in 4-way configurations, so what is there to complain about? The fact that Anandtech has consistently recommended AMD's processors just makes it doubly silly.
  • by Anonymous Coward on Wednesday March 03, 2004 @01:24PM (#8453190)
    At two processors Xeon is still ok because the bandwidth of the memory coherency still isn't in serious contention. However, as the systems scale larger support for NUMA is critical to reducing memory latency because it means that memory does not have to flow in from the controllers on other processors.

    That is why Opteron is required for good performance with eight to sixteen processors, and you can even see the improvement on the four way tests that Anand ran.
    • by Anonymous Coward
      NUMA is rarely employed in systems at the 8 to 16 processor size. Those are typically SMP.

      Saying Opteron is better for 16 way means nothing as those systems do not exist.

      NUMA architectures are not without issue. I emphasize the "NON UNIFORM" aspect of the acronym. Even if SGI wants to change it to "nearly uniform". Sounds like you've been reading too much of their properganda.

    • by flaming-opus (8186) on Wednesday March 03, 2004 @01:59PM (#8453617)
      You will find that most high-end Xeon systems are also NUMA systems. IBM, Unisys, HP all construct their really big xeon boxes as NUMA-clusters of 4-processor SMPs. They create a distributed memory machine at the chip-set level. This is actually what the opteron does, except that the chip-set (well, the memory controller part of it) is built into the processor.

      I think the above poster had the correct idea about NUMA, but worded it in a misleading way. A NUMA design (either of opterons, or of Xeon-quads) will have to do some memory access through the memory controllers on other nodes. This increases the latency of memory access, and can clog up the inter-processor links if lots of memory loads/stores go to remote memory. Thus NUMA-aware operating systems and system libraries are necessary to maximize the amount of memory access that is local, and minimize the usage of the inter-processor links.

      While the opteron design is elegant, and fast, it is not the only smart way to do things. It offers great aggregate memory bandwidth, but can slow things down in the worst case. Most large NUMA systems are created by linking 4-way SMP nodes. (Examples: Sunfire, HP alphaservers, Cray X1, NEC SX-6, Unisys 7000, IBM xseries 4xx xeon, IBM xseries 4xx itanium,...) Apart from opteron systems, the only systems I can think of that do NUMA per processor are the cray T3E, SGI origin, and intel paragon, all of which are Massively parallel supercomuters.

      It is safe to say, however, that a shared bus system does not scale well beyond a few processors. This is best demonstrated by the 36 processor SGI-challengeXL, which was significantly bottle-necked at the memory bus.

      food for thought.

    • by Fnord (1756) <joe@sadusk.com> on Wednesday March 03, 2004 @03:42PM (#8454819) Homepage
      I don't think you really understand how NUMA works. The whole point of NUMA is that memory DOES have to flow from controllers on other processors. Xeons use a uniform memory architecture, meaning that they all share a memory controller. Even if that memory controller is faster than a single proc system's memory controller, they're still contending for it. However, they do all have equal access to it.

      NUMA is a tradeoff. Each processor has its own memory controller and its own bank of memory. Therefore each processor has preferential access to that bank of memory. If, however, a processor attempts to access memory in another processor's bank, its slower. This cost is mitigated by intelligent VMs that attempt to put the memory for processes in the memory bank controlled by the processor running that process. If this is done efficiently enough, the benifit of having 4 memory controllers far outweighs the cost of possibly having to get memory slowly from another processor.
  • by SlashingComments (702709) on Wednesday March 03, 2004 @01:26PM (#8453213)
    I saw this yeasterday on his site. Pretty good.

    One thing I did not understand is how come the 3MB cache is helping with big database query ? I thought that will thrash the cache and there will be not much performance gain if you are working with bigger code/data set. Also, for the four CPU opteron, do they have hyper transport going from every cpu to every cpu ? Is it like a mesh or like a ring where every cpu has only two connections to it's next ones.

    Another thing I did not get is how linux is handling ( not handling ) the local memory to the CPU. This thing looks like a mini-numa type system. Does linux actually try to keep the data in the RAM and process it with the cpu it is connected to ? how does this really work ?

    May be you guys can help clear my ideas .

    • by chef_raekwon (411401) on Wednesday March 03, 2004 @01:43PM (#8453435) Homepage
      ne thing I did not understand is how come the 3MB cache is helping with big database query

      this is interesting, and i don't have an answer, except to say that SQL servers generally try to load all of the tables into available RAM. If the data is too large, then simply the indexes(??). If the server ever has to go back to the harddrive for data (which would make it bloody slow for a query) it will check recently cached stuff first - and larger caches means reduced time pulling data sets from a raid array or single harddrive.

      that is atleast my take...it generally differs from server to server....MySQL does not run exactly the same way as MSSQL as Oracle. (which means I've generalized.)
      • by Twillerror (536681) * on Wednesday March 03, 2004 @02:59PM (#8454297) Homepage Journal
        SQL servers use a page caching system generally. That is the database exists on the harddrive as a series of xK pages ( 8k pages for MS SQL server ).

        As a page gets loaded from the harddisk it is loaded into the server's cache. Any read/writes are done to memory and not the disk. Background process write the pages to disk that are dirty. All transactions are written to the transaction log so if the server crashes before this happens recovery can happen when the db starts back up.

        This means that a large portion of the data is already in memory. Servers usually pre-allocate gigs of memory for this purpose, the more the better and a big reason for 64 bit on large dbs.

        Under x86 caching schemes, the CPU does speculative loads. It "guesses" what memory the processor is going to need and starts loading it into high speed cache. This is perfect for a db since most of the time the db pages you need are in sequential order. Especially when you are talking about pages that only include indexing data. The query usually does most of it's work using indexes, and then at the end will actually "lookup" the data.

        So bigger caches mean that these big binary trees get loaded into cache and the algorythms can loop through them faster and pull off the cache.

        Take this into the Itanium world and we can start to get even better performance. The thing people tend to forget about Itanium is that you can tell it to load you data into cache. So an optimized DB server can have it read this large section of code into the cache while it does calculations. Itanium allows 3 instructions to be loaded. Once hypertheading is put into Itanium you will see these DB apps really fly. Itanium is showing good promise in this arena, even at 1.5 GHZ. Clock that up to 2 to 3ghz with multiple hypertheaded cores and we are going to have one fast chip for dbs.

        The big issue is the price, if you are going to spend that much money, go with the proven Sun/IBM. Itanium is set to replace the Xeon by 2007 ( I'm guessing before then because of scaling issue on the Xeon and x86 emulation software giving decent 32 bit for legacy apps ).

        I really think Intel needs to push their Itanium. MS likes it, Linux likes it, a few db servers like it, and a slew of other high performance, server based things. I don't see how Intel is going to scale against the Oppie. Not unless they stick a on-die memory controller. Hypertheading and the new thread based instructions will help though. Should be an inresting battle. I'll be happy if AMD can get 10% to 20% marketshare, then we will see some true competition and innovation like we have on the desktop.

  • by Anonymous Coward on Wednesday March 03, 2004 @01:27PM (#8453227)
    Believe it or not, Intel's compiler generates very good code for the Opteron. Far better than GCC or generic IA32 compilers.

    So in any evaluation, the compiler and binaries that are used is an important question.

    There was no mention of this in the article.
    • by Aardpig (622459) on Wednesday March 03, 2004 @01:44PM (#8453441)

      Believe it or not, Intel's compiler generates very good code for the Opteron. Far better than GCC or generic IA32 compilers.

      This is the experience I've had with the Intel Fortran compiler (ifort) on an Athlon XP. Codes compiled with ifort are around twice as fast as those compiled with GNU g77 (for Fortran 77), and around 1.5 times faster than those compiled with Lahey lf95 (for Fortran 95).

  • OS (Score:4, Interesting)

    by millahtime (710421) on Wednesday March 03, 2004 @01:27PM (#8453228) Homepage Journal
    So I see that M$ Windows was used as the OS. Unless this was a prerelease of the 64bit XP then they were running a 32bit OS on the chips. So, wouldn't that mean that this isn't a true test of the power?? Your not taking full advantage of the 64bit power.
  • Back to Intel Fanboy (Score:3, Interesting)

    by superpulpsicle (533373) on Wednesday March 03, 2004 @01:28PM (#8453236)
    Alright I have had about 3 AMD processors die on me. I have owned about 4 Intel processors all the way back from original Pentium. Not one has ever had a problem.

    Now... given this kind of statistics, as sad as it may sound I'd say I am willing to pay anything for an Intel just to avoid the headaches.
    • by tuffy (10202)
      I've owned 5 AMD processors from the K5 to an Athlon64 and all are still in perfect working order. But these sorts of anecdotes aren't very helpful in determining average chip reliability.
    • by hng_rval (631871) on Wednesday March 03, 2004 @01:39PM (#8453371)
      Alright I have had about 3 AMD processors die on me. I have owned about 4 Intel processors all the way back from original Pentium. Not one has ever had a problem.
      Now... given this kind of statistics, as sad as it may sound I'd say I am willing to pay anything for an Intel just to avoid the headaches.


      That is an interesting use of the word statistics. In order to determine if your next processor is likely to break, you should look at thousands or hundereds of thousands of Intel procs and AMD procs. Your 7 processor study is inherently flawed.
    • by hyperstation (185147) on Wednesday March 03, 2004 @01:45PM (#8453462)
      you know, you're supposed to outfit those AMD processors with fans, heatsinks, and some of that thermal paste....
    • by John Courtland (585609) on Wednesday March 03, 2004 @01:45PM (#8453464)
      Wow, I have a working AMD 386/40. Yet I have a score of dead Intel 286/386/486's. I just evened out your "statistics". Not to mention the 5th gen and above x86 class processors I have.
    • by Mr Guy (547690) on Wednesday March 03, 2004 @01:54PM (#8453566) Journal
      And, as we all know, the plural of anecdote is indeed "data".
  • by eddy (18759) on Wednesday March 03, 2004 @01:28PM (#8453237) Homepage Journal

    Some info here [techreport.com]. SSE3 is the big thing.

    • Some info here. SSE3 is the big thing.

      I doubt SSE3 will make a difference for very many applications. A quick overview of x86 vector instructions:

      MMX: primarily vector integer instructions on 64-bit registers; main flaw is that they use the floating-point registers, so you can't mix MMX and FP code. Biggest win for image processing, which is usually 8-bit data and perfect for MMX.
      3Dnow: adds vector floating-point instructions on 64-bit registers (introduced by AMD).
      SSE: primarily vector floating-point
  • by Fallen Kell (165468) on Wednesday March 03, 2004 @01:30PM (#8453268)
    The jist of the whole thing is that Intel's achitecture has a huge bottleneck in its FSB. All the processors share the same FSB and quickly max it out if there are more then 2 processors. So anyone building or buying systems with more then 2 processors will get much better performance out of an AMD opteron system then an Intel.
  • by maharito (626909) on Wednesday March 03, 2004 @01:32PM (#8453287)
    I attend a university that is currently building a beowulf cluster, and when it came down to making a decision, the deciding factor was price/performance ratio. While it may make sense for enterprises to go with the Xeon, the Opteron is a clear winner, in my mind, when money is an object. Of course, if you have the money to burn, the Xeon may seem to be the more obvious choice.
    • by Pingular (670773) on Wednesday March 03, 2004 @01:37PM (#8453354)
      I attend a university that is currently building a beowulf cluster, and when it came down to making a decision, the deciding factor was price/performance ratio. While it may make sense for enterprises to go with the Xeon, the Opteron is a clear winner, in my mind, when money is an object. Of course, if you have the money to burn, the Xeon may seem to be the more obvious choice.
      Even if someone has money to burn, wouldn't it be better to get more performance anyway?
  • memory controllerS? (Score:5, Interesting)

    by Bender Unit 22 (216955) on Wednesday March 03, 2004 @01:33PM (#8453304) Journal
    But these days days with all the virtualization getting hot(vmware etc), a server architecture with a single memory bus/controller is getting old.
    I'd like to see some test on servers like the IBM x445 [ibm.com] with NUMA.

  • The Usual Problem (Score:5, Interesting)

    by Sloppy (14984) * on Wednesday March 03, 2004 @01:37PM (#8453345) Homepage Journal
    We've seen this same type of benchmark over and over. It wasn't interesting then, and it's not interesting now.

    The tests in this article, involved running the same exact binaries (out-of-the-box Microsoft 386 stuff) on both types of CPUs, rather than the code being compiled to run natively. The Opterons were fighting with one hand tied behind their backs.

    In other words, this benchmark is mainly only of interest to Microsofties. If that's what you run, then fine, the article may be useful to you and you may get something out of reading it.

    If you are trying to maximize speed, though, then the software contraints that this test took place under, are totally contrary to what you'd actually be doing (running code that is appropriate for the hardware).

    BTW, another weird thing I noticed about this article: these guys use flash for static images of bar graphs. WTF? Anandtech, your w3b d3$1gn3rz R S0 31337!!!1

    • -5, Clueless (Score:5, Insightful)

      by Gothmolly (148874) on Wednesday March 03, 2004 @01:47PM (#8453491)
      Firstly, Anandtech uses flash for its images so that people w/o the plugin can't see the data. This forces you to install it, so that you can see their OTHER Flash pieces... ads.
      Secondly, you are not going to get MS to recompile an MS-SQL for Opteron. You're not going to get IBM to support a Linux installation, after you've rolled your own ueber-NUMA-patch-level-42 kernel.
      The test was clear - out of the box, plug in servers, load OS, load app, run benchmark.
      And the outcome was clear, the Opteron architecture is vastly superior, both performance and price-wise.
      The MHz myth is over, at least in Slashdot and Anandtech circles.
      • Re:-5, Clueless (Score:3, Insightful)

        by MikeBabcock (65886)
        1) Microsoft is a major Opteron supporter; they had a freely downloadable Opteron Windows XP beta available for some time now that I have an ISO of here.

        2) IBM would probably support uber-numa patched kernels as you put it, since they are one of the main proponents of Linux-on-massively-parallel-supercomputers anyway.

        Do some research.
      • Re:-5, Clueless (Score:3, Informative)

        by Junta (36770)
        Hmm... moderated yourself nicely.

        IBM supports native x86_64 distros (SLES8/AMD64 now, and looks like RHEL3/AMD64). So IBM supports you running 64-bit today.

        Windows is working towards 64-bit for Opteron, so yes, MS is recompiling their crap for Opteron.

        Simply because there is a logical reason for images being flash makes it no less annoying as hell.

        Opteron is a fascinating platform, and very cool, *especially* with respect to 64 bit computing.
  • Article on one page (Score:4, Informative)

    by mulle (30054) on Wednesday March 03, 2004 @01:37PM (#8453355)
    Full text [anandtech.com]
  • Hmm (Score:4, Funny)

    by Trailer Trash (60756) on Wednesday March 03, 2004 @01:44PM (#8453447) Homepage

    ...4-way Opteron systems seemed to fair the best...

    Did they fare well, also?

  • Conclusion... (Score:5, Informative)

    by ERJ (600451) on Wednesday March 03, 2004 @01:51PM (#8453554)
    Anand seems to conclude something a bit different then the submitter:

    The comparison we've made here is a very important one; it identifies Intel's strengths and their weaknesses with Xeon, and it crowns Opteron a clear multiprocessor winner. An area that we didn't touch on is cost, which is where AMD truly shines. The Opteron 848 processors we tested are around 1/2 the price of Intel's 2MB L3 Xeon MPs and we have not seen retail data on how expensive the 4MB parts will be.

    In a 4-way configuration AMD's Opteron cannot be beat, and thus it is our choice for the basis for our new Forums database server.
  • by Phaid (938) on Wednesday March 03, 2004 @02:11PM (#8453758) Homepage
    It's interesting to notice that in these tests, the Opterons were clocked quite a bit slower and had a lot less cache than the Intel CPUs, yet performed comparably in 2-way and better in 4-way than the Intel chips.

    The Opteron clocked at 2.2ghz with 1MB of cache was very close in 2-way performance with the Xeon 3.0 and 3.2 ghz each with 4 and 2 mb of cache respectively. The 1.8ghz Xeon compared well with the Xeon 2.8ghz with 2MB of cache. The Opterons were typically within 3% or so of their Intel counterparts in 2-way benchmarks and closer to 10% ahead in 4-way.

    If nothing else, this says a lot about the efficiency of the Opteron's design. Less silicon, and more importantly for AMD, less expensive silicon, manages to achieve very close results.
  • Comparing Prices (Score:5, Insightful)

    by gbulmash (688770) <semi_famous@yahooBLUE.com minus berry> on Wednesday March 03, 2004 @02:37PM (#8454048) Homepage Journal
    4 AMD Opteron 248's at Newegg: $5876 ($1469 ea)
    4 Xeons (@Intel's announced pricing): $14768 ($3692 ea)

    Did the quad Xeon system outperform the quad Opteron by a factor of 2.5:1? No. In fact, in some cases, the quad Opteron outperformed the quad Xeon. The Xeon had advantages of hyperthreading, 4x as much cache, and a clock speed 800mhz higher than the Opteron, ans still got beat.

    Clock speed may sell in the consumer market ("Me want bigger!"), but in the server market, Opterons getting better performance for half the price are going to win more and more converts.

    - Greg

    • Re:Comparing Prices (Score:3, Informative)

      by tjw (27390)
      4 AMD Opteron 248's at Newegg: $5876 ($1469 ea) 4 Xeons (@Intel's announced pricing): $14768 ($3692 ea)
      Actually, you'de need Opteron model 848 to do 4 way SMP. Obviously this is what you meant though, since you have the right price.
  • by DarkHelmet (120004) * <[mark] [at] [seventhcycle.net]> on Wednesday March 03, 2004 @03:07PM (#8454402) Homepage
    Here's a part that I can't help but laugh at:

    In our infinite desire to please everyone we worked very closely with a company that could provide us with a truly Enterprise Class SQL stress application. We cannot reveal the identity of the Corporation that provided us with the application because of non-disclosure agreements in place.

    Okay... So we know what kind of hardware they're testing against, but not knowing what kind of software they're benchmarking? "We're using an enterprise scenario" isn't good enough.

    It's nice to look at pretty charts and all, but I imagine anyone who is going to investigate enterprise level solutions is going to want to know EXACTLY what this is being benchmarked on.

    Even though I typically tend to trust Anandtech's outlook on things, I'm still kind of so-so on this review. Their forum test is not really externally reproducible and their enterprise test is too vague. I doubt any IT person would weigh this review too heavily when making a decision.

    Then again, I could be wrong.

  • by Loki_1929 (550940) on Wednesday March 03, 2004 @03:09PM (#8454427) Journal
    "although dual Xeon configurations almost always beat dual Opterons."

    Perhaps the submitter's screen reader doesn't work well with flash, but in the 2-way benchmarks, Opteron was on top twice, and Xeon was on top 3 times. All the 2-way benchmarks were fairly close (within 5%), and the Xeons never beat the Opterons by a margin greater than 1.7%. I don't quite know where 40% wins translate into "almost always" loses. In other words, the story submitter is a moron, or simply didn't look at the article.

    "Results were a little varied as 4-way Opteron systems seemed to fare the best,"

    Seemed? Let's see, out of five 4-way benchmarks, Opteron won... all of them - performing about 10% better than Xeons each time.

    Since when did we start letting Tom Pabst submit articles to /. ?

    Note to editors: When the submission is non-sequitur, either reject it or edit it.

  • Not Surprising (Score:3, Interesting)

    by RAMMS+EIN (578166) on Wednesday March 03, 2004 @03:10PM (#8454436) Homepage Journal
    ``Results were a little varied as 4-way Opteron systems seemed to fare the best, although dual Xeon configurations almost always beat dual Opterons.''

    Varied, perhaps, but not surprising. AMD has integrated the memory controller on the CPU, which could explain their getting better when the number of CPUs increases (the Intels being held back by having to go through the same memory controller).

    As for Intel winning out on the dual CPU systems, well, they are ahead of AMD in the CPU speed race, aren't they?
  • by prisoner-of-enigma (535770) on Wednesday March 03, 2004 @03:12PM (#8454460) Homepage
    Opteron systems seemed to fare the best, although dual Xeon configurations almost always beat dual Opterons.

    Perhaps the benchmarks show the 2P Xeon's doing OK against 2P Opteron's, but for the price of two Xeon MP chips you can buy five Opteron 848's. Rounding that down, I wonder how well the 2P Xeon does against the 4P Opteron? Oops, Anand already though of that. He says "it would not be pretty." Indeed.
    • by NerveGas (168686) on Wednesday March 03, 2004 @03:22PM (#8454577)
      Yes, the tests weren't exactly apples-to-apples - the outcomes are actually much better for AMD than the graphs would initially appear.

      The graphs mean that Opterons with a "measly" 1 meg of cache are beating out Xeons that have (a) four times the cache, (b) 50% higher clock speed, and (c) a price tag that's three times higher.

      Hats off to AMD. In times past (K2/K3), price was the only thing they had better than Intel. Now they've got both price and performance.

      steve

"Gotcha, you snot-necked weenies!" -- Post Bros. Comics

Working...