First 16-Core Opteron Chips Arrive From AMD 189
angry tapir writes "After a brief delay and more than a year of chatter, Advanced Micro Devices has announced the availability of its first 16-core Opteron server chips, which pack the largest number of cores available on x86 chips today. The new Opteron 6200 chips, code-named Interlagos, are 25 per cent to 30 per cent faster than their predecessors, the 12-core Opteron 6100 chips, according to AMD."
Compared to Intel? (Score:2)
Re:Compared to Intel? (Score:5, Interesting)
This would compete with the Xeon-E chips that aren't out yet. But in terms of performance about 75%, so this is the equivalent of a 12-core intel chip.
Re: (Score:2)
If they aren't out yet, how can you know? I wouldn't trust the performance benchmarks from either manufacturer.
Re: (Score:2)
This assumes that performance is not significantly different from the desktop line, which is usually the case.
Re: (Score:3)
Slight correction, on threaded workloads, we'd be talking about a 6-core chip, intel runs 2 threads per core.
Re: (Score:2)
I think that Intel's hyperthreads and AMD's Bulldozer 'cores' both use a resource sharing arrangement, and in neither case are full cores. The benchmarks bear this out: intel's hyperthreading is nearly as good as AMD's.
Re: (Score:2)
Not sure what Tyan has planned and what the chips can do, but tyan had boards that supported 4 quad core opterons plus you could add a "daughter board" that allowed you to add 4 more (plus more ram slots)
Now that setup using 16 core cpus in an eatx format would be crazy
Re: (Score:2)
Yeah. I could ditch my furnace in the winter with a computer like that... Might even have to open a few Windows.
Re: (Score:3)
The idle heat would be sufficient, no? I don't see why you would need to open some windows just to ramp up the temperature unless you're using this thing to few heat for a sauna.
Re: (Score:3)
"Honey, it's kinda cold. Can you fire up Linpack on the server?"
Re:Compared to Intel? (Score:5, Interesting)
Put simply, the AMD ones are slower than the intel ones by about 2 fold per core. This isn't because AMD sucked at design, so much as their marketing department sucked at telling the truth. In reality, we're looking at 8 core AMD CPUs with 2 integer units per core - i.e. no more 16 core than intel's are 16 core chips because of hyperthreading.
Once that's ironed out, the AMD chips turn out to have rather good performance if you want lots of integer work done, and the Intel chips to have rather good performance if you want anything else done.
fool. (Score:2)
Re: (Score:3)
The problem is, while this is true, bulldozer also suffers from being a fairly crappy arch design compared to sandy bridge. The result is that AMD's 8 "core" bulldozer is only roughly as fast as intel's 4 core i5 without hyperthreading. Extrapolate this to bolting two 8 "core" bulldozers together and you get to... well, that would only be about as fast as an 8 core sandy bridge with no hyperthreading, or a 6 core with hyperthreading. Given that Intel is selling 6 core E5 Xeons with hyperthreading for les
Re: (Score:2)
BENCHMARK!
In the meantime, *please* STFU.
Re: (Score:3)
Given that an 8-core Bulldozer already needs its own power station to operate, I can't imagine Intel could have a worse TDP than a 16-core.
Re: (Score:2)
Re:Compared to Intel? (Score:5, Informative)
Re: (Score:2, Interesting)
While I do agree that AMD is *well* behind Intel's latest and greatest in the 1P / desktop world, I fail to see how you could make such bold statement, unless you have had the chance to compare and AMD 4S machine to Intel 4S machine (say, Opteron 62xx based HP DL585G7 vs. Xeon 75xx/E7 based HP DL580G7).
In my experience (and I venture and guess that is just as good as yours, if not better) the picture is far from being black-and-white and greatly (!!!) depends on the application that is being tested. The pic
Re: (Score:2)
Re:Compared to Intel? (Score:5, Interesting)
What's the Xeon E5-2650L, 2650, 2660, 2665, 2670, 2680, 2690 and 2687W then?
Hint: they're all 8 core SNB-E chips. Second hint - AMD's 16 "core" CPUs don't have 16 cores – they have 16 integer units. They only have 8 instruction fetch units, 8 decode units, 8 L2 caches, etc. That is, they're 8 core CPUs with strong integer support. SNB-E's particular strength is floating point, but it tends to beat the opterons at pretty much anything that isn't heavily integer biased.
Re: (Score:3)
Re: (Score:3)
Not really, no –databases and web servers don't spend their time doing parallel integer work, they spend their time doing logic work. Sandy Bridge kicks the snot out of it there.
Re: (Score:2)
If the logic is parallelizable, then the AMD chips could be a good choice. A webserver would be a good example of parallel logic in run-of-the-mill software were it not hampered by all that pesky I/O.
-l
Re: (Score:2)
How about we take all the energy we are putting into this pissing contest and do some actual benchmarking?? Put up or shut up.
COME ON EVERYONE! BENCHMARK! BENCHMARK! BENCHMARK!
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
Databases that are working on in-RAM workloads (so not I/O bound) spend most of their time moving data pages around in memory. There are few computational components to database work, compared with how often chunks of data are touched. Neither the floating point or integer speed is the real limiting factor on how fast that can happen. The size of the CPU caches and the speed of the CPU->RAM interconnect are the important factors.
I've been working on a memory oriented benchmark [github.com] aimed at testing for thi
Re: (Score:2)
Render farms don't use GPUs. Good luck fitting a 3D scene into 1GB of memory!
Maybe for a couple specialty applications custom written for a few narrow pipeline tools but certainly not the backbone which is still all PRman, Arnold, Vray, Brazil and Mental Ray. None of which use the GPU yet. Only production renderer nearing GPU acceleration is Final Render and maybe Brazil/Vray for specialized passes.
Re: (Score:2)
Re:Compared to Intel? (Score:5, Interesting)
sandy bridge ep 95W (Score:4, Informative)
There will be server versions as well...I've seen specs (publicly available) for an 8-core (16-thread) sandy bridge EP with a 95W TDP. I suspect it's clocked a bit lower and maybe binned for efficiency.
Re: (Score:3)
Even the fastest Sandy Bridge-E draws less power than a Bulldozer even at much higher performance. It also costs 3-4 times as much, so performance/$ is quite shitty (hey, it's an extreme $999 proc) but you the winner in performance is clear [pcper.com]. But thanks for trolling, come again.
Re: (Score:3)
Really? Given that an 8 "core" bulldozer FX-8150 gets beaten by a 4 core i5 2500, you would reasonably expect that this 16 "core" bulldozer would get beaten by an 8 core sandy bridge chip with no hyperthreading at roughly the same clock speed. A little bit of imagination might convince you that a 6 core with hyprethreading might perform similarly too.
AMD – 16 "core" bulldozer – $1000
Intel – 6 core + HT Xeon E5-1650 at much higher clock – $583.
Alternatively, if you want to be able t
Re: (Score:2)
Well except the 130TDP of the 3690x is less than the 140TDP of the (almost equivalent) 6282 SE from AMD. Don't let facts get in the way of your beliefs.
Re: (Score:3)
1: You can buy your new sandy bridge from newegg or such right now, while those new bulldozers are nowhere to be found.
2: Overclocking any chip is bound to require a lot more power than the TDP no matter which one you are using.
3: Dozer's core, as you said, feel like they are dozing on the job..
yes (Score:3)
Re: (Score:2)
No, they aren't.
Re: (Score:2)
Really? Because this looks like the FX-8150 getting beaten 3 ways silly by even an i5-2500 at photoshop:
http://images.anandtech.com/graphs/graph4955/41688.png [anandtech.com]
Re: (Score:2)
http://www.tomshardware.com/reviews/fx-8150-zambezi-bulldozer-990fx,3043-15.html [tomshardware.com]
radial blur, shape blur, median, polar coordinates.
This test employs threaded filters, taxing as many cores as we throw at it. Zambezi’s eight integer units capitalize, flying past the Core i5 and Core i7, outright trouncing the six-core Phenom II X6 1100T, too.
Re: (Score:2)
Re: (Score:2)
No, no it's not, logic work includes all kinds of things like branch prediction, pipeline length and hence amount flushed when it all goes titsup, etc. Notably Bulldozer [anandtech.com], does [anandtech.com] terribly [anandtech.com] at this [anandtech.com], but not so badly at pure integer work.
Re: (Score:2)
Bottom line – Bulldozer isn't good at multithreading, it's good at integer work. Unfortunately, servers are mostly logic work, so sandy bridge is likely to destroy it.
oh boy. i just saw this. you dont know shit.
'servers are mostly logic work' hahahahaa. luckily someone else gave your answer.
next time, dont talk without knowing shit. 'servers' mean heavily multithreaded integer work. in these, bulldozer excels. and that is also one of the reasons why there have been 3 amd opteron (bulldozer 16 core) supercomputer orders in the past 3 weeks. NOT intel. amd. opteron, bulldozer. SUPERcomputer.
Re: (Score:2)
even the i5 beats the shit out of it
are you aware that the tooling process and silicon cutting in the factories for this chip, has not matured yet ? do you even know what these mean ?
Re:how do they compare ? (Score:5, Informative)
-mainconcept http://www.lostcircuits.com/mambo//i...&limitstart=17 [lostcircuits.com]
-mediashow http://www.guru3d.com/article/amd-fx...ssor-review/14 [guru3d.com]
-h.264 http://www.guru3d.com/article/amd-fx...ssor-review/14 [guru3d.com]
-vp8 http://www.guru3d.com/article/amd-fx...ssor-review/17 [guru3d.com]
-sha1 http://www.guru3d.com/article/amd-fx...ssor-review/17 [guru3d.com]
-photoshop cs5 http://www.lostcircuits.com/mambo//i...&limitstart=14 [lostcircuits.com]
-photoshop cs5 http://www.tomshardware.com/reviews/...x,3043-15.html [tomshardware.com]
-winrar, faster than 2600k http://www.techspot.com/review/452-a...pus/page7.html [techspot.com]
-winrar, improves over x6 http://www.tomshardware.com/reviews/...x,3043-16.html [tomshardware.com]
-7-zip better than 2600k here: http://images.anandtech.com/graphs/graph4955/41698.png [anandtech.com] http://www.anandtech.com/show/4955/t...x8150-tested/7 [anandtech.com]
-7-zip same perf as 2600k http://www.tomshardware.com/reviews/...x,3043-16.html [tomshardware.com]
-POV-ray, faster than 2600k http://www.legitreviews.com/article/1741/10/ [legitreviews.com]
-POV-ray http://www.nordichardware.se/test-la...art=15#content [nordichardware.se]
-x264(2nd pass AVX enabled) http://www.anandtech.com/show/4955/t...x8150-tested/7 [anandtech.com]
-x264 (2nd pass, better overall than 2600k) http://www.bjorn3d.com/read.php?cID=2125&pageID=11108 [bjorn3d.com]
-x264 (2nd pass +.3 than SB2600k) http://www.legitreviews.com/article/1741/7/ [legitreviews.com]
-handbrake; http://www.legitreviews.com/article/1741/9/ [legitreviews.com]
-truecrypt; http://www.bjorn3d.com/read.php?cID=2125&pageID=11111 [bjorn3d.com]
-solidworks; faster than 2600k http://www.techspot.com/review/452-a...pus/page7.html [techspot.com]
-abbyy filereader http://www.tomshardware.com/reviews/...x,3043-16.html [tomshardware.com]
-C-Ray, as fast as $1k i7-990X, http://i664.photobucket.com/albums/v.../c-rayir38.png [photobucket.com]
Re: (Score:2)
Good work digging up all the graphs where Bulldozer manages to get between the i5 and the i7 (which, based on its price point *it damn well should*, being priced half way between the two). Unfortunately, while you've dug up a nice bunch of places it just about holds its own, there many times more where the Sandy Bridge chip eats it for breakfast, including heavily multithreaded work. As I said above – Bulldozer is good at very multithreaded integer work, and pretty much nothing else.
Re: (Score:2)
there many times more
yes. then instead if shooting from the hip, recount those times and occasions.
Re: (Score:2)
Bottom line – Bulldozer isn't good at multithreading, it's good at integer work. Unfortunately, servers are mostly logic work, so sandy bridge is likely to destroy it.
excuse me but you have posted the same bullshit without knowing SHIT about what you are talking on the second time here. apparently you havent read what you have been told about how logic work being integer work by another slashdotter.
i replied to you on your ignorance in the other post. 3 supercomputers that are bulldozer based, in the past 3 weeks. a supercomputer a week. yes. sandy bridge e must be 'LIKELY' to destroy bulldozer in heavily multithreaded workloads.
how about not talking on stuff you d
Re: (Score:2)
Good work digging up all the graphs where Bulldozer manages to get between the i5 and the i7 (which, based on its price point *it damn well should*, being priced half way between the two). Unfortunately, while you've dug up a nice bunch of places it just about holds its own, there many times more where the Sandy Bridge chip eats it for breakfast, including heavily multithreaded work. As I said above – Bulldozer is good at very multithreaded integer work, and pretty much nothing else.
Nice Trolling
Bulldozer Cores are not that Great (Score:5, Interesting)
The "cores" in Bulldozer are not your typical first-class x86 core. Bulldozer "cores" are worth 2/3 of a modern x86 core. The 6200 is more like a 10 core. Add to that the crappy IPC and I'm not impressed.
I was excited about Bulldozer before it was released. It's not often that CPU makers take chances on radical new architectures. Too bad this one turned out to be a huge pile of fail.
Re:Bulldozer Cores are not that Great (Score:5, Informative)
Your description in inaccurate, but that's not surprising since most slashdot readers don't know much about CPU architecture.
Bulldozers are essentially full-fledged cores, where the two cores in each module are mostly independent. There are two completely independent integer pipelines, so people seem to want to harp on the fact that the FPU is "shared". It's really a single split FPU, where each half can execute independent instructions, as long as the data width is 128 bits or less. Only when it is executing 256-bit AVX instructions is there any competition for resources. This is a very sensible design decision, since you don't find enough AVX software right now to justify completely dedicated AVX logic. (Plus, IIRC sandy bridge's FPU is only 128 bits wide and issues AVX instructions in two cycles, so what's the difference?) Moreover, even with AVX-heavy workloads, most software won't issue AVX instructions every cycle, and two AVX-heavy tasks on the same module won't really run into much contention. Assuming my memory of Sandy Bridge's FPU is correct, then Bulldozer has the advantage of having lower latency within the FPU on isolated AVX instructions.
The PROBLEM with Bulldozer is that they just have not done some of the really aggressive and costly things that Intel has done in their design. Bulldozer is still a 3-issue design. While going to 4-issue doesn't help that much that often, it still gives Sandy Bridge a slight edge. But where SB REALLY gets its advantage is the huge instruction window. Intel found clever ways to shrink the logic for various components so that they could make room for a much larger physical register file and reorder buffer. As a result, SB can have many more decoded instructions in flight, which exposes more instruction-level parallelism and, critically, absorbs more memory access latency.
A Sun engineer (discussing Rock, among other things) once described modern CPU execution as a race between last-level cache misses. When you have a miss on your L3 cache, it can cost hundreds of cycles, upwards of 1000. During that miss, the CPU fills up its reservation station with other instructions and then stalls, waiting on something to retire. This won't happen for a long time. Because of the disparity in speed (and latency) between compute and memory access, this is typically the most significant bottleneck. By enlarging the instruction window, SB can achieve much higher throughput, and it shows in the benchmarks.
This is Bulldozer's Achilles' heel. I know there are a few benchmarks where Bulldozer is faster than SB, but they're not typical workloads with typical memory footprints. Anyhow, so if you're going to rag on Bulldozer, rag on it for the right reasons. Bulldozer's "shared" FPU is a red herring.
Re:Bulldozer Cores are not that Great (Score:5, Informative)
The OP right, and seems to understand the issues far better than you. It isn't that the FPU is shared, it that nearly _everything_ is shared: Instruction cache, fetch and decode, FPU, L2 data cache. The only things that aren't shared are L1 data and integer operations (scheduler and ALU).
Instruction issuing and and cache misses are big performance areas, but these are precisely the resources the cores share! You're running two threads off (with the exception of L1 data) the same caches and instruction fetches. So, in reality, the second core in bulldozer is much more like ultra-hyperthreading than it is a second core. I think the fact that they're even listed as cores is a marketing strategy that has backfired pretty hard.
P.S. L3 cache has proven to be quite useless in many workloads... It helps a bit in servers, IIRC, but that's about it. So it's more a race to L2 cache, which, again, is a shared resource. AMD, in fact, has indicated that it may drop the L3 from desktop parts.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
My SSE code converted to AVX runs two times faster (not all of it though -- certain instructions do run in two cycles)
Re: (Score:2)
Wish List (Score:4, Informative)
When are multiple cores going to help me? (Score:5, Interesting)
I just got a fancy 8 core T7500 Dell workstation and only one of my compilers actually takes advantage of the multiple cores when it is compiling. As a result this expensive desktop is only 15% faster in terms of time to compile than the 4 year old PC it replaced (the new PC has twice the ram as the old though which may account for some of that speed increase). I am seriously unimpressed with all these cores. Maybe they are useful for something, but I've not found anything that I do that shows significant improvement. Putting my development projects on a SSD did much more for my work flow performance than this fancy new computer, that is for certain.
Re:When are multiple cores going to help me? (Score:4, Informative)
You're doing it wrong.
make -j8
Re: (Score:3)
And if it's too much for one machine, use distcc.
Re: (Score:3)
Try doing DSLR image editing with Lightroom or Aperture. Those cores make one hell of a difference.
Re: (Score:2)
Yeah, they do in image editing.
However, there will always be things that must be done in series, and always a maximum speed-up you can get from multiprocessing. (Amdahl's Law comes to mind) Plus, you'll often hit other bottlenecks, especially if you have an obscene number of cores. Memory, disk, video, network..
Memory has always been a problem after the 6502 era. Even single core systems splat into the performance barrier that is main memory.
I'd rather have a single-core system that's 8x faster than an
Re: (Score:2)
I just got a fancy 8 core T7500 Dell workstation and only one of my compilers actually takes advantage of the multiple cores when it is compiling.
If your compiler isn't threaded, then at least run multiple compile jobs simultaneously--this is probably better anyway. If your build system can't do this, your tools are broken.
Re:When are multiple cores going to help me? (Score:4, Informative)
What do you mean by "only one of my compilers actually takes advantage of the multiple cores when it is compiling"?
Are you on Windows? Because any compiling done in linux with a "make" based (or similar) build system can use as many cores as you can throw in a machine (regardless of the actual compiler it's running). It should be the same in Windows...
Don't look to your compiler to be multithreaded... look at the build system (i.e. in Visual Studio there should be an option somewhere to tell it how many processors to use while compiling). For make you just do "make -j8" to use 8 "jobs" total for compiling (i.e. 8 instances of the compiler will be running).
Here is a test for one of my software projects doing "make -j#" where # is 1,4,8,12,16,24:
1 : 15m9.614s
4 : 3m57.947s
8 : 2m6.354s
12 : 1m33.426s
16 : 1m25.559s
24 : 1m17.345s
That is on my dual 6-core hyperthreaded Mac workstation (so it had 12 "real" cores and 12 "hyperthreads"). You can see that hyperthreads definitely aren't as good as real cores... but do provide some speedup. That said, I thank God every time I compile (which is all day long) for the cores he has bestowed upon me...
Good to hear that you are already on SSD... because parallel compiling does need speedy disk to keep the processors humming. The timings above are for two 256GB SSD's in RAID0.
Re: (Score:2)
I mostly work with Eclipse doing Java, Android, and GWT. Only GWT offered an effective way to use those cores. It is VERY possible that I just don't know how to use Eclipse to the best of its ability, but I can tell you that Eclipse never pushes more than one core during a build except when its building GWT projects for me (I had to tell it explicitly to do that though).
Re: (Score:2)
Yeah, you're screwed, sorry. Eclipse integrates nicely with Ant, but Ant doesn't do multi-core builds either. And Ant tasks are very heavy, so parallelizing them wouldn't help much anyway. You might try rebuilding your build process in plain 'make' and try that -j option.
Also, I'm sorry you have to use GWT. That thing was just absurdly slow last time I used it, to the point that it would be faster to hand-code JavaScript.
Re: (Score:2)
That's what I thought. I research it every couple months when I get annoyed by multi minute builds. I never get any answers.
GWT is slow and deployment is a little cumbersome, but the code is so elegant I just don't care. I love GWT. I wish Google provider more libraries, but I'm pleased with it. I'm not certain it has a future though.
I loath Ruby. What's a fella to do if he wants a strongly typed object oriented website?
Re: (Score:2)
What's a fella to do if he wants a strongly typed object oriented website?
I think JBoss is the usual answer to that. That only takes care of the back end, but GWT has your front end covered anyway, and the more code you can move into JBoss, the less you have to crank through GWT's slow processor.
Re: (Score:2)
Well, you could consider using a better compiler, or a better configuration for it. Many parts of compilation parallelize reasonably well, especially if you have a lot of source files. Some things will have dependencies on other parts (which limits parallelism) and some have dependencies on the entire previous stage (which severely limits or prevents it, for that stage).
Besides, unless you're just building a pure build machine (and I doubt it, if your compilation setup is so bad), multiple cores can help a
Re: (Score:2)
Send your octo-core my way, I'll see that it gets some use...
For any RPM based Linux distro, just edit your RPM macros file to add eg. -j8 option to make, and every "rpmbuild" will max-out all 8 cores with 8 instances of gcc operating on different files each.
And if you're lzma compressing the RPMs in question, and they're a non-trivial size, you can get a pretty good speed-up using either parallel-xz or p7zip across
Re: (Score:2)
Neither my Java nor Android compilers do a good job of taking advantage of the multiple cores from within Eclipse. I am able to get significant improvement when compiling GWT projects by giving it a 6 core directive. I save about 40% of the time it used to take. Plain old Java and Android showed little improvement though.
Re: (Score:2)
Re: (Score:2)
Doesn't really matter until developers get off their asses and start including multi-threading code. You'd think that after multicore and multiprocessor usage started jumping through the roof, that you'd see it.
Re:Only 16? (Score:4, Informative)
Re:Only 16? (Score:5, Interesting)
Pfft, how much harder can it be to design one with 32 :)
Design? Easy.
Manufacture? Tricky.
Make work? Trickier.
To read about? Interesting.
Re: (Score:2)
So what are you waiting for? Hop to it and corner the market!
Go ahead, I'll just wait over here and read the paper.
Re: (Score:2, Informative)
Pffft, it's only 8 cores anyway, 8 cores each with 2 integer units. It's no more 16 core than intel's 8 cores with hyperthreading.
Re:Only 16? (Score:4, Informative)
No, 8 integer cores per chip, but 4 actual real cores. For a total of 8 cores across 2 chips.
Re:Only 16? (Score:5, Insightful)
Re:Only 16? (Score:5, Interesting)
The basic point is that it has a total of 8 instruction fetch units, it has a total of 8 instruction decode units that they feed, and it has a total of 8 chunks of L2 cache. The fact that each of these 8 cores has 2 integer units on it is neither here nor there –hell, for years cores have had several floating point units on them, it didn't make them more than one core. Not only that, but this CPU behaves badly when the scheduler treats it as 16 cores instead of 8. The bottom line is that this chip in every single way behaves like an 8 core CPU, more so, it's slower than intel's 8 core CPUs at a similar clock even with hyper threading disabled.
Re: (Score:3)
What are you basing this on? As someone that runs both database and web servers using both AMD and Intel I find your conclusions to be completely counter to my experience and to the experience of almost everyone I know that does virtualized infrastructure.
I ran into a number of problems when I first tried to deploy them because SQL 2005 wouldn't install on it. SQL 2008 runs just great with 24 cores as they were dual processor 12 core servers. I have no reason to think the 16 cores variants would be much di
Re:Only 16? (Score:5, Funny)
Re: (Score:2)
pffff why the troll mod, it's funny and on topic :) :)
probably not very accurate, but still quite enjoyable
Re: (Score:2)
Re: (Score:2)
Pfft, how much harder can it be to design one with 32 :)
To run at the same speed - very difficult. Think about twice the heat unless you make major changes
Re: (Score:2)
Hmmm... According to the article, these new chips seemed to be based on the bulldozer architecture, so it might be better to think of these opterons as 8 core chips that have really good hyperthreading.
Hold your horse, cowpoke.
Just because it's based upon doesn't mean it will suffer the same issues as the Bulldozer. Perhaps this is the core which really works well, while the more consumer oriented Bulldozer is the red-headed stepchild.
Re: (Score:3)
Re:really 16 core? (Score:5, Interesting)
Maybe...
It'll be interesting. Most server applications are integer-only and never touch the floating point units. That should mean that Bulldozer designs work close to the full core count in contrast to the poor benchmarking results it puts out in Photoshop filters and video encode.
Re: (Score:2)
Well, as Intel hyperthreading is basically brain-dead (had to disable it for decent performance as some things were glacially slow), really good hyperthreading just means usable hyperthreading for me. If Interl did not have so much money, AMD would have blown them away a long time ago. Intel technology sucks badly.
Re: (Score:3)
This isn't the point. You get 16 cores (slowish compared to top of the line, they may be) that will fit in a single socket on a single motherboard, with a single power supply. This is a *huge* cost saving for machines that it makes sense to use them in...servers, where single core performance is relatively stupid to consider.
Re:Poor performance (Score:4, Insightful)
If multithreaded performance was all that matters, the Sun Niagara chips would have done a lot better than they did.
Re:Poor performance (Score:4, Insightful)
Umm, Joins can be done in parallel, in lots and lots of cases. ERP and CRP are applications that ought to see big improvements form more cores, if you have more than a few users anyway. It also simplifies things, you don't have to figure out how to architect the thing to run across 10 hosts anymore, good multi-core systems deliver there performance these apps need if you can get the disk IO solved. A good SAN with mutlipath support and multiple HBAs can get there.
Niagara failed because each individual core was too slow, a comparable cost Intel CPU could do in serial with one core two jobs, in less time than Niagara could do one job with on core. The question is here for most paralleled work loads like a database where all cores will be used are AMDs 16 core chips at least 62% the speed of Intel's 10 core chips on core vs. core basis? If true other things being equal for *some* work loads these Opterons will be better.
Re: (Score:2)
Re: (Score:2)
The big question I have is if it will be like AMD's previous 12-core chips, where you could get 4 of them crammed into a 2U server for not all that much money. 4-Xeon configurations are way more expensive.
Re:Intel vs AMD's philosophy as of late (Score:5, Interesting)
Intel: I know, let's try to see just how many features/cores/cache we can fuse off in our dies and different socket combinations to try to make *puts pinky finger to mouth* one MILLION SKUs! Oh, and while we're at it, let's add a FOURTH memory channel, because more is better! Sure, we could get all the bandwidth we need with two DDR3-1866 or -2133 channels and that you really only get about three channels' worth of bandwidth because we have to clock the IMC down to DDR3-1333 with two modules per channel- but we still have FOUR channels! Oh, and we forgot, it's the start of a new quarter so we need to release a new socket. Can't let those socket suppliers get lazy making last quarter's socket design. What, you guys want us to release Sandy Bridge-based Xeon MPs because MP platforms actually need that much bandwidth and core count? We just released the Westmere-based ones a few months ago! Don'tcha know that Xeon MPs run two years behind everything else? Geez, what did you do, wake up yesterday? Next you'll want us to stop crippling our chips, stop using a new socket every other month or something ridiculous like that. Where do you guys get those ideas?
AMD: Based on market analysis, most server applications use primarily integer code and require a lot of bandwidth, memory capacity, and a high core count. We don't have over a hundred billion dollars in market cap to fund several parallel R&D teams to design a specific CPU for every edge use case, so we will design a CPU that is highly modular, has good integer performance (because that's what our research indicated most server apps are), and has a lot of cores. Experience with Intel's HyperThreading is less than stellar with regards to predictable performance, so we will use our CMT approach that leads to better integer performance than HyperThreading but doesn't increase the die size by a huge amount, since we can't afford to make 400-600 mm^2 dies like Intel does to have a lot of physical cores. Oh, and we'll continue to use the existing server platforms out there so our customers can drop-in upgrade and we'll also not change any feature sets in the SKU stack other than the clock speed and number of enabled modules and their associated caches. We do apologize for being "late" with these parts since we usually release server and client at about the same time...
Crippling chips (Score:3)
It's common, live with it. Every Cell processor in a PS3 comes with eight cell processing units, with one disabled. That way they can set the standard for seven and use most of the chips that come off the line.
Even AMD had a problem with too-good yield about ten years ago, so they restricted the clock and sold "crippled" low-end chips that were technically rated to run at much higher speeds.
Re: (Score:2)
Don't forget Intel pricing Xeon MP at thousand of dollars per CPU while AMD rags about the lack of this "4P tax".
Re: (Score:2)
AMD already had the on die memory controller. Their answer to intel's Hyperthreading was real cores. The QPI bus that intel uses is very similar to the one AMD pioneered with Hypertransport. Let's not forget that AMD64 (oh, did you want me to call it EM64T or x86_64?) was a product of AMD's engineering effort rather than forcing people toward the EPIC architecture which seems to be niche based.
Re: (Score:2)
2005 Called they want their list of bragging rights back. Oh and hypertransport is mostly technology that AMD bought from Digital along with parts of the Alpha team. They get some credit for bringing their version to market first, but it's hardly like they came up with the idea for Hypertransport out of thin air. As for x86-64, AMD brought it to market first, but Intel had internal builds of 64 bit enabled X86 chips around for some time, which is why they could bolt it onto the P4 and not require a brand
Re: (Score:2)
Oh.. and integrated memory controller:
1. The 486 had one too.
2. Look at Bulldozer's atrocious memory performance: There's a difference between slapping any memory controller on-die and slapping a *good* memory controller on-die. Intel has the good one.
Re: (Score:2)
--> Call me when you have actual benchmarks of Interlagos,
You don't have them either. What's amusing is that I'm using known data from 1/2 of an interlagos chip (Bulldozer) at much higher clockspeeds than what Interlagos will operate at to make my assumptions. There's plenty of data from just the 6 core 3960 and 3930 chips that came out today that indicate that even desktop Bulldozer x 2 with theoretically perfect scaling won't beat the upcoming Xeons. You ain't gonna get perfect scaling and you ain't