C`t Throws Athlons And P4s In The Gladiator Pit 198
An unnamed correspondent writes: "In the most recent C`T "Computer technik" there is a great benchmark with a pentium 4 (1,5 and 1,4 ghz)vs a athlon thunderbird (1,2 ghz and 1,2 ghz ddr memory with the 760 chipset).
If you think that that isn`t a fair race ... then read it now here and here in English.
You should get a copy of the German paper version anyway -- great magazine, even beter benchmark.
Now does anyone know where to get a 760 mainboard ;-)" Unnamed's cousin Noname also contributes a link to GamePC, which
reviews in grand 13-page SE-style the 1.4 and 1.5 GHz P4 chips.
Re:Pipeline depth and clock rate. (Score:1)
Re:I really think this will end up hurting intel (Score:1)
> And people seem really happy with AMD
I was an all-intel guy since the 80286 through PPro200.
What turned me off have been the CPUID thing. As soon as they released their chips with the CPUID, I decided not to buy any Intel chip any more
Had a lot of problems with my AMDs. But at least, there is now competition in PC arena. I am amazed at the perf of recent Athlons.
I tend to buy 1 personal machine per year, 3 or 4 for the work, and a couple of ones for friends. Now that intel is trying to fuck everyone with the RAMBUS thing, it is very unlikely that I buy any Intel CPU again.
Cheers,
--fred
Re:For those who haven't heard... (Score:1)
Give me a Transmeta-style design any day...
Re:Athlon vs. P4 (Score:1)
Re:AMD clearly the better choice (Score:1)
Look at the POV-Ray scores.. (Score:1)
Man, look at the POV-Ray scores.. Athlon 1.2GHz beats P4 1.6GHz by 960 to 834. Intel claims that "The Pentium 4 processors that we're announcing Monday have the highest performing floating point of any PC processor that's out there", and yet the Athlon is more than 50% faster at 3D rendering, clock for clock? What an embarrassment. [zdnet.com]
Interesting... (Score:1)
However, my 800Mhz, non-RAMBUS Athlon should last me quite a while, thankyouverymuch.
---
pb Reply or e-mail; don't vaguely moderate [ncsu.edu].
Re:For those who haven't heard... (Score:1)
Most consumers wouldn't be able to tell the difference in performance between the two. Those benchmarks are showing, 5-10% difference when running quake at 150 FPS. If it wasn't for the frame counter/speedometer you wouldn't be able to tell the difference.
The PIV is clearly designed to go to ultra high clocks, that has been the business. Intel has also done poorly, historically, with first revs of a new architecture. I'm guessing by summer there will be a refined PIV and by this time next year they will have tuned the caching and probably changed the branch prediction hardware enough to eek out that enough performance to sit on top, this is intel.
The one real thing that article does show is that AMD has put together a very competitive processor. It has been pretty clear for some time but it was nice to see a new generation Intel part come out that didn't destroy everything in it's path. Moore is catching up.
alpha (Score:1)
Cheap though they are not. In fact they're damm expensive new - you'll never get a 21264 for anything less than 3k.
However, the previous generation of Alpha can still be had secondhand, even from some vendors, for a much more reasonably price. Eg you could get a 600MHz 21164A for about IEP£1k to £1.5k, or motherboard+chip for about £600.
Re:Which chip will you actually be able to buy? (Score:1)
This is funny... If I looked hard -- even not SO hard -- I could prolly find 8 places in south KCMO that sell an athlon... maybe even at a competitive price to online retailers! :o)
Of course, Intel shot themselves in both feet (as well as their head) when they went RAMBUS, so no money from me!
--
Re:Once again, benchmarks hardly tell the whole st (Score:1)
Correction. The initial generation of P4 doesn't have SMP support enabled. It's not that they're not designed for it (any more than P3 was).
The issue is that they're focusing on single-processor systems for roll-out. This is for ease of troubleshooting. They'll work out most of the bugs in the single-processor soloution before compounding any problems trying to prematurely release the SMP boards to market.
But if they did that, you'd want to gripe about how "buggy" their SMP boards were right?
At least try to appear impartial here. Brand-zealotry makes it too easy for inaccurate statements like yours to be made with impunity.
Chas - The one, the only.
THANK GOD!!!
Re:"Not available"??? (Score:1)
Chas - The one, the only.
THANK GOD!!!
Re:Both Good and Bad for Intel (Score:1)
The processor was just released. So arguments about availability are kind of meaningless.
How available were Thunderbird processors when they were released?
How available was the Athlon?
How available was the P3?
Howabout P2?
Of course components that have had more time to penetrate the market will be more available.
Chas - The one, the only.
THANK GOD!!!
Re:MMX support way better on Athlon (Score:1)
4x and 12x compared to what?
Re:Athlon vs. P4 (Score:1)
Re:Power Hungry (Score:1)
Problem is this: the man who dies w/the most toys wins..
A lot of people have a lot of money and are willing to buy systems w/it. They do it to beat the Jones.
Like someone said before, this was a marketing descision (to run up the clock speed but not the perfomance). People are going to rush towards the GHz machines just b/c of that.
Just my worthless rambling.
Re:Misleading Benchmark (Score:1)
Re:Misleading Benchmark (Score:1)
Alpha 166 UDB -- 184k keys/s
Intel 400mhz Celeron -- 1.1m keys/s (2.2 for dual
etc..
Re:Power Hungry (Score:1)
How's that old 486SX holding up?
Re:Misleading Benchmark (Score:1)
Until recently.
Re:well well well. (Score:1)
For older (read: non-pipelined) machines, you might use the inverse (clocks per instruction or CPI, a 68HC11 might take 3 clock cycles to execute an ADD instruction). It's generally an average over a plausible mix of instructions that would be a significant part of a normal software program.
Well, how do you make CPI into IPC? First, you pipeline the processor. Theoretically, you could get your CPI or IPC to near 1.0. However, certain things such as loads from memory might require you to wait a clock or two, pausing the entire pipeline. That'll keep your IPC from being big (big generally = better for IPC).
Okay, great. You did your best trying to get it near 1.0 and you couldn't quite do it. How about starting multiple instructions in each clock? Of course there are complications, you can't do an ADD that requires an answer before the multiply that supplies that answer has completed. So there are limitations. But for many instructions which are not inter-dependent, you can issue them all at the same time. Surprise! Now you can get an IPC > 1.0.
Yes, this is a grossly simplified answer computer architecture answer condensed in a can, but maybe it clears things up a little.
Re:Power Hungry (Score:1)
Re:well well well. (Score:1)
But as for the price of the actual CPU's... Intel needs to get off their high horse and realize that they need to win back their market share, even if it means not living on 40%-60% margins on their high end chips...
Re:Misleading Benchmark (Score:1)
Re:Misleading Benchmark WHAT? (Score:1)
They seem to be doing well in Quake 3... (Score:1)
Then again, I don't think the nVidia video drivers actually USE much DirectX. The only two other things that use would DirectX as far as I can tell in Quake3 would be DirectInput and DirectSound. Would they make THAT much of a difference?
Maybe we should benchmark a P4 using an A3D soundcard, thus bypassing DirectSound also.
:)
- Ed.
Re:They seem to be doing well in Quake 3... (Score:1)
I was, in fact, trying to make excuses for the P4. Guess you don't like me doing that. I'll stop now.
Re:Why? (Score:1)
Re:Misleading Benchmark (Score:1)
Re:PPro all over again (Score:1)
Re:What makes you think.... (Score:1)
Hardly fair... (Score:1)
English language eqv to C'T (Score:1)
Can any one recommend an equivalent magazine available in the States either on the "high street" or by subscription? PC Magazine and PC World just don't cut it. Byte magazine was OK but is now web only.
Suggestions anyone?
Thanks
Re:Fun with ambiguous decimal commas. (Score:1)
since c't is german i would say thats ok.
john
Re:just my 64 bits... (Score:1)
Compiler Technology (Score:1)
The branch prediction is supposed to be about 94% accurate and can be made near 100% with the use of some intelligent compiling technologies. Dont forget the performance on the new SSE-2 instructions as well.
I love AMD, but they refuse to initiate rather than imitate. Even their plan for a 64bit processor refuses to step out of the path. (64bit with x86 instructions) A vast majority of this site runs linux and are therefor mostly instruction independant. I would think this population would embrace a new architecture if it provided a performance increase.
FunOne
Re:Athlon bad at SPECfp, good on FP apps, why? (Score:1)
Most all manufacturers do this, I wouldn't be suprised if AMD had done this in the past.
SPECfp, for the most part though, is useless. Sorta like the old Mac benchmarks that showed it to be better (ByteMark, was it?). They were written by Apple, and pulled all sorts of shortcuts.
Re:I really think this will end up hurting intel (Score:2)
The avrage joe - when he see's a commercial of a major brand selling PC, or an add of Dell - he see's quite clear the animated logo - "Intel Inside".
This little animated logo makes the differences for the avrage joe. He'll see in the commercial a flashy PC and he will goto the store and buy it - with this little tiny logo that he see's on TV.
Thats how you sell Intel PC's. Maybe AMD would do this some-day...
"Not available"??? (Score:2)
Well. From what I learned back in grade school, "unavailable" means you can't get it. Period.
Having it priced beyond your means doesn't make it unavailable. A better term would have been "less readily available".
I DO agree with the basic sentiment of it though. The focusing on the best price/performance ratio rather than best price or highest rated speed. As a P4 system with the mandatory RAMBUS *GACK!*, easily outpaces an Athlon.....in terms of price.
Chas - The one, the only.
THANK GOD!!!
Re:well well well. (Score:2)
Doesn't matter. The media is frantically pumping Intel's "next big thing," to the exclusion of AMD's *already existing* big thing.
It's been most interesting to watch the mainstream (C|Net, ZDNews) suck it up to Intel. Two months ago, they were mentioning AMD about as much as Intel. Today, they talk exclusively about Intel's "to be" chips, and ignore AMDs existing superiority.
The winner in the chip wars will be the company that best manipulates the popular media.
Which means that Intel, no matter how badly it shoots itself in the foot with poor designs, poor performance and poor planning, will succeed -- because Intel is a Master of Marketing.
--
Re:just my 64 bits... (Score:2)
There's no problem with defining time_t to be 64 bits on a 32-bit architecture. There can be a small performance penalty if it isn't done carefully, but that's it.
Mozilla and VMware (Score:2)
VMware is also quite CPU hungry - it's fine for most things, running Windows on Linux on this hardware with 256 MB RAM, but it could still do with a speed boost when booting (which doesn't run at full speed).
This isn't a trend I particularly like, but it seems to be happening in the Linux world as the GUI applications get better...
P4 performance thoughts (Score:2)
1. Why doesn't the P4 blow away the P3 on normal integer code, despite having double clocked ALUs?
Two reasons. First, is the issue that everyone keeps bringing up: mispredicted branches causing part of the pipeline to be flushed. At 20+ stages, there's a fairly large penalty for this. I don't think this is as big of a problem as most people have been led to believe. The branch predictor is 33% better than the one on the P3 according to intel, which helps a lot with this. With the use of a P4 aware compiler, branches are also laid out in such a way as to help out the static predictor (in the case the branch is not in the branch history table, or is unpredictable), and the P4 can use branch hints emitted by the compiler or assembly writer. So at least with newly compiled programs, branching shouldn't be a huge issue. With older programs, the improved branch predictor should help a lot. The real problem with the P4's integer performance (and why it performs dismally on RC5 is that shifts, rotates, and multiplies all have increased latencies (although the throughput remains either the same or very close per clock) compared to the p3. So code that expects to get the result of a shift or rotate back very soon is not gonna be happy when the p4 takes 4 cycles to do so. Small shifts can be replaced with a series of adds which have both higher throughput and 8x lower latency than shifts on the P4, but the compiler has to know about this in order to optimize it.
2. Why does the FPU perform so poorly in some things, and great in Quake3
My answer here would be that the FPU in the P4 is almost identical to the one in the P3 in terms of throughput, but the operations have a longer latency. So again, code that expects p3 or athlon latency instructions will get a rude awakening. Quake3 (and this is total speculation) probably uses a large unrolled loop of FPU ops that would hide the latency issue, but benefit from the same per-clock throughput as a p3...at 1.5x the clock speed. If you check out JC News (www.jc-news.com/pc) there's is someone claiming that Quake3 has no SSE/SSE2 optimizations whatsoever, so the FPU routines are probably just tuned to get as much throuput out of a pipelined FPU as possible. Use of the scalar SSE2 FPU ops on a P4 (by the compiler, or the assembly writer) can decrease latency while using the SIMD ops can increase FPU throughput. Obviously, Intel has decided to make some tradeoffs here. Shifts, rotates, multiplies are all slower, while add/sub/not/or/xor/and are all faster. The trace cache is perhaps the most interesting thing about the P4, as the x86 instruction set becomes a one-time cost for the core most of the time. It is in essence, a very simple code-morpher (x86 Ops -> cached uOps).
Speculation time
Digital video will chew up CPU power (Score:2)
Re:just my 64 bits... (Score:2)
Some software assumes time_t is int. If that were not the case, we could simply define time_t as unsigned int on 32-bit systems (provided systems don't use dates from before 1970) and continue along for an extra seven decades.
If we just make ints 64 bits we won't need to clean up such brokenness. Of course there is still 64-bit uncleanness (stuff that assumes sizeof(int) == 4), but that will have to be fixed regardless.
Re:Athlon vs. P4 (Score:2)
Then the Pentium Pro came out. The Pentium ran Windows faster. Didn't help the Pentium-level cloners like AMD when Intel came out with the Pentium II and started cranking the clock up.
Now the first revision of the P4 is being outclassed by AMD and perhaps even the PIII. Let's give a year to see what the outcome is.
--
Re:Hardly fair... (Score:2)
Bollocks, Intel has now released the chip. Officially. Game is over. And yes, at the same frequency Athlon beats the s**t out of it. So as long as Athlon manages to climb up to 1.7 P4 will be unable to beat it.
This of course does not mean that P4 will not sell. It will. And it will sell like hell. And the fact that it is more expensive does not matter either. Corporate IT is usually ruled by irrational mathematics and the cost and performance are a factor that is inferior to other more "important" ones.
More reviews... (Score:2)
Anandtech [anandtech.com]
HardOCP [hardocp.com] - on HardOCP's frontpage [hardocp.com] you can find more links to reviews.
Toms Hardware [tomshardware.com] hasn't got his review up yet, but I bet it will be soon...
Greetings Joergen
Not a total demolition job by AMD (Score:2)
The Athlons rule in everything but raw FP speed, where the Intel parts actually win. That translates over to things like Quake 3 and Unreal Tournament runing faster on Intel.
However, the Intel parts have an unfair advantage there, being clocked at 1.4 and 1.5 GHz rather than 1.2.
THat really means Intel and AMD have equally capable FP units, maybe AMD has a slightly better one.
If money is no object and all you care about is FP speed, buy the Intel. Otherwise buy the Athlon.
Re:I really think this will end up hurting intel (Score:2)
That would explain why Microsoft is so close to bankruptcy
MMX support way better on Athlon (Score:2)
________________________________________
Re:MMX or 3DNow!? (Score:2)
________________________________________
Re:MMX support way better on Athlon (Score:2)
On the contrary. Each ATAPI CDROM reads the tracks at somewhere between 0.8x and 2.4x, depending on the amoung of error correction cdparanoia has to do. GoGo then encodes at 4x and 12x, for PII and Athlon, respectively. Encoding nearly always finishes faster than ripping on both machines.
________________________________________
Re:well well well. (Score:2)
Re:For those who haven't heard... (Score:2)
I seem to recall from some class that, when you have a deep pipeline, there are benefits from being able to make instructions "conditional". That is, given a conditional branch over only a couple of instructions, it is more effective if you can make the test, set some bit in a register, and have the next two instructions execute or turn into a NOP based on that bit. The two NOPs cost less in performance than the hit you take when the branch prediction fails and the pipeline gets you in trouble. This is also an alternative (don't remember if it's all that good) to spending lots of transisters on branch prediction.
Anything like this make it into the P4? You'd need P4-optimized code, but hey, that's one of the things I like about running Open Source stuff...
Re:Are you sure about the mobo? (Score:2)
But you can get a new MB for $120, so it shouldn't be that big of a deal.. Except that you'll probably want to get PC133Memory (since you most likely had PC100 on that older board).
-Michael
Re:well well well. (Score:2)
Given that it's brand new, you're also going to be at the mercy of the MB manufacturers in terms of features.. If you like ultra-SCSI-RAID on your MB, it might be a while. There is a similar argument to be made for your case.
Expect to pay premium dollars for this combo.
The problem with the head-sink is logistics and ergonomics.. Where do you put everything? Unless you have a monster tower case, you're not going to be able to fit too many full lenght boards.. You have a massively over-heated PAIR of RDRAM RIMMS, a MASSIVE CPU, a larger than normal power-supply.. And then you get to have your first AGP card. A Mid-Tower is probably too small for these guys.. That might not be a problem for some.. But personally, I like stacking my computers, so it's a problem for people like me.
The heat-issue also means real-estate for lots of cooling.. And an increased risk of heart-failure.. So to speak.
My guess is that you're looking at $300 - $400 for the MB + case alone. That's more than I usually spend on an entire bare-bones system. Throw in roughly $600 for each of CPU and memory, and you've got a rather large handi-cap for an almost miniscule performance gain.
As for the pricing of their CPUs... To be fair, they _have_ to depretiate their costs.. It's not cheap to design and entire CPU from scratch.. Remember, that this is their FIRST CPU redisign since the pentium Pro some 5 years ago!! Everything else (with the exception of the vapor-ware Italium) has been an add-on to their old archetecture.
I don't know what their margins are per CPU - I wouldn't be surprised if it was 100% - But when you take in the cost of multiple billions of dollars, it's not all money in the bank. AT&T use to depretiate their hardware-costs over 40 years.. Intel doesn't have such a luxury.
-Michael
Re:just my 64 bits... (Score:2)
Most of the RISC/UNIX camp are already in 64 bit land, having just emerged from several years of teething pains.
Check out:
In many cases the chip hardware has reached 64 bit before the OS has. Most of this development has been below the radar in the mainstream press because the solutions are not Wintel.
Extrapolating the story here suggests the Transmeta approach of using VLIW to emulate lower bit width processors can help to compensate for the slow rate of change in OS.
Also, it suggests there might be practical merit in the AMD K8 approach of abusing 64 bits for double 32 bit processing.
Power Hungry (Score:2)
But the whole performance rating really doesn't matter much once you get past the point of human usability. I mean, unless you're running more programs than you can possibly use at once, who can truely use all that computing power personally?
Both companies should just forget pushing processors farther and faster, and go for manufacturing. Once they can mass produce those processors, the whole world can have a computer. (Hence, the government has more to worry about
Re:Athlon bad at SPECfp, good on FP apps, why? (Score:2)
Your numerical program in all likelihood take advantage of neither.
Also don't forget that Intel has far greater resources to make sure all the compilers, etc are fully optimized.
zdnet - "P4 willy waving" (Score:2)
Re:For those who haven't heard... (Score:2)
Re:just my 64 bits... (Score:2)
Re:Athlon vs. P4 (Score:2)
If only this was the case. It's MHz,or in this case, GHz that sells to Joe Consumer. If the general public bought machines based on the pure speed then Motorola would have a larger userbase with its PowerPC as until recently it ran all over the Intel/AMD offerings...
So whilst the educated buyer will look at the benchmarks and see that a 1.2GHz Athlon (+760 chipset) beats the P4 1.5GHz in a lot of the benchmarks, the uneducated majority will see that 1.5GHz is a lot faster than 1.2GHz and go out and line Intel's pockets, paying a premium in the process.
Re:Blue Men will sell the PIV (Score:2)
They really exist and they use Mac's !!!
So the ad should really say "No pentiums were harmed during the making of this advertisement"
Next question... Will they come over to Europe soon ?
Re:Benchmarks not surprising. (Score:2)
Hmm... lets see. Comparison of Z80 and P3 at 1 MHz.
P3 has two 32-bit wide ALUs (arithmetic-logical units) and one floating point unit. Each one of them can do one operation in one clock cycle (excluding multiply, divide, legacy instructions and branches), therefore in theory being able to execute 2 instructions per clock (in perfect conditions, perfect pipelining (no mem/reg dependencies) and no cache misses, page faults or interrupts).
Z80 has just one 8-bit wide ALU and no floating point OR multiply/divide instruction (and yes, it has two 16 bit registers too, but still it's an 8-bit system internally). Z80 is not pipelined, so it has to spend one (two?) complete cycles just for memory access and then two cycles for execution (minimum cycle time being 4 cycles).
Even in basic situations, at 1MHz, P3 would win Z80 by 800%. But in general each P3 instruction does a *lot* more useful work than a Z80 instruction. If you code multiply routine in Z80, it probably needs at least C+N*8 to C+N*16 cycles (C = some initialization N=number of bits), probably more. Original Pentium needs 9 or 11 cycles for multiply (AMD K5, K6 and K7 have troughput of one cycle per multiply IIRC). N being usually at least 16 and C maybe 32, Z80 would probably be able to do one 16-bit multiply in 200-400 cycles, all modern processors would be a *lot* ahead in this case. Even in case of general code, 1MHz P3 would probably come ahead some 2000%, at least 1000% faster, than 1 MHz Z80.
But I wouldn't be surprised if Z80 performance per transistor per MHz would be faster than P3's counterpart, though...
MMX or 3DNow!? (Score:2)
--
Re:Benchmarks (Score:2)
No vendor is going to go to this trouble unless you are buying at least a hundred machines.
Re:well well well. (Score:2)
Cheapest P4 1.4 Ghz =$920US
Cheapest Thunderbird 1.2 Ghz = $488US.
Sounds like a signifigant difference to me.
Re:Benchmarks (Score:2)
Still Wrong.
I think Intel won that round (Score:2)
Re:Athlon vs. P4 (Amiga comparison, etc.) (Score:2)
64 may seem like too high end and too far down the road, from the present, but the fact is 64 bit systems will be common place, probably in as soon as 3 years. They already exist in great numbers in engineering workstations and servers, from a variety of vendors.
--
Mammoth Transition - Poetic! (Score:2)
Intel Set to Unveil Next-Generation, Speedy Pentium 4
By Duncan Martell
SAN FRANCISCO (Reuters) - About every five years, Intel Corp. (NasdaqNM:INTC - news), the world's No. 1 chip maker, undertakes a mammoth transition.
After viewing the benchmarks, the world mammoth does come to mind. Old, huge, slow (well, slower than a sabretoothed tiger)
Granted these benchmarks are not ground to the finest tuning on either platform, they benchmark _is_ fair.
Why? Because when I acquire software for an x86 platform I'm not getting something tuned specifically for that processor, that memory, those controllers, all I get is an approximation that's "good enough" Therefor, unless there was some specific nefarious activity to downgrade the P4 results Intel is producing a throwback. Should the press be so unkind as to publish Athlon DDR benches and translate it into laymans terms, Intel could be seeing the next big hit on their credibility.
Chances are, Intel will escape unscathed, as the press are either ignorant savages or too afraid to stake their names to a story to rain on Intel's mediablitz parade. For the New York times to proclaim "so what?" at the bottom of Page 1, would be removing the crumbling keystone from Intel and sending them into the "get serious" restructuring they are so badly in need of. Anyone with doubts need only look at the Merced project to see where a Bay of Pigs mentality took root and manifested itself at Intel.
--
Re:Mammoth Transition - Poetic! - DOH! (Score:2)
Begining:After viewing the benchmarks, the world (typo included) are my own words. Sorry about that.
--
Re:Athlon vs. P4 (Score:2)
In a year it'll be a 64 bit processor from AMD that'll be making Intel groan. Screw 32 bits, that is SOOO 5 years ago.
--
Re:For those who haven't heard... (Score:2)
Re:well well well. (Score:2)
just my 64 bits... (Score:3)
I'm not a very demanding fellow. I just want 64-bit systems to be everywhere before 32-bit time_t overflows in 2039.
Personally, I think it's gonna be tight. We've been hearing about 64-bit for a long time now and yet most of us are still stuck with 32-bit.
And I really hope MS moves their OS from 32->64 in less time than it took them to go 16->32. Wasn't the 80386 released some time around 1987? Past experience suggests that the "fully 64-bit" Windows 2015 will still run some 32-bit code under the hood.
Re:just my 64 bits... (Score:3)
Okay, I'm no CPU architecture expert, so take this with salt...
Re:Benchmarks (Score:3)
(eg. C't and Ars Technica). It isn't always so easy to do your own
benchmarks since one needs access to all the sets of hardware you want
to compare.
Re:well well well. (Score:3)
I believe AMD uses a fully pipelined FPU (multiple ones at that). I'm sure that the P4 also fully pipelined their FPU, BUT, they added several additional stages to the basic instruction as well.
The addition of stages does two things: first it increases your max clock rate all-else-held-equal. And second it increases the latency for missed branch predictions. Again, all-things-being-equal, the missed branchs will tend to hurt more than the higher clock helps, except in a few special cases.
Intel, therefore included with those extra stages a highly advanced branch predictor that lives in the MIDDLE of the pipe. More-so, successfully predicted branches can skip the first several stages thanks to the branch cache. Thus, well-behaved code will get the benifit of heavier pipelining with fewer of the pit-falls. To make things even more tantelizing, they're using a 2x clocked Integer Unit. Thus they have a 3GHZ integer unit on these benchmarks. That means that they can further extend their pipelines with almost no visible penalty (even in branch-misses).
Unfortunately, they still seem to be plagued with branch misses (the only logical explanation for why AMD can still keep up or even surpase them). Obviously the memory played an important role in these benchmarks.. The KT133 v.s. AMD560 really only differ in memory speed, and that was enough to sway several percentage points. A more fair comparison would be between VIA's up and comming DDR-SDRAM P4 chipset.
But, as was pointed out; if Intel can get the P4 up to 2GHZ before AMD can (last rumor I heard was that AMD was going to hit
Unfortunately, as several sites pointed out, buyers don't look at benchmarks, they look at CPU speed, so Intel should be able to wrongfully win people over on this synthetic basis. Thankfully, the only people that are going to be willing to buy P4's are people needing servers (or maybe even Q3). We'd have to see NT ASP/Sql Server and or Linux+Apache+PHP+Oracle, etc to determine who's king (including memory types). Unfortunately I rarely see benchmarks on these grounds.
Sooner or later AMD is going to come out with their 64bit proc. With Mustang gone, this is their only next-great-hope. An all new design - hopefully without a tremendous cost - that has started from scratch (as the P4 did). I'm sure it too will have a heavy pipeline, but several of it's new features (such as the flat-memory archetecture) should enhance the playing field.
By then, however, the P4 will have found a new chipset that handles DDR-SDRAM and will have enough volume MB's and cases that it'll be cheap enough for the hard core gamer and possibly even casual gamer to purchase. A 2 - 3GHZ processor running at
I totally agree with another poster that said we should be rooting for both Intel AND AMD since competition is good.
-Michael
But about public perception.. (Score:3)
(thats another thing that I find interesting, that they need to use a *dual* channel 800mhz rambus setup to be able to compete witha single channel sdram setup. Thats pathetic.)
Benchmarks (Score:3)
Extensions and Optimisations (Score:3)
I have AMD machines now and when the dual-Athlon DDR SMP motherboards come out (purr purr) I will be getting another one.
A lot of the benchmarks, etc. claim that certain things are bettered by optimisations, saying that recompiling or rebuilding with P4 or Athlon optimising in place will radically change the numbers.
So for the Linux/FreeBSD crowd in the know: given that we rebuild kernels, what are going to be the chances that gcc and/or buildscripts are going to support/offer optimisations for either the P4 or the Athlon? I think the PentiumGCC people are working on K6/Pentium optimisation, any chance of it going further?
I'd hate to think that but for the want of code optimisation options for my silicon, I'd be unable to take full advantage...
Re:well well well. (Score:3)
So what you're saying is that if I could buy the P4, It would cost twice as much as an Athlon that can beat the hell out of it in most performance benchmarks.
Thanks for clearing that up for me.
This proves it once and for all: (Score:3)
(And I was getting worried...)
Benchmarks not surprising. (Score:3)
For example, if you compared a Z80 running at 1MHz with a P3 running at 1MHz, you would find that the Z*) does much more work.
KTB:Lover, Poet, Artiste, Aesthete, Programmer.
Athlon bad at SPECfp, good on FP apps, why? (Score:4)
Thanks for any information!
Dual Boards (Score:4)
Ah well. Right now I'm really waiting for Itanium anyway. Once that comes out, I'm hoping someone'll do a price/performance comparason of the assorted 64 bit chips on the market that will run Linux. I'm also hoping it'll push 64 bit processor prices down a bit. I'll happily go for whoever offers me the most bang for my buck.
Misleading Benchmark (Score:4)
I really think this will end up hurting intel (Score:4)
And people seem really happy with AMD, sure there was a minor flap when they were accused of covering up a bug with their chipset not doing full AGP speeds with Nvidia boards, but over all I see people who know what they are doing with hardware drooling over AMD and raising thier eyebrows at Intel.
Until now, it's been kind of hard to tell someone who doesn't understand technology why they should like AMD (except for the price). Not every one is willing to listen to a lecture about the evils of unfair patent law and the whole Rambus affair. Not everybody can even understand or care, how schizophrenic Intel as a company has become, shipping chips with no decent chipset support (i820 anyone?), announcing releases of high-speed chips that they can't supply in any reasonable quantity, and ignoring the needs of not only large accounts (I worked at a school board last year and had to fight tooth an nail to gaurantee a supply of celerons for student workstations) but also niche accounts that hold the strength of their image in their hands.
I'm a big believer that, public perception be damned, if your deailing in tech and you lose the respect of the technical community then, over time, you will lose the respect of the rest of the market.
I don't think the public has had much bad intel publicity that they can fully understand. However, I think that Intel's move to increase clock speed at the expense of performance will ultimately have a negative effect across all segments of the market.
What is the message going to be from you people when you are asked about which computer people should buy? It will be, "yeah you could get the intel system, but it's actually way slower than the AMD that costs a lot less too." And the psuedo-experts who read the free computer monthly will pick the argument up and spread it even further.
And then Joe Lunchpail goes to work and tells his buddies, "yeah, I never heard of this AMD, but I guess they are making faster computers than intel, even though intel says they're faster, so that's what I bought."
And this kernel of information, meme if you must, will start to weaken the Intel brand and the public's perception of Mhz. It isn't hard to understand. Faster clock speeds are just for marketing. Even John Dvorak could bold that entire line in his zdnet column.
And for all the PowerPC zeolots, give it up, you can't actually buy those chips yet either.
I'm not saying that this issue will kill Intel, but it will damage them. It is a short-sighted and ignorant move by their marketing department, how, like most marketing departments, overestimates word-of-mouth when it's in their favour, and underestimates it when it's potentially negative.
Once again, benchmarks hardly tell the whole story (Score:4)
What I want to know is this: how will these two processors perform against each other in dual-processor configurations?
The answer is that the Pentium 4s were designed to not be SMP capable, while the Athlons will be using the same SMP architecture that is used currently on DEC Alpha systems, which means that each processor has two dedicated connections to the North Bridge of the motherboard, as opposed to Intel's Xeon SMP configurations, which require all the processors to share bandwidth to the North Bridge.
well well well. (Score:4)
Re:For those who haven't heard... (Score:4)
Re:Athlon bad at SPECfp, good on FP apps, why? (Score:5)
Some other answers to your question were really uneducated. SPEC is an organisation producing open bechmarks and the whole industry has a saying about the benchmarks. Not just Intel like someone thought.
Hope this sheds some light on the issue.
/peter
For those who haven't heard... (Score:5)
The problem is that it's a bit of a false gain. Most of the performance gained in clock speed is lost again to the serious hit you take at each branch misprediction. If you could keep your ultra-long pipe full, you'd be cruising, but you can't. Occasionally you will mispredict, and have to flush that pipe. One your pipe becomes as deep at the P4, that performance hit starts eating your lunch. Suddenly, most of your processor is sitting empty most of the time.
So, clock-for-clock P4's get slaughtered by Athlons or PIII's. But Intel doesn't care. They know that the majority of consumers buy based solely on that magical MHz/GHz number. Most consumers are not sophisticated enough to realize that there is more to performance to clock rate.
This move on Intel's part was motivated by marketing rather than management. They are playing on the uneducated masses. It is all but directly deceptive, and I hope they get their clock cleaned by the press for it.
Buyer beware.
--Lenny
Pipeline depth and clock rate. (Score:5)
...Or move to a finer linewidth.
Pipelining lets you increase the clock rate at a given linewidth. It isn't a requirement for faster clock rates in general.
Sometimes it's a good idea to use a larger pipeline, and sometimes not. For a given linewidth, increasing the pipeline depth will increase the clock speed at the expense of mispredict penalty and hardware complexity and timing sensitivity. Sometimes the increase in clock speed is enough to offset the disadvantages, but beyond a certain point, a deeper pipeline makes performance _worse_. What pipeline depth makes sense depends on your branch predictor, your cache, and a few other things.
However, shrinking linewidth will always let you increase clock speed, regardless of pipeline depth. It gives a straight factor-of-x speedup of all logic, no matter how the pipeline of the chip is set up.
I've been studying this for five years, so I have a good idea of what the tradeoffs are
Which chip will you actually be able to buy? (Score:5)
Since the release of the Athlon, AMD's chips are more readily available at higher clock speeds. Right now, there are 4 full pages of vendors selling the AMD 1.1 Ghz Thunderbird. The chip sells for as low as $341. That is an available chip.
Intel's 1.4 and 1.5 Ghz chips are available from 8 vendors and will cost you between $950 and $1100. In my book that chip is not available.
Comparison of pricing (Score:5)
AMD T-Bird 1.2 GHz: $488 (Pricewatch)
Marketing to convince consumers that Pentium 4 is faster: $4 million
Look on Intel managers' face after seeing sales statistics: Priceless
------------------
A picture is worth 500 DWORDS.
Re:Misleading Benchmark (Score:5)
It's only a level playing field if it's pre-titled in Intel's favor?