Intel Dismisses 'x86 Tax', Sees No Future For ARM 406
MrSeb writes "In an interview with ExtremeTech, Mike Bell — Intel's new mobile chief, previously of Apple and Palm — has completely dismissed the decades-old theory that x86 is less power efficient than ARM. 'There is nothing in the instruction set that is more or less energy efficient than any other instruction set,' Bell says. 'I see no data that supports the claims that ARM is more efficient.' The interview also covers Intel's inherent tech advantage over ARM and the foundries ('There are very few companies on Earth who have the capabilities we've talked about, and going forward I don't think anyone will be able to match us' Bell says), the age-old argument that Intel can't compete on price, and whether Apple will eventually move its iOS products from ARM to x86, just like it moved its Macs from Power to x86 in 2005."
Speed versus complexity (Score:5, Interesting)
You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly. And the reason for that is because the bandwidth outside the processor, the I/O, is so damnably slow compared to what's possible on the die itself. That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can. Besides, look at Nvidia's GPU cores: They throw hundreds of cores onto the die, but it eats hundreds of watts as well. Massively parallel and simple instruction sets don't appear to translate into energy savings.
Re:Speed versus complexity (Score:5, Insightful)
For example, taking your point about data bandwidth, because the x86 has so few registers, it has to do data IO a lot more compared to something like the PowerPC or SPARC.
To make up for that, Intel built a lot of logic in microcode and pipe-lining. It was a lot of work, but they did it well, so the x86 gets acceptable performance. All that extra logic takes power though. So Intel has a tradeoff between power consumption and performance that they can make. This guy seems to be saying they will switch to reduce power consumption, and then make up for it by having the best manufacturing process once again.
And they do. For probably as long as chips continue to get smaller, Intel will have the advantage.
Re:Speed versus complexity (Score:5, Informative)
The instruction decoder is such an absurdly tiny part of a modern CPU that it really doesn't matter. CISC often has the ultimate advantage simply because it makes better use of the code cache.
Re:Speed versus complexity (Score:5, Informative)
The instruction set decoder should be an absurdly tiny part, but in modern Intel processors they're not necessarily small. They're dynamically converting an archaic x86 instruction set into an internal RISC-like set.
Re:Speed versus complexity (Score:4, Interesting)
Any superscaler processor is going to be doing instruction conversion, this includes RISC instruction set processors. The micro-ops in Intel processors convert to are less than RISC instructions. Once you start implementing things like Tomasulo the traditional advantages of RISC are eroded. If this was n't the case Intel would have never been able to leverage their process advantage to get better performance whilst retaining the x86 instruction set.
In a high performance processor instruction set is irrelavant since 80%+ of the die area is cache any way.
Re: (Score:2)
Re:Speed versus complexity (Score:5, Interesting)
This isn't nearly so true in a super-low-power mobile design. The instruction decoder size for a given instruction set architecture is pretty much a fixed size per decode pipe. This means that in one of these tiny mobile chips the relative size of the decoders is dramatically larger. A super-low-power chip dramatically reduces the sizes of the caches and branch prediction, reduces the size of the regfiles, and often eliminates the issue queue. It probably also removes a decode pipe, but the relative reduction in decode size is much smaller than the relative size reduction in other areas.
The limited register set absolutely hurts x86 on power usage, perhaps more than the decoders do, since it forces more data cache accesses for register spills and fills.
Now, I'm not saying that x86 is necessarily worse than arm on power usage, as the richer instruction set may have other advantages such as reducing instruction cache miss rate which can be used to improve IPC which can be spent to lower frequency and reduce power. Also, microcoded instructions may turn out to be more power efficient because they don't have to access the instruction cache every cycle.
None of this considers the fact that Intel has the best fab technology in the world. This means their processors will be a generation more efficient than everyone else's, which is probably more than enough to counter any "x86 tax" which the instruction set incurs.
Comment removed (Score:5, Interesting)
Re:Speed versus complexity (Score:5, Informative)
Three watts isn't even close to usable for a mobile phone. At that level of power consumption, you would either have to charge your phone every half hour (by the time you add in the chipset consumption) or build a phone that looks like one of those old portable phones from the 1980s with the small suitcase attached....
Intel's latest Atom offerings, however, claim to draw about two orders of magnitude less power than that at idle, and are thus in the ballpark for being usable for phones and similar devices. It remains to be seen who will adopt it.
BTW, last I read, a 2GHz Cortex A9 CPU based on a 40 nm process drew about 250 mW max, not 2W, though those numbers could easily be wrong.
Re: (Score:3)
BTW, last I read, a 2GHz Cortex A9 CPU based on a 40 nm process drew about 250 mW max, not 2W, though those numbers could easily be wrong.
The answers are really all at the site the GP linked.
Performance optimized: 1.9W
Power optimized: 0.5W (250 mW/core)
Anyway, Anandtech has a pretty good overview [anandtech.com] of actual phones. If you look at the normalized hours/watthour figures Medfield (the Xolo X900) is decidedly middle of the pack. It's not better than the ARM phones, but it's not terrible either. Of course newer ARM designs will beat it, but then again Intel isn't going to stand still either.
Re:Speed versus complexity (Score:4, Informative)
Re:Speed versus complexity (Score:4, Insightful)
As far as the IPC difference between Intel and ARM, I'm going to side with Intel this time and say that architecture doesn't really matter. The back-end of these chips all run RISC-like. Cache sizes are going to be similar and the Intel core isn't all that sophisticated. There is no reason to believe that, at a given frequency, x86 performance will be significantly better than ARM performance. The argument is whether or not, at a given frequency, the added area required to decode x86 represents a significant additional power draw (or, worse yet, additional pipeline stages, which would have a detrimental impact on x86 performance.)
As far a fabs go, Intel is playing this in an interesting way. Intel seems to be using mobile chips as a way to keep their older fabs busy. This makes the mobile chips very nearly free for them to manufacture. They're just keeping up with ARM, rather than moving to their current process and absolutely blowing them away. So, let's be clear. Intel could be a die shrink ahead of where they are, which probably would make the x86 cores on a newer process better than the ARM ones on an older process. Intel is staying on the old process for cost reasons, not performance ones.
AMD doesn't really have anything that plays in the mobile space, but their closest comparison is Bobcat. Bobcat is a pretty good core for the power envelope it works in. I think AMD could build an x86 core for the mobile space, if they wanted to. The real problem is that they couldn't maintain current performance while using a back-level process to compete with Intel on cost. In some ways Intel might prefer that they could, as it might make x86 in the mobile space seem less like locking yourself into a single vendor, indirectly helping Intel sell Medfield.
Re: (Score:3, Interesting)
Re: (Score:3)
Intel has not GPU? Are you kidding? Intel has a GPU. It may not be the greatest but they certainly have one. If you don't like it then you do the same thing with Intel kit that they do with ARM kit.
You grab 3rd party parts like Nvidia.
Re:Speed versus complexity (Score:4, Interesting)
The instruction decoder is such an absurdly tiny part of a modern CPU that it really doesn't matter.
Not true. It is quite a small part, but it is the part that you can not turn off or put in a low power state as long as the CPU is doing anything. This is why it becomes important on low-power systems: it's a constant power drain. Big FPUs and SIMD units draw a lot more power, but they draw almost nothing when executing scalar integer code.
CISC often has the ultimate advantage simply because it makes better use of the code cache.
If you're comparing to something like the Berkeley RISC or Alpha architecture, yes. If you're comparing to ARM... not so much. In the comparisons I've done, on both compiler-generate code and hand-written assembly, ARM and x86 are within 10% of each other in terms of code size with ARM smaller in most cases. Note that this was comparing ARM to x86 and x86-64. For a modern ARM core, you would use the Thumb-2 instruction set, which is typically about 30% smaller, and 50% smaller in the best case.
Re:Speed versus complexity (Score:5, Interesting)
Intel won the CPU wars because of manufacturing, not because of a superior instruction set.
There's nothing inherently "superior" about ARM or PPC instruction sets.
Each has its strengths and weaknesses and prescribed methods of capitalizing on the former while working around the latter.
Is x86, possibly, more inelegant than ARM or PPC? Maybe. Then again, what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?
x86 may be ugly and hackish. But it's probably THE best documented platform in history and has very VERY few platform segregation points.
Re:Speed versus complexity (Score:5, Informative)
The processor architecture is not wildly different between manufacturers. The System On Chip designs in which the CPU is just one element is what makes them different. Should Intel produce custom x86 SoC you can expect the same.
Re: (Score:3, Interesting)
The processor architecture is not wildly different between manufacturers. The System On Chip designs in which the CPU is just one element is what makes them different. Should Intel produce custom x86 SoC you can expect the same.
Intel is producing x86 SoCs (medfield) and yes, they are not PC compatible.
Re:Speed versus complexity (Score:5, Informative)
And the most insightful post of the thread is from an AC... if you had posted non-AC I might have modded you up ;)
It also points out how the GP post talking about slow off-die IO is way overrated and really not all that relevant to the mobile/embedded space.
ARM is winning the embedded STB/TV/BD/phone wars because their core is tiny and integrates well in SoCs. Many of these SoCs have graphics, Ethernet, Wifi, USB, SATA, HW crypto, MPEG decoding, etc all on die, on a $10-20 part. Intel may have something a bit faster, but they don't have anything close in overall features for that price.
Re: (Score:3)
Re:Speed versus complexity (Score:5, Insightful)
No one who had never seen x86 would design an instruction set like it has. It exists this way not because someone designed it from scratch but because it is the end result of a long series of backward's compatible decisions, stretching all the way back to the 4004. Everytime Intel tries to start from a clean slate those CPUs do not take off or get enough time in the market place to prove themselves. The customers always demand that the new CPUs be able to run old software.
It's actually a surprise that ARM is taking off more in higher end systems (higher end meaning tablets and smart phones). I think this is precisely because the backward's compatibility is not necessary there.
Backward compatibility is there... (Score:3)
It's actually a surprise that ARM is taking off more in higher end systems (higher end meaning tablets and smart phones).
Since the iPhone and iPad are in effect the start of those becoming really widespread things, they are the definition of backwards compatible, the base... that's what will make it difficult to move the market away from them.
The Motorola chips never had a totally massive market penetration the way Arm does now in mobile/tablet worlds... I am not sure even slightly superior chips from Intel
Re: (Score:3)
I think this is precisely because the backward's compatibility is not necessary there.
It's actually quite funny. One of Intel's main problems in smartphones is that their chips aren't compatible with existing software, so they have to use dynamic translation. (There are some incorrect benchmarks out there that reckon it's as fast as native code but that's because they didn't realise that Intel had paid the manufacturer of the Android benchmarking suite they'd used to include a native x86 version and that it was using that instead of the ARM one.)
Re:Speed versus complexity (Score:5, Insightful)
The GP didn't say anything of the sort. He was pointing out that to say "CISC won" is only true if you consider that x86 is CISC and Intel spend gobs of money to be at the forefront of CPU manufacturing technology, both in shrinking die size/increasing clock speed and shoehorning all the negative characteristics of the x86 design into a form that was more RISC like so it could allow for super-scalar and deep pipeline designs. Intel deserves a lot of credit in proving just how far CISC design can go. But it certainly wasn't that CISC won because it had greater strengths.
Sounds like Linux on the x86, actually. Seriously, though, RISC design tends to have a few very strong design elements: it tends to have a good many registers which absolves a lot of cache/stack work, it tends to have a fixed opcode size and requires aligned memory which usually improves throughput and allows for a much more streamlined instruction decoding engine, and precisely because there's a lot less need to support legacy platforms there's a lot more leeway to segment memory for power considerations.
Well, you can think MS's monopolistic actions for that. Seriously, "ugly and hackish"* might well describe near everything MS and Intel can be known for, in their question to maintain backwards compatibility. And if Intel had started out with an 8-bit RISC design, I'm certain there'd be the same problems, so it's not really an x86/CISC thing. Never the less, it's precisely the fact that Intel is unlikely to allow platform segregation points that x86 will probably never be low power.
*And please realize, I say this with a great deal of respect towards both Intel and MS in maintaining performance giving how many hacks they've put in over the years to compensate for not only their own bugs but the bugs of other developers. So, as pretty and clever as a lot of the hacks may be, it's still ugly overall to have the hacks in the first place and to have so many over so many places and to be so incapable of removing any without the risk of significant backlash or simply to lose their customer base. Ie, the code may be pretty but it's put them in an ugly place.
Re: (Score:3)
What killed them was the binary-only nature of most windows software...
At a time, MIPS, PPC and Alpha were all considerably faster than x86, but except for a few specialist applications none of the existing windows software ran on them, making the hardware utterly useless.
There is no incentive for a software developer, especially a commercial one to port to a platform with very few users, and there is no incentive for an end user to buy a platform for which there is no software.
I don't remember seeing NT fo
Re: (Score:3)
NT/Alpha at least had an emulator that worked well.
What killed all these RISC PCs was the Intel Pentium Pro chip. It offered 90% of the performance of the DEC Alpha chip without any of the downsides of running some weird platform. (keep in mind NT was 32-bit only) In retrospect, the RISC PC groups had more prescience than the server crowd. They saw the writing on the wall and got the fuck out quickly.
And you have to laugh at trolls knocking MS for not making software for dead platforms. As if Windows is th
Re: (Score:3)
The reason MS made NT for RISC was not b'cos RISC vendors paid them anything, as the GGP alleged. SGI, for instance, which owned MIPS, was still very solidly in the Unix camp, and made only one MIPS workstation for NT. DEC's initial Turbochannel Alphas were OVMS and OSF/1 only. The reason NT was made to be portable was that at that time, RISC CPUs were way ahead of the Pentiums, and so MS thought that they needed to have RISC platforms ready in order to participate in the workstation and server markets.
Re: (Score:2)
what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?
Transistor efficiency.
Re:Speed versus complexity (Score:4, Insightful)
Early ARM chips didn't have an integer divide instruction because it took up to 12 cycles[1] to perform integer division and you could get the same performance without complicating the pipeline without it. Integer division is often cited as one of the main reasons why RISC had problems, because newer techniques reduced the number of cycles required to perform integer division, so newer CISC chips just used those in place of the old microcoded loops, while RISC code got no benefit unless the instruction set was extended and the code recompiled. Modern RISC chips - including ARM - do have integer division instructions though, and compilers use them, so this is something of a moot point.
[1] It was a variable number, which made life very difficult for hardware designers. One of the benefits early RISC architectures had was the fact that their instructions took the same length of time to execute, so the pipeline could be very simple.
Re: (Score:3)
When it was originally designed, it didn't matter much because memory accesses weren't much slower than register accesses, so people did arithmetic directly from RAM. It was more convenient that way. As a result, x86 has a lot of interesting, convenient addressing modes, which were really great when it was built.
In a modern computer, RAM is significantly slower than registers, so having more regis
Re: (Score:3)
I think that on a modern x86 implementation, with the CISC instructions you can use about a cacheline worth of BP-relative RAM just as it were registers. It's no slower than using registers, or so it seems. There's some instruction rewriting going on that makes it so, I bet.
Re:Speed versus complexity (Score:4, Insightful)
Unfortunately every time you add circuitry like that, you also increase power consumption. Which is where difficulty comes in for Intel, when it's trying to make the tradeoff between power consumption and performance.
Re:Speed versus complexity (Score:4, Interesting)
If you read the article, Bell keeps on going back to the manufacturing process as Intel's main advantage. He says things like, "our competitors are going to have trouble making it to the 9nm scale." That's where their advantage is, and he knows it.
So basically he has a more efficient engine, but rather than give customers a more efficient car he adds lots of unnecessary weight that provides no benefit to users, so that the overall package isn't any better than what everyone else is offering.
If he put that more efficient engine, in a car as lightweight as everyone else's then customers would benefit from a superior product.
Re: (Score:3)
The x86 has four general purpose registers. No one in their right mind would design a chip like that today.
x86 has eight general purpose registers. In 64 bit mode, it's 16 general purpose registers. Plus 16 vector registers of 256 bit each, holding 64 double precision, or 128 single precision floating point numbers, or up to 512 bytes. (That's the current versions).
Re:Speed versus complexity (Score:5, Interesting)
Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die. That's real estate that could be spent on bigger cache or more registers. ARM is a much better instruction set by comparison.
Re: (Score:3)
Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die.
Actually the most obscure instructions are implemented in software (microcode) and don't take up any hardware at all except the storage space. This makes them hideously slow but modern compilers avoid them and if you're running very old legacy code it runs fast enough anyway. Anyway, I heard these arguments back in the 90s when processors had 5 million transistors. Now they have 1.5 billion transistors and you still keep talking about the few thousand - yes, thousands - of transistors required. Sigh.
Re:Speed versus complexity (Score:5, Informative)
x86 is ugly. It's one of the most screwed up, inconsistent, crufty architectures ever created. Motorola's 68000 architecture was a lot cleaner. But Intel, through sheer brute force, has managed patch up many of its shortcomings and make x86 perform well in spite of itself.
They went with a load and execute architecture for the x86 instructions. Then they didn't stick to that model for the floating point instructions, going with a stack for that. And remember they split the CPU into 2 parts. If you wanted the floating point instructions, you had to get a very expensive matching x87 chip. I still remember the week when 80387 prices collapsed from $600 to $200, and still no one would buy, not with free emulators and the 486DX nearing release. Another major bit of ugliness was the segment. Rather than a true 32bit architecture, they used this segmented architecture scheme, then buggered it up even more by having different modes. In some modes, the segment and address were simply concatenated for a 32bit address space, and in others 12 bits overlapped to give only a 20bit address space. Then you had all this switching and XMS and EMS to access memory above 1M. Nasty.
x86 has been bashed for years for not having enough registers. And for making them special purpose. For instance, only one, AX, can be used for integer multiplication. Ask some compiler designers about the x86 sometime. Bet you'll get an earful.
Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes. Then there's junk like the ASCII adjust and decimal adjust instructions: AAA, AAS, AAD, and AAM, and DAA, and DAS. Nobody uses packed decimal any more! And hardly anyone ever used it. Those instructions were a crappy way to support decimal anyway. If they were going to do it at all, should have just had AA for ASCII Add instead of "adjusting" after a regular ADD instruction. Then there's the string search instructions, REPNE CMPSW and relatives. They're hopelessly obsolete. We have much better algorithms for string search than that. They also screwed up the instructions intended for OS support on the 286. That's one reason why the lowest common denominator is i386 and not i286. 286 is also only 16bit.
You might be tempted to think x86 was good for its time. Nope. Even by the standards and principles of the 1970s, x86 stinks.
Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density. Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.
That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.
Re:Speed versus complexity (Score:4, Informative)
hen they didn't stick to that model for the floating point instructions, going with a stack for that. And remember they split the CPU into 2 parts. If you wanted the floating point instructions, you had to get a very expensive matching x87 chip.
... The same as Motorola.(http://en.wikipedia.org/wiki/Motorola_68881). They began to integrate an FPU about the same time (68040/486DX).
Another major bit of ugliness was the segment. Rather than a true 32bit architecture, they used this segmented architecture scheme, then buggered it up even more by having different modes.
You mean, having a 16-bit cpu support a FULL MEGABYTE instead of the usual 64Kb of Ram? In 1979? Pure evil.
In some modes, the segment and address were simply concatenated for a 32bit address space, and in others 12 bits overlapped to give only a 20bit address space. Then you had all this switching and XMS and EMS to access memory above 1M. Nasty.
You do know that XMS memory is just linear memory above real memory, right? And that EMS whas just a PC-compatible paging memory layout, right? Because you seem to lack basic understanding of the architecture.
Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes.
Thank god. I can still run FreeDOS.
They're hopelessly obsolete. We have much better algorithms for string search than that.
While the instructions you mentioned are used for string comparison, that's not their sole purpose. They compare bytes. not strings.
We have much better algorithms for string search than that.
Please do tell. Because null detection in a couple of opcodes isn't something easy to come by.
They also screwed up the instructions intended for OS support on the 286.
If you are talking about MMU, they dind't screw up. Nobody cared about 16-bit support.
That's one reason why the lowest common denominator is i386 and not i286. 286 is also only 16bit.
Nobody cared about i386 MMU either, upto Windows 3.0. That's why early versions of 386 were buggy as hell (such as skipping the first GDT entry - yup. it's a 386 bug, not a feature).
Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density
That is the first thing on your comment that is right on the spot.
Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.
...And then you lose it. Vector instructions were a FPU feature (the 487 ITT had it), and Intel also had a peek with their RISC cpus, i860/i960. With the advent of DSPs, this kind of technology came even more common.
That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.
Older programs will run faster on new CPUs. In many cases, they won't take advantage of SSE at all if both the algorithm and the compiler aren't optimized for the use of those instructions.
Re: (Score:3)
You mean, having a 16-bit cpu support a FULL MEGABYTE instead of the usual 64Kb of Ram? In 1979? Pure evil.
In 1979, in a rather crufty way... As opposed to the 68000 which was also released in 1979, that supports 16 FULL MEGABYTES of ram, and doing so using 32 bit addressing such that even tho only 24 address lines are connected, unless you do something crufty like use the upper 8 bits for storing data (as some software did) your code for the 68000 should run just fine on the 68020 which could support up to 4GB of ram.
Re: (Score:3)
The topic with the *architecture* was about the simple and clean elegance of ARM vs x86 with its tons of old shit.
And the topic with the *processors* was about efficiency.
ARM processors are 10 times as efficient as Intel ones. The architecture isn’t even mentioned in that.
Those are two completely separate things!
And yet Intel's first real entry into the phone processor market, Medfield, is equivalent to ARM in terms of power efficiency. ARM is 1x as efficient as x86, not 10x.
Re: (Score:2, Informative)
x86-64 has 16 (64-bit) general purpose registers, but ARM has 8 (32-bit) general purpose registers, and a few specialized ones, some of which are only available in certain operating modes. PowerPC and SPARC both have 32 64-bit registers but can only do register-register type operations (load/store) which quickly forces the registers to be cycled, while x86-64 can do register-memory type operands which is much more efficient.
And yes, Intel does do a lot of microcode, pipelining, and micro-ops. The great pa
Re: (Score:3)
That's not correct.
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0245a/index.html [arm.com]
2. Register set
The ARM register set consists of 37 general-purpose registers, 16 of which are usable at any one time. The subset which is usable is determined by the current operation mode.
Re:Speed versus complexity (Score:4, Informative)
Yeah, right. 37 of which you can only ever use at most 16. Of which, 5 are taken up already, and personally I wouldn't call the flags register a general purpose register, nor the stack pointer, etc, but apparently they do, lol. Also, look down at the nice graph right below your quote, you will also notice that during Fast Interrupt Routines, you have only 5 registers free to use (R8-R12), and during user mode, you only have 13 (R0-R12) free for your use, and during an IRQ you have 0 free? lol.
So you have R0-R7 which are what most would consider general purpose, R8-R12 are special and only available in certain operating modes, and R13,14,15 aren't what most would consider general purpose.
Re: (Score:2)
And yes, Intel does do a lot of microcode, pipelining, and micro-ops. The great part is that because of it, the instruction set appears CISC externally, but internally through micro-ops, it gets 95% of the benefit that you would see through RISC. Today's x86 chips are more a CISC/RISC hybrid than they are of a pure CISC design.
It does that, but at the cost of power. The question is whether they can get the same performance while cutting power. Intel says they can. We'll see.
Re: (Score:3)
Intel won the CPU wars because of manufacturing, not because of a superior instruction set. They are always able to get a smaller manufacturing process.
When Intel was up against the 68000 they outperformed it at the same process size because of more compact instructions. This happened again with RISC which relied on the suboptimal premise that saving transistors in the processor trumps memory bandwidth and cache efficiency. Fail.
ARM fixed that issue with its thumb instruction set, a 16 bit instruction encoding without which Intel surely would have squashed it too. To be sure, Intel has a few single byte instructions, mainly register inc/decs, but in genera
Re: (Score:2)
I'm not sure I get where you're going. I think the more logical course of action, given your argument that it's faster inside than outside the CPU, would be to move everything inside the CPU. I know "that" would be a hell of a lot more problem to fix/debug, but if you want
Re: (Score:2)
parts that send bits of electrons back and forth
Listen man, we don't all have CPUs that incorporate significant quantities of Sr2CuO3. Damn kids these days and their newfangled electrons, decaying into goshdarned component bits. It's just shameful.
Re: (Score:2, Interesting)
In terms of market share CISC isn't even close to touching RISC. Every ARM processor is RISC. It's not just smartphones and tablets you need to consider, but PMPs, consumer routers, and an unfathomable number of other devices that all use ARM (Advanced RISC Machine, previously Acorn RISC Machine).
Re: (Score:2)
Intel really is the last hold out for CISC, and I don't think it even wants to be in that position. It does create newer CPUs that are RISC based but the customers demand x86 compatibility in the desktop which is their cash cow. Everywhere else you see RISC dominating to a ridiculous degree. Sure there are a few 68000 based SoCs around, some people actually use 8051 or 8086 here and there, but they're such tiny parts of the market compared to PowerPC, ARM, PIC, AVR, MIPS, and so forth.
Comment removed (Score:4, Interesting)
Re: (Score:3)
oooooooooookay.
alking about tiny low margin embedded that while might be good from a numbers game frankly isn't a market one should be chasing
Given that there are many high volume, low margin companies out there, including ARM which are vastly more successful than anything you or I have done, I won't be taking your word for it that I should be doing something else.
Its like how Apple doesn't make any of the low rent
You know, you're right! I could never stand to be only Michael Dell level of rich when I coul
Re:Speed versus complexity (Score:5, Insightful)
And we know who lost that one. Badly.
We do? The world's fastest supercomputer (K computer [wikipedia.org]) is RISC based, and ARM is RISC, so it seems very much alive. Also CISC now has pipelining which was the thing that originally made RISC awesome, and RISC has gotten more complex, so they have evolved to be closer to each other. I am sure there are other factors that are more important for energy efficiency (mainly transistor size) and I don't have an opinion on that, but I don't understand where you are coming from.
Re: (Score:3)
FWIW, ARM isn't a pure RISC instruction set.
I think that was his point. There is nothing that is "pure" CISC or pure RISC these days as they have been borrowing tech from each other for years now.
Re: (Score:2)
Re: (Score:3, Insightful)
You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.
Wait, which one lost?
RISC lost because instructions took too much space and caused cache misses.
CISC lost because it couldn't perform -- practically every CISC processor designed today is a RISC processor + instruction set translation.
As is frequently the case between two pure ideas (that are both legitimate enough to be seriously considered long enough for a decent flame war), the winner is actually a clever but "impure" choice combining the merits of both.
And the reason for that is because the bandwidth outside the processor, the I/O, is so damnably slow compared to what's possible on the die itself. That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can.
Yes, which is why ARM, despite/because of being fa
Re:Speed versus complexity (Score:5, Interesting)
Power-wise the argument is right. There's very little difference between the two instruction sets that makes one more power efficient than the other. However in practice the difference is that most Intel x86 family chips are optimized for high performance (desktop) where as most ARM chips are optimized for cost and efficiency (low power embedded systems, phones, etc). ARM probably has more experience in the chip design in making things smaller but as it ramps up into faster desktop or tablet oriented CPUs it is going to lose out more.
It really does come down to software ultimately I think. Software needs to do minimal work if it wants to save power; stop checking the net every minute to see if there's an update, put the CPU to sleep when not in use, use interrupts instead of polling, do more in a compiled low level language and less in a byte code interpreted language or scripting language, keep things small, and don't let Microsoft touch you. As soon as you start demanding the ability to run MS Office then you are giving up on power savings.
Re: (Score:3)
There's very little difference between the two instruction sets that makes one more power efficient than the other
There are two aspects of an instruction set that effect power consumption. One is density: how much instruction cache do you need for a given algorithm. ARM does about as well as i386 here, and Thumb-2 does better (typically about 20% smaller code than x86). Smaller instruction cache means less power consumption. The other is decoder complexity: how many transistors do you need to decode the instructions. x86 instructions are somewhere between 1 and 15 bytes, and the encoding scheme is highly non-othro
Re: (Score:3)
You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one.
CISC did. ARM is RISC, and there are far more ARM chips in use in the world than X86 chips.
Re: (Score:3)
And there are far more 8-bit and 16-bit CPUs that use CISC instruction sets than ARM chips. The quantities mean nothing, it is who is making the most money and Intel certainly won that one.
Re: (Score:2)
The Nvidia example is not convincing. Nvidia has to produce very fast chips in a limited time-frame in order to maintain their image, and market-share. There is no time for power-optimization at all. For the right workload, though, Nvidia (or AMD GPUs) are massively better in power consumption with regard to computing power than x86 CPUs. Just look, for example, at breaking encrypted passwords. You get speed-ups of 100...1000 compared to a normal PC CPU, while power consumption is only 1...10 that of the CP
Re: (Score:3)
You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.
We do? Aside from the fact that the distinction is becoming less relevant as chips become more complex, It seems to me that pretty much any market that isn't dependent on MS Windows has gone with RISC.
The x86 has a monopoly on desktop and laptop PCs and business servers not because it is a better architechture, but because of a huge, legacy code base - so big that even Intel failed when they tried to move to a new 64 bit instruction set (Itanium) and had to fall back on the current, backward-compatible so
Slashdot, please do something ! (Score:2)
This is getting too serious
I will not mention the name, but the post I'm replying to, is littered with links to that joint
I am not asking for censorship, but what those guys are doing (I am not sure it's one person or several) is too much
Being parasitic is one thing, being parasitic _and_ annoying is a totally different beast altogether !!
Do something, Slashdot, please do something !!
Re: (Score:2)
Which is why I'm able to search for it on Slashdot and find it, eh?
No, SEO bombing is the way to go. Also notify google.
Re: (Score:3)
rel="nofollow" is what you use with a link to indicate that it should not be considered for page rank. Slashdot already uses that as you note.
It will still show up as a hit in a search.
Re: (Score:3)
My examination of the link content (using Chrome, right-click the link and pick "Inspect element"), shows that there is NO rel=nofollow attribute for any link
Maybe Chrome just sucks then, since the rel=nofollow is in fact there for all the links.
Re:Make mycleanpc reference shit eating (Score:4, Insightful)
Oh come on moderators.
That link is the 2nd most disgusting thing besides Goatse and I am sick and tired of that Mycleanx troll (wont say it as it will increase his SEO and page ranking.
The only way we can stop that dipshit is to lower his Google ranking or the more he spams the more we will bring troll sites for his potential customers instead.
Well... (Score:5, Insightful)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Exactly. Intel seems like a great company with intelligent engineers, but look how long it's taken for them to come even close to ATI or Radeon discrete graphics. They're not going to be in the cell phone game anytime soon.
And an intel based iphone? Not soon. Maybe in a few years. MAYBE. I'll believe it when I see it.
Re: (Score:2)
Re: (Score:2)
They aren't going to pull any design wins on the strength of their GPU; because it's the same damn GPU as a number of others; but they also aren't going to be put out in the cold by it...
Re: (Score:2)
Funny that Intel once made the XScale ARM rpocessor.
Turn that boat around (Score:5, Insightful)
Now they are putting all their engineering muscle into minimizing power requirements, while maintaining high performance.
I don't see any reason to think they won't succeed, and if they do, then ARM will end up a niche architecture.
Re: (Score:2)
They worried a lot about power draw and leekage current. They were just worried about somewhat arbitrary targets of 45, 65 and 130 W TDP. If you give their engineers and equally arbitrary 4.5W power envelope they'll work on that.
The thing for intel has always been that the easiest way to reduce power consumption is a die shrink. Which it is. If they can stay one node ahead of the competition and transistor for transistor match performance more or less they'll have a big advantage. And as you say, they'
Re: (Score:2)
A niche that already sells more CPUs per year than Intel does. The high end computing market such as the desktop, smart phones, netbooks, tablets, those are just a fraction of the total CPUs sold. Every automobile has at least one CPU now, every home is going to have a CPU or two in the electric meter (even dumb ones), every microwave oven has one, every new appliance will have one, etc. Even your phone will have one front end CPU for the display and apps but probably a couple behind the scenes CPUs to do
Re: (Score:2)
Already too late. Intel is about 2 decades late and it will take even them a long time to catch up. Also note that for devices running a Linux kernel, using ARM is not that much effort and is already well known to the developers, so these people do not need x86 for anything. The only people that would desperately need x86 with low power is Microsoft, because they have this basically x86-only monster of an OS, just look at all the limitations of Win8 on ARM.
He's mostly right (Score:5, Insightful)
Compounding this fact, ARM isn't that great of an architecture. It's got variable length instructions, not enough registers, microcoded instructions, and a horrible, horrible virtual memory architecture.
The big thing that ARM has is the licensing model. ARM will give you just about everything you need for a decent applications SOC. Processor, bus, and now even things like GPU and memory controllers. Sprinkle in your own companies' special sauce, and you have a great product. All they ask is for a little bit of royalty money for every chip you sell. And since everyone is using pretty much the same ARM core, the tools and "ecosystem" is pretty good.
But there's not much of an advantage to the architecture... the advantage is all in the business model, where everyone can license it on the cheap and make a unique product out of it.
And nowadays, the CPU is becoming less important. It's everything around it -- graphics, video, audio, imaging, telecommunications -- is what makes the difference.
Re: (Score:3, Interesting)
Phoronix just did an article on 6 clustered Panda boards (Cortex A9) [phoronix.com] VS the other guys. It's worth a read.
Re:He's mostly right (Score:5, Informative)
ARM has fixed length instructions. Thumb is a separate instruction set from ARM and is also a fixed size set. You can't easily interchange ARM and Thumb without making a function call. There is Thumb 2 that interchanges them more easily now. However the instruction set decoder for Thumb to ARM is so very very simple that it could even be a standard project in an undergrad CS class. Thumb really is for people who are willing to give up some performance to save space anyway. ARM has plenty of registers compared to Thumb. I think it has the sweet spot of 16 registers which is enough to not feel cramped but not so many that context switching or interrupts get in your way. ARM is not micro-coded in any model as far as I know, it is RISC and there's no reason to do any micro-coding (maybe in an FPU coprocessor?).
However it does have a goofy MMU at times, however this is treated as a separate coprocessor and is not intrinsic to the ARM (a different ARM system-on-chip will handle memory mapping and VM differently, it is not standardized).
Re:He's mostly right (Score:4, Funny)
You can't easily interchange ARM and Thumb without making a function call.
ARMs weakness lies in the ELBOW implementation. Whereas Thumb is opposable to 4finGer which some see as a strength, but I find that the pinKey shadow architecture complements Thumb nicely with hAnd holding the whole set together in a CRISP burrito.
Re: (Score:2)
I think the ARM revolution is greater than the license issue. I think it has to do with minimizing cost of the total product that
Re: (Score:2)
And nowadays, the CPU is becoming less important. It's everything around it -- graphics, video, audio, imaging, telecommunications -- is what makes the difference.
The CPU gets important again when you start multiplying cores.
Nice post.
He's missing the point... (Score:5, Insightful)
Re: (Score:3)
Actually, they answer that in the article. He claims that, even if Intel chips *are* more expensive, a) the price of the processor is pretty much negligible compared to the price of the full unit (particularly the screen), and b) the performance advantage is worth the cost.
And he kind of has a point. The Raspberry Pi has been described as "a smartphone minus the screen". It's $25-$35. A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor.
Re: (Score:3)
Re: (Score:2)
A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor.
Smartphone prices are overdue for a precipitous drop. And ten times better battery life would be nice.
Totally agree --- only SoC price matters (Score:3)
I think that you are totally right about this. Maintaining x86 compatibility may hurt Intel a little, but it's not the key issue.
ARM-based SoCs cost under $10 in volume, and Intel simply cannot compete in that space. It doesn't want to. It likes large prices and huge profit margins.
Meanwhile, ARM keeps improving the performance of their cores, while the SoC manufacturers keep improving the capabilities of their SoCs, including (critically)
Definition of "efficient" (Score:5, Insightful)
From Intel: Work done per watt
From ARM: System power draw small enough for handheld & long battery life
A year or two ago, I read a study that the most ops/watt were still done by high-end Intel processors sucking tons of power each. They did so much work so fast that the per-watt work done was still beyond the tiny-power-sipping ARMs that were relatively slow but still quite capable. Has this changed in the last generation or two of CPUs?
Re:Definition of "efficient" (Score:5, Insightful)
Re: (Score:2)
and not have to recharge it more then once a day.
That was back then when I had a Dell Axim with an extra battery. These days I only want to recharge it once a week, if that.
Re: (Score:2)
ARM has some advantages (Score:5, Insightful)
They don't have all the legacy instruction set issues to deal with. Intel must be backward compatible with all previous versions. Remember, the 8080 subset is still alive and well in the INTEL architecture. This comes with a cost.
It's easier to move up from a lower power system to a higher power system. In this context power can be thought of as both electrical power consumption and as compute power. Moving down means something must be simplified/eliminated, and the backwards compatibility issues makes this much harder.
When it comes to mobile devices, ARM owns the market and has the network effect working for it. This is how INTEL kept a stranglehold on the PC market, but it works against them for mobile.
ARM is not monolithic in the same way as INTEL. Because of the license based IP model, there are many more variations of ARM chips then INTEL chips. The resources to make variations comes from the IP user base, not from ARM. A single company, no matter how dominant, cannot afford to support that many variants. If some of the versions fail, the cost is not born by ARM. If INTEL guesses wrong and makes a dud, they have to absorb the cost.
INTEL is no pushover, but I think ARM has the advantage.
Caveat lector (Score:4, Interesting)
Simply put, as Intel has no standing in the ARM market (and AMD has now), Intel has every motivation to distort the facts.
That said, there is indication that while x86 is not in principle more power-hungry than ARM,in practice, on silicon, it is today. The main reason is that it requires more chip area and more complex circuitry, which in practice leads to higher power consumption because of communication and signal distribution overheads and because complex circuits are far harder to optimize, not only for power consumption. Again, that does not mean that in principle it is infeasible. But note that larger chip area is also a strong argument against x86 if size matters.
There is also the fact that low-power ARM is more energy efficient than low-power x86 when you look at the market. So maybe this person is just saying that Intel messed up and failed to make good low-power x86 implementations while ARM did not. Looking back at power-disasters like the P4, this would be plausible as well. If, on the other hand, I look at CPUs like the AMD LX800 x86 offering, (e.g. used in the Alix boards), these are pretty power efficient and may even get into ARM ranges. They are pretty slow at full load though and have a large chip area.
So my impression is that the Intel person just said that while they do not have any offering comparable to ARM, it is their fault and not a fundamental problem of x86. I am unsure this is right, although I certainly agree that Intel does not have a leg to stand on in the market for power-efficient CPUs.
Re: (Score:3)
Simply put, as Intel has no standing in the ARM market (and AMD has now), Intel has every motivation to distort the facts.
Did you know that Intel used to make ARM processors (StrongARM, XScale)? And that they are (probably) still an ARM licensee?
I see nothing (Score:2)
Me thinks Sgt Schultz doth protest too much. Since my first post here in the early days of URL speak-and-spell, I've propounded that the disadvantages of x86 to RISC in performance were almost entirely illusory (brazen bubbles in the fabric of reality now feeding the worms notwithstanding).
That said, on the power front, x86 bites. Possibly it bites like an undershot chihuahua in some small way that a billion dollars of doggy dentistry could adequately rectify—but it most certainly bites. Jumbles of
Re: (Score:2)
Re: (Score:2)
So far this year Intel has basically finished off AMD from the high-end of the desktop CPU market, while advancing into the useful mobile desktop GPU market via their 22nm mobile Ivy Bridge HD 4000 chipset. There's nothing really competitive from them yet for under 15W of TDP, but it's obvious they intend to battle more on the mobile and SoC markets. Only new market to expand into at this point, and the only one still growing usefully. They're not there yet.
But it wasn't that long ago that Intel's integr
Re: (Score:2)
Oh, what they've got is interesting now if they'd drop Windows like the bad habit it is and give us a decent Intel Android tablet. You'd think they'd leap at it - bigger tablets mean more room for a bigger battery.
It's not like Microsoft is holding back on the Tegra 3 WinRT tablets to give them a leg up.
Re: (Score:2)
The topic with the "architecture" was about the simple and clean elegance of 680x0 vs x86 with its tons of old shit.
Oh, wait, am I in the wrong century?