Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Intel Hardware

Intel Dismisses 'x86 Tax', Sees No Future For ARM 406

MrSeb writes "In an interview with ExtremeTech, Mike Bell — Intel's new mobile chief, previously of Apple and Palm — has completely dismissed the decades-old theory that x86 is less power efficient than ARM. 'There is nothing in the instruction set that is more or less energy efficient than any other instruction set,' Bell says. 'I see no data that supports the claims that ARM is more efficient.' The interview also covers Intel's inherent tech advantage over ARM and the foundries ('There are very few companies on Earth who have the capabilities we've talked about, and going forward I don't think anyone will be able to match us' Bell says), the age-old argument that Intel can't compete on price, and whether Apple will eventually move its iOS products from ARM to x86, just like it moved its Macs from Power to x86 in 2005."
This discussion has been archived. No new comments can be posted.

Intel Dismisses 'x86 Tax', Sees No Future For ARM

Comments Filter:
  • by girlintraining ( 1395911 ) on Thursday June 14, 2012 @09:08PM (#40331149)

    You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly. And the reason for that is because the bandwidth outside the processor, the I/O, is so damnably slow compared to what's possible on the die itself. That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can. Besides, look at Nvidia's GPU cores: They throw hundreds of cores onto the die, but it eats hundreds of watts as well. Massively parallel and simple instruction sets don't appear to translate into energy savings.

    • by phantomfive ( 622387 ) on Thursday June 14, 2012 @09:19PM (#40331225) Journal
      Intel won the CPU wars because of manufacturing, not because of a superior instruction set. They are always able to get a smaller manufacturing process.

      For example, taking your point about data bandwidth, because the x86 has so few registers, it has to do data IO a lot more compared to something like the PowerPC or SPARC.

      To make up for that, Intel built a lot of logic in microcode and pipe-lining. It was a lot of work, but they did it well, so the x86 gets acceptable performance. All that extra logic takes power though. So Intel has a tradeoff between power consumption and performance that they can make. This guy seems to be saying they will switch to reduce power consumption, and then make up for it by having the best manufacturing process once again.

      And they do. For probably as long as chips continue to get smaller, Intel will have the advantage.
      • by Man On Pink Corner ( 1089867 ) on Thursday June 14, 2012 @09:30PM (#40331275)

        The instruction decoder is such an absurdly tiny part of a modern CPU that it really doesn't matter. CISC often has the ultimate advantage simply because it makes better use of the code cache.

        • by Darinbob ( 1142669 ) on Thursday June 14, 2012 @10:01PM (#40331427)

          The instruction set decoder should be an absurdly tiny part, but in modern Intel processors they're not necessarily small. They're dynamically converting an archaic x86 instruction set into an internal RISC-like set.

          • by msgmonkey ( 599753 ) on Friday June 15, 2012 @01:58AM (#40332449)

            Any superscaler processor is going to be doing instruction conversion, this includes RISC instruction set processors. The micro-ops in Intel processors convert to are less than RISC instructions. Once you start implementing things like Tomasulo the traditional advantages of RISC are eroded. If this was n't the case Intel would have never been able to leverage their process advantage to get better performance whilst retaining the x86 instruction set.

            In a high performance processor instruction set is irrelavant since 80%+ of the die area is cache any way.

        • That's why ARM has the compact Thumb instruction subset.
        • by yakovlev ( 210738 ) on Thursday June 14, 2012 @10:27PM (#40331533) Homepage
          For a "modern" CPU the instruction decoder is an absurdly tiny part. This is because the branch prediction, caches, issue queue, regfiles, etc. are all much larger or at least the same size.

          This isn't nearly so true in a super-low-power mobile design. The instruction decoder size for a given instruction set architecture is pretty much a fixed size per decode pipe. This means that in one of these tiny mobile chips the relative size of the decoders is dramatically larger. A super-low-power chip dramatically reduces the sizes of the caches and branch prediction, reduces the size of the regfiles, and often eliminates the issue queue. It probably also removes a decode pipe, but the relative reduction in decode size is much smaller than the relative size reduction in other areas.

          The limited register set absolutely hurts x86 on power usage, perhaps more than the decoders do, since it forces more data cache accesses for register spills and fills.

          Now, I'm not saying that x86 is necessarily worse than arm on power usage, as the richer instruction set may have other advantages such as reducing instruction cache miss rate which can be used to improve IPC which can be spent to lower frequency and reduce power. Also, microcoded instructions may turn out to be more power efficient because they don't have to access the instruction cache every cycle.

          None of this considers the fact that Intel has the best fab technology in the world. This means their processors will be a generation more efficient than everyone else's, which is probably more than enough to counter any "x86 tax" which the instruction set incurs.
          • Comment removed (Score:5, Interesting)

            by account_deleted ( 4530225 ) on Thursday June 14, 2012 @10:50PM (#40331633)
            Comment removed based on user account deletion
            • by dgatwood ( 11270 ) on Thursday June 14, 2012 @11:44PM (#40331911) Homepage Journal

              Three watts isn't even close to usable for a mobile phone. At that level of power consumption, you would either have to charge your phone every half hour (by the time you add in the chipset consumption) or build a phone that looks like one of those old portable phones from the 1980s with the small suitcase attached....

              Intel's latest Atom offerings, however, claim to draw about two orders of magnitude less power than that at idle, and are thus in the ballpark for being usable for phones and similar devices. It remains to be seen who will adopt it.

              BTW, last I read, a 2GHz Cortex A9 CPU based on a 40 nm process drew about 250 mW max, not 2W, though those numbers could easily be wrong.

              • by Kjella ( 173770 )

                BTW, last I read, a 2GHz Cortex A9 CPU based on a 40 nm process drew about 250 mW max, not 2W, though those numbers could easily be wrong.

                The answers are really all at the site the GP linked.
                Performance optimized: 1.9W
                Power optimized: 0.5W (250 mW/core)

                Anyway, Anandtech has a pretty good overview [anandtech.com] of actual phones. If you look at the normalized hours/watthour figures Medfield (the Xolo X900) is decidedly middle of the pack. It's not better than the ARM phones, but it's not terrible either. Of course newer ARM designs will beat it, but then again Intel isn't going to stand still either.

              • by Ginger Unicorn ( 952287 ) on Friday June 15, 2012 @05:09AM (#40333189)
                There are two versions [arm.com] - speed optimised @2ghz and power optimised @800Mhz to ~1GHz. Speed optimised draws 1.9W and power optimised draws 0.5W.
            • by yakovlev ( 210738 ) on Friday June 15, 2012 @08:12AM (#40334057) Homepage
              Mobile processors (even those made by Intel) are NOT desktop processors. While it's pretty clear you know this, you make a mistake by trying to count the hardware decoders on the ARM but not on the x86. I don't care who makes the processor, no general-purpose mobile phone processor is going to be able to do 1080p video decoding in software. Intel couldn't even do it on Atom, which has substantially higher power draws than a mobile phone CPU. This is true of anything in the current generation of processors and should be true for the next few die shrinks. With technology scaling not providing the performance gains it once did, this really means it won't be possible for the foreseeable future. Even if the x86 cores could do 1080p video decoding, you'd still rather have the dedicated hardware, as dedicated hardware will use substantially less power doing it than the x86 core, and video decoding is one of the cases where power draw matters on mobile phones. The point of all this is that the graphics hardware comes out to be a wash when comparing Intel and ARM systems for mobile phones. Both of them need one, so you can't assume the x86 chip can get by without one. Thus, it really is comparing apples to apples to compare just the x86 cores and the ARM cores.

              As far as the IPC difference between Intel and ARM, I'm going to side with Intel this time and say that architecture doesn't really matter. The back-end of these chips all run RISC-like. Cache sizes are going to be similar and the Intel core isn't all that sophisticated. There is no reason to believe that, at a given frequency, x86 performance will be significantly better than ARM performance. The argument is whether or not, at a given frequency, the added area required to decode x86 represents a significant additional power draw (or, worse yet, additional pipeline stages, which would have a detrimental impact on x86 performance.)

              As far a fabs go, Intel is playing this in an interesting way. Intel seems to be using mobile chips as a way to keep their older fabs busy. This makes the mobile chips very nearly free for them to manufacture. They're just keeping up with ARM, rather than moving to their current process and absolutely blowing them away. So, let's be clear. Intel could be a die shrink ahead of where they are, which probably would make the x86 cores on a newer process better than the ARM ones on an older process. Intel is staying on the old process for cost reasons, not performance ones.

              AMD doesn't really have anything that plays in the mobile space, but their closest comparison is Bobcat. Bobcat is a pretty good core for the power envelope it works in. I think AMD could build an x86 core for the mobile space, if they wanted to. The real problem is that they couldn't maintain current performance while using a back-level process to compete with Intel on cost. In some ways Intel might prefer that they could, as it might make x86 in the mobile space seem less like locking yourself into a single vendor, indirectly helping Intel sell Medfield.
        • by TheRaven64 ( 641858 ) on Friday June 15, 2012 @03:45AM (#40332845) Journal

          The instruction decoder is such an absurdly tiny part of a modern CPU that it really doesn't matter.

          Not true. It is quite a small part, but it is the part that you can not turn off or put in a low power state as long as the CPU is doing anything. This is why it becomes important on low-power systems: it's a constant power drain. Big FPUs and SIMD units draw a lot more power, but they draw almost nothing when executing scalar integer code.

          CISC often has the ultimate advantage simply because it makes better use of the code cache.

          If you're comparing to something like the Berkeley RISC or Alpha architecture, yes. If you're comparing to ARM... not so much. In the comparisons I've done, on both compiler-generate code and hand-written assembly, ARM and x86 are within 10% of each other in terms of code size with ARM smaller in most cases. Note that this was comparing ARM to x86 and x86-64. For a modern ARM core, you would use the Thumb-2 instruction set, which is typically about 30% smaller, and 50% smaller in the best case.

      • by Chas ( 5144 ) on Thursday June 14, 2012 @09:35PM (#40331303) Homepage Journal

        Intel won the CPU wars because of manufacturing, not because of a superior instruction set.

        There's nothing inherently "superior" about ARM or PPC instruction sets.

        Each has its strengths and weaknesses and prescribed methods of capitalizing on the former while working around the latter.

        Is x86, possibly, more inelegant than ARM or PPC? Maybe. Then again, what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?

        x86 may be ugly and hackish. But it's probably THE best documented platform in history and has very VERY few platform segregation points.

        • by Anonymous Coward on Thursday June 14, 2012 @09:59PM (#40331415)

          The processor architecture is not wildly different between manufacturers. The System On Chip designs in which the CPU is just one element is what makes them different. Should Intel produce custom x86 SoC you can expect the same.

          • Re: (Score:3, Interesting)

            by Anonymous Coward

            The processor architecture is not wildly different between manufacturers. The System On Chip designs in which the CPU is just one element is what makes them different. Should Intel produce custom x86 SoC you can expect the same.

            Intel is producing x86 SoCs (medfield) and yes, they are not PC compatible.

          • by Dahamma ( 304068 ) on Friday June 15, 2012 @12:04AM (#40331987)

            And the most insightful post of the thread is from an AC... if you had posted non-AC I might have modded you up ;)

            It also points out how the GP post talking about slow off-die IO is way overrated and really not all that relevant to the mobile/embedded space.

            ARM is winning the embedded STB/TV/BD/phone wars because their core is tiny and integrates well in SoCs. Many of these SoCs have graphics, Ethernet, Wifi, USB, SATA, HW crypto, MPEG decoding, etc all on die, on a $10-20 part. Intel may have something a bit faster, but they don't have anything close in overall features for that price.

          • It's worth noting that with the Cortex A9 and newer, ARM has done a lot to standardise things. For example, interrupt controllers now have a standard well-defined interface. This means that once you have one Cortex A9 SoC working, getting the next working is about as hard as getting a new x86 laptop working: you may need device drivers for the GPU and a few other things, but the core functionality will be the same.
        • by Darinbob ( 1142669 ) on Thursday June 14, 2012 @10:05PM (#40331443)

          No one who had never seen x86 would design an instruction set like it has. It exists this way not because someone designed it from scratch but because it is the end result of a long series of backward's compatible decisions, stretching all the way back to the 4004. Everytime Intel tries to start from a clean slate those CPUs do not take off or get enough time in the market place to prove themselves. The customers always demand that the new CPUs be able to run old software.

          It's actually a surprise that ARM is taking off more in higher end systems (higher end meaning tablets and smart phones). I think this is precisely because the backward's compatibility is not necessary there.

          • It's actually a surprise that ARM is taking off more in higher end systems (higher end meaning tablets and smart phones).

            Since the iPhone and iPad are in effect the start of those becoming really widespread things, they are the definition of backwards compatible, the base... that's what will make it difficult to move the market away from them.

            The Motorola chips never had a totally massive market penetration the way Arm does now in mobile/tablet worlds... I am not sure even slightly superior chips from Intel

          • by makomk ( 752139 )

            I think this is precisely because the backward's compatibility is not necessary there.

            It's actually quite funny. One of Intel's main problems in smartphones is that their chips aren't compatible with existing software, so they have to use dynamic translation. (There are some incorrect benchmarks out there that reckon it's as fast as native code but that's because they didn't realise that Intel had paid the manufacturer of the Android benchmarking suite they'd used to include a native x86 version and that it was using that instead of the ARM one.)

        • by 10101001 10101001 ( 732688 ) on Thursday June 14, 2012 @10:23PM (#40331509) Journal

          There's nothing inherently "superior" about ARM or PPC instruction sets.

          The GP didn't say anything of the sort. He was pointing out that to say "CISC won" is only true if you consider that x86 is CISC and Intel spend gobs of money to be at the forefront of CPU manufacturing technology, both in shrinking die size/increasing clock speed and shoehorning all the negative characteristics of the x86 design into a form that was more RISC like so it could allow for super-scalar and deep pipeline designs. Intel deserves a lot of credit in proving just how far CISC design can go. But it certainly wasn't that CISC won because it had greater strengths.

          Is x86, possibly, more inelegant than ARM or PPC? Maybe. Then again, what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?

          Sounds like Linux on the x86, actually. Seriously, though, RISC design tends to have a few very strong design elements: it tends to have a good many registers which absolves a lot of cache/stack work, it tends to have a fixed opcode size and requires aligned memory which usually improves throughput and allows for a much more streamlined instruction decoding engine, and precisely because there's a lot less need to support legacy platforms there's a lot more leeway to segment memory for power considerations.

          x86 may be ugly and hackish. But it's probably THE best documented platform in history and has very VERY few platform segregation points.

          Well, you can think MS's monopolistic actions for that. Seriously, "ugly and hackish"* might well describe near everything MS and Intel can be known for, in their question to maintain backwards compatibility. And if Intel had started out with an 8-bit RISC design, I'm certain there'd be the same problems, so it's not really an x86/CISC thing. Never the less, it's precisely the fact that Intel is unlikely to allow platform segregation points that x86 will probably never be low power.

          *And please realize, I say this with a great deal of respect towards both Intel and MS in maintaining performance giving how many hacks they've put in over the years to compensate for not only their own bugs but the bugs of other developers. So, as pretty and clever as a lot of the hacks may be, it's still ugly overall to have the hacks in the first place and to have so many over so many places and to be so incapable of removing any without the risk of significant backlash or simply to lose their customer base. Ie, the code may be pretty but it's put them in an ugly place.

        • what exactly is so elegant about a "catch all" platform where the basic processor architecture can change wildly between manufacturers, leading one to require many "flavors" of code simply to cover multiple vendor platforms?

          Transistor efficiency.

        • The x86 has four general purpose registers. No one in their right mind would design a chip like that today.

          When it was originally designed, it didn't matter much because memory accesses weren't much slower than register accesses, so people did arithmetic directly from RAM. It was more convenient that way. As a result, x86 has a lot of interesting, convenient addressing modes, which were really great when it was built.

          In a modern computer, RAM is significantly slower than registers, so having more regis
          • by tibit ( 1762298 )

            I think that on a modern x86 implementation, with the CISC instructions you can use about a cacheline worth of BP-relative RAM just as it were registers. It's no slower than using registers, or so it seems. There's some instruction rewriting going on that makes it so, I bet.

            • by phantomfive ( 622387 ) on Friday June 15, 2012 @12:15AM (#40332031) Journal
              In that case, it would mean the CPU is doing the optimization instead of the compiler. I am unfamiliar with that particular optimization, but it sounds like a good idea.

              Unfortunately every time you add circuitry like that, you also increase power consumption. Which is where difficulty comes in for Intel, when it's trying to make the tradeoff between power consumption and performance.
          • by Bert64 ( 520050 ) <.moc.eeznerif.todhsals. .ta. .treb.> on Friday June 15, 2012 @01:16AM (#40332303) Homepage

            If you read the article, Bell keeps on going back to the manufacturing process as Intel's main advantage. He says things like, "our competitors are going to have trouble making it to the 9nm scale." That's where their advantage is, and he knows it.

            So basically he has a more efficient engine, but rather than give customers a more efficient car he adds lots of unnecessary weight that provides no benefit to users, so that the overall package isn't any better than what everyone else is offering.

            If he put that more efficient engine, in a car as lightweight as everyone else's then customers would benefit from a superior product.

          • The x86 has four general purpose registers. No one in their right mind would design a chip like that today.

            x86 has eight general purpose registers. In 64 bit mode, it's 16 general purpose registers. Plus 16 vector registers of 256 bit each, holding 64 double precision, or 128 single precision floating point numbers, or up to 512 bytes. (That's the current versions).

        • by naasking ( 94116 ) <(naasking) (at) (gmail.com)> on Thursday June 14, 2012 @11:22PM (#40331791) Homepage

          There's nothing inherently "superior" about ARM or PPC instruction sets.

          Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die. That's real estate that could be spent on bigger cache or more registers. ARM is a much better instruction set by comparison.

          • by Kjella ( 173770 )

            Superior to x86? Sure there is. x86 is a mish mash of instructions many of which hardly anyone uses except for backwards compatibility, but that still cost real estate on the CPU die.

            Actually the most obscure instructions are implemented in software (microcode) and don't take up any hardware at all except the storage space. This makes them hideously slow but modern compilers avoid them and if you're running very old legacy code it runs fast enough anyway. Anyway, I heard these arguments back in the 90s when processors had 5 million transistors. Now they have 1.5 billion transistors and you still keep talking about the few thousand - yes, thousands - of transistors required. Sigh.

        • by bzipitidoo ( 647217 ) <bzipitidoo@yahoo.com> on Thursday June 14, 2012 @11:28PM (#40331821) Journal

          x86 is ugly. It's one of the most screwed up, inconsistent, crufty architectures ever created. Motorola's 68000 architecture was a lot cleaner. But Intel, through sheer brute force, has managed patch up many of its shortcomings and make x86 perform well in spite of itself.

          They went with a load and execute architecture for the x86 instructions. Then they didn't stick to that model for the floating point instructions, going with a stack for that. And remember they split the CPU into 2 parts. If you wanted the floating point instructions, you had to get a very expensive matching x87 chip. I still remember the week when 80387 prices collapsed from $600 to $200, and still no one would buy, not with free emulators and the 486DX nearing release. Another major bit of ugliness was the segment. Rather than a true 32bit architecture, they used this segmented architecture scheme, then buggered it up even more by having different modes. In some modes, the segment and address were simply concatenated for a 32bit address space, and in others 12 bits overlapped to give only a 20bit address space. Then you had all this switching and XMS and EMS to access memory above 1M. Nasty.

          x86 has been bashed for years for not having enough registers. And for making them special purpose. For instance, only one, AX, can be used for integer multiplication. Ask some compiler designers about the x86 sometime. Bet you'll get an earful.

          Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes. Then there's junk like the ASCII adjust and decimal adjust instructions: AAA, AAS, AAD, and AAM, and DAA, and DAS. Nobody uses packed decimal any more! And hardly anyone ever used it. Those instructions were a crappy way to support decimal anyway. If they were going to do it at all, should have just had AA for ASCII Add instead of "adjusting" after a regular ADD instruction. Then there's the string search instructions, REPNE CMPSW and relatives. They're hopelessly obsolete. We have much better algorithms for string search than that. They also screwed up the instructions intended for OS support on the 286. That's one reason why the lowest common denominator is i386 and not i286. 286 is also only 16bit.

          You might be tempted to think x86 was good for its time. Nope. Even by the standards and principles of the 1970s, x86 stinks.

          Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density. Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.

          That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.

          • by rev0lt ( 1950662 ) on Friday June 15, 2012 @12:42AM (#40332163)

            hen they didn't stick to that model for the floating point instructions, going with a stack for that. And remember they split the CPU into 2 parts. If you wanted the floating point instructions, you had to get a very expensive matching x87 chip.

            ... The same as Motorola.(http://en.wikipedia.org/wiki/Motorola_68881). They began to integrate an FPU about the same time (68040/486DX).

            Another major bit of ugliness was the segment. Rather than a true 32bit architecture, they used this segmented architecture scheme, then buggered it up even more by having different modes.

            You mean, having a 16-bit cpu support a FULL MEGABYTE instead of the usual 64Kb of Ram? In 1979? Pure evil.

            In some modes, the segment and address were simply concatenated for a 32bit address space, and in others 12 bits overlapped to give only a 20bit address space. Then you had all this switching and XMS and EMS to access memory above 1M. Nasty.

            You do know that XMS memory is just linear memory above real memory, right? And that EMS whas just a PC-compatible paging memory layout, right? Because you seem to lack basic understanding of the architecture.

            Few platform segregation points? Maybe, but one price is lots of legacy garbage. x86 still has to support those ancient segmented modes.

            Thank god. I can still run FreeDOS.

            They're hopelessly obsolete. We have much better algorithms for string search than that.

            While the instructions you mentioned are used for string comparison, that's not their sole purpose. They compare bytes. not strings.

            We have much better algorithms for string search than that.

            Please do tell. Because null detection in a couple of opcodes isn't something easy to come by.

            They also screwed up the instructions intended for OS support on the 286.

            If you are talking about MMU, they dind't screw up. Nobody cared about 16-bit support.

            That's one reason why the lowest common denominator is i386 and not i286. 286 is also only 16bit.

            Nobody cared about i386 MMU either, upto Windows 3.0. That's why early versions of 386 were buggy as hell (such as skipping the first GDT entry - yup. it's a 386 bug, not a feature).

            Someone mentioned CISC, as if that beat out RISC? It didn't. Under the hood, modern x86 CPUs actually translate each x86 instruction to several RISC instructions. So why not just use the actual RISC instruction set directly? One argument in favor of the x86 instruction set is that it is denser. Takes fewer bytes than the equivalent action in RISC instructions. Perhaps, but that's accidental. If that is such a valuable property, ought to create a new instruction set that is optimized for code density

            That is the first thing on your comment that is right on the spot.

            Then, as if x86 wasn't CISC enough, they rolled out the MMX, SSE, SSE2, SSE3, SSE4 additions.

            ...And then you lose it. Vector instructions were a FPU feature (the 487 ITT had it), and Intel also had a peek with their RISC cpus, i860/i960. With the advent of DSPs, this kind of technology came even more common.

            That makes a powerful argument in favor of open source. Could drop all the older SSE versions if only all programs could be easily recompiled.

            Older programs will run faster on new CPUs. In many cases, they won't take advantage of SSE at all if both the algorithm and the compiler aren't optimized for the use of those instructions.

            • by Bert64 ( 520050 )

              You mean, having a 16-bit cpu support a FULL MEGABYTE instead of the usual 64Kb of Ram? In 1979? Pure evil.

              In 1979, in a rather crufty way... As opposed to the 68000 which was also released in 1979, that supports 16 FULL MEGABYTES of ram, and doing so using 32 bit addressing such that even tho only 24 address lines are connected, unless you do something crufty like use the upper 8 bits for storing data (as some software did) your code for the 68000 should run just fine on the 68020 which could support up to 4GB of ram.

      • Re: (Score:2, Informative)

        by KingMotley ( 944240 )

        x86-64 has 16 (64-bit) general purpose registers, but ARM has 8 (32-bit) general purpose registers, and a few specialized ones, some of which are only available in certain operating modes. PowerPC and SPARC both have 32 64-bit registers but can only do register-register type operations (load/store) which quickly forces the registers to be cycled, while x86-64 can do register-memory type operands which is much more efficient.

        And yes, Intel does do a lot of microcode, pipelining, and micro-ops. The great pa

        • by Pulzar ( 81031 )

          but ARM has 8 (32-bit) general purpose registers, and a few specialized ones, some of which are only available in certain operating modes

          That's not correct.

          http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dai0245a/index.html [arm.com]

          2. Register set

          The ARM register set consists of 37 general-purpose registers, 16 of which are usable at any one time. The subset which is usable is determined by the current operation mode.

          • by KingMotley ( 944240 ) on Friday June 15, 2012 @12:07AM (#40332001) Journal

            Yeah, right. 37 of which you can only ever use at most 16. Of which, 5 are taken up already, and personally I wouldn't call the flags register a general purpose register, nor the stack pointer, etc, but apparently they do, lol. Also, look down at the nice graph right below your quote, you will also notice that during Fast Interrupt Routines, you have only 5 registers free to use (R8-R12), and during user mode, you only have 13 (R0-R12) free for your use, and during an IRQ you have 0 free? lol.

            So you have R0-R7 which are what most would consider general purpose, R8-R12 are special and only available in certain operating modes, and R13,14,15 aren't what most would consider general purpose.

        • And yes, Intel does do a lot of microcode, pipelining, and micro-ops. The great part is that because of it, the instruction set appears CISC externally, but internally through micro-ops, it gets 95% of the benefit that you would see through RISC. Today's x86 chips are more a CISC/RISC hybrid than they are of a pure CISC design.

          It does that, but at the cost of power. The question is whether they can get the same performance while cutting power. Intel says they can. We'll see.

      • Intel won the CPU wars because of manufacturing, not because of a superior instruction set. They are always able to get a smaller manufacturing process.

        When Intel was up against the 68000 they outperformed it at the same process size because of more compact instructions. This happened again with RISC which relied on the suboptimal premise that saving transistors in the processor trumps memory bandwidth and cache efficiency. Fail.

        ARM fixed that issue with its thumb instruction set, a 16 bit instruction encoding without which Intel surely would have squashed it too. To be sure, Intel has a few single byte instructions, mainly register inc/decs, but in genera

    • That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can.

      I'm not sure I get where you're going. I think the more logical course of action, given your argument that it's faster inside than outside the CPU, would be to move everything inside the CPU. I know "that" would be a hell of a lot more problem to fix/debug, but if you want

      • parts that send bits of electrons back and forth

        Listen man, we don't all have CPUs that incorporate significant quantities of Sr2CuO3. Damn kids these days and their newfangled electrons, decaying into goshdarned component bits. It's just shameful.

    • Re: (Score:2, Interesting)

      by Anonymous Coward

      In terms of market share CISC isn't even close to touching RISC. Every ARM processor is RISC. It's not just smartphones and tablets you need to consider, but PMPs, consumer routers, and an unfathomable number of other devices that all use ARM (Advanced RISC Machine, previously Acorn RISC Machine).

      • Intel really is the last hold out for CISC, and I don't think it even wants to be in that position. It does create newer CPUs that are RISC based but the customers demand x86 compatibility in the desktop which is their cash cow. Everywhere else you see RISC dominating to a ridiculous degree. Sure there are a few 68000 based SoCs around, some people actually use 8051 or 8086 here and there, but they're such tiny parts of the market compared to PowerPC, ARM, PIC, AVR, MIPS, and so forth.

      • Comment removed (Score:4, Interesting)

        by account_deleted ( 4530225 ) on Thursday June 14, 2012 @11:35PM (#40331855)
        Comment removed based on user account deletion
        • oooooooooookay.

          alking about tiny low margin embedded that while might be good from a numbers game frankly isn't a market one should be chasing

          Given that there are many high volume, low margin companies out there, including ARM which are vastly more successful than anything you or I have done, I won't be taking your word for it that I should be doing something else.

          Its like how Apple doesn't make any of the low rent

          You know, you're right! I could never stand to be only Michael Dell level of rich when I coul

    • by danlip ( 737336 ) on Thursday June 14, 2012 @09:40PM (#40331337)

      And we know who lost that one. Badly.

      We do? The world's fastest supercomputer (K computer [wikipedia.org]) is RISC based, and ARM is RISC, so it seems very much alive. Also CISC now has pipelining which was the thing that originally made RISC awesome, and RISC has gotten more complex, so they have evolved to be closer to each other. I am sure there are other factors that are more important for energy efficiency (mainly transistor size) and I don't have an opinion on that, but I don't understand where you are coming from.

    • Depends on what you are doing. Some things GPU's are much more effecient. Say bitcoin mining. Best ratio of Mhash/Joule was .3 for the best intel processor, and 2.5 for some AMD cards. So even 3-4 years down the road there are other factors that may stop Intel. Switching speed for silicon have really met the limit, significantly higher speeds will require entirely new processes. Main memory is becoming faster and cache is getting bigger. In reality the designs have moved towards each other. RISC almost alw
    • Re: (Score:3, Insightful)

      by Anonymous Coward

      You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      Wait, which one lost?
      RISC lost because instructions took too much space and caused cache misses.
      CISC lost because it couldn't perform -- practically every CISC processor designed today is a RISC processor + instruction set translation.
      As is frequently the case between two pure ideas (that are both legitimate enough to be seriously considered long enough for a decent flame war), the winner is actually a clever but "impure" choice combining the merits of both.

      And the reason for that is because the bandwidth outside the processor, the I/O, is so damnably slow compared to what's possible on the die itself. That's why the data transfers to and from the CPU are only about 1/30th or less the speed at which the CPU runs internally. The only logical course of action is to do as much as you can on each byte of data coming off the bus as you can.

      Yes, which is why ARM, despite/because of being fa

    • by Darinbob ( 1142669 ) on Thursday June 14, 2012 @09:59PM (#40331411)

      Power-wise the argument is right. There's very little difference between the two instruction sets that makes one more power efficient than the other. However in practice the difference is that most Intel x86 family chips are optimized for high performance (desktop) where as most ARM chips are optimized for cost and efficiency (low power embedded systems, phones, etc). ARM probably has more experience in the chip design in making things smaller but as it ramps up into faster desktop or tablet oriented CPUs it is going to lose out more.

      It really does come down to software ultimately I think. Software needs to do minimal work if it wants to save power; stop checking the net every minute to see if there's an update, put the CPU to sleep when not in use, use interrupts instead of polling, do more in a compiled low level language and less in a byte code interpreted language or scripting language, keep things small, and don't let Microsoft touch you. As soon as you start demanding the ability to run MS Office then you are giving up on power savings.

      • There's very little difference between the two instruction sets that makes one more power efficient than the other

        There are two aspects of an instruction set that effect power consumption. One is density: how much instruction cache do you need for a given algorithm. ARM does about as well as i386 here, and Thumb-2 does better (typically about 20% smaller code than x86). Smaller instruction cache means less power consumption. The other is decoder complexity: how many transistors do you need to decode the instructions. x86 instructions are somewhere between 1 and 15 bytes, and the encoding scheme is highly non-othro

    • You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one.

      CISC did. ARM is RISC, and there are far more ARM chips in use in the world than X86 chips.

      • And there are far more 8-bit and 16-bit CPUs that use CISC instruction sets than ARM chips. The quantities mean nothing, it is who is making the most money and Intel certainly won that one.

    • by gweihir ( 88907 )

      The Nvidia example is not convincing. Nvidia has to produce very fast chips in a limited time-frame in order to maintain their image, and market-share. There is no time for power-optimization at all. For the right workload, though, Nvidia (or AMD GPUs) are massively better in power consumption with regard to computing power than x86 CPUs. Just look, for example, at breaking encrypted passwords. You get speed-ups of 100...1000 compared to a normal PC CPU, while power consumption is only 1...10 that of the CP

    • You know, we had the same argument with RISC versus CISC architecture. And we know who lost that one. Badly.

      We do? Aside from the fact that the distinction is becoming less relevant as chips become more complex, It seems to me that pretty much any market that isn't dependent on MS Windows has gone with RISC.

      The x86 has a monopoly on desktop and laptop PCs and business servers not because it is a better architechture, but because of a huge, legacy code base - so big that even Intel failed when they tried to move to a new 64 bit instruction set (Itanium) and had to fall back on the current, backward-compatible so

  • Well... (Score:5, Insightful)

    by QuietLagoon ( 813062 ) on Thursday June 14, 2012 @09:11PM (#40331163)
    What did you expect him to say... that an Intel product was not suitable for the mobile marketplace? That would have been career suicide for him. He is singing from the Intel songbook. Those songs may not be sung with what is best for the customer in mind.
    • At least he settled one thing quite clearly. We need not hold off our purchases of a quad-core Android tablet like the new Nexus 7" tablets to be released soon, in hopes of getting a cool Intel Android tablet instead. Because they're not going Android on tablets anytime soon. He thinks tablets are for Windows. BWaaaa hahaha.
    • Exactly. Intel seems like a great company with intelligent engineers, but look how long it's taken for them to come even close to ATI or Radeon discrete graphics. They're not going to be in the cell phone game anytime soon.

      And an intel based iphone? Not soon. Maybe in a few years. MAYBE. I'll believe it when I see it.

      • to be fair, intels bread and butter is their CPU, not the GPU. obviously someone who specializes in GPUs should have the edge
      • Given that Intel's mobile graphics strategy has simply been 'license the same stuff from PowerVR as most of the ARM licencees that don't have an in-house design' there doesn't seem to be anything obviously uncompetitive about it.

        They aren't going to pull any design wins on the strength of their GPU; because it's the same damn GPU as a number of others; but they also aren't going to be put out in the cold by it...
    • by yuhong ( 1378501 )

      Funny that Intel once made the XScale ARM rpocessor.

  • by busyqth ( 2566075 ) on Thursday June 14, 2012 @09:19PM (#40331219)
    Intel spent many years chasing performance with little thought of power draw.
    Now they are putting all their engineering muscle into minimizing power requirements, while maintaining high performance.
    I don't see any reason to think they won't succeed, and if they do, then ARM will end up a niche architecture.
    • by Sir_Sri ( 199544 )

      They worried a lot about power draw and leekage current. They were just worried about somewhat arbitrary targets of 45, 65 and 130 W TDP. If you give their engineers and equally arbitrary 4.5W power envelope they'll work on that.

      The thing for intel has always been that the easiest way to reduce power consumption is a die shrink. Which it is. If they can stay one node ahead of the competition and transistor for transistor match performance more or less they'll have a big advantage. And as you say, they'

    • A niche that already sells more CPUs per year than Intel does. The high end computing market such as the desktop, smart phones, netbooks, tablets, those are just a fraction of the total CPUs sold. Every automobile has at least one CPU now, every home is going to have a CPU or two in the electric meter (even dumb ones), every microwave oven has one, every new appliance will have one, etc. Even your phone will have one front end CPU for the display and apps but probably a couple behind the scenes CPUs to do

    • by gweihir ( 88907 )

      Already too late. Intel is about 2 decades late and it will take even them a long time to catch up. Also note that for devices running a Linux kernel, using ARM is not that much effort and is already well known to the developers, so these people do not need x86 for anything. The only people that would desperately need x86 with low power is Microsoft, because they have this basically x86-only monster of an OS, just look at all the limitations of Win8 on ARM.

  • He's mostly right (Score:5, Insightful)

    by Erich ( 151 ) on Thursday June 14, 2012 @09:19PM (#40331221) Homepage Journal
    All those scalar processors look the same. You can trade energy efficiency for performance and end up with a lower power processor that's a lot slower. When you push the performance, the architecture doesn't matter as much, because most of the energy is spent figuring out what to run and when to run it.

    Compounding this fact, ARM isn't that great of an architecture. It's got variable length instructions, not enough registers, microcoded instructions, and a horrible, horrible virtual memory architecture.

    The big thing that ARM has is the licensing model. ARM will give you just about everything you need for a decent applications SOC. Processor, bus, and now even things like GPU and memory controllers. Sprinkle in your own companies' special sauce, and you have a great product. All they ask is for a little bit of royalty money for every chip you sell. And since everyone is using pretty much the same ARM core, the tools and "ecosystem" is pretty good.

    But there's not much of an advantage to the architecture... the advantage is all in the business model, where everyone can license it on the cheap and make a unique product out of it.

    And nowadays, the CPU is becoming less important. It's everything around it -- graphics, video, audio, imaging, telecommunications -- is what makes the difference.

    • Re: (Score:3, Interesting)

      by Anonymous Coward

      Phoronix just did an article on 6 clustered Panda boards (Cortex A9) [phoronix.com] VS the other guys. It's worth a read.

    • Re:He's mostly right (Score:5, Informative)

      by Darinbob ( 1142669 ) on Thursday June 14, 2012 @10:22PM (#40331505)

      ARM has fixed length instructions. Thumb is a separate instruction set from ARM and is also a fixed size set. You can't easily interchange ARM and Thumb without making a function call. There is Thumb 2 that interchanges them more easily now. However the instruction set decoder for Thumb to ARM is so very very simple that it could even be a standard project in an undergrad CS class. Thumb really is for people who are willing to give up some performance to save space anyway. ARM has plenty of registers compared to Thumb. I think it has the sweet spot of 16 registers which is enough to not feel cramped but not so many that context switching or interrupts get in your way. ARM is not micro-coded in any model as far as I know, it is RISC and there's no reason to do any micro-coding (maybe in an FPU coprocessor?).

      However it does have a goofy MMU at times, however this is treated as a separate coprocessor and is not intrinsic to the ARM (a different ARM system-on-chip will handle memory mapping and VM differently, it is not standardized).

      • by pitchpipe ( 708843 ) on Thursday June 14, 2012 @11:50PM (#40331925)

        You can't easily interchange ARM and Thumb without making a function call.

        ARMs weakness lies in the ELBOW implementation. Whereas Thumb is opposable to 4finGer which some see as a strength, but I find that the pinKey shadow architecture complements Thumb nicely with hAnd holding the whole set together in a CRISP burrito.

    • by fermion ( 181285 )
      I see it this way. As the processor cycles and memory became cheaper, it became less economical to pay humans to write efficient code. It also frees up cycles that can drive all the eye candy in the modern OS. There is a limit to this as we saw with MS Vista Aero. People are not going to pay just for eye candy. The purpose of faster processor is to reduce the overall cost.

      I think the ARM revolution is greater than the license issue. I think it has to do with minimizing cost of the total product that

    • And nowadays, the CPU is becoming less important. It's everything around it -- graphics, video, audio, imaging, telecommunications -- is what makes the difference.

      The CPU gets important again when you start multiplying cores.

      Nice post.

  • by romanval ( 556418 ) on Thursday June 14, 2012 @09:21PM (#40331233)
    ARM works because 1) it's good enough while being 2) cheap enough. As far as I know, ARM is getting license royalties in the pennies per chip or SoC core using their design. For how much better Intel can make their low power x86 CPUs, its going to have to compete with dozens of foundries churning out millions of ARM devices when it comes to pricing...and thats where I see Intel having a hard time.
    • Actually, they answer that in the article. He claims that, even if Intel chips *are* more expensive, a) the price of the processor is pretty much negligible compared to the price of the full unit (particularly the screen), and b) the performance advantage is worth the cost.

      And he kind of has a point. The Raspberry Pi has been described as "a smartphone minus the screen". It's $25-$35. A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor.

      • Yes, a smartphone without the screen, gsm radio, wcdma radio, bluetooth, wifi, gps, battery, case... You can buy a 700mhz smartphone for $100
      • A smartphone is in the range of $300-$600. Order of magnitude difference, and that's not because of the processor.

        Smartphone prices are overdue for a precipitous drop. And ten times better battery life would be nice.

    • ARM works because 1) it's good enough while being 2) cheap enough.

      I think that you are totally right about this. Maintaining x86 compatibility may hurt Intel a little, but it's not the key issue.

      ARM-based SoCs cost under $10 in volume, and Intel simply cannot compete in that space. It doesn't want to. It likes large prices and huge profit margins.

      Meanwhile, ARM keeps improving the performance of their cores, while the SoC manufacturers keep improving the capabilities of their SoCs, including (critically)

  • by White Flame ( 1074973 ) on Thursday June 14, 2012 @09:30PM (#40331279)

    From Intel: Work done per watt
    From ARM: System power draw small enough for handheld & long battery life

    A year or two ago, I read a study that the most ops/watt were still done by high-end Intel processors sucking tons of power each. They did so much work so fast that the per-watt work done was still beyond the tiny-power-sipping ARMs that were relatively slow but still quite capable. Has this changed in the last generation or two of CPUs?

  • by Required Snark ( 1702878 ) on Thursday June 14, 2012 @10:38PM (#40331579)
    The ARMed camp has intrinsic advantages over INTEL.

    They don't have all the legacy instruction set issues to deal with. Intel must be backward compatible with all previous versions. Remember, the 8080 subset is still alive and well in the INTEL architecture. This comes with a cost.

    It's easier to move up from a lower power system to a higher power system. In this context power can be thought of as both electrical power consumption and as compute power. Moving down means something must be simplified/eliminated, and the backwards compatibility issues makes this much harder.

    When it comes to mobile devices, ARM owns the market and has the network effect working for it. This is how INTEL kept a stranglehold on the PC market, but it works against them for mobile.

    ARM is not monolithic in the same way as INTEL. Because of the license based IP model, there are many more variations of ARM chips then INTEL chips. The resources to make variations comes from the IP user base, not from ARM. A single company, no matter how dominant, cannot afford to support that many variants. If some of the versions fail, the cost is not born by ARM. If INTEL guesses wrong and makes a dud, they have to absorb the cost.

    INTEL is no pushover, but I think ARM has the advantage.

  • Caveat lector (Score:4, Interesting)

    by gweihir ( 88907 ) on Thursday June 14, 2012 @11:03PM (#40331707)

    Simply put, as Intel has no standing in the ARM market (and AMD has now), Intel has every motivation to distort the facts.

    That said, there is indication that while x86 is not in principle more power-hungry than ARM,in practice, on silicon, it is today. The main reason is that it requires more chip area and more complex circuitry, which in practice leads to higher power consumption because of communication and signal distribution overheads and because complex circuits are far harder to optimize, not only for power consumption. Again, that does not mean that in principle it is infeasible. But note that larger chip area is also a strong argument against x86 if size matters.

    There is also the fact that low-power ARM is more energy efficient than low-power x86 when you look at the market. So maybe this person is just saying that Intel messed up and failed to make good low-power x86 implementations while ARM did not. Looking back at power-disasters like the P4, this would be plausible as well. If, on the other hand, I look at CPUs like the AMD LX800 x86 offering, (e.g. used in the Alix boards), these are pretty power efficient and may even get into ARM ranges. They are pretty slow at full load though and have a large chip area.

    So my impression is that the Intel person just said that while they do not have any offering comparable to ARM, it is their fault and not a fundamental problem of x86. I am unsure this is right, although I certainly agree that Intel does not have a leg to stand on in the market for power-efficient CPUs.

    • by dkf ( 304284 )

      Simply put, as Intel has no standing in the ARM market (and AMD has now), Intel has every motivation to distort the facts.

      Did you know that Intel used to make ARM processors (StrongARM, XScale)? And that they are (probably) still an ARM licensee?

  • Me thinks Sgt Schultz doth protest too much. Since my first post here in the early days of URL speak-and-spell, I've propounded that the disadvantages of x86 to RISC in performance were almost entirely illusory (brazen bubbles in the fabric of reality now feeding the worms notwithstanding).

    That said, on the power front, x86 bites. Possibly it bites like an undershot chihuahua in some small way that a billion dollars of doggy dentistry could adequately rectify—but it most certainly bites. Jumbles of

One half large intestine = 1 Semicolon

Working...