Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
Intel Power

Research Shows RISC vs. CISC Doesn't Matter 161

Posted by timothy
from the just-a-couple-of-letters dept.
fsterman writes The power advantages brought by the RISC instruction sets used in Power and ARM chips is often pitted against the X86's efficiencies of scale. It's difficult to assess how much the difference between instruction sets matter because teasing out the theoretical efficiency of an ISA from the proficiency of a chip's design team, technical expertise of its manufacturer, and support for architecture-specific optimizations in compilers is nearly impossible . However, new research examining the performance of a variety of ARM, MIPS, and X86 processors gives weight to Intel's conclusion: the benefits of a given ISA to the power envelope of a chip are minute.
This discussion has been archived. No new comments can be posted.

Research Shows RISC vs. CISC Doesn't Matter

Comments Filter:
  • by alen (225700) on Thursday August 28, 2014 @09:09AM (#47773753)

    i've read the legacy x86 instructions were virtualized in the CPU a long time ago and modern intel processors are effectively RISC that translate to x86 in the CPU

    • by Z80a (971949) on Thursday August 28, 2014 @09:13AM (#47773771)

      As far i'm aware since the pentium pro line, the intel CPUs are RISCs with translation layers and AMD been on this boat since the original athlon.

      • Actually AMD did that way back in the K5 time. The K5 was a 29k RISC processor with a x86 frontend.

        • by Z80a (971949)

          Interessing.
          I was thinking the Atlhon that was the 29k.

          • by cheesybagel (670288) on Thursday August 28, 2014 @09:57AM (#47774121)

            After AMD lost the license to manufacture Intel i486 processors, together with other people, they were forced to design their own chip from the ground up. So they basically used one of the 29k RISC processors and put an x86 frontend on it. Cyrix did more or less the same thing at the time also coming with their own design. Since the K5 had good performance per clock but could not clock very high and was expensive AMD was stuck and to get their next processor they bought a company called NexGen which designed the Nx586 processor which was Intel compatible. AMD then worked on the successor of Nx586 as a single chip which was the K6. The K7 Athlon was yet another design made by a team headed by Dirk Meyer who used to be a chip designer at Digital Equipment Incorporated i.e. DEC. He was one of the designers of the Alpha series of RISC CPUs and the Athlon resembles an Alpha chip internally a lot because of that.

            • Actually, Nexgen was the first to do x86 -> RISC. Then Intel with Pentium Pro. Then AMD with K5. As far as I recall, Cyrix never did x86 -> RISC back then until they were acquired by VIA (ie, Cyrix M series chips executed x86 directly, but VIA Epia and later translate).

              • Right NexGen released their chip first but it is hard to compare because it was a dual-chip solution where the FPU came in a separate package while the K5 had its own FPU. NexGen only did a single-chip product later. K5 development process was highly protracted and difficult and the release was delayed many times. The Cyrix 5x86 was also similar in a lot of regards to the Pentium Pro. In fact I remember the Pentium Pro designer himself stating that they had a lot of interesting insights during Pentium Pro c

          • When the DEC Alpha was killed, many of the engineers were picked up by AMD.

      • by Anonymous Coward on Thursday August 28, 2014 @09:36AM (#47773941)

        Yes. As noted by the study (That by the way isn't very good.) "When every transistor counts, then every instruction, clock cycle, memory access, and cache level must be carefully budgeted, and the simple design tenets of RISC become advantageous once again."
        Essentially meaning that "If you want as few transistors as possible it doesn't help to have the CISC to RISC translation layer in x86"

        They also claim things like "The report notes that in certain, extremely specific cases where die sizes must be 1-2mm2 or power consumption is specced to sub-milliwatt levels, RISC microcontrollers can still have an advantage over their CISC brethren." which clearly indicates that their idea of "embedded" systems is limited to smartphones.
        The cases where you have a battery that can't be recharged on daily basis is hardly an extremely specific case. Not that any CPU they tested is suitable for those applications anyway. They have essentially limited themselves to applications where "not as bad as P4" is acceptable.

        • You make a lot of good points. But any time you create a processor with cache miss prediction as well as branch prediction as well as execution parallelization, the instruction set has almost no effect at all.

          You're suggesting that the instruction set interpretation is a separate unit to to out of order execution code. In reality, the first stage of any highly optimized modern CPU able to minimize pipeline misses must actually "recompile the code" (an oversimplification) in hardware before the actual execut
          • by Darinbob (1142669)

            ARM is RISC through and through. Though it complicates it somewhat with multiple simple instruction sets. Basic ARM ISA is all 32-bit instructions only, very much RISC from every angle you look at it, every bit as pure as MIPS. ARM Thumb ISA is 16-bit instructions only, and the machine translation from Thumb to ARM is very simple, just a fraction of the chip. Thumb2 gets slightly more complex allowing both 16 and 32 bit instructions intermixed, but again it's not that complicated. It's just RISC like a

            • by unixisc (2429386)
              Nonetheless, they've lost some major markets - game consoles first to PPC, and later Intel, and then, the embedded market to ARM
    • Re: (Score:2, Interesting)

      by Anonymous Coward

      x86 instructions, are in fact, decoded to micro opcodes, so the distinction isn't as useful in this context.

      • by RabidReindeer (2625839) on Thursday August 28, 2014 @09:42AM (#47773997)

        x86 instructions, are in fact, decoded to micro opcodes, so the distinction isn't as useful in this context.

        They're not the only ones. The IBM mainframes have long been VMs implemented on top of various microcode platforms. In fact, one of the original uses of the 8-inch floppy disk was to hold the VM that would be loaded up during the Initial Microprogram Load (IMPL), before the IPL (boot) of the actual OS. So in a sense, the Project Hercules mainframe emulator is just repeating history.

        Nor were they unusual. In school I worked with a minicomputer which not only was a VM on top of microcode, but you could extend the VM by programming to the microcode yourself.

        The main differences between RISC and CISC, as I recall were lots of registers and the simplicity of the instruction set. Both the Intel and zSeries CISC instruction sets have lots of registers, though. So the main difference between RISC and CISC would be that you could - in theory - optimize "between" the CISC instructions if you coded RISC instead.

        Presumably somebody tried this, but didn't get benefits worth shouting about.

        Incidentally, the CISC instruction set of the more recent IBM z machines includes entire C stdlib functions such as strcpy in a single machine-language instruction.

        • by Guy Harris (3803) <guy@alum.mit.edu> on Thursday August 28, 2014 @03:55PM (#47778789)

          They're not the only ones. The IBM mainframes have long been VMs implemented on top of various microcode platforms.

          But the microcode implemented part or all of an interpreter for the machine code; the instructions weren't translated into directly-executed microcode. (And the System/360 Model 75 did it all in hardware, with no microcode).

          And the "instruction set" for the microcode was often rather close to the hardware, with extremely little in the way of "instruction decoding" of microinstructions, although I think some lower-end machines might have had microinstructions that didn't look too different from a regular instruction set. (Some might have been IBM 801s [wikipedia.org].)

          So that's not exactly the same thing as what the Pentium Pro and successors, the Nx586, and the AMD K5 and successors, do.

          Currently mainframe processors, however, as far as I know 1) execute most instructions directly in hardware, 2) do so by translating them into micro-ops the same way current x86 processors do, and 3) trap some instructions to "millicode", which is z/Architecture machine code with some processor-dependent special instructions and access to processor-dependent special registers (and, yes, I can hear the word PALcode [wikipedia.org] being shouted in the background...). See, for example, " A high-frequency custom CMOS S/390 microprocessor" [ieee.org] (paywalled, but the abstract is free at that link, and mentions millicode) and "IBM zEnterprise 196 microprocessor and cache subsystem" [christianjacobi.de] (non-paywalled copy; mentions microoperations). I'm not sure those processors have any of what would normally be thought of as "microcode".

          The midrange System/38 and older ("CISC") AS/400 machines also had an S/360-ish instruction set implemented in microcode. The compilers, however, generated code for an extremely CISCy processor [ibm.com] - but that code wasn't interpreted, it was translated into the native instruction set by low-level OS code and executed.

          For legal reasons, the people who wrote the low-level OS code (compiled into the native instruction set) worked for a hardware manager and wrote what was called "vertical microcode" (the microcode that implemented the native instruction set was called "horizontal microcode"). That way, IBM wouldn't have to provide that code to competitors, the way they had to make the IBM mainframe OSes available to plug-compatible manufacturers, as it's not software, it's internal microcode. See "Inside the AS/400" [amazon.com] by one of the architects of S/38 and AS/400.

          Current ("RISC") AS/400s^WeServer iSeries^W^WSystem i^WIBM Power Systems running IBM i are similar, but the internal machine language is PowerPC^WPower ISA (with some extensions such as tag bits and decimal-arithmetic assists, present, I think, in recent POWER microprocessors but not documented) rather than the old "IMPI" 360-ish instruction set.

          The main differences between RISC and CISC, as I recall were lots of registers and the simplicity of the instruction set. Both the Intel and zSeries CISC instruction sets have lots of registers, though.

          Depends on which version of the instruction set and your definition of "lots".

          32-bit x86 had 8 registers (many x86 processors used register renaming, but they still had only 8 programmer-visible registers, and not all were as general as one might like), and they only went to 16 registers in x86-64. System/360 had 16 general-purpose registers (much more regular than x86, but that's not setting the bar all that high :-)), and that continues to z/Architecture, althoug

          • CISC ISAs may have individual "complex" instructions, such as procedure call instructions, string manipulation instructions, decimal arithmetic instructions, and various instructions and instruction set features to "close the semantic gap" between high-level languages and machine code, add extra forms of data protection, etc. - although the original procedure-call instructions in S/360 were pretty simple, BAL/BALR just putting the PC of the next instruction into a register and jumping to the target instruction, just as most RISC procedure-call instructions do. A lot of the really CISCy instruction sets may have been reactions to systems like S/360, viewing its instruction set as being far from CISCy enough, but that trend has largely died out.

            I know you say "current", but one of the original ideas behind RISC was also to make each instruction "short", i.e. make each instruction take one cycle, and reduce cycle times as much as possible so that you could have really deep pipelines (MIPS), or increase clock speed. Now, while most "RISCs" today, sort-of follow this idea, by virtue of the ISA having been made with that in mind in the old days, i.e. load-store etc. they're typically not as strict about it (if they in fact ever where). I guess the CIS

          • by unixisc (2429386)
            The AMD-64 architecture - is that also register limited? Or did AMD toss something like 32-64 program accessible registers @ the problem? And if they did, would Intel have limited theirs?
      • by perpenso (1613749) on Thursday August 28, 2014 @10:03AM (#47774185)

        x86 instructions, are in fact, decoded to micro opcodes, so the distinction isn't as useful in this context.

        Actually it is. Modern performance tuning has a lot to do with cache misses and such. CISC can allow for more instructions per cache hit. The strategy of a hybrid type design, CISC external architecture and RISC internal architecture definitely has some advantages.

        That said, the point of RISC was not solely execution speed. It was also simplicity of design. A simplicity that allowed organization with less money and resources than Intel to design very capable CPUs.

        • by Darinbob (1142669)

          RISC came out when Intel was only doing tiny microchips, the RISC market was not competing with it. One of the advantages of CISC at the time was indeed that it was easy to implement, because you just build the micro architecture most of the rest was microcoding the instructions and putting that into ROM. If you needed to add a couple new instructions for the next release to stay competitive, it could be done very quickly (and you could patch your computers on the fly to get the new instructions too).

          Yes,

          • by perpenso (1613749)

            RISC came out when Intel was only doing tiny microchips, the RISC market was not competing with it.

            The reference CISC platform, the "competition", at the time was the VAX. Same arguments, different target.

          • Yes, simplicity of design was important, but the simplicity was to free up chip resources to use elsewhere, not to make it easier for humans to design it.

            Well, yes. I think we're forgeting one of the main drivers for RISC, and that was making the hardware more compatible with what the then current compilers could actually fruitfully use. Compilers couldn't (and typically didn't) actually use all the hideously complex instructions that did "a lot" and were instead hampered by lack of registers, lack of orthogonality etc. So there was a concerted effort to develop hardware that fit the compilers, instead of the other way around, which had been the dominating p

            • by Darinbob (1142669)

              Some of what you say is true. Register allocation was simple in RISC, but most of the competing CISC machines also had very orthagonal registers as well (PDP and VAX were the classic CISC machines, x86 isn't even in the picture yet). Also some CISC machines were adding instructions that compilers had trouble using, often because the compilers had to fit in a small amount of memory.

              However many RISC machines required much more advanced compilers if you wanted optimization. I think for basic compilation RI

              • Much of the later complexity didn't exist in the late 70s.

                Yes, I should have said that I put RISC as beginning with Hennessy & Patterson's work, that became MIPS and SPARC respectively. So we're a bit later than that. And of course when I said "compiler" I meant "optimizing compiler". Basic compilation as you say, was not a problem on CISC, but everybody observed that the instruction set wasn't really used. I remember reading VAX code from the C-compiler (on BSD 4.2) when I was an undergrad and noting that the enter/leave instructions weren't used. My betters

    • This. The x86 ISA is roughly analogous to ARM Thumb compressed instructions. It is just a front end to a register rich RISC core.

      • by Darinbob (1142669)

        I dont' see it. The ARM Thumb instruction set is vastly more simple and regular than even the 286 instruction set. Thumb is already a reduced instruction set. There are no special purpose string instructions, it has general purpose registers that can be used as anything whereas 286 has only special purpose registers (AX is only accumulator, BX is the only base register, CX is the only register for counting instructions, etc). Yes, SP and PC are special purpose, but that's true of all the early RISC mach

    • I have to assume the wisc.edu folks know this and somebody gummed up the headlines along the way.

    • by drinkypoo (153816) <martin.espinoza@gmail.com> on Thursday August 28, 2014 @09:34AM (#47773925) Homepage Journal

      That is correct. Every time this comes up I like to spark a debate over what I perceive as the uselessness of referring to an "instruction set architecture" because that is a bullshit, meaningless term and has been ever since we started making CPUs whose external instructions are decomposed into RISC micro-ops. You could switch out the decoder, leave the internal core completely unchanged, and have a CPU which speaks a different instruction set. It is not an instruction set architecture. That's why the architectures themselves have names. For example, K5 and up can all run x86 code, but none of them actually have logic for each x86 instruction. All of them are internally RISCy. Are they x86-compatible? Obviously. Are they internally x86? No, nothing is any more.

      • by hkultala (69204) on Thursday August 28, 2014 @10:17AM (#47774375)

        That is correct. Every time this comes up I like to spark a debate over what I perceive as the uselessness of referring to an "instruction set architecture" because that is a bullshit, meaningless term and has been ever since we started making CPUs whose external instructions are decomposed into RISC micro-ops. You could switch out the decoder, leave the internal core completely unchanged, and have a CPU which speaks a different instruction set. It is not an instruction set architecture. That's why the architectures themselves have names. For example, K5 and up can all run x86 code, but none of them actually have logic for each x86 instruction. All of them are internally RISCy. Are they x86-compatible? Obviously. Are they internally x86? No, nothing is any more.

        This same myth keeps being repeated by people who don't really understand the details on how processors internally work.

        You cannot just change the decoder, the instruction set affect the internals a lot:

        1) Condition handling is totally different on different instruciton sets. This affect the banckend a lot. X86 has flags registers, many other architectures have predicate registers, some predicate registers with different conditions.

        2) There are totally different number of general purpose and floating point registers. The register renamer makes this a smaller difference, but then there is the fact that most RISC's use same registers for both FPU and integer, X86 has separate registers for both. And this totally separates them, the internal buses between the register files and function units in the processor are done very differently.

        3) Memory addressing modes are very different. X86 still does relatively complex address calculations on single micro-operation, so it has more complex address calculation units.

        4) Whether there are operations with more than 2 inputs, or more than 1 output has quite big impact on what kind of internal buses are needed, how many register read and write ports are needed.

        5) There are a LOT of more complex instructions in X86 ISA which are not split into micro-ops but handled via microcode. the microcode interpreter is totally missing on pure RISCs ( but exists on some not-so pure RISC's like Powe/PowerPC).

        6) Instruction set dictates the memory aligment rules. Architectures with more strict alignment rules can have simples load-store-units.

        7) Instruction set dictatetes the multicore memory ordering rules. This may affect the load-store units, caches and buses.

        8) Some instructions have different bitnesses in different architectures. For example x86 has N x X -> 2N wide multiply operations which most RISC's don't have. So x86 needs bigger/different multiplier than most RISCs.

        9) X87 FPU values are 80-bit wide(truncated to 64-bit when storing/loading). Practically all the other CPU's have maximum of 64-bit wide FPU values (though some versions Power have support for 128-bit FP numbers also)

        • This same myth keeps being repeated by people who don't really understand the details on how processors internally work.

          Actually, YOU are wrong.

          You cannot just change the decoder, the instruction set affect the internals a lot:

          All the reason you list could all be "fixed in software". The fact that silicon designed by Intel handles opcode in a way a little bit better optimized toward being fed from a x86-compatible frontend is just specific optimisation. Simply doing the same stuff with another RISCy back-end, i.e: interpreting the same ISA fed to the front-end, will simply require each x86 ISA being executed as a different set of micro-instructions. (some that are handled as single ALU opcode on Intel's si

          • by hkultala (69204)

            This same myth keeps being repeated by people who don't really understand the details on how processors internally work.

            Actually, YOU are wrong.

            You cannot just change the decoder, the instruction set affect the internals a lot:

            All the reason you list could all be "fixed in software".

            No, they cannot. OR the software will be terible slow , like 2-10 times slowdown.

            The fact that silicon designed by Intel handles opcode in a way a little bit better optimized toward being fed from a x86-compatible frontend is just specific optimisation.

            Opcodes are irrelevant. They are easy to translate. What matters are the differences in the semantics of the instructions.
            X86 instructions update flags. This adds dependencies between instructions. Most RISC processoers do not have flags at all.
            This is semantics of instructions, and they differ between ISA's.

            Simply doing the same stuff with another RISCy back-end, i.e: interpreting the same ISA fed to the front-end, will simply require each x86 ISA being executed as a different set of micro-instructions. (some that are handled as single ALU opcode on Intel's silicon might require a few more instruction, but that's about the different).

            The backend, the micro-instrucions in x86 CPUs are different than the instructions in RISC CPU's. They differ in the smal

            • by DrYak (748999) on Thursday August 28, 2014 @02:48PM (#47778011) Homepage

              All the reason you list could all be "fixed in software".

              The quotes around the "software" mean that i refer about the firmware/microcode as a piece of software designed to run on top of the actual execution units of a CPU.

              No, they cannot. OR the software will be terible slow , like 2-10 times slowdown.

              Slow: yes, indeed. But not impossible to do.

              What matters are the differences in the semantics of the instructions.
              X86 instructions update flags. This adds dependencies between instructions. Most RISC processoers do not have flags at all.
              This is semantics of instructions, and they differ between ISA's.

              Yeah, I pretty well know that RISCs don't (all) have flags.
              Now, again, how is that preventing the micro-code swap that dinkypoo refers to (and that was actually done on transmeta's crusoe)?
              You'll just end with a bigger clunkier firmware that for a given front-end instruction from the same ISA, will translate into a big bunch of back-end micro-ops.
              Yup. A RISC's ALU won't update flags. But what's preventing the firmware to dispatch *SEVERAL* micro-ops ? first to do the base operation and then aditionnal instructions to update some register emulating flags?
              Yes, it's slower. But, no that don't make micro-code based change of supported ISA impossible, only not as efficient.

              The backend, the micro-instrucions in x86 CPUs are different than the instructions in RISC CPU's. They differ in the small details I tried to explain.

              Yes, and please explain how that makes *definitely impossible* to run x86 instruction? and not merely *somewhat slower*?

              Intel did this, they added x86 decoder to their first itanium chips. {...} But the perfromance was still so terrible that nobody ever used it to run x86 code, and then they created a software translator that translated x86 code into itanium code, and that was faster, though still too slow.

              Slow, but still doable and done.

              Now, keep in mind that:
              - Itanium is a VLIW processor. That's an entirely different beast, with an entirely different approach to optimisation, and back during Itanium development the logic was "The compiled will handle the optimising". But back then such magical compiler didn't exist and anyway didn't have the necessary information at compile time (some type of optimisation requires information only available at run time. Hence doable in microcode, not in compiler).
              Given the compilers available back then, VLIW sucks for almost anything except highly repeated task. Thus it was a bit popular for cluster nodes running massively parallel algorithms (and at some point in time VLIW were also popular in Radeon GFX cards). But VLIW sucks for pretty much anything else.
              (Remember that, for example, GCC has auto-vectorisaion and well performing Profile-Guided-Optimisation only since recently).
              So "supporting an alternate x86 instruction on Itanium was slow" has as much to do with "supporting an instruction set on a back-end that's not tailored for the front-end is slow" as it has to do with "Itanic sucks for pretty much everything which isn't a highly optimized kernel-function in HPC".

              But still it proves that runing a different ISA on a completely alien back-end is doable.
              The weirdness of the back-end won't prevent it, only slow it down.

              Luckily, by the time Transmeta Crusoe arrived:
              - knowledge had a bit advance in how to handle VLIW ; crusoe had a back-end better tuned to run CISC ISA

              Then by the time Radeon arrived:
              - compilers had gotten even better ; GPU are used for the same (only) class of task at which VLIW excels.

              The backend of Crusoe was designed completely x86 on mind, all the execution units contained the small quirks in a manner which made it easy to emulate x86 with it. The backend of Crusoe contains things like {...} All these were made to make binary translation from x86 eas

        • by Guy Harris (3803)

          but then there is the fact that most RISC's use same registers for both FPU and integer

          With the minor exceptions of Alpha [hp.com], PA-RISC 1.x [hp.com] and 2.0 [hp.com], POWER/PowerPC/Power ISA [power.org], MIPS [imgtec.com], and SPARC [sparc.org].

      • by enriquevagu (1026480) on Thursday August 28, 2014 @11:27AM (#47775259)

        This is why we use the terms "Instruction Set Architecture" to define the interface to the (assembler) programmer, and "microarchitecture" to refer to the actual internal implementation. ISA is not bullshit, unless you confuse it with the internal microarchitecture.

    • by morgauxo (974071)

      Does that mean that the Transmeta Crusoe wasn't anything special?

    • by Darinbob (1142669)

      RISC was not supposed to be a religion, which is what it seems to have turned into. No one should even ben arguing the point today because modern chips are so different from when the term 'RISC' was new. The whole premise behind RISC is being used extensively in modern CISC machines. The problem with trying to keep a RISC vs CISC debate alive is that is harms the education of the students.

      RISC is primarily at its core about eliminated complex infrastructure where you can and reusing the resources for thi

  • by i kan reed (749298) on Thursday August 28, 2014 @09:10AM (#47773765) Homepage Journal

    Back when compilers weren't crazy optimized to their target instruction set, people coding things in assembler wanted CISC, and people using compilers wanted RISC.

    But nowadays almost no one still does the former, and the latter uses CISC chips a lot better.

    This is now a question for comp sci history, not engineers.

    • by TWX (665546) on Thursday August 28, 2014 @09:24AM (#47773835)
      You mean, my Github project on Ruby on Rails with node.js plugins isn't optimized to use one versus the other?
    • by Rich0 (548339)

      I actually wonder how relevant CISC even is to people doing assembly programming these days. There is no reason you can't pull an LLVM-like move and target something other than the native instruction set when you are programming.

      That is basically all the x86 does anyway - convert CISC instructions into microcode. There is no reason that an assembler couldn't do the same thing, further blurring the lines between assembly and compiled code. If the whole point of writing assembly is to optimize your code, a

      • Usually, if you're coding in assembly, it's because you're trying to bring some very basic core functionality into a system, be it an OS component, a driver, or a compiler, and usually that means that you're engaged in enough system-specific behaviors that virtualization does you no good.

        Java and .NET benefit from a virtual machine language precisely because they're high level languages, and it's easier to compile assembly to assembly in multiple ways than to compile high level languages to multiple assembl

        • easier to compile assembly to assembly in multiple ways than to compile high level languages to multiple assembly languages.

          In other words, they don't want to be bothered writing real compilers.

        • virtual machine language

          Don't mind me; I'm just twitching over here and fighting down the urge to vomit.

          • Oh no. A technology exists.

            Let me rephrase that. I cannot comprehend your objections.

            • "Virtual" anything means there's at least one layer of abstraction between the thing and anything the layperson would consider remotely close to the hardware. "Machine" would imply something that is inversely quite close to the hardware. To my ears, it sounds like saying "pure hybrid"...you can't be both at the same time.

              Maybe I'm mixing up (virtual (machine language)) and ((virtual machine) language). From the perspective of the Java/.NET compiler it conceptually resembles machine language but it sure does

              • by psmears (629712)

                I can see how Java being in a VM to begin with presents a similar model to running assembly on the actual machine but comparing the two in terms of efficiency and overhead is laughable. I was signalling my cognitive dissonance of conflating Java and assembly so directly.

                You are aware that there are CPUs capable of executing Java bytecode directly? [wikipedia.org] I.e. that use Java bytecode as (one of) their native assembly instruction set(s)?

    • by Nyall (646782) on Thursday August 28, 2014 @09:50AM (#47774071) Homepage

      I think a large part of the confusion is that CISC often means accumulator architectures (x86, z80, etc) vs RISC which means general purpose register (ppc, sparc, arm, etc) In between you have variable width RISC like thumb2.

      As an occasional assembly programmer (PowerPC currently) I far prefer these RISC instructions. With x86 (12+ years ago) I would spend far more instructions juggling values into the appropriate registers, then doing the math, then juggling the results out so that more math could be done. With RISC, especially with 32 GPRs, that juggling is near eliminated to the prologue/epilogue. I hear x86 kept taking on more instructions and that AMD64 made it a more GPR like environment.

      -Samuel

      • by mlts (1038732)

        Even though Itanium is all but dead, I did like the fact that you had 128 GP registers to play with. One could do all the loads in one pass, do the calculations, then toss the results back into RAM. The amd64 architecture is a step in the right direction, and I'd say that even though it was considered a stopgap measure at the time, it seems to have been well thought out.

      • An all the way CISC architecture would allow you to have both operands of an instruction to be pointers to memory, the CPU would have the circuitry to load values into a register in the background. That would eliminate manual register loading. You would also have 3 operand versions of the instructions, as well. x86 is not the most CISC architecture out there or take the concept as far as it could. This would be a very programmer friendly environment. AMD expanded the number of registers, they could add eve

    • by mlts (1038732)

      With Moore's law flattening out, the pendulum might end up swinging back that way.

      Right now, for a lot of tasks, we have CPU to burn, so the ISA doesn't really matter as much as it did during the 680x0 era.

      But who knows... Rock's law may put the kibosh on Moore's law eventually, so we might end up seeing speed improvements ending up being either better cooling (so clock speeds can be cranked up), or adding more and more special purpose cores [1]. At this point, it might be that having code optimized by a c

  • by Anonymous Coward

    The CPU ISA isn't the important aspect. Reduced power consumption mostly stems from not needing a high end CPU because the expensive tasks are handled by dedicated hardware. What counts as top of the line ARM hardware can barely touch the processing power of a desktop CPU, but it doesn't need to be faster because all the bulk processing is handled by graphics cores and DSPs. Intel has for a long time tried to stave off the barrage of special purpose hardware. The attempts to make use of ever more general pu

  • by nimbius (983462) on Thursday August 28, 2014 @09:35AM (#47773935) Homepage
    as a greybeard I remember when choosing Intel over Sun meant the project wasnt completed on time, and your electrical/mechanical engineering group lived in the breakroom while their jobs chugged along. Intel was a toy train compared to the power you'd get with RISC. however I can somewhat confidently say the RISC CISC battle is moot these days because x86 has largely caught up to power, sparc, and others. a competent argument could be made however that if it werent for AMD, most servers would probably still be running some flavour of RISC. The foolhardy nature of SUN and SGI can also be argued as a cause of their demise, but ill not flame. Intel wouldn't have bothered to get off their duff without a poke in the ribs from AMD; they had partnerships with RISC manufacturers anyhow and their own RISC-ish processor called itanium. outside of performance though there is another reason people stick with Power and others just as they have in the past. Lock-in.

    you see, applications like Oracle Business Objects and JD Edwards come with a quid-pro-quo of exacting standards to which most businesses must adhere. Namely, IBM or Sun/Oracle hardware. You may only need accounting and payroll, but you'll have to clear a corner of the room for the circus to set up their hardware and make sure everything is "just so." Their hope is that their quiet mandate becomes your quiet mandate, and before you know it other systems that interact with JDE are now required to be Power-based because "thats what runs JDE." The only way out of this is to realize that any business that doesnt explicitly do payroll or metrics for profit, doesnt need the kind of horsepower decreed by things like SAP.
    • Intel has had several RISC chips (eg i960), but Itanium is VLIW (ie not RISC or CISC).

      Certainly it can be argued that it's AMD's fault for the current dominance of x86, but it's also true that none of the other architectures were cheap enough for the general populace to adopt, hence the abundance of ARM nowadays and POWER, SPARC, and currently on life support

  • by bzipitidoo (647217) <bzipitidoo@yahoo.com> on Thursday August 28, 2014 @09:52AM (#47774093) Journal

    This study looks seriously flawed. They just throw up their hands at doing a direct comparison of architectures when they try to use extremely complicated systems and sort of do their best to beat down and control all the factors that introduces. One of the basic principles of a scientific study is that independent variables are controlled. It's very hard to say how much the instruction set architecture matters when you can't tell what pipelining, out of order execution, branch prediction, speculative execution, caching, shadowing (of registers), and so on are doing to speed things up. An external factor that could influence the outcome is temperature. Maybe one computer was in a hotter corner of the test lab than the other, and had to spend extra power just overcoming the higher resistance that higher temperatures cause.

    It might have been better to approach this from an angle of simulation. Simulate a more idealized computer system, one without so many factors to control.

  • RISC architecture is going to change everything.

  • They are seriously comparing some 90nm process with much better intel 32nm and 45 nm processes.

    They have just taken some random cores made on random (and uncomparable) manufacturing technologies, throw couple of benchmarks and try to declare universal results based on these.

    Few facts about the benchmarks setup and the cores cores:

    1) They use ancient version of GCC. ARM suffers this much more than x86.
    2) Bobcat is relatively balanced core, no bad bottlenecks. mfg tech is cheap, not high performance but relat

    • by WorBlux (1751716)
      Yes, the Ingenic MIPS are a good example of modern MIPS design, with quad core at 1GHZ comeing in at a one watt TDP. The loonbson 3B probably could have beat the i7 on some workloads if it had come out on schedual and a modern process. Now that intel has openCL support it's less likely as half the cores in the loongson 3 series are/were supposed to bed dedicated vector processing units. Another interesting comparison would be the MIPS Tilera design vs. Intel's x86 Knights/table Many Integrated Cores (x86-
  • 20 years ago, RISC vs CISC absolutely mattered. The x86 decoding was a major bottleneck and transistor budget overhead.

    As the years have gone by, the x86 decode overhead has been dwarfed by the overhead of other units like functional units, reorder buffers, branch prediction, caches etc. The years have been kind to x86, making the x86 overhead appear like noise in performance. Just an extra stage in an already long pipeline.

    All of which paints a bleak picture for Itanium. There is no compelling reason to ke

    • by trparky (846769)
      You can't add too many stages to the pipeline or you end up with the Intel NetBurst Pentium 4 Prescott mess again. It had an horrendously long 31-stage pipeline. Can you say... Branch Prediction Failure? The damn thing produced more branch prediction failures than actual work. That's essentially why the NetBurst was completely scrapped and why they went back to the drawing board.
    • As the years have gone by, the x86 decode overhead has been dwarfed by the overhead of other units like functional units, reorder buffers, branch prediction, caches etc. The years have been kind to x86, making the x86 overhead appear like noise in performance. Just an extra stage in an already long pipeline.

      And all that long pipeline takes power to run (recently this argument comes up in discussions of mobile devices more than in the server room, because battery life is hugely important, and ARM in the serverroom is still a joke). ARM chips sometimes don't even have cache, let alone reorder buffers, and branch prediction. When you remove all that stuff, the ISA becomes more important in terms of power consumption.

      Of course, as someone else pointed out, they were comparing 90nm chips to 32nm and 45nm. Why tha

    • All of which paints a bleak picture for Itanium. There is no compelling reason to keep Itanium alive other than existing contractual agreements with HP. SGI was the only other major Itanium holdout, and they basically dumped it long ago. And Itaiums are basically just glorified space heaters in terms of power usage.

      Itanium was dead on arrival.

      It ran existing x86 code much slower. So if you wanted to move up to 64bit (and use Itanium to get there), you had to pay a lot more for your processors, just t
      • by unixisc (2429386)

        Itanium was first conceived as a VLIW CPU. As its development progressed, it was found that the real estates savings due to moving everything into the compiler was minimal, while in the meantime, the compiler was a bitch to write. Also, under the original VLIW vision, software would need to be recompiled every time for a new CPU Which could be a dream for the GNU world, which requires the availability of source code, but practically, a bitch for the real world

        Today's Itanium, unlike Merced, is now more

    • by Curate (783077)

      All of which paints a bleak picture for Itanium.

      Wow, that is a rather bold prediction to be making in 2014. If Itanium does eventually start to falter in the marketplace, then you sir are a visionary.

  • As mentioned over many years of slashdot posts, x86 as a hardware instruction no longer truly exists and represents a fraction of the overall die space. The real bread and butter of CPU architecture and trade secrets rests in the microcode that is unique in every generation or edition of a processor. Today all intel processors are practically RISC.

  • by enriquevagu (1026480) on Thursday August 28, 2014 @11:38AM (#47775417)

    It is really surprising that neither the linked Extremetech article, nor the slashdot summary cite the original source. This research was presented in HPCA'13 [carch.ac.cn] in a paper titled "Power Struggles: Revisiting the RISC vs. CISC Debate on Contemporary ARM and x86 Architectures", by Emily Blem [wisc.edu] et al, from the University of Wisconsin's Vertical Research group [wisc.edu], led by Dr. Karu Sankaralingam. You can find the original conference paper [wisc.edu] in their website.

    The Extremtech article indicates that there are new results with some additional architectures (MIPS Loongson and AMD processors were not included in the original HPCA paper), so I assume that they have published an extended journal version of this work, which is not yet listed in their website. Please add a comment if you have a link to the new work.

    I do not have any relation with them, but I knew the original HPCA work.

  • These days the instruction set matters less than the underlying chip architecture than customization. With this, ARM has an advantage in that their business model allows for higher degree of customization. While some companies can work with Intel or AMD on their designs, for the most part, ARM allows them to change the design as much as they need depending on the licensing.
  • Intel funded research to show their processors are not also rans in the tablet market so manufacturers could feel comfortable using Intel chips in tablets?
  • And actually so is RISC to a degree on POWER processors.

    Back in the 80's going RISC was a big deal. It simplified decode logic (which was a more appreciable portion of the circuit area), reduced the number of cycles and logic area necessary to execute an instruction, and was more amenable (by design) to pipelining. But this was back in the days when CISC processors actually directly executed their ISAs.

    Today, CISC processors come with translation front-ends that convert their external ISA into a RISC-like

HELP!!!! I'm being held prisoner in /usr/games/lib!

Working...