RISC Vs. CISC Is the Wrong Lens For Comparing Modern x86, ARM CPUs (extremetech.com) 118

Posted by EditorDavid on Sunday June 06, 2021 @04:39PM from the chips-on-your-shoulder dept.

Long-time Slashdot reader Dputiger writes: Go looking for the difference between x86 and ARM CPUs, and you'll run into the idea of CISC versus RISC immediately. But 40 years after the publication of David Patterson and David Ditzel's 1981 paper, "The Case for a Reduced Instruction Set Computer," CISC and RISC are poor top-level categories for comparing these two CPU families.
ExtremeTech writes:
The problem with using RISC versus CISC as a lens for comparing modern x86 versus ARM CPUs is that it takes three specific attributes that matter to the x86 versus ARM comparison — process node, microarchitecture, and ISA — crushes them down to one, and then declares ARM superior on the basis of ISA alone. The ISA-centric argument acknowledges that manufacturing geometry and microarchitecture are important and were historically responsible for x86's dominance of the PC, server, and HPC market. This view holds that when the advantages of manufacturing prowess and install base are controlled for or nullified, RISC — and by extension, ARM CPUs — will typically prove superior to x86 CPUs.

The implementation-centric argument acknowledges that ISA can and does matter, but that historically, microarchitecture and process geometry have mattered more. Intel is still recovering from some of the worst delays in the company's history. AMD is still working to improve Ryzen, especially in mobile. Historically, both x86 manufacturers have demonstrated an ability to compete effectively against RISC CPU manufacturers.

Given the reality of CPU design cycles, it's going to be a few years before we really have an answer as to which argument is superior. One difference between the semiconductor market of today and the market of 20 years ago is that TSMC is a much stronger foundry competitor than most of the RISC manufacturers Intel faced in the late 1990s and early 2000s. Intel's 7nm team has got to be under tremendous pressure to deliver on that node.

Nothing in this story should be read to imply that an ARM CPU can't be faster and more efficient than an x86 CPU.

RISC Vs. CISC Is the Wrong Lens For Comparing Modern x86, ARM CPUs

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 118 Comments Log In/Create an Account

Comments Filter:

Slow news year (Score:2)

by ceoyoyo ( 59147 ) writes:

Somebody at ExtremeTech got tired of bitching about chip shortages and put together a word salad about instruction sets.
- Re:Slow news year (Score:4, Interesting)
  
  by ShanghaiBill ( 739463 ) writes: on Sunday June 06, 2021 @05:05PM (#61460484)
  
  TFA is just babbling about theory and the semantics of what "CISC" and "RISC" really mean. Who cares?
  The reality is that the M1 delivers twice the performance on half the power.
  Intel has the excuse that their chips suck because of bad fabs rather than bad architecture.
  But AMD has no such excuse because their chips are fabbed in the same TSMC facilities as the M1.
  
  - The three really ARE one (Score:2)
    
    by raymorris ( 2726007 ) writes:
    
    The article tries to say that the ISA doesn't matter as much "because what about the microarchitecture?". Well, the microarchitecture is an *implementation* of the ISA. It's the ISA that drives the microarchitecture.
    "And that about process node?", the article says. As you said, they can be, and are, fabbed in the exact same fabs. They can use exactly the same node. TSMC and the process node doesn't care if the transistors in the mask implement CISC or RISC. So that's not a difference.
    - Re: (Score:2)
      
      by dfghjk ( 711126 ) writes:
      
      "It's the ISA that drives the microarchitecture."
      No it doesn't. The objective drives the microarchitecture, the ISA influences it. There are examples of processors of different ISAs built on a common microarchitecture, for example Centaur x86 processors.
      - Re: (Score:2)
        
        by raymorris ( 2726007 ) writes:
        
        I wonder which Centaur processors, specifically, you have in mind when you say a single microarchitecture implements different ISAs.
        For the classic, well-known Centaur x86 processors, they've been very public about their design goals and the fact that one of those is that the commonly-used x86 instructions must execute in a single clock cycle, by being implemented directly in the hardware. Essentially, their design process is that the ISA *is* the circuit, for the commonly-used instructions.
        Their newest chi
        
        Dang typos (Score:2)
        
        by raymorris ( 2726007 ) writes:
        
        That should says "separate circuits implementing separate ISAs", on different parts of the wafer, at different clock speeds, with WILDLY different bittedness (register width).
    - Re: (Score:3)
      
      by edwdig ( 47888 ) writes:
      
      The article tries to say that the ISA doesn't matter as much "because what about the microarchitecture?". Well, the microarchitecture is an *implementation* of the ISA. It's the ISA that drives the microarchitecture.
      I wouldn't say that at all about any Intel chip starting from the Pentium Pro. The entire point of the Pentium Pro line was to design a fast processor, then tack on a small piece that converted x86 instructions into something the processor liked. The part of a modern x86 chip that cares about the x86 ISA is very, very tiny.
      - Re: (Score:2)
        
        by raymorris ( 2726007 ) writes:
        
        And yet shockingly, all of the commonly-used x86 instructions execute in a single clock. Which means that the hardware directly implements that exact instruction. That didn't happen by random chance.
        There are some less commonly used instructions which can implemented by a series of other instructions. That's where the microcode is doing anything. Those instructions where it's implementing an x86 instruction by running multiple CPU instructions, which therefore take multiple clock cycles.
        
        Re: (Score:2)
        
        by edwdig ( 47888 ) writes:
        
        And yet shockingly, all of the commonly-used x86 instructions execute in a single clock. Which means that the hardware directly implements that exact instruction. That didn't happen by random chance.
        Yes, all modern CPUs execute the load/store/integer math instructions that make up the bulk of code in a single cycle. Most code is simple instructions that execute very quickly with pipelined cpus. No, it's not chance, it's just the bare minimum design necessary to have a viable CPU nowadays.
  - Re: Slow news year (Score:2)
    
    by Pinky's Brain ( 1158667 ) writes:
    
    AMD is an entire process node behind Apple.
    Apple can afford to just buy the first 12+ months of TSMCs smallest node as a marketing exercise.
  - Re: (Score:3)
    
    by AmiMoJo ( 196126 ) writes:
    
    Compared to Ryzen? The M1 is about half the speed and while it uses a bit less power, in practice it's the difference between a laptop with 20 hours battery life and 15 hours. Meaningless to most people.
    We shall see what M2 is like but I can't see it matching mobile Ryzen unless they go nuts with the number of cores, but that might give them cache issues given how reliant M1 is on having a massive L1 cache.
- Re: (Score:1)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
Process Node? (Score:3)

by niftydude ( 1745144 ) writes: on Sunday June 06, 2021 @05:00PM (#61460460)

The existence of TSMC means that any process node available to CISC is available to RISC.

Does this guy know what he's talking about?

- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  I think what they mean is that previously RISC CPU manufacturers had a disadvantage because Intel was ahead in terms of shrinking it's process down, but now because of TSMC and Intel getting stuck on 14nm++++ that isn't the case any more.
  So now we are closer to being able to do a reasonable comparison RISC and CISC manufactured on the same process.
- Re: Process Node? (Score:2)
  
  by Pinky's Brain ( 1158667 ) writes:
  
  Assuming they could afford to bid enough that TSMC sells them capacity of the most recent node. Intel probably could, AMD probably not.
Instruction decode (Score:2)

by Serif ( 87265 ) writes:

One thing that RISC has going for it that works against legacy CISC ISAs is the lack of variable length instructions. If the aim is to have as many concurrent non-interdependent instructions in flight as possible, then the instruction decode logic becomes very complex for CISC. In RISC the fact that all instructions are the same size means that reading and analyzing the next N instructions is comparatively easy.
- Re: (Score:1)
  
  by KirbyCombat ( 1142225 ) writes:
  
  For the ARM, please look up Thumb Instructions. The ARM very much has variable length instructions.... And an instruction decoder... (very CISC-like).
  - Re:Instruction decode (Score:5, Informative)
    
    by ShanghaiBill ( 739463 ) writes: on Sunday June 06, 2021 @05:37PM (#61460538)
    
    The ARM very much has variable length instructions....
    Not really. An ARM normally executes fixed length 32-bit instructions.
    You can switch it into "thumb" mode and execute fixed-length 16-bit instructions.
    But they are two separate instruction decoders, both fixed length. You switch modes by executing the "BLX" instruction.
    In addition to fixed-length instructions, ARM also benefits from aligned instructions. 32-bit instructions are always stored on a 4-byte boundary and thumb instructions on a 2-byte boundary. So, unlike x86 you don't need to waste time and silicon on logic to deal with a page fault in the middle of fetching an instruction.
    
    - Re: (Score:2)
      
      by ChunderDownunder ( 709234 ) writes:
      
      Correct me if I'm wrong but when ARMv9 cores go 64 bit-only in the next 18 months, Thumb disappears from silicon?
      'legacy'...
      - Re: (Score:2)
        
        by Megane ( 129182 ) writes:
        
        I don't know anything about whether they will remove Thumb from Cortex A (for general computing), but Cortex M (for embedded) has been Thumb-only since the start.
    - Re:Instruction decode (Score:4, Interesting)
      
      by Waccoon ( 1186667 ) writes: on Sunday June 06, 2021 @09:56PM (#61461078)
      
      That's just a technicality. Rather than having to "switch" to a different instruction length, lots of RISC processors mix 16 and 32-bit instructions in the same stream, such as RISC-V, MicroMIPS, PPC VLE, and SH-4. The idea that RISC processors use fixed-length instructions is a couple decades out of date, and they have adapted some of the properties of CISC designs because... well, CISC wasn't really that wrong, after all.
      When people talk about the inefficiencies of variable-length instructions, what they mean is the use of instruction extensions, where immediate and displacements are encoded separately from the opcode as a number of extra word reads. "Compressed" RISC instructions do the same thing, except their extra words are misaligned/scrambled so they need a separate decoder for those shortened instructions. CISC processors always align the words. So, ironically, a properly designed CISC can handle variable-length instructions more efficiently than RISC.
      I'm working on my own hobby CISC processor that does just this with "normal" and "compressed" versions of every memory instruction. I have an 8-bit design with 16 registers that supports up to 20-bit immediates and 17-bit displacements, and a 32-bit design with 32 registers that supports up to 32-bit immediates and 23-bit displacements. Both are RISC-like internally, support shifting of immediates on load, and can perform computes on load (as long as they are ALU operations).
      Pure RISC processors are pretty boring, all effectively work the same way, and are a pain to program in assembly. By combining the properties of both RISC and CISC, not only is it possible to make something even better, but it's also a lot of fun.
      
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  Problem long since solved. Literally last century.
- Re: (Score:2)
  
  by edwdig ( 47888 ) writes:
  
  RISC's fixed length instructions simply the decoder.
  CISC's variable length instructions mean a lot more instructions fit in the processor's cache.
  When decoders were a significant portion of the chip, that was a win. Now the decoder is negligible, and a huge portion of performance comes down to putting in the best cache you can.
  - Re: (Score:2)
    
    by Rockoon ( 1252108 ) writes:
    
    ..and not eating up precious die space with stupid shit like an overabundance of instruction fetch paths because 6 instructions is 6 x N bytes and thats just the way it is
    
    For those reading all these posts that dont know that instruction fetch is a thing, not just semantically, but is a proper unit within the cpu, should think long and hard about their passion for RISC.
    
    How many more proper units within a CPU can I surprise you RISC fans with? How about the instruction retirement unit? Didja know that thi
- Re: (Score:3)
  
  by Waccoon ( 1186667 ) writes:
  
  That depends on whether you're talking about instruction extensions or opcode pages. The 68K architecture used extensions, which allowed the CPU to offer 16 or 32-bit immediate values for many instructions, but all opcode words were fixed at 16-bits in length. By comparison, the x86 uses opcode pages with extension bytes, which make things incredibly messy. This alone is the biggest reason why x86 is so horrible, but very, very few CISC processors do things this way.
  RISC processors have supported variabl
  - Re: (Score:3)
    
    by Megane ( 129182 ) writes:
    
    Coldfire took that even farther by simply eliminating every 68000 instruction more than 3 words long. Few if any of those were used by compilers, for instance move from absolute address to absolute address. But the 68000 has an illegal instruction trap, and it's possible to add emulation of those instructions for legacy code.
    Some 8-bit 68xx processors (6809, 68HC11, 68HC12) used a few pre-bytes for opcode pages, but from what I hear, x86 went nuts with it. There was also a 68HC16 that wasn't very popular,
    - Re: (Score:2)
      
      by Waccoon ( 1186667 ) writes:
      
      Coldfire was a way of saying the 68020 was a mistake. The '020 added a whole ton more address modes over the 68000 that I'm not sure even 68K fans understood or wanted. One nice thing x86 has going for it is that it only allows one memory access per instruction.
      - Re: (Score:2)
        
        by Megane ( 129182 ) writes:
        
        Hmm... I spent so little time looking at anything specific to the 68020 or later (other than 32-bit offset branches, different exception frames, and "it has a barrel shifter") that I never thought of that angle. So you can add "even knew about" to that list. But I also never heard of Coldfire as intending to fill in for the 68020 either, just for the 68000 in embedded space. It was really never in a position to validate or invalidate the 68020.
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  If the aim is to have as many concurrent non-interdependent instructions in flight as possible, then the instruction decode logic becomes very complex for CISC.
  The CISC decoder in a modern CPU is literally smaller than some of the functional units. It is a minor footnote in CPU area compared to basically everything else. This doesn't mean it isn't complicated, it's just not that complicated compared to the rest of the CPU.
Compliers are critical (Score:5, Informative)

by david.emery ( 127135 ) writes: on Sunday June 06, 2021 @05:02PM (#61460470)

Compliers these days are very complex beasts, more than just translators. They perform lots of analysis (and some heuristics) to determine instruction set scheduling, operation sequencing, register allocation, etc. So in part, the 'best ISA' is one that enables the most powerful compiler.
(I suspect that the tight coupling between processor architecture and implementation, language/compiler, and OS/applications infrastruture are key parts to Apple's performance advantage with the new M-1 and successors.)

- Re: (Score:2)
  
  by KirbyCombat ( 1142225 ) writes:
  
  Remember that RISC stands for Relegate Impossible Stuff to Compiler.
  - Re: (Score:2)
    
    by WarJolt ( 990309 ) writes:
    
    What about the runtime? I guess RISRT doesn't have a nice ring to it. I believe ARM performance was one of the motivating reasons why Dalvik was designed as a register machine. A lot of developers never touch compilers anymore.
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  The tight compiler/CPU coupling thing was tried with Itanium and it failed hard. The main issue is that the compiler can only consider a single thread executing on a single core, for a single model of CPU with a given number of ALUs and FPUs, a fixed pipeline. In practice the actual execution environment of a CPU is much less predictable.
  Modern CPUs, including ARM ones like the M1, do things like out-of-order execution and speculative execution to mitigate these issues. As such it's generally better for the
- - Re: (Score:3)
    
    by david.emery ( 127135 ) writes:
    
    Well, the truly optimizing comment compiler would catch and flag my spelling error :-(
    - Re: (Score:2)
      
      by aberglas ( 991072 ) writes:
      
      Remember that for modern machines, part of the compiler is on the chip. It does not actually execute the raw byte code that is the horror that is X86. But that hardware compiler needs power.
      - Re: (Score:3)
        
        by Rockoon ( 1252108 ) writes:
        
        It does not actually execute the raw byte code that is the horror that is X86.
        What you call horror, is a performance-beneficial feature that RISC cannot have due to the entire ethos of it being a stupid religion.
  - Re: (Score:1)
    
    by Tablizer ( 95088 ) writes:
    
    Blame it on a branch prediction bug
Comment removed (Score:5, Interesting)

by account_deleted ( 4530225 ) writes: on Sunday June 06, 2021 @05:03PM (#61460476)

Comment removed based on user account deletion

- Re: (Score:2)
  
  by Waccoon ( 1186667 ) writes:
  
  My beef is that people think that x86 = CISC. Since x86 sucks, that obviously means CISC sucks.
  I'd probably be easier to get his point across if the whole industry hadn't drunken the RISC Kool-Aid and there were more CISC designs around. If done properly, CISC can indeed be just as efficient as RISC, even if that's not possible with Intel's wretched instruction set.
Not RISC vs CISC. VAX and 68k vs everyone else (Score:5, Interesting)

by Seven Spirals ( 4924941 ) writes: on Sunday June 06, 2021 @05:06PM (#61460488)

There are a few RISC machines (such as the SGI MIPS line) that are undeniably sexy and interesting. ARM also has auto-vectorization for NEON SIMD which is cool. However, if you are an Assembly coder, there is really only one CPU and that's the VAX. If you were too broke to afford one at the time the next coolest was the Motorola 68k (especially once it went 32-bit). Nowadays few folks are down on the metal long enough to care. I like it down here. It's quiet.

- von Neumann vs Capability machines... (Score:4, Interesting)
  
  by david.emery ( 127135 ) writes: on Sunday June 06, 2021 @05:21PM (#61460512)
  
  Actually, the most interesting machines were the capability machines, particularly the stillborn BiiN processor. Intel learned a lot from the i432, and developed a capability engine with an otherwise 'RISC' flavor. The capability engine was ripped out and the remaining RISC chip was sold as the I80960. That had a long run in embedded computing, in part because as a system component it was easier to verify. But from Intel's perspective, this happened at the height of the WinTel x86 boom, and there was MUCH MORE money to be made selling Pentiums etc than to look at a radically different approach to computing. Too bad, the BiiN capability machine might have made a very good and extremely secure execution environment for C++ and other languages. (I worked in that project in the mid '80s.)
  So if you want to get much more esoteric, the debate between "von Neumann vs non-von Neumann" ISAs would be worth investigating. Given the continuing cybersecurity problems, I believe it's time to take another look at capability machines.
  
  - Re:von Neumann vs Capability machines... (Score:5, Informative)
    
    by UnknownSoldier ( 67820 ) writes: on Sunday June 06, 2021 @05:51PM (#61460564)
    
    > the debate between "von Neumann vs non-von Neumann" ISAs would be worth investigating. Given the continuing cybersecurity problems, ...
    The "non" von Neumann" architecture you refer to is called Harvard architecture [wikipedia.org].
    As you point out, yes, the von Neumann architecture allows for self-modifying code and all sorts of code injection tricks, whereas the Harvard architecture doesn't. It is indeed tragic that it never really caught on. Early consoles had ROM cartridges but there still was only had a single address space from the CPU's POV since it was "good enough".
    
    - Re: (Score:2)
      
      by ShanghaiBill ( 739463 ) writes:
      
      It is indeed tragic that it never really caught on.
      Both x86 and ARM use Harvard Architecture internally.
      Data and instructions are stored in separate L1 caches.
      - Re: (Score:2)
        
        by jonwil ( 467024 ) writes:
        
        Its not true Harvard Architecture since what the programmer sees is one single area of memory that can contain code or data.
        How would you implement a true Harvard Architecture on a modern computer with many GB of RAM? Would you need one stick of RAM for code and a totally separate stick for data?
        
        Re: (Score:2)
        
        by david.emery ( 127135 ) writes:
        
        On BiiN, the memory had an extra bit for each 32 bit word, that indicated that word was a 'capability' rather than just plain old data. The HW and OS worked tightly together to control how that bit got set.
        So yeah, not commodity RAM. How much is security worth these days?
        
        Re: (Score:3)
        
        by ShanghaiBill ( 739463 ) writes:
        
        How would you implement a true Harvard Architecture on a modern computer with many GB of RAM?
        Why would you want to?
        Separate I/D caches give you the benefits of a Harvard Architecture without the drawbacks.
        Would you need one stick of RAM for code and a totally separate stick for data?
        I suppose so, which is why it makes no sense at that level.
        If you are concerned about malicious code modification, you can just mark code pages as read-only in the MMU ... which is what many OSes already do.
        
        Re: (Score:2)
        
        by wierd_w ( 1375923 ) writes:
        
        A combination of CPU controlled allocation/division of ram assets, via a mapper chip, and an ISA.
        Basically, on initial program load, a small rom contains the bootup program, and another small rom contains the bootup configuration data.
        The initial program in the rom gives the facility to instruct the mapper chip, and the data stored in the configuration rom contains the configuration data to feed to the mapper chip.
        The mapper chip then takes the unified memory, (which appears as flat from its perspective), a
        
        Re: (Score:2)
        
        by david.emery ( 127135 ) writes:
        
        As I recall, a BiiN 'access descriptor' (capability) had several different parts, including some access bits and a type descriptor, as well as the actual object identifier. So you'd use the type information to obtain (in the class object-oriented sense) the code (method) for the per-type implementation of the requested operation. As I said, a potential great match for OO language execution at the OS level.
        One of the things I remember was a generalization of 'file directory' operations. You could re-impl
    - Re: (Score:2)
      
      by _merlin ( 160982 ) writes:
      
      Early consoles had ROM cartridges but there still was only had a single address space from the CPU's POV
      That's only true of some consoles. NES cartridges have separate spaces for CPU and PPU memory, with completely separate signals. Neo Geo cartridges have separate spaces for main CPU program, sound CPU program, graphics, and two sets of sound samples. Atari 2600, SNES, Master System and Mega Drive had single address space, but that wasn't universal.
      - Re: (Score:2)
        
        by UnknownSoldier ( 67820 ) writes:
        
        Great examples! I didn't dive into consoles since there are tons of examples, counter-examples, and pseudo-Harvard architectures such as the PS2's SPU (2 MB audio ram), the PS3's SPU's had its own address space for execution, etc.
- Re: (Score:1)
  
  by shoor ( 33382 ) writes:
  
  I never did assembly on a Vax, though I worked at places that had them running BSD Unix 4.2 or 4.3. I did however, do a lot with the Motorola 68000 and yeah, I liked it. It's a pity it didn't handle page faults fully, so you couldn't do memory mapped page swapping with it, even you had a memory management chip to go with it. They fixed that with the 68010, but I think by then the IBM PC (ugh) had come out so the market for home PCs was taken over by that inferior machine.
  However, some of the older 8 bit
- Re: (Score:3, Informative)
  
  by tlhIngan ( 30335 ) writes:
  
  RISC doesn't work. It can't. Because you cannot make a fundamental programming object on a pure RISC architecture.
  The big fundamental property of a RISC architecture is load-store. If you want to access memory, you use dedicated memory access instructions (load and store) to read and write memory. CISC architectures allow memory operands to be given, so you can perform an addition with a register and a memory location. This is fundamentally impossible on RISC, you have to load both operands from memory into
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  68k is a great architecture, but it is probably a bit *too* orthogonal. There are half a dozen ways to do some things, and knowing the fastest was part of the challenge for assembler coders back in the day.
  6502, you could memorize every instruction and timing quite easily. Modern AVR is quite similar.
  - Re: (Score:2)
    
    by Seven Spirals ( 4924941 ) writes:
    
    6502 was awesome and I agree: for the simplicity of it. I use it to teach sometimes. VAX was awesome because of it's model CISC design from an OS-centric perspective, rather than a computer science purist score. RISC frequently underwhelms on performance, so they make these huge expensive die packed with L3 cache to make up for their lack of practical grunt (but hey, they got results that way, too, it's just expensive to fab them). However, RISC at least has some boss-chips like the POWER9 line or the Alpha
Speak for yerself (Score:1)

by Yo,dog! ( 1819436 ) writes:

CISC v. RISC, meh. X64 has double the registers of X86, which produced a significant performance boost, and ARM64 doubles them again. This is huge in my experience.
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  "This is huge in my experience."
  And what is your experience? Posting on /.?
RISC vs RISC with translation layer. (Score:4, Insightful)

by Z80a ( 971949 ) writes: on Sunday June 06, 2021 @06:03PM (#61460598)

CISC is basically dead since forever, acting more like a crude data compression/decompression circuit rather than an actual architecture.
x86 is not CISC since the pentium pro or even earlier.

- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  RISC is also "basically dead since forever". Nothing "reduced" about modern "RISC" processors. There is little differentiation between ISAs, they are not particularly significant, and the design approaches of all modern processors is similar.
  - Re: (Score:2)
    
    by Z80a ( 971949 ) writes:
    
    So "not as large as x86" processors?
- Re: (Score:3)
  
  by Waccoon ( 1186667 ) writes:
  
  Pretty much. That's because there's no hard line of separation between CISC and RISC, and over the years, RISC machines have adopted most of the features of CISC, though they pretend that they haven't.
  Most of the arguments are, as usual, silly religious wars.
This sounds similar to the debate over (Score:1)

by Tablizer ( 95088 ) writes:

...phonetic versus pictogram alphabets. There are pro's and con's of each, and throughout history each has had ebbs and flows in position. Pictograms proved difficult in the typewriter era, but are making a comeback of sorts because computers offer more input options, and also for being more spoken-language-independent: think international road signs and emojis.
And so (Score:2)

by Impy the Impiuos Imp ( 442658 ) writes:

35,000 foot executive summary:
Risc -- generates machine code of the form load from memory to processor, perform math, store results back
Cisc -- perform processor operation "add number at location x to number in processor already, store results at location y"
Hence the reduced vs. complex instruction sets. With Risc you only have load, store, and math, not math operations bundled with loads and stores in one line.
The benefit is that even though you have to generate more machine code with Risc (3 vs. 1 comman
CISC vs RISC was not why x86 became dominant (Score:4, Interesting)

by Megane ( 129182 ) writes: on Monday June 07, 2021 @08:06AM (#61461990)

There is exactly one reason that x86 became dominant. It is not because of CISC vs RISC. It is because IBM picked it in 1981 for their PC.
The first IBM PC was pretty boring, but had enough expansion capability that it became THE architecture that everybody wanted. Then it got extended like crazy, every now and then replacing an entire part like the bus (ISA, EISA, VESA, MCA, PCI, PCI-E) or the video (MDA, CGA, HGA, EGA, VGA, SVGA, DVI) with something new that lived side-by-side for a few years before the old thing vanished. Similarly, the CPUs upgraded with Intel's designs, with the 80286 as particularly dark years because Intel had decided everybody needed a protected mode operating system that couldn't access blocks of memory larger than 64K, when people were already fucking with segment register math to access larger blocks of memory. People worked around that stupidity simply because it was faster than an 8088, then Intel got a clue and made the 80386 with a virtual 8086 mode.
Why IBM picked 8088 over the other main choice (68k) is the subject of much debate. What I've seen indicates IBM wanted the 68008 rather than the 68000 because it needed less RAM chips, and Motorola didn't want to commit to IBM's schedule. Motorola management at that time only wanted the 68000 in expensive Unix server boxes (as documented in the DTACK Grounded newsletter), and their lack of giving a fuck certainly didn't help. Without the 68008, 8088 won by default.

- Re: (Score:3)
  
  by MonoSynth ( 323007 ) writes:
  
  The other theory is that it was all because of bureaucracy inside IBM.
  Quoting (MIPS architect) John Mashey:
  "In dealing with external supplier, there would generally be one IBM
  division that would be the lead in dealing with that supplier, and
  if you were in another division, you had to work through that first division.
  Needless to say, a fast-track effort like the IBM PC wouldn't care for that
  much ... and there was already another division using 68Ks ..."
  https://www.yarchive.net/comp/... [yarchive.net]
  - Re: (Score:2)
    
    by niks42 ( 768188 ) writes:
    
    I disagree. Vendor logic, in general went through a Corporate component qualification process. Most divisions in IBM went to CCP in Burlington for its parts - so once available to one development group in IBM, they were available to all. Even TTL components - 7400 series - were qualified, and then supplied to IBM with IBM part numbers. What Boca did, to avoid the costs of the central procurement and qualification process (which wasn't cheap or easy - I did it once upon a time for TIL311 LED displays) was to
- Re: (Score:2)
  
  by niks42 ( 768188 ) writes:
  
  There was a lovely combination of factors, none of which would have compelled by itself. As well as all of the other factors, I think the physical packaging was a big part. 8088 / 86 was in a 40-pin plastic package with tinned pins. 68000 was in a 64-pin ceramic package, with gold pins. So, a socketed version was going to have to be a 64-pin gold socket, not a 40-pin tinned socket (and IBM would not countenance putting dissimilar metals on pin and socket for reliability reasons). You couldn't solder the 64-
  - Re: (Score:2)
    
    by Megane ( 129182 ) writes:
    
    But you can't ignore that the commodity DRAMs of the day were all 1-bit parts. They would have needed 16 chips just for the RAM with a 68000, or design some crazy bus multiplexing as a workaround. I'm not sure if the 8088 needed that too, but there would be no point in using it with 8-bit wide RAM. And the 68000 was definitely in a plastic package when the Macintosh was released in early 1984.
    So you've given another argument about why IBM would have wanted the 68008, but I think the RAM argument is big eno
Transistor Counts vs Memory Bandwidth (Score:4, Insightful)

by NothingWasAvailable ( 2594547 ) writes: on Monday June 07, 2021 @10:14AM (#61462400)

Another view, that I heard from IBM architects, was that RISC made sense when transistors were expensive (as the article points out, processors requiring multiple chips to implement) but memory bandwidth was comparable to processor speeds. At that point in history, fixed length instructions and fixed length operands fetched by a simpler processor made a lot of sense.
Skip forward a couple decades, and transistors are (essentially) free, so there isn't a huge cost for complexity. Processor speeds (200/250 ps clocks) are much higher than memory speeds. Now variable length instructions and variable length operands, feeding a more complex decode engine, make a lot of sense.

- Re: (Score:2)
  
  by niks42 ( 768188 ) writes:
  
  Strangely enough in the original Brainiacs vs Speed Demons articles in Microprocessor Report, they put IBM Power in the Brainiac camp, so they didn't think of it as a RISC engine. Reduced Instruction Set Complexity, at most.
  
  Still wins my Favourite Mnemonic award for an instruction - Enforce In Order Execution of I/O Operations, or somesuch. EIEIO.
I know that... (Score:2)

by RealNeoMorpheus ( 6713808 ) writes:

Brand worshiping, err sorry, brand loyalty is a thing these days and customers will defend the companies, instead of themselves.
That said, I personally wish that Intel simply died and disappeared.
I hate how they keep us in what I call 4 core hell for over a decade, simply because they didnt had any competition.
And talking about competition, they almost killed and got away with the many illegal crap that they pulled on AMD.
Lets the white knights come with their swords to defend their hopefully soon to be dea
Learn to use the tool you are given. (Score:2)

by jellomizer ( 103300 ) writes:

The CISC x86 has a lot of people who know how to code for it, Thus has coding styles and practices optimal for the CPU, So knowing particular type of calculations work better one way vs an other way allows for a better running program. Then when you port it over to a different platform, you find that port while it will run fine, is less optimal than the first system.
RISC is like a Marathon runner, and CISC is like a sprinter.
A RISC System are actually ideal for systems that run at a constant high load, as
- Re:Who cares? (Score:5, Interesting)
  
  by ShanghaiBill ( 739463 ) writes: on Sunday June 06, 2021 @04:59PM (#61460454)
  
  x86 ... the worst ISA ever created.
  If you believe that x86 is the worst ever, then you should learn about the iAPX 432 [wikipedia.org].
  All you can say about the x86 is that it was the worst ISA that was a massive global success. Plenty of worse designs died in infancy.
  
  - Re:Who cares? (Score:4, Insightful)
    
    by CaptainLugnuts ( 2594663 ) writes: on Sunday June 06, 2021 @06:01PM (#61460594)
    
    I'd say Intel's 8051 is far worse than x86 and had widespread use in embedded systems. It's truly terrible.
    
    - Re: (Score:2)
      
      by ShanghaiBill ( 739463 ) writes:
      
      Yes, the 8051 is horrible. 8051s are often used in American mil-spec projects. 8051s are rad-hard, manufactured with depleted boron dopants. They have been used for many decades, so no bureaucrat wants to stick their neck out and authorize a more modern micro-controller like, say, an AVR.
      - Re: (Score:3)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
    - Re: (Score:3)
      
      by Aighearach ( 97333 ) writes:
      
      8051 is one of the shortage chips that has the automotive sector crying.
      Somebody out there loves the damn thing.
    - Re: (Score:2)
      
      by spth ( 5126797 ) writes:
      
      Have you seen the smaller Microchip PIC? Compared to them, MCS-51 (8051) looks like a nice, clean architecture.
      They have a hardware stack (i.e. a hard limit on call depth, often just 2 or 4).
      Like some other microcontrollers, some can have more physical memory than the logical addresses allow, and use an extra register for the upper bits on the address bus. But on the pic, that register can be written using bit instructions only. I.e. how much effort it takes to access a known address depends on the Hamming
      - Re: (Score:2)
        
        by The123king ( 2395060 ) writes:
        
        At some point, someone has to ask just what kind of arcane application was this abomination actually designed for?
        
        Re: (Score:2)
        
        by spth ( 5126797 ) writes:
        
        It was originally designed in the 1970s as an I/O processor (typically multiple per computer) for a PDP-11 clone that used multiplexed address and data busses.
        It was and is cheap, so it found widespread use in some embedded systems where a 8051 would have been too expensive.
        
        Re: (Score:2)
        
        by The123king ( 2395060 ) writes:
        
        ...huh. What PDP-11 clone? I know that DEC used PDP-11's as microcontrollers and other similar tasks in a lot of their products (T-11 in the RQDX1/2, PDP-11/03 as a bootstrap for VAXen) but i didn't realise there was western clones as well. Would that be Mentec or QED CPU boards?
        
        Re: (Score:2)
        
        by spth ( 5126797 ) writes:
        
        I'm not sure of the details, possibly it was more a PDK-11-like system rather than a clone, since it used the CP1600 processor, which has nearly, but not fully the same instruction set as the PDP-11.
        I don't even know if that system was really actually in the end, or they just developed it to the point that they had the PIC1650 meant for use in that system.
        Either way, the PIC1650 became the ancestor of a large family of cheap microcontrollers, and while they are still made by Microchip, today there are also
  - Re: (Score:3)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
    - Re: (Score:2)
      
      by AmiMoJo ( 196126 ) writes:
      
      AMD x64 was designed. Okay, within the constraints of being somewhat x86 compatible, but it did sort out a lot of the worst parts of the x86.
      In some ways the lack of design is a strength. x86 basically underwent evolution, with new instructions being introduced and some dying out because they proved to be unpopular or were superseded by better ones.
      No modern CPUs actually execute x86 directly anyway, it's basically an intermediate language these days. Bytecode that the CPU translates into internal instructi
- Re:Who cares? (Score:5, Interesting)
  
  by Kisai ( 213879 ) writes: on Sunday June 06, 2021 @05:29PM (#61460524)
  
  Nah.
  The problem , as stated in the article, is a hyperfocus on one aspect about CPU's (the ISA) rather than things like process node or thermal properties.
  Is RISC better? Better at what. I'd say that the reason we didn't switch to MIPS or DEC Alpha or ARM back in 1999 or so was because nothing justified it. You could put an ARM chip in a mobile device, but mobile devices all relatively sucked until 2011. Smartphones came out in 2008, but those first three or so models everyone put out were relatively weak and crappy. But they played music and some crappy quality videos.
  It's only in the last 3 years that ARM parts beat Intel parts at the same category: Mobile. Laptops in particular. They don't beat laptops in every aspect, but what has been historically been a garbage part (Intel iGPU's) that made Intel, iGPU-only laptops utter garbage to use for everything is now having it's lunch stolen by ARM. An ARM part can do all of this, better, than Intel can, and it's not the ISA that makes it that way. It's simply that Apple designed a CPU that was DESIGNED for a phone, tablet, laptop or All-in-One Mac that is reasonable for that device. Where as Intel's been a top-down approach where they design Xeon's and i7/i9 parts and then carve off (bin) parts into lower performance categories. This is why the H parts seem like desktop parts, but the U and Y parts have been utter trash in every laptop.
  The ARM parts from Apple are better than those U and Y parts, and run laps around them. That's a testament to designing a good CPU+GPU part and that part comes in at a higher performance level and lower TDP than the Intel part because again, the Intel part is really more of a cut-down desktop part than it is a laptop part. Intel's incompetence at die-shrinking is how they lost Apple's business. Everyone was predicting that Apple would switch to their own parts for everything, but it took Intel being too far behind, repeating the mistakes of IBM and Motorola (which sent Apple to Intel in the first place) that Apple finally decided to just switch with what they have. Hence the M1.
  Now what about Samsung or Qualcomm, or someone else? Well they don't sell CPU's, they sell SoC's, Samsung is the only one in a position to actually be able to build a CPU for a device they sell (Samsung Galaxy phones. SmartTV's, etc) and the only reason they've been kinda awful is that they are chained to terrible operating systems like Android and Windows that don't use the features in the CPU, or lack software (such as ARKit on iOS) to make those fancy hardware features useful.
  It needs to be said how badly Android has dropped the ball on ARCore, and how utterly horrible Windows has no AR at all. For the last 2 years, there's been a huge uptick in AR face and body tracking for game development and entertainment (because, the pandemic sent people looking for other ways to be entertained, remotely) and Android is being left behind. The hardware Samsung makes is capable of doing something close to ARKit (Apple), but Android has nothing like ARkit. ARKit supports 52 shapes, works perfectly with Unity. ARCore (Android) supports, 3.
  Basically the lack of AR camera hardware in any Android smartphone is going to push people who want to use that stuff to either iphones, or software-only solutions on Windows/MacOS on higher end CPU's.
  
  - Comment removed (Score:5, Interesting)
    
    by account_deleted ( 4530225 ) writes: on Sunday June 06, 2021 @07:21PM (#61460790)
    
    Comment removed based on user account deletion
    
    - Re: (Score:2)
      
      by edwdig ( 47888 ) writes:
      
      The point of RISC is *not* the instruction set. The point is to free up chip resources beign used for instruction decoding so that they can be used for much more important stuff; like more registers, a cache, pipelines, etc. Intel doesn't bother solving that problem on the x86 line because it has vast amounts of resources to shove into a chip, it doesn't care that it spends a lot of chip space on instruction decoding (which it decodes into a risc-like macro-instruction). But if you have limited resources, a fixed price point to compete with, then you do care about how to get performance out of a small chip.
      All that stuff mattered decades ago, but for a long time now, the instruction decoding portion of the chip has been completely negligible. The vast majority of the transitors on an Intel chip make up the cache. Even within a single core, the instruction decoder is trivial.
      It's even more trivial on Apple's chips. The big performance trick Apple pulled on the M1 was putting the main memory inside the CPU package. The RAM inside the CPU package dwarfs the size of the CPU. The transistors for instruction decodi
      - Re: (Score:2)
        
        by drinkypoo ( 153816 ) writes:
        
        Yet those transistors for decoding x86 are always active and sucking a non-trivial amount of power.
        It's a trivial amount of power compared to the rest of the CPU, and it's paid back by the effective instruction compression provided by variable-length instructions in x86.
    - Re:Who cares? (Score:5, Informative)
      
      by imgod2u ( 812837 ) writes: on Monday June 07, 2021 @02:03AM (#61461456) Homepage
      
      Almost every modern ARM implementation does instruction decoding and not something much more complex than the x86 front-end of most Intel or AMD CPUs. The complex x86 instructions in modern x86 CPUs are handled by ROM lookup tables.
      And ARM is anything but a "simple" instruction set -- especially prior to v8. And most (except Apple) modern ARM implementations still has to support the older v7 ARM instructions. That means things like Thumb, load-multiple, NEON's more....eccentric vector load/stores, etc.
      And you know what? There's a reason ISA's can be complex. Instructions get added because someone needed them and it was more performant and efficient to let the compiler specify (through an instruction) what to do rather than the uArch figuring it out at runtime.
      The "purest" RISC ISA today is RISC-V (brainchild of Patterson, who loves simple RISC ISAs) but that simplicity has led to all sorts of limitations of where it can be used. There's no security model, for example. Or virtualization. Or wide SIMD...
      
      - Re: (Score:3)
        
        by WarlockD ( 623872 ) writes:
        
        The "purest" RISC ISA today is RISC-V (brainchild of Patterson, who loves simple RISC ISAs) but that simplicity has led to all sorts of limitations of where it can be used. There's no security model, for example. Or virtualization. Or wide SIMD...
        Ah now there is if you look at the latest drafts.(Volume 2, Privileged Spec v. 20190608 mainly [riscv.org]), you can see that they are going to support a multi layered privilege spec, not to mention multi mmu types depending on application. Its why I am starting to fall in love with RISC-V. You can make a stupidly simple ISA with just RV32IMC or go all the way to 64bit with stack privilege system and mmu.
    - Re: (Score:2)
      
      by AmiMoJo ( 196126 ) writes:
      
      The point is to free up chip resources beign used for instruction decoding so that they can be used for much more important stuff; like more registers, a cache, pipelines, etc.
      Which made sense when designers were struggling to get enough gates onto the chip, but look at modern CPU designs. Most of the die area is cache. The Apple M1 has a massive 320k L1 cache per core. That's bigger than AMD and Intel x86 cores use.
      There is still some advantage in terms of power saving, which is why ARM has efficient cores (with shorter pipelines and less cache) and performance cores. But x86 does that too now, you don't need RISC to make it work.
      Performance ARM has picked up the longer pipeline
    - Comment removed (Score:4, Interesting)
      
      by account_deleted ( 4530225 ) writes: on Monday June 07, 2021 @04:10PM (#61463560)
      
      Comment removed based on user account deletion
      
  - Re: (Score:2)
    
    by willy_me ( 212994 ) writes:
    
    Is RISC better? Better at what.
    RISC is easier to implement as a superscalar [wikipedia.org] design. You can go deeper with fewer resources which results in a faster CPU. This is why the M1 is as fast as the Intel/AMD parts despite only running at ~3GHz. I read that the M1 has 8 parallel execution units compared to 4 for Intel/AMD. And it is not like Intel or AMD can just add more - well they could but they currently have 4 because there is no benefit to going any wider. And this is largely due to being a CISC ISA where instructions can come in any
    - Re:Who cares? (Score:5, Interesting)
      
      by imgod2u ( 812837 ) writes: on Monday June 07, 2021 @02:11AM (#61461468) Homepage
      
      This is insanely, embarrassingly false. There's no limitation to the number of parallel execution pipes and it has absolutely nothing to do with how "complex" the instructions are. I mean for fuck sakes, you just have to look at Zen3 (link: https://www.anandtech.com/show... [anandtech.com]) for a counter-example to some mythical "limit" to parallel x86 pipes.
      If all that was necessary for better performance was more parallel execution pipes, they'd have shoved 69 of those in there long ago. One of the wider designs I've ever worked on was Project Denver (link: https://www.anandtech.com/show... [anandtech.com]) with 7 wide superscalar pipes. Embarrassingly, that thing performed like a schizophrenic sloth.
      There's much more to CPU performance that just how much execution resources you have. And that's why Intel hasn't spent a lot of time increasing them in their designs: there are bigger bottlenecks to be battled.
      
    - - Re: Who cares? (Score:3)
        
        by MysteriousPreacher ( 702266 ) writes:
        
        In the same way a cyclist needs a bike to be able to keep up with me running? That the cyclist can match my speed with ease using far less energy is incidental I suppose.
  - Re:Who cares? (Score:4, Informative)
    
    by Bert64 ( 520050 ) writes: <bert@slaTEAshdot ... m minus caffeine> on Sunday June 06, 2021 @10:12PM (#61461106) Homepage
    
    MIPS and Alpha were around in the early 90s, and were beating x86 hands down on performance. The reasons for not switching to them was primarily cost, with the secondary issues being lack of marketing and lack of direct compatibility.
    The only thing x86 had was economies of scale. Selling millions of cheaper chips allowed them to invest a lot more into R&D and eventually surpass the other superior architectures. The same thing is happening with ARM, millions of cheap low power chips allows a lot of R&D into higher performance processors.
    We also now have better marketing, Microsoft and especially Apple are pushing their ARM based systems far more heavily than was ever done for other architectures. Binary compatibility is now less important due to many things being web based and/or open source but also both Apple/MS have created seamless emulation layers allowing the older binaries to still run. Apple did that before with the move from m68k to PPC, and MS actually had an x86 emulation layer for the Alpha, but it was not really marketed very aggressively.
    
    - Re: (Score:2)
      
      by Rockoon ( 1252108 ) writes:
      
      MIPS and Alpha were around in the early 90s, and were beating x86 hands down on performance.
      At the end of the Alpha it was the performance that doomed it.
      
      Everyone is worried about instruction decode and how RISC blah blah blah but thats a disjoint theory. There is no reason to believe that the easiest to decode and dispatch ISA would have fixed sized instructions and every reason to believe that it would have the fewest instructions so you guessed it, a single Turing Complete instruction.
      
      The problem with fixed sized instructions is that it ends up being pretty big. 32-bits for instance. That
      - Re: (Score:2)
        
        by jeremyp ( 130771 ) writes:
        
        Why are you talking about read-modify-write cycles when this is a fairly atypical operation in a processor? With modern compilers, you are much more likely to be doing read-modify-modify-modify.... - write, assuming storing in registers doesn't count as a write.
  - Re: (Score:2)
    
    by currently_awake ( 1248758 ) writes:
    
    If fast memory is cheaper than a fast CPU then use RISC (needs lots of fast memory to perform well). If a fast CPU is cheaper than fast memory then use CISC(for when your memory is expensive).
- Re: (Score:2)
  
  by Waccoon ( 1186667 ) writes:
  
  That's really what annoys me the most about people who say RISC is better than CISC. All of their arguments boil down to, "x86 is bad, therefore CISC bad!"
  I'm working on my own hobby CISC processor for a retro computer, which of course it intended to be programmed only in assembly. My solution to "fix" the complexity problem is to limit orthogonality to ALU operations, so you can Load and Add, or Load and Compare, but you can't Load and Multiply. Rather than have a ton of special-purpose instructions to
  - Re: (Score:2)
    
    by Rockoon ( 1252108 ) writes:
    
    Then you can reply to their argument with "x86 performs better in practice, therefore RISC bad!"

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Slow news year (Score:2)

Re:Slow news year (Score:4, Interesting)

The three really ARE one (Score:2)

Re: (Score:2)

Re: (Score:2)

Dang typos (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: Slow news year (Score:2)

Re: (Score:3)

Re: (Score:1)

Process Node? (Score:3)

Re: (Score:2)

Re: Process Node? (Score:2)

Instruction decode (Score:2)

Re: (Score:1)

Re:Instruction decode (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re:Instruction decode (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Compliers are critical (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:1)

Comment removed (Score:5, Interesting)

Re: (Score:2)

Not RISC vs CISC. VAX and 68k vs everyone else (Score:5, Interesting)

von Neumann vs Capability machines... (Score:4, Interesting)

Re:von Neumann vs Capability machines... (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Speak for yerself (Score:1)

Re: (Score:2)

RISC vs RISC with translation layer. (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

This sounds similar to the debate over (Score:1)

And so (Score:2)

CISC vs RISC was not why x86 became dominant (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Transistor Counts vs Memory Bandwidth (Score:4, Insightful)

Re: (Score:2)

I know that... (Score:2)

Learn to use the tool you are given. (Score:2)

Re:Who cares? (Score:5, Interesting)

Re:Who cares? (Score:4, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)