Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
IBM Desktops (Apple) Hardware

A History of PowerPC 193

A reader writes: "There's a article about chipmaking at IBM up at DeveloperWorks. While IBM-centric, it talks a lot about the PowerPC, but really dwells on the common ancestory of IBM 801" Interesting article, especially for people interested in chips and chip design.
This discussion has been archived. No new comments can be posted.

A History of PowerPC

Comments Filter:
  • by Erect Horsecock ( 655858 ) on Wednesday March 31, 2004 @03:47PM (#8728195) Homepage Journal
    IBM Also announced [tmcnet.com] a ton of new PPC information and tech today at an event in new york. Opening up the ISA to third parties including Sony.
  • by Anonymous Coward on Wednesday March 31, 2004 @03:56PM (#8728338)
    They also have a very good article about the PowerPC's three instruction levels and how to use implementation-specific deviations, while code stays compatible. This introduction to the PowerPC application-level programming [ibm.com] model will give you an overview of the instruction set, important registers, and other details necessary for developing reliable, high performing PowerPC applications and maintaining code compatibility among processors.
  • by Erect Horsecock ( 655858 ) on Wednesday March 31, 2004 @03:57PM (#8728346) Homepage Journal
    yes they are all PPC "based" now. The PS3 will be using what is called the Cell cpu which is derived from the Power ISA.

    Theres a pantload of info here [ibm.com].
  • Re:Motorola (Score:4, Informative)

    by Kiryat Malachi ( 177258 ) on Wednesday March 31, 2004 @03:59PM (#8728381) Journal
    Motorola didn't give up on PPC.

    They gave up on desktop PPC. They still do a lot of new PPCs, just working on improving MIPS/watt instead of pure MIPS. Embedded space is a lot higher volume and bigger profit than Apple.
  • by Erect Horsecock ( 655858 ) on Wednesday March 31, 2004 @03:59PM (#8728391) Homepage Journal
    The new POWER 5s, although i have no idea how they work, are said to allow virtual microprocessors to allow you to run multiple OSes at once. That could make for some pretty usefull linux apps/distros for windows technician (think repairing viruses and stuff)


    This is really cool stuff. IBM is a little late to the game in some regards, SGI has been doing this stuff for years in IRIX on their MIPS machines. But hey better late than never...
  • Re:Big Endian (Score:5, Informative)

    by Mattintosh ( 758112 ) on Wednesday March 31, 2004 @04:00PM (#8728403)
    PPC is big endian, which is normal.

    X86 is little endian, which is chunked-up and backwards.

    Example:
    View the stored number 0x12345678.

    Big endian: 12 34 56 78
    Little endian: 78 56 34 12

    Clear as mud?
  • Nice PowerPC Roadmap (Score:5, Informative)

    by bcolflesh ( 710514 ) on Wednesday March 31, 2004 @04:01PM (#8728419) Homepage
    Motorola has a nice overview graphic [motorola.com] - you can also checkout a more generalized article at The Star Online [star-techcentral.com].
  • Both Endians (Score:2, Informative)

    by bsd4me ( 759597 ) on Wednesday March 31, 2004 @04:05PM (#8728476)

    The PPC ISA has support for both big- and little-endian modes. However, the little-endian mode is a bit screwy. There are some appnotes on the Motorola website on using little-endian mode.

  • by bhtooefr ( 649901 ) <[gro.rfeoothb] [ta] [rfeoothb]> on Wednesday March 31, 2004 @04:09PM (#8728536) Homepage Journal
    If it's what I think it is, then Intel has been doing this since the 80386 (try VMWare, which uses your box's CPU in this way, then Bochs, which emulates an x86 CPU), Motorola (and therefore IBM, because of the AIM alliance) has been doing this since the PPC 601 (Mac-on-Linux only runs on PPCs, pretty damn obvious here, isn't it?), and it just goes on and on.
  • by afidel ( 530433 ) on Wednesday March 31, 2004 @04:10PM (#8728545)
    The core has a full mips-3 instruction set, with extensions from mips-4 and mips-5
    link [uiuc.edu]

    So yes, it is in a way MIPS derived, but the MIPS core does very little of the actual processing, it's more of a bootloader and I/O coprocessor.
  • by Snocone ( 158524 ) on Wednesday March 31, 2004 @04:10PM (#8728552) Homepage
    P.S. Does anyone know why Windows has never been adapted to run under PPC?

    Errm, actually, it WAS. See for instance

    http://home1.gte.net/res008nh/nt/ppc/default.htm [gte.net]
  • by Anonymous Coward on Wednesday March 31, 2004 @04:10PM (#8728554)
    Had to be a typo, meant "Playstation 3"
  • by levram2 ( 701042 ) on Wednesday March 31, 2004 @04:10PM (#8728556)
    Intel has also shown virtual micropartitions, rebooting Windows XP while running a DVD without a hitch. The SMT being added to the Power5 is called Hyperthreading by Intel PR. I hope IBM, Intel, AMD and others keep competing.
  • by sirinek ( 41507 ) on Wednesday March 31, 2004 @04:12PM (#8728575) Homepage Journal
    They did briefly for WinNT 3.51, but then shit-canned it pretty quickly. They had a MIPS version as well, and an Alpha version that lasted even to 4.0 IIRC.

  • Re:Motorola (Score:3, Informative)

    by Kiryat Malachi ( 177258 ) on Wednesday March 31, 2004 @04:13PM (#8728588) Journal
    They're spinning it off, actually, not selling it. Going to be called Freescale Semiconductor.

    So, you could say Motorola is giving up on semiconductors... but the division that worked on the G4 will continue to work on PPC. Just under a different name.
  • by MrIrwin ( 761231 ) on Wednesday March 31, 2004 @04:14PM (#8728598) Journal
    OTOH it would be difficult to write computer history pre late '60's **with** IBM. Apart from sponsoring the Harvard MK1 they were pretty oblivious to waht computers would do to thier market.

    It was Lyons Tea Shop Company, of all unlikely contenders, who married "electronic programmible devices" to IT.

    Of course when they realised thier mistake they went hell for leather to redress the balance. But...amazingly.....they were totally off the ball **again** with microcomputer technology.

  • by Thaidog ( 235587 ) <slashdot753@@@nym...hush...com> on Wednesday March 31, 2004 @04:14PM (#8728609)
    Actually, think MOL Linux running Mac in it's own runtime... at natvie speed. This technology is from the mainframe chips. A hand me down so to speak.
  • Well sort of (Score:4, Informative)

    by Erect Horsecock ( 655858 ) on Wednesday March 31, 2004 @04:16PM (#8728640) Homepage Journal
    It's actually closer to Intel's Vanderpool [xbitlabs.com] technology that allows you to partition the cpu through firmware.

    Example: Windows is running on slice 1, BSD on slice 2, and Linux on slice 3.

    BSD gets a kernel panic and crashes, the slice is restarted without affecting the remaining running OS's. It's, for the lack of a better term, Hyperthreading for the whole computer.
  • Re:Big Endian (Score:4, Informative)

    by Pius II. ( 525191 ) <PiusII@nospAM.gmx.de> on Wednesday March 31, 2004 @04:17PM (#8728647)
    Motorolas PPC implementation is only partly dual-endian. The G3s are byte-sexual, most G4s are, but some G4 chipsets are not.
  • by Morologous ( 201459 ) on Wednesday March 31, 2004 @04:17PM (#8728652)
    Or, you could always settle for an RS/6000.

    RS/6000 [ibm.com]

    Or, a Power-based IBM workstation,

    Workstation [ibm.com]
  • by sam_van ( 602963 ) on Wednesday March 31, 2004 @04:20PM (#8728676) Homepage
    When I was working on the embedded IBM PowerPCs (400 series), we used Verilog primarily...though there were a few VHDL hold-outs.
  • Re:Big Endian (Score:5, Informative)

    by Anonymous Coward on Wednesday March 31, 2004 @04:20PM (#8728678)
    Big-endian appeals to people because they learned to do their base-10 arithmetic in big-endian fashion. The most significant digit is the first one encountered. It's habit.

    Little-endian has some nice hardware properties, because it isn't necessary to change the address due to the size of the operand.

    Big Endian:
    uint32 src = 0x00001234; // at address 1000, say
    uint32 dst1 = src; // fetch from 1000 to get 00001234
    uint16 dst2 = src; // fetch from 1000 + 2 to get 1234

    Little Endian:
    uint32 src = 0x00001234; // at address 1000, say
    uint32 dst1 = src; // fetch from 1000
    uint16 dst2 = src; // fetch from 1000

    The processor doesn't have to modify register values and funk around with shifting the data bus to perform different read and write sizes with a little-endian design. Expanding the data to 64 bits has no effect on existing code, whereas the big-endian case will have to change all the pointer values.

    To me, this seems less "chunked up" than big endian storage, where you have to jump back and forth to pick out pieces.

    In any event, it seems unnecessary to use prejudicial language like "normal" and "chunked up". It's just another way of writing digits in an integer. Any competent programmer should be able to deal with both representations with equal facility.

    Being unable to deal with little-endian representation is like being unable to read hexadecimal and insisting all numbers be in base-10 only. (Dotted-decimal IP numbers, anyone?)

    Big-endian has one big practical advantage other than casual programmer convenience. Many major network protocols (TCP/IP, Ethernet) define the network byte order as big-endian.

  • by Anonymous Coward on Wednesday March 31, 2004 @04:25PM (#8728736)
  • They actually didn't shit-can it until NT4. The MIPS version (AFAIK) got shit-canned as 2000 went into alpha, and Alpha got shit-canned as 2000 was coming out of alpha. Itanium came into the picture between Whistler (AKA WinXP) alphas and W2K final, and some W2K Itanium alphas exist (they obviously got shit-canned, and the tech went into WinXP 64-bit for IA64).
  • MVS... (Score:2, Informative)

    by DAldredge ( 2353 ) <SlashdotEmail@GMail.Com> on Wednesday March 31, 2004 @04:35PM (#8728843) Journal
    Yea, IBM is a little late to the virtual processor market. About -10 to -20 years.

    Damn them! Dam them to HELL!!!!

  • Re:Yeah, I remember (Score:5, Informative)

    by Billly Gates ( 198444 ) on Wednesday March 31, 2004 @04:44PM (#8728972) Journal
    Yes

    What Intel did was include RISC architecture in around the x86 instruction set to create the pentium pro, pentium II, III, etc. Otherwise they would have been killed.

    Infact IBM was correct. Cisc was dying. THe pentium1 could not compete agaisnt the powerpc unless it had a very high clock speed. All chips today are either pure risc or a hybrid cisc/risc like todays Althons/Pentium's. The exception is the nasty Itanium which is not doing too well
  • by Zo0ok ( 209803 ) on Wednesday March 31, 2004 @04:48PM (#8729030) Homepage
    The concept of RISC (that each instruction takes one cycle) is what makes pipelining possible in the first place. If you have instructions that take 2-35 cycles to execute its very hard to produce an efficient pipeline.

    Also, things like Out-of-order-execution and Branch-prediction makes more sense for a RISC instruction set (so I was told ;).

    But I more or less agree with you that a long pipeline is somewhat contradictory to the idea of RISC.
  • by Steveftoth ( 78419 ) on Wednesday March 31, 2004 @05:00PM (#8729178) Homepage
    When my buddy first told me about this exciting new RISC idea one of the design goals was each instruction was to take a single instruction cycle to execute. Isn't this completely contrary to a deep pipeline? The Pentium 4 has a 20-stage pipeline IIRC.
    Not really, the idea is to make every instruction simple.
    Reduced Instruction Set Computer
    The side effects of this are that every instruction can be the same length thus simplifying the complex decoding process of a CPU.
    x86 can be multiple bytes in length, while all PPC (and most RISC) instructions are all 32 bits long (yes even the PPC-64 instructions).
    the simplified insruction set allows for more instructions to be processed in less cycles, but generally you need more instructions to do the same thing. Since it's easier to decode the PPC instructions, it's also easier to pipeline them, easier to do superscalar cores (since less transistors are required to do the same thing).

    This doesn't always translate into more performance since RISC computers generally need more memory (the code is less dense) and thus more bandwidth to achieve the same performance sometimes. While some x86 instructions are hard to crack for the decoder, the savings in memory to store the instruction can make it worthwhile to do.

    If I am not mistaken the Transmeta was a very wide instruction word. And if I am not mistaken, doesn't that make it the opposite of a RISC?

    Yep, but the problem is that you're asking the compiler to extract the parrallelism from the instruction stream, which is not always possible. Usually, there is more thread level parallelism then instruction level parallslism.
  • by Abcd1234 ( 188840 ) on Wednesday March 31, 2004 @05:05PM (#8729253) Homepage
    Yeah. It's a good thing that the processors in the POWER line has unbelievable branch prediction logic. So, for example, the branch prediction rate for the POWER 4 is in the mid to high 90 percentile for most workloads (as high as 98%, IIRC) In fact, quite a large number of transitors are dedicated to this very topic, which allows the processor to do a pretty good job of achieving something close to it's theoretical IPC.

    Although, it should be noted that the pipeline depth for the POWER4 is just 15 stages (as opposed to the P4 which has, IIRC, 28 stages), so while a branch misprediction is quite bad, it's not as bad as some architectures. My understanding is that, in order to achieve that 200 IPC number, the POWER4 is just a very wide superscalar architecture, so it simply reorders and executes a lot of instructions at once. Plus, that number may in fact be 200 micro-ops per second, as opposed to real "instructions" (although, that's just speculation on my part... it's been quite a while since I read up on the POWER4), as the POWER4 has what they term a "cracking" stage, similar to most Intel processors, where the opcodes are broken down into smaller micro-ops for execution.
  • by Zathrus ( 232140 ) on Wednesday March 31, 2004 @05:28PM (#8729612) Homepage
    When my buddy first told me about this exciting new RISC idea one of the design goals was each instruction was to take a single instruction cycle to execute. Isn't this completely contrary to a deep pipeline?

    No, in fact pipelining is central to the entire concept of RISC.

    In traditional CISC there was no pipelining and operations could take anywhere from 2-n cycles to complete -- at the very least you would have to fetch the instruction (1 cycle) and decode the instruction (1 cycle; no, you can't decode it at the same time you fetch it -- you must wait 1 cycle for the address lines to settle, otherwise you cannot be sure of what you're actually reading). If it's a NOOP, there's no operation, but otherwise it takes 1+ cycles to actually execute -- not all operators ran in the same amount of time. If it needs data then you'd need to decode the address (1 cycle) and fetch (1 cycle -- if you're lucky). Given that some operators took multiple operands you can rinse and repeat the decode/fetch several times. Oh, and don't forget about the decode/store for the result. So, add all that up and you could expect an average instruction to run in no less than 7-9 cycles (fetch, decode, fetch, decode, execute, decode, store). And that's all presuming that you have a memory architecture that can actually produce instructions or data in a single clock cycle.

    In RISC you pipeline all of that stuff and reduce the complexity of the instructions so that (optimally) you are executing 1 instruction/cycle as long as the pipelines are full. You have separate modules doing the decodes, fetches, stores, etc. (and in deep-pipeline architectures, like the P4, these steps are broken up even more). This lets you pump the hell out of the clockrate since there's less for each stage of the pipeline to actually do.

    Modern CPUs have multiple everything -- multiple decoders, fetchers, execution units, etc. so it's actually possible to execute >1 cycle/cycle. Of course, the danger to the pipelining is that if you branch (like when a loop runs out or an if-then-else case) then all those instructions you've been decoding go out the window and you have to start all over from wherever the program is now executing (this is called a pipeline stall and is very costly; once you consider the memory delays it can cost hundreds of cycles). Branch prediction is used to try and mitigate this risk -- generally by executing both branches at the same time and only keeping the one that turns out to be valid.

    Was I wrong to laugh when I heard hardware manufacturers claim, "sure, we make a CISC, but it has RISC-like elements .

    Yes, because neither one exists anymore. CISC absorbed useful bits from RISC (like cache and pipelining) and RISC realized there was more to life than ADD/MUL/SHIFT/ROTATE (oversimplification of course). The PowerPC is allegedly a RISC chip, but go check on how many operators it actually has. And note that not all of them execute in one cycle. x86 is allegedly CISC, but, well... read on.

    how wide are the Pentium 4 and Athlon microcode?

    The x86 ISA has varying width. It's one of the many black marks against it. Of course, in reality, the word "microcode" isn't really applicable to most CPUs nowadays -- at least not for commonly used instructions. And to further muddy the picture both AMD and Intel don't actually execute x86 ISA. Instead there's a translation layer that converts x86 into a much more RISC-y internal ISA that's conducive to running at more than a few megahertz. AFAIK, the internal language is highly guarded by both companies.

    If I am not mistaken the Transmeta was a very wide instruction word. And if I am not mistaken, doesn't that make it the opposite of a RISC?

    Transmeta and Intel's Itanium use VLIW (very large instruction word) computing, which is supposed to make the hardware capable of executing multiple dependant or independant operations in one cycle. It does so by putting the onus on the compiler
  • by Pope ( 17780 ) on Wednesday March 31, 2004 @05:37PM (#8729731)
    The 604 had better floating point performance than the 601, so a number of audio apps I used to use had different specific versions that were installed when the installer ran.

    You'd go into its folder and see "Peak (604)" or "Deck II (604)" to let you know that it was going to use your particular processor to its best performance.
  • by Wudbaer ( 48473 ) on Wednesday March 31, 2004 @06:08PM (#8730163) Homepage
    Points taken, but I think they owe(d) this more to their absolutely overwhelming market presence and domination (as well as doing things like calling your boss to make sure you get fired for not buying IBM) than their supreme marketing. For a long time, for people computers was IBM. IBM always was there, and everyone thought they would stick around as they always would, unchanged, untouched, invincible. Their style of selling apparently was more something like shock and awe with sales people, threats and promises, one-to-one, than marketing as it is usually done.

    Then came the PC, Unix, the fiascos with OS/2 (especially OS/2 marketing was pretty bad) and Microchannel, and IBM changed. They certainly still are one of (if not even the) largest, but they are only a shadow of their former might and the terror they could inflict on people daring to not choose IBM.
  • by Schlaefer ( 673275 ) on Wednesday March 31, 2004 @06:13PM (#8730244)
    In the dsp range vliw gets more attention. Take the TI C6000 serie for example. Pure VLIW (8 instruction/cycle for the 8 exec-units) Risc (dedicated load-store arch. etc.) with no pipeline interlock and a very short pipelines you have impressive performance at low cycles/s. In addition you have the advantace off compile ones and have a dedicatet behavior at runtime. Unlike cisc cpus which have to rearange the instructions at runtime you can (if you want) literaly move at compile time any the assembler instruction to the cycle/exec.-unit you want at runtime. Schlaefer i'm sorry for my poor engl.
  • Re:Too scary! (Score:2, Informative)

    by Feynman ( 170746 ) on Wednesday March 31, 2004 @06:23PM (#8730377)
    What kind of design tools did you use?

    Mostly IBM-developed schematic capture, simulation, and physical design tools. I also did some work on test structure verification using an IBM-designed tool.

    Tools available [ibm.com] in the current ASIC methodology are on the IBM website. Some of these would have been used back then, too.

  • Impressive (Score:3, Informative)

    by 1000101 ( 584896 ) on Wednesday March 31, 2004 @06:39PM (#8730584)
    "What do the Nintendo GameCube's Gekko, Transmeta's first Crusoe chips, Cray's X1 supercomputer chips, Xilinx Virtex-II Pro processors, Agilent Tachyon chips, and the next-generation Microsoft XBox processors-which-have-yet-to-be-named all have in common? All of them were or will be manufactured by IBM."

    That's quite impressive. Throw the 970 in that mix and it's even more impressive. The bottom line is that Intel isn't alone at the top of the mountain when it comes to producing high quality, fast, and reliable chips. On a side note, as a soon-to-be-graduating CS major, I dream about working at a place like IBM.

  • Re:Big Endian (Score:5, Informative)

    by karlm ( 158591 ) on Wednesday March 31, 2004 @08:04PM (#8731539) Homepage
    What kind of strange CPU implementation modifies register values when addressing sub-word vlaues? This is done most commonly by the programmer at write-time, (or maybe by some strange compiler or assembler at compile-time). This is not a hardware advantage in any architecture I'm aware of. Are you perhaps talking about extra hardware burden associated with unaligned memory access? Unaligned memory access is not a consequence of byte ordering.

    One more big advantage of the big-endian byte order is that 64-bit big-endian CPUs can do string comparisons 8 bytes at a time. This is a big advantage where the length of the strings is known (Java strings, Pascal strings, burrows-wheeler transform for data compression) and still an advantage for null-terminated strings.

    I'm not aware of any such performance advantages for the little-endian byte order.

    The main advantage of little-endian byte order is ease of modifying code written in assembly or raw opcodes if you later decide to change your design and go with larger or smaller data fields. The main uses for assembly programming are very low-level kernel programming (generally the most stable part of the kernel code base) and performace enhancement of small snippets of code that have been well tested and profiled and are unlikely to change a lot.

    I agree that an decent programmer should be able to deal with either endianess, but the advantages of the little-endian byte order seem to be becoming less and less relevant.

  • Re:Motorola (Score:2, Informative)

    by mrand ( 147739 ) on Thursday April 01, 2004 @12:01AM (#8733531)
    As much as it pains me to, I must agree with your general theme that something is missing in Motorola's processor development. As a embedded hardware engineer, I've watched them stumble over themselves time and again on the PowerQUICC II [motorola.com]:

    - Just last year they reached core speeds they promised back in 2000 (or was it 1999?).

    - PCI support was two years late (or was it three)?

    - Power dissipation has been higher than expected.

    - Some clock speeds require you to run a different voltage, while other other clock speeds don't work at all (if you use certain clock multipliers).

    We still actively design in their parts because they are a perfect fit, but we don't trust them to deliver their next feature on time (last Oct they promised the 8270 and related devices would be in production by December... here we are in March and now they are promising May). I hope they can get their act together, cause when they finally release a product, it works like a hose.

"A car is just a big purse on wheels." -- Johanna Reynolds

Working...