Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Intel Hardware

Larrabee Based On a Bundle of Old Pentium Chips 286

arcticstoat writes "Intel's Pat Gelsinger recently revealed that Larrabee's 32 IA cores will in fact be based on Intel's ancient P54C architecture, which was last seen in the original Pentium chips, such as the Pentium 75, in the early 1990s. The chip will feature 32 of these cores, which will each feature a 512-bit wide SIMD (single input, multiple data) vector processing unit."
This discussion has been archived. No new comments can be posted.

Larrabee Based On a Bundle of Old Pentium Chips

Comments Filter:
  • I doubt it (Score:5, Interesting)

    by Bender_ ( 179208 ) on Monday July 07, 2008 @05:49PM (#24090219) Journal

    I doubt it. Maybe they mentioned the Pentium as an example to explain an in-order superscalar architecture as opposed to more modern CPUS.

    -There is a lot of overheard in the P54C to execute complex CISC operations that are completely useless for graphic acceleration.

    -The P54C was manufactured in a 0.6micron BiCMOS process. Shrinking this to 0.045micron CMOS (more than 100x smaller!) would require a serious redesign up to the RTL level. Circuit design had evolve with process technology.

    -a lot more...

  • Manycore GPU (Score:5, Interesting)

    by DrYak ( 748999 ) on Monday July 07, 2008 @05:54PM (#24090293) Homepage

    Larrabee [wikipedia.org] is going to be Intel's next creation in the GPU world. A many core GPU which has the following peculiarities :

    - fully compatible with x86 instruction set. (whereas other GPU use different architecture, and often instruction sets that aren't as much adapted to run general computing).
    Thus, the Larrabee could *also* be used as a many core main processor (if popped into a quick path socket) and used to execute a good multicore OS. Something that's not achievable with any current GPU (both ATI's and nVidia's completely lack some control structures - both are unable to use subroutines and everything must be in-lined at compile time)

    - unlike most current Intel x86 CPUs, features a shallow pipeline, executing instruction in-order. Hence, the Larrabee (and the Silverthorne which also have such characteristics) are regularly compared with old Pentiums (which also share those characteristics) since the initial announcement and including in TFA.

    - feature more cores with narrower SIMD : 32 cores able each to handle 16 32bit float simultaneously. Whereas, for exemple nVidia's CUDA-compatible GPU have up to 16 cores only, but each able to execute 32 threads over 4 cycles and keep up to 768 threads in flight.
    This enable Larrabee to cope with slightly more divergent code than traditional GPUs and make it a good candidate to run stuf like GPU accelerated RayTracing.

    Hence all the recent technical demos running Quake 4 in raytracing mentionned on /.

    That's for what Intel tells you.

    Now the old and experienced geek will also notice that Intel has only kept making press releases and technical demo running on plain regular multi-chip multi-core Intel Cores (just promising that the real chip will be even better than the demoed stuff).

    Meanwhile, ATI and nVidia are churning new "half"-generations each 6 months.

    And the whole Larrabee is starting to sound like a big vaporware.
     

  • Re:I doubt it (Score:4, Interesting)

    by Enleth ( 947766 ) <enleth@enleth.com> on Monday July 07, 2008 @06:12PM (#24090493) Homepage

    It's unlikely but not impossible - don't forget that the Pentium M and, subsequently, Core line of processors was based on Pentium III Coppermine, whereas the Pentium 4 Netburst architecture developed in the meantime was abandoned completely. Going back to Pentium I would be a bit on the extreme, but it's possible that they meant some basic design principles of Pentium I, not the whole core as it was. Maybe they will make something from scratch, but keep it similar to the original Pentium's inner RISC core, or maybe redo it as a vector processor or hell knows what. It was a citation from a translated interview with some press monkey, so you can expect anything.

  • Re:I doubt it (Score:4, Interesting)

    by Chip Eater ( 1285212 ) on Monday July 07, 2008 @06:17PM (#24090567)
    A process shrink, even a deep one like .6 um to 45 nm shouldn't require too many RTL changes if the design was done right. But I don't think they are using "soft" or RTL cores. Most likely this P54C was a custom design. Shrinking a custom design is a lot more tedious. Which might help explain why they chose such a old, small core.
  • by Antony T Curtis ( 89990 ) on Monday July 07, 2008 @06:18PM (#24090593) Homepage Journal

    If anyone remembers those old original Pentiums, their 16-bit processing sucked - so much that a similarly clocked 486 could outperform them. I guess that it would be reasonably trivial for Intel to slice off the 16bit microcode on this old chip to make a 'pure' 32-bit only processor. I am sure that they will be using the designs with a working FPU... but for many visual operations, occasional maths errors would largely go unnoticed. Remember when some graphics chip vendors were cheating on benchmarks by reducing the quality ... and how long it took for people to notice?

    Although, if I had Intel's resources and was designing a 32-core cpu, I would probably choose the core from the latter 486 chips... I don't think a graphics pipeline processor would benefit much from the Pentium's dual instruction pipelines and I doubt that it would be worth the silicon realestate. The 486 has all the same important instructions useful for multi-core work - the CMPXCHG instruction debuted on the 486.

  • Re:I doubt it (Score:4, Interesting)

    by georgewilliamherbert ( 211790 ) on Monday July 07, 2008 @06:24PM (#24090739)

    One does not "shrink" a chip by taking photomasks and shrinkenating. One redoes the design / layout process, generally. The P5 series went from 0.8 um to 0.25 um over its lifetime (through Tillamook), stepping through 0.6, 0.35, and finally 0.25 um.

    It was 148 mm^2 at 0.6 um, so the process shrink should bring it down to a floorplan of around a square millimeter or so a core. Not sure how big the die will be for Larrabee, but the extra space will probably support the simple wide data unit per core and more cache. If the SIMD is simple it could be another 3-4 million transistors / 1 square mm or so. For a 100 mm^2 chip that gives you another 30 mm^2 or so for I/O and cache (either shared, or parceled out to the cores).

  • by hattig ( 47930 ) on Monday July 07, 2008 @06:49PM (#24091137) Journal

    Right. It clearly isn't using the Pentium design, but a Pentium-like design.

    To that, they will have added SMT, because (a) in-order designs adapt to SMT well because they have a lot of pipeline bubbles and (b) there will be a lot of latency in the memory system and SMT helps hide that. I would assume 4 way SMT, but maybe 8. Larrabee will therefore support 128 or 256 hardware threads. nVidia's GT280 supports 768.

    The closest chip I can think of right now is Sun's Niagara and Niagara 2 processors, except with a really beefy SIMD unit on each core, and a large number of cores on the die because of 45nm. I think Niagara 3 is going to be a 16 core device with 8 threads/core, can anyone confirm?

    Note that this is pretty much what Sony wanted with Cell, but Cell was 2 process shrinks too early. 45nm PowerXCell32 will have 32 SPUs and 2 PPUs (whereas Larrabee looks like it is matching an equivalent of a weak-PPU with each SPU equivalent). It could run at 5GHz too... power/cooling notwithstanding.

  • by greywire ( 78262 ) on Monday July 07, 2008 @06:57PM (#24091247) Homepage

    at least 20 years ago, I thought, hey, with the density and speed of transistors these days, and with RISC being popular, why not go all the way and make chip with literally hundreds of (wait for it..) Z80 cpu's?

    Of course I and others dismissed the idea as being just slightly ludicrous. But then, at the time, I also thought eventually there would be Amiga emulators and interpreted versions of C language, for which I was also called crazy to think...

  • Re:I doubt it (Score:2, Interesting)

    by Anonymous Coward on Monday July 07, 2008 @07:30PM (#24091691)

    Actually, I used to work at Intel (around the time of 0.6um) and one could, and indeed, did sometimes shrink chips just by "shrinkenating", or perhaps shrinkenating followed by a design rule check. The result was a chip that was cheaper to manufacture, and in most cases, ran faster.

    Of course, to really take advantage of the smaller process node, one could revisit the cell library, circuit design, and logic, or any subset of the above, depending on what you were after. Often, time was of the essence, so you didn't do everything possible.

    I was not on the Pentium team, but I'd guess the P54 logic model was written in iHDL, which would mean that getting it through a modern synthesizer like Physical Compiler would require first converting it to Verilog. (They probably have a translator now.) But to get an efficient result, some serious changes to the RTL would almost certainly be required. Because wire delay is much more important in 0.045um than 0.6um, the analysis of what work can be done "close by" or "far way" within a clock cycle will be quite a bit different.

    The real question is, if this core is going to spend 98% of its time cracking away with its super-sexy SIMD FP unit, why are they bothering with x86 cores anyway rather than something slimmer? It's not like they need to boot windows -- I hope.

  • by TransEurope ( 889206 ) <eniac&uni-koblenz,de> on Monday July 07, 2008 @08:04PM (#24092137)
    No, i'm using a German keyboard layout. the ' and the # are on the same key, i simply missed the shift key to print the '.
  • by mbessey ( 304651 ) on Monday July 07, 2008 @08:47PM (#24092569) Homepage Journal

    According to the diagram in the article, the Larrabee has 8 GDDR memory interfaces, which will supply rather a lot of bandwidth. Presumably, those are GDDR4 or GDDR5 interfaces, so that's 4.5 Gb/s * 8 = 4.5 GB/s bandwidth.

    Getting data onto and off the board will still be a challenge - you're limited by PCI Express transfers.

  • by f8l_0e ( 775982 ) on Monday July 07, 2008 @08:57PM (#24092673)
    You should look at this [rapportincorporated.com]. 256 8 bit processing engines. Their product lineup used to have a product called the K1024, which had a PPC core with 1024 of these 8 bit processing engines.
  • Re:Why Not Atom? (Score:2, Interesting)

    by Intelista ( 1187985 ) on Monday July 07, 2008 @09:05PM (#24092777)
    I don't think Atom was done when they started Larrabee. Just a thought.
  • by JebusIsLord ( 566856 ) on Monday July 07, 2008 @11:52PM (#24094677)

    What I'm confused about: Around 40% I believe of the original Pentium was x86 translation layer.. it was the first chip to use a RISC-like internal setup. Nowadays that percentage is way lower since the rest of the chip has gotten all the new transistors. Is this chip going to have 32 x86 translation units?

  • by greywire ( 78262 ) on Tuesday July 08, 2008 @12:10AM (#24094849) Homepage

    You know, I was actually going to note that in my post. Yep, the Z80 is probably the antithesis of RISC at the time. It had a lot of instructions for the day. I dont think any instruction was less than 4 clock cycles, and many or most were more than 2 of these 4 clock cycles (for 8 or more total clock ticks). If I remember right.

    Much more risc like would have been the 6502 or something. But then they had few internal registers, where the Z80 had lots... and I think RISC designs all have lots of registers.

    I figured the Z80 would work better in such an extremely high core count device for that reason. The 6502 needed a lot more memory accesses to get things done.

    Of course, the final conclusion to this line of thinking was, how simple of a core could you possibly make? And then how many could you get into a modern chip?

    I don't remember the name but there were some people that made a many-core cpu of processors that ran forth as their language. That was interesting...

  • Re:Pentium 75? (Score:3, Interesting)

    by DurendalMac ( 736637 ) on Tuesday July 08, 2008 @01:52AM (#24095843)
    While you're right that it was for some specific operands, it was still a pretty glaring error that should never have made it into production. Worse still was Intel's response, which was a big "Meh, we'll replace your chip if you can show that you need one that works right." They went ahead and did a full replacement program after a big load of public outrage.
  • Re:I doubt it (Score:2, Interesting)

    by waferbuster ( 580266 ) on Tuesday July 08, 2008 @02:26AM (#24096067)
    Ok, so let's just forget about the process-dependent part (Px54, which in reality was P854 since 12 inch wafers weren't in use yet). P860 process came out with the dual-damascene copper, while P854 still used aluminum metal interconnects. In the era of P854, Hafnium was used in Nuclear Power (control rods) much more than in semiconductor manufacturing. There was no high-k dielectric for P854.
    He was talking about the Processor, not the Process. While it's nice to know Intel is resuscitating an old processor from the boneyards, the process to be used will be nothing like the original process. Nowadays we're printing at 45nm equivalent gatewidths.

    The interesting part is that Intel is going to be doing a mashup of a grunch of old processors for parallel processing. Each of these sub-processors are going to make an Atom look massive, but collectively (with appropriate programming) they should be quite cool.

Work without a vision is slavery, Vision without work is a pipe dream, But vision with work is the hope of the world.

Working...