Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

IBM Releases Cell SDK 207

Posted by Zonk on Thursday November 10, 2005 @12:27PM from the toys-while-waiting-for-the-next-gen-consoles dept.

derek_farn writes "IBM has released an SDK running under Fedora core 4 for the Cell Broadband Engine (CBE) Processor. The software includes many gnu tools, but the underlying compiler does not appear to be gnu based. For those keen to start running programs before they get their hands on actual hardware a full system simulator is available. The minimum system requirement specification has obviously not been written by the marketing department: 'Processor - x86 or x86-64; anything under 2GHz or so will be slow to the point of being unusable.'"

This discussion has been archived. No new comments can be posted.

IBM Releases Cell SDK

Load All Comments

Search 207 Comments Log In/Create an Account

Comments Filter:

Well . . . (Score:2, Funny)

by Yocto Yotta ( 840665 ) * writes:

But does it run Linux?

Oh. Well, okay then.
- Re:Well . . .Next question (Score:2)
  
  by Nom du Keyboard ( 633989 ) writes:
  
  But does it run Linux?
  Well, we know the answer to that. Next we want to know, will it kill Intel?
Wikipedia article question (Score:2, Insightful)

by goofyheadedpunk ( 807517 ) writes:

Not knowing too much about the cell processor I read the wikipedia article. I came across this: "In other ways the Cell resembles a modern desktop computer on a single chip."

Why?
- Re:Wikipedia article question (Score:2)
  
  by Surt ( 22457 ) writes:
  
  Because they are offering audio, video, networking on the same chip as the general purpose processing.
- Re:Wikipedia article question (Score:5, Insightful)
  
  by AKAImBatman ( 238306 ) writes: <akaimbatman&gmail,com> on Thursday November 10, 2005 @12:43PM (#13998565) Homepage Journal
  
  Um. That's kind of a weird statement. I think they mean to say that it encompasses much of the multiprocessing capabilities of a modern PC in a single chip. i.e. It's your CPU and GPU rolled into one.
  
  Cell processors aren't really anything all that new per say. The multi-core design makes them superficially similar to GPUs (which are also vector processors) with the difference that GPUs use multiple pipelines for parallel processing whereas each cell is a self-contained pipeline capable of true multi-threaded execution. In theory, the interplay between these chips could accelerate a lot of the work currently done through a combination of software and hardware. e.g. All the work that graphics drivers do to process OpenGL commands into vector instructions could be done on one or two cells, thus allowing those cells to feed the other cells with data.
  
  I guess you could say that the cell processor is the start of a general purpose vector processing design. I'm not really sure if it will take off, but unbroken thoroughput on these things is just incredible.
  
  Parent Share
  twitter facebook
- Re:Wikipedia article question (Score:5, Insightful)
  
  by l33t-gu3lph1t3 ( 567059 ) writes: <arch_angel16.hotmail@com> on Thursday November 10, 2005 @12:45PM (#13998589) Homepage
  
  Easy answer - the wiki article on "Cell" isn't that good. Cell isn't a System-On-A-Chip. It's just a stripped-down, in-order power pc core coupled to 8 single-purpose in-order SIMD units, using an unconventional cache/local memory architecture. It can run perfectly optimized code very, very fast, at extremely low power consumption to boot, but optimization will be/is a bitch. For instance, you have to unroll your "for" loops to start, since those SIMD co-processors can't do loops.
  
  I'm sure IBM and Sony have much better documentation on the CPU than I do, but that's it in a nutshell. Everything else you hear about it is just marketing. Oh yeah, almost forgot. Microsoft's "Xenon" processor for the Xbox360 is pretty much just 3 of those stripped down, in-order PPC cores in one cpu die.
  
  Parent Share
  twitter facebook
  - Re:Wikipedia article question (Score:2, Interesting)
    
    by AKAImBatman ( 238306 ) writes:
    
    Cell isn't a System-On-A-Chip. It's just a stripped-down, in-order power pc core coupled to 8 single-purpose in-order SIMD units, using an unconventional cache/local memory architecture
    
    You know, I'm looking back at all these replies to the poor guy, and I can't help but think that he's sitting in front of his computer wondering, "Can't anyone explain it in ENGLISH?!?" :-P
    
    For instance, you have to unroll your "for" loops to start, since those SIMD co-processors can't do loops.
    
    Actually, we need a new program
    - Re:Wikipedia article question (Score:3, Funny)
      
      by Jellybob ( 597204 ) writes:
      
      Looks like Ruby to me, although it's a little to verbose ;)
      
      0..9 { |i| puts i }
    - Re:Wikipedia article question (Score:2, Informative)
      
      by morgan_greywolf ( 835522 ) writes:
      
      That looks more like syntactic sugar to me. How is that different? More importantly, how would that translate differently into assembler code? You pretty much will wind up with the same thing, that is: "do your thang, increment the accumulator, if the accumulator equals the count, jump to do your thang."
      
      gcc and other compilers have options such as -funroll-loops, which will unroll loops (no matter how they were specified) for you if the count can be determined at compile time. So you wind up with "Do yo
      - Re:Wikipedia article question (Score:3, Informative)
        
        by tomstdenis ( 446163 ) writes:
        
        GCC can unroll all loops if you want including those with variable itteration counts. In those cases it uses a variant of duff's device. [well on x86 anyways].
        
        As for the other posters, the real reason you want to unroll loops is basically to avoid the cost of managing the loop, e.g.
        
        a simple loop like
        
        for (a = i = 0; i b; i++) a += data[i];
        
        In x86 would amount to
        
        mov ecx,b
        loop:
        add eax,[ebx]
        add ebx,4
        dec ecx
        jnz loop
        
        So you have a 50% efficiency at best. Now if you unroll it to
        
        mov ecx,b
        shr ecx,1
        loop:
        add eax,[e
        
        Re:Wikipedia article question (Score:3, Interesting)
        
        by AKAImBatman ( 238306 ) writes:
        
        mov ecx,b
        shr ecx,2
        loop:
        add eax,[ebx]
        add eax,[ebx+4]
        add eax,[ebx+8]
        add eax,[ebx+12]
        add ebx,16
        dec ecx
        jnz loop
        
        With SIMD instructions, you can execute all four of those adds in one instruction. I wish I knew SSE a bit better, then I could rewrite the above. Sadly, I haven't gotten around to learning the precise syntax. :-(
        
        However, there's a fairly good (if not a bit dated) explanation of SIMD here [mackido.com].
        
        Re:Wikipedia article question (Score:2)
        
        by tomstdenis ( 446163 ) writes:
        
        Yes, and unrolling would speed that up in the same fashion.
        
        iirc the instruction is "paddd", and you'd do four parallel adds then shuffle and add twice to get the single sum.
        
        Tom
        
        Re:Wikipedia article question (Score:2)
        
        by ameline ( 771895 ) writes:
        
        Try; mov ecx, b shr ecx, 2 pxor xmm1, xmm1; loop_: // you can unroll this loop... movdqa xmm0, [ebx] // aligned-- use movdqu for unaligned paddd xmm1, xmm0 add ebx, 16 dec ecx jnz loop_ // now just need to do a horizontal add... // but there
    - Re:Wikipedia article question (Score:2)
      
      by jcnnghm ( 538570 ) writes:
      
      mov ecx, COUNT
      LOOP_START:
      ;IIRC this is the underlying assembly
      ;construct for looping
      ;
      ;excluding conditional jumps
      LOOP LOOP_START
    - Re:Wikipedia article question (Score:2)
      
      by MatD ( 895409 ) writes:
      
      In most cases, I think template metaprogramming (in C++) is pedantic garbage. In this case however, you could probably use it to great effect (ie, the compiler will unroll your loops for you). The syntax is still pretty horrible though.
    - - Re:Wikipedia article question (Score:2)
        
        by AKAImBatman ( 238306 ) writes:
        
        Excellent! Now all we need are SIMD optimized LISP Compilers.
        
        (Must (resist (temptation (to (joke (about (syntax))))))) :-P
      - Re:Wikipedia article question (Score:2, Funny)
        
        by hr raattgift ( 249975 ) writes:
        
        (dotimes i (code-goes-here))
        
        Ack, pfft, says the evil Schemer. This is just insipid syntactic sugar for what you really mean:
        (let loop ((i number-of-iterations)) (if (= i 0) #f ;; because CommonLisp dotimes returns NIL (begin (code-goes-here) (loop (- i 1)))))
        
        instead of whatever dark magic your buggy
        (dotimes (i number-of-iterations) (code-goes-here))
        
        ends up being mangled into by your CommonLis
    - - Re:Wikipedia article question (Score:2)
        
        by pdbogen ( 596723 ) writes:
        
        Reportedly, the SIMD processors can't do loops. Okay, this probably just means they can't Branch. A loop in assembly basically looks like:
        
        loop: /* do some stuff */
        branch to loop
        
        However, you can "unroll" loops. If you have a loop that always runs 8 times, instead of doing a for loop you can just put the statement there 8 times. It makes the code larger in memory, but it saves processing time since you don't have to check exit conditions or jump around. This would be something done by the compiler, so OP's po
        
        Re:Wikipedia article question (Score:2)
        
        by BarryNorton ( 778694 ) writes:
        
        This would be something done by the compiler[...] It might still be necessary to make your loops more unrollable than otherwise
        
        Perhaps something like writing in tail recursive style to help out an optimising compiler?...
        
        Re:Wikipedia article question (Score:3, Informative)
        
        by hr raattgift ( 249975 ) writes:
        
        Perhaps something like writing in tail recursive style to help out an optimising compiler?...
        You have this backwards. Optimizing compilers will turn tail-recursive style source into "normal" loops.
        
        You can write a loop recursively, so that:
        foo() { int x=8; int b=1; while(x > 0) { b << 1; --x; } return b; }
        
        becomes
        foo() { return foo-helper(10, 1); } foo-helper(int x, int b) {
        
        Re:Wikipedia article question (Score:2)
        
        by BarryNorton ( 778694 ) writes:
        
        You have this backwards. Optimizing compilers will turn tail-recursive style source into "normal" loops.
        
        Thanks, no I don't. I said "like writing in tail recursive style" - I know what it means.
        My point is that, just like one can write recursions in a form that a compiler can turn them into something more (stack) efficient, so one might write iterations in a style that they can be unwound more easily (like using a primitive type as the counter, rather than an OO-style iterator)...
        
        Re:Wikipedia article question (Score:2, Interesting)
        
        by hr raattgift ( 249975 ) writes:
        
        Ah, OK, I had to think about this a bit... please correct me if I'm still misunderstanding you.
        
        I now think you were using a simile or making an analogy to argue that compilers can benefit from careful construction of loops in the source code.
        
        If so, then of course I agree with you.
        
        Saying this in a much more general way: careful choice of syntax can make the semantics more clear to the compiler.
        
        A high level language with "dotimes (count) { action }" syntax lets the compiler make good choices about loop unrol
        
        Re:Wikipedia article question (Score:2)
        
        by BarryNorton ( 778694 ) writes:
        
        For what it's worth, when I say your sentence to myself, I want to make the like bold, I guess to emphasize the simile
        
        Quite (hence my second post) - I couldn't work out what you thought I meant - at first I wondered if you thought I meant that iterations could be turned into recursions by the Cell compiler (i.e. the opposite to the normal optimisation, which is why I was trying to make it clear that I know what direction this happens in), then I realised you'd mistaken my analogy for an example... Rather
        
        Re:Wikipedia article question (Score:2)
        
        by BarryNorton ( 778694 ) writes:
        
        Did you perhaps misunderstand 'like' as introducing an example, rather than an analogy? ('Optimising compiler' was the clue that I was talking about a different scenario - the compiler tricks being discussed in the general context are not optimisations...)
        
        Re:Wikipedia article question (Score:2)
        
        by BarryNorton ( 778694 ) writes:
        
        I could have been clearer, looking back :)
      - Re:Wikipedia article question (Score:2)
        
        by JWW ( 79176 ) writes:
        
        I'm pretty sure he spelled frak right.
        
        Frak is actually a made up swear word from Battlestar Galactica.
        
        It is sometimes used by the super geeky. Like er, um, er.. me.
    - - Re:Wikipedia article question (Score:2)
        
        by LWATCDR ( 28044 ) writes:
        
        You don't write code do you?
        Think about it just in broad terms. Computer programing is like math. It really is best expressed visually. Think of a math class with no white board and just some one lecturing. Pretty useless. So even if you have an AI as smart or smarter than a person they will probably still want to see what your talking about.
        Not to mention that AIs as smart as a Hamster are still years or decades away.
  - Re:Wikipedia article question (Score:5, Informative)
    
    by plalonde2 ( 527372 ) writes: on Thursday November 10, 2005 @12:55PM (#13998686)
    
    You are wrong. These SIMD processors do loops just fine. There's a hefty hit for a mis-predicted branch, but the branch hint instruction works wonders for loops.
    The reason you want to unroll loops is because of various other delays. If it takes 7 cycles to load from the local store to a register, you want to throw a few more operations in there to fill the stall slots. Unrolling can provide those operations, as well as reduce the relative importance of branch overheads.
    
    Parent Share
    twitter facebook
    - Re:Wikipedia article question (Score:2)
      
      by AKAImBatman ( 238306 ) writes:
      
      You are wrong. These SIMD processors do loops just fine. There's a hefty hit for a mis-predicted branch, but the branch hint instruction works wonders for loops.
      
      Um... I'm not sure that's what he's trying to say. SIMD by definition is Single Instruction, Multiple Data. i.e. You give it a couple of instructions and watch it perform them on every item in the stream of data. By definition, a loop is an iteration over each instruction, multiple times. a.k.a. Multiple Instruction Multiple Data (MIMD).
      
      What's neede
      - Re:Wikipedia article question (Score:2)
        
        by AKAImBatman ( 238306 ) writes:
        
        a.k.a. Multiple Instruction Multiple Data (MIMD).
        
        Minor correction. That's supposed to be Single Instruction, Single Data. (SISD) My bad.
      - Re:Wikipedia article question (Score:2)
        
        by plalonde2 ( 527372 ) writes:
        
        The SIMD in question here is the Altivec/SSD style also called SWAR (SIMD Within A Register); the instruction set has many ops for applying the same operation over each of the 4 (or 8, or 16) elements within a 128 bit register. It's not the streaming type of SIMD.
  - Re:Wikipedia article question (Score:2)
    
    by adam31 ( 817930 ) writes:
    
    You were off to a really good start! But a couple of things:
    Optimization won't be a problem. At least it won't be the main problem. The instruction set is rich enough to provide scalar and vector integer/fp/dp operations along with both conditional branching and conditional assignment. And it can be programmed in C using intrinsics for SIMD instead of assembly. That brings up the really important part-- 128 128-bit registers. Current x86 compilers suck balls at intrinsics mostly because SSE is such
  - - - Re:Wikipedia article question (Score:2)
        
        by be-fan ( 61476 ) writes:
        
        IIRC, the SPEs have 18 stage pipelines. Okay for an FP pipe, but quite long for an INT pipe.
- Re:Wikipedia article question (Score:2, Informative)
  
  by stienman ( 51024 ) writes:
  
  The Cell processor is essentially a multi-core chip. It has, IIRC, one "master" CPU, and then multiple slave CPUs on the same die.
  
  A modern desktop computer has one master CPU, then several smaller CPUs each running their own software. Graphics, Sound, CD/DVD, HD, not to mention all the CPUs in all the peripherals.
  
  But the analogy ends there. The Cell has certian limitations and wouldn't be able to operate as a full computer system with no other processors very efficiently. I believe the PS3 has a s
- Re:Wikipedia article question (Score:2)
  
  by Jerry Coffin ( 824726 ) writes:
  
  I suspect the author of the Wikipiedia article knows a bit more than he's being given credit for elsethread.
  Each cell processor includes not only the multiple processors mentioned elsethread, but addressable memory, DMA controller, and a controller for what is essentially a proprietary network. The last is somewhat open to argument -- for example, current AMD CPUs include HyperTransport controllers, which are somewhat similar.
  In any case, IBM does (e.g. here [ibm.com]) talk about the Cell as a System on a Chip, t
- Re:Wikipedia article question (Score:2)
  
  by adisakp ( 705706 ) writes:
  
  Not knowing too much about the cell processor I read the wikipedia article. I came across this: "In other ways the Cell resembles a modern desktop computer on a single chip."
  
  Why?
  
  Actually each of the SPU's resemble a system-on-a-chip. They each have local memory, CPU and I/O. The Cell itself actually resembles a network-on-a-chip (or in slashdotology, a Beowulf-Cluster-on-a-Chip) if you consider main memory to be I/O storage.
Is this the same Cell processor used in the PS3? (Score:2)

by Spy der Mann ( 805235 ) writes:

Just to clarify.
- Re:Is this the same Cell processor used in the PS3 (Score:5, Funny)
  
  by Spazntwich ( 208070 ) writes: on Thursday November 10, 2005 @12:39PM (#13998524)
  
  No. In our insanely litigious society, a company has graciously allowed another to create and market a different processor by the same exact name.
  
  Parent Share
  twitter facebook
- - - Re:Is this the same Cell processor used in the PS3 (Score:2, Funny)
      
      by Anonymous Coward writes:
      
      I not get mine run. Please send exact instruction how downloaded PS3 games play can?
Unproductive? (Score:5, Funny)

by RManning ( 544016 ) writes: on Thursday November 10, 2005 @12:40PM (#13998532) Homepage

My favorite quote from TFA...

...in addition, the ILAR license states that "You are not authorized to use the Program for productive purposes" -- so make sure that your time spent with these downloads is as unproductive as possible.

Share
twitter facebook
- Re:Unproductive? (Score:2)
  
  by Kayamon ( 926543 ) writes:
  
  Sounds like my job. I don't think there'll be any problems there. :-)
Since the submitter didn't bother to explain... (Score:5, Informative)

by frankie ( 91710 ) writes: on Thursday November 10, 2005 @12:41PM (#13998540) Journal

...the Cell processor is an upcoming PowerPC variant that will be used in the PlayStation 3. It's great at DSP but terrible at branch prediction, and would not make a very good Mac. If you want to know full tech specs, Hannibal is da man [arstechnica.com].

Share
twitter facebook
- Re:Since the submitter didn't bother to explain... (Score:2)
  
  by imroy ( 755 ) writes:
  
  I'm just wondering what information you have on the Cell being "terribla at branch predeiction"? I don't know about using it in a mac, but IBM seems eager to run Linux on it. They've even demonstrated a prototype cell-based blade server system [slashdot.org] running Linux, back in May.
  - Re:Since the submitter didn't bother to explain... (Score:2)
    
    by nutshell42 ( 557890 ) writes:
    
    Even though the GP linked to an article that greets you with Inside the Xbox 360, Part II: the Xenon CPU there are links to some informative articles about the Cell architecture further down.
    Short story: The cool thing about the Cell are the SPEs that are the best thing since sliced bread if you have lots of matrix-vector operations to perform but more or less useless otherwise.
    IBM is eager to run Linux on it because the Cell could make one hell of a supercomputing grid. (Although it loses lots of flops i
  - - Re:Branch Prediction (Score:2)
      
      by imroy ( 755 ) writes:
      
      Wow, you could not be more wrong. See the wikipedia article on branch misprediction [wikipedia.org]. You should probably read up on exactly what RISC [wikipedia.org] means as well. I have the "SPU assembly language" document here from IBM (can't remember where I got it from, sorry). The branch instructions (not JUMP) can jump to any location stored in any 32-bit register, minus the two least significant bits. It is a RISC CPU after all. Or it can branch relative to the current PC using an 18-bit direct value. Considering the first generat
    - - Re:Branch Prediction (Score:2)
        
        by TurkishGeek ( 61318 ) writes:
        
        This is exactly what the Cell SPE's have. The SPE compiler uses "branch hints" that are put in by the compiler using the GCC pragma "__builtin_expect_". Take a look at the "SPU C/C++ Language Extensions" document that was released a while back by the Cell team.
        
        Most of the other posters have no idea what they are talking about. The PPE is a fully PowerPC compliant two-way SMT processor and absolutely has a branch predictor. It is the SPEs (SIMD vector units) that do not have branch prediction, but they do ha
        
        Re:Branch Prediction (Score:2)
        
        by TurkishGeek ( 61318 ) writes:
        
        Agreed, the PPE core only has a 4KB by 2-bit BHT(branch history table). Note that the PPE pipeline depth is only 23 stages (i.e. branch misprediction penalty is 23 cycles), so a misprediction penalty is comparable to designs that run at far, far slower clocks. I am not sure if the main motivation was making chip real estate available for other things: The recent IBM Journal of R & D paper by Kahle et al. is an excellent read to gain insight into the design decisions they took, and I believe they were co
        
        Re:Branch Prediction (Score:2)
        
        by be-fan ( 61476 ) writes:
        
        I'd hardly call a CPU with a 23-stage pipeline and no out of order execution 'elegant'. Maybe to a hardware guy, but all I see when I look at Cell is absolutely atrocious integer performance.
        
        Re:Branch Prediction (Score:2)
        
        by TurkishGeek ( 61318 ) writes:
        
        I am a hardware guy, and the design is far more elegant and simpler than most of the competing CPU's out there; mainly as a result of the push to get it to work at the 4GHz+ frequency range.
        
        I think it's very early to talk about the integer performance of Cell. I have been working on Cell for a few months now, and all I can say is that the integer performance of the PPE core is on par with the competition; and it beats them handily using hand-written code to take advantage of the SPEs.
        
        Re:Branch Prediction (Score:2)
        
        by be-fan ( 61476 ) writes:
        
        On par with what? I'm a Lisp guy/compiler enthusiast. I like processors with out-of-order execution that don't care about code scheduling, have excellent branch prediction, have low memory latency, etc. Basically, my ideal processor is an Opteron. It's all about perspective, hence my criticism of your use of the word "elegant".
- Re:Since the submitter didn't bother to explain... (Score:2)
  
  by poot_rootbeer ( 188613 ) writes:
  
  It's great at DSP but terrible at branch prediction
  
  With 8 or more semi-independent "Synergistic Processing Unit" pipelines, it doesn't really need to have a lot of complex branch prediction logic. It could adopt a bit of a quantum methodology and assign a different SPU to proceed for each possible outcome of a compare/branch instruction, and then once the correct outcome has been established, discard the "dead-end" pipelines.
  
  Then again, I learned microprocessor design principles back when the PPC 601 was s
- Re:Since the submitter didn't bother to explain... (Score:2)
  
  by MaestroSartori ( 146297 ) writes:
  
  That's not entirely true. The PPE can do branch prediction, the SPEs can't. Whether the PPE's branch prediction is any good or not, I don't know... :)
- - Re:Since the submitter didn't bother to explain... (Score:2)
    
    by Glock27 ( 446276 ) writes:
    
    3) Steve goes out on stage and pretends like he has made the 'choice' to move to Intel
    You're bitter about something. Care to share?
    Steve most certainly made a decision to go Intel. No "pretending" involved. Just what dollar value to you ascribe to "5% of IBM's chip volume", BTW?
    4) With Cell processors in Macs no longer an option for Apple, the sour grapes meme that the idiot above parroted starts to make its rounds in Mac circles.
    Cell wouldn't be that great for it's clock speed, but it would certain
Source for actual chips? (Score:4, Interesting)

by mustafap ( 452510 ) writes: on Thursday November 10, 2005 @12:47PM (#13998605) Homepage

Thats great news, but as an embedded systems designer and eternal tinkerer, where will I be able to buy a handfull of these processors to experiment with? Without having to dismantle loads of games machines ;o)

Share
twitter facebook
- I should have added... (Score:2)
  
  by mustafap ( 452510 ) writes:
  
  That I am in the UK, although I dont think that will make much difference :o)
  
  But I would like to know.
  
  Mike.
- Re:Source for actual chips? (Score:2)
  
  by Wesley Felter ( 138342 ) writes:
  
  Unfortunately for you, you don't "tinker" with Cell. Since all the I/O is multi-GHz exotic Rambus signaling, you probably have to be an expert board designer to do anything with it. Not to mention that you have to get the processor, southbridge, and RAM from three different companies, probably signing a stack of NDAs in the process.
  - Re:Source for actual chips? (Score:2)
    
    by mustafap ( 452510 ) writes:
    
    Ah. When I read 'the size of your fingernail' I assumed it would be like an ARM core. Oh well. :o(
    
    Thanks,
    
    Mike.
    - Re:Source for actual chips? (Score:2)
      
      by AKAImBatman ( 238306 ) writes:
      
      All CPUs are the size of your fingernail. It's the packaging that makes them appear large. :-)
      - Re:Source for actual chips? (Score:2)
        
        by mustafap ( 452510 ) writes:
        
        The Intel P4 is 15x15mm You must have very large fingernails.
        
        Re:Source for actual chips? (Score:2)
        
        by AKAImBatman ( 238306 ) writes:
        
        I don't have a ruler with me at the moment, but that looks pretty close to right for my thumbnail. What can I say? I'm a big guy. :-)
        
        Re:Source for actual chips? (Score:2)
        
        by mustafap ( 452510 ) writes:
        
        well I wont argue with a big guy :o)
        
        I'd forgotten that these processors are not made on the 3 micron processes like the chips I used to work on!
        
        Re:Source for actual chips? (Score:2)
        
        by AKAImBatman ( 238306 ) writes:
        
        I'd forgotten that these processors are not made on the 3 micron processes like the chips I used to work on!
        
        3 microns? Wow. That's huge! The top of the line chips these days are easily below 0.5 microns. (The PIV chips are 0.18 and 0.13 microns!) I know I was just shocked when I got my Spartan III FPGA kit. I couldn't believe how small the thing was in it's packaging. I can't even imagine how small the actual die must be!
        
        Re:Source for actual chips? (Score:2)
        
        by mustafap ( 452510 ) writes:
        
        >I know I was just shocked when I got my Spartan III FPGA kit.
        
        cool! I just got a Spartan III dev board in the post last week too. First thing I did was hook it up to a monitor and twiddle a few buttons :o)
        Fancy chatting about it by email?
        
        mikehibbett at oceanfree (dot) net
        
        Mike.
What about a PPC SDK and simulator? (Score:5, Interesting)

by kuwan ( 443684 ) writes: on Thursday November 10, 2005 @12:49PM (#13998619) Homepage

As the Cell is basically a PPC processor I find it strange that the SDK is for x86 processors. Fedora Core 4 (PowerPC), also known as ppc-fc4-rpms-1.0.0-1.i386.rpm is listed as one of the files you need to download. Maybe it's just because of the large installed base of x86 machines.

It'd be nice if IBM released a PPC SDK for Fedora, it would have the potential to run much faster than an x86 SDK and simulator.

Share
twitter facebook
- Re:What about a PPC SDK and simulator? (Score:2)
  
  by antifoidulus ( 807088 ) writes:
  
  Not sure how much faster it would be really(though I'm writing this from a powerbook and I really wish they would release some ppc stuff). A PPC chip acts as the controller but the actual proccessing is done on chips with architectures vastly different from both x86 and PPC, For instance, they aren't superscalar so they do no branch prediction like both x86 and PPC do...so really the emulation speed is pretty independent of it's host architecture. I suppose they could use the Altivec found on Apple's CPU
- Re:What about a PPC SDK and simulator? (Score:2)
  
  by Jozer99 ( 693146 ) writes:
  
  I dont know how much of a performace boost you would get. Despite the fact that it has a power pc processor on it, it is a very different power pc. It does not have out of order execution, and has all those pretty vector processors dangling off of it. I think emulation would be 2x faster, at most.
- Another question about the simulator (Score:2)
  
  by Weaselmancer ( 533834 ) writes:
  
  I wonder if it'll take advantage of multi-core chips? Might make sense to do so, especially since that's also (sort of) similar to the hardware being simulated.
- Re:What about a PPC SDK and simulator? (Score:2)
  
  by jmorris42 ( 1458 ) * writes:
  
  > Maybe it's just because of the large installed base of x86 machines.
  
  Got it in one try. Anyone interested in actually using this thing has a spare PC to load FC4 on, almost none has a spare top of the line PowerMac in the closet. Heck, most don't have a top of the line Powermac period.
  - Re:What about a PPC SDK and simulator? (Score:2)
    
    by bhima ( 46039 ) writes:
    
    Well, I have a dual 2.5 G5 and as easy as it is to dual boot with OS X I'd devote a firewire disk to it for a while.
    I keep having this fantasy that a PCI-E development board will come out and I'll be able to do something interesting with it (what I have no idea but I'm open to suggestions). I'd really like OS X development environment for it to tinker with.
- Not a PPC Processor (Score:2, Informative)
  
  by MJOverkill ( 648024 ) writes:
  
  Once again, the cell is not a PPC processor. It is not PPC based. The cell going into the playstation 3 has a POWER based PPE (power processing element) that is used as a controller, not a main system processor. Releasing an SDK for Macs would not give any advantage over an X-86 based SDK because you are still emulating another platform.
  
  Wiki [wikipedia.org]
  - Re:Not a PPC Processor (Score:2)
    
    by Wesley Felter ( 138342 ) writes:
    
    Well, the PPE is a PowerPC. Whether you call the PPE a "controller" or "main system processor" is really just a matter of definitions (I think both terms are simultaneously applicable).
    - - Re:Not a PPC Processor (Score:3, Informative)
        
        by Wesley Felter ( 138342 ) writes:
        
        "Power Architecture" is PowerPC.
        
        What is Power Architecture technology? [ibm.com]
        
        "Power Architecture is an umbrella term for the PowerPC® and POWER4(TM) and POWER5(TM) processors produced by IBM, as well as PowerPC processors from other suppliers."
- Re:What about a PPC SDK and simulator? (Score:3, Informative)
  
  by pbohrer ( 930124 ) writes:
  
  The simulator is actually maintained on a number of different platforms within IBM. Since the rest of the SDK team (xlc, cross-dev gcc, sample & libs, etc) chose Fedora Core 4 on x86 as a means of enabling the most number of people, we didn't want to confuse too many people by supplying the simulator on a variety of platforms for which the rest of the SDK is not supported. This was somewhat of a big-bang release of quite a bit of software to enable exploration of Cell. Now that we have this released a
Comment removed (Score:5, Interesting)

by account_deleted ( 4530225 ) writes: on Thursday November 10, 2005 @12:50PM (#13998628)

Comment removed based on user account deletion

Share
twitter facebook
- Re:GNU toolchain (Score:4, Informative)
  
  by Have Blue ( 616 ) writes: on Thursday November 10, 2005 @01:24PM (#13999002) Homepage
  
  IBM may have run into the same problems with the Cell that they did with the PowerPC 970- the chip breaks some fundamental assumptions GCC makes, and to add the best optimization possible it would necessary to modify the compiler more drastically than the GCC leads would allow (to keep GCC completely platform-agnostic).
  
  Parent Share
  twitter facebook
  - Re:GNU toolchain (Score:2)
    
    by advocate_one ( 662832 ) writes:
    
    IBM may have run into the same problems with the Cell that they did with the PowerPC 970- the chip breaks some fundamental assumptions GCC makes, and to add the best optimization possible it would necessary to modify the compiler more drastically than the GCC leads would allow (to keep GCC completely platform-agnostic).
    who the heck says they have to keep the GCC they distribute with the software development kit platform agnostic??? what a stupid concept. It has absolutely NOTHING to do with the GCC leads..
- Re:GNU toolchain (Score:4, Informative)
  
  by Wesley Felter ( 138342 ) writes: <wesley@felter.org> on Thursday November 10, 2005 @01:32PM (#13999098) Homepage
  
  The SDK includes both GCC and XLC. GCC's autovectorization isn't the greatest, but Apple and IBM have been working on it. I think if you want fast SPE code you'll end up using intrinsics anyway.
  
  Parent Share
  twitter facebook
Echoes of Redhat (Score:3, Insightful)

by delire ( 809063 ) writes: on Thursday November 10, 2005 @12:51PM (#13998649)

Why Fedora is so often considered the default target distribution I don't know. Even the project page [redhat.com] states it's an unsupported, experimental OS, and one now comparitvely marginal when tallied [distrowatch.com].

Must be a case of 'brand leakage' from a distant past, one that held Redhat as the most popular desktop Linux distribution.

Shame, I guess IBM is missing out on where the real action is.

Share
twitter facebook
- I agree! (Score:2)
  
  by BobPaul ( 710574 ) * writes:
  
  Give me a nice clean distro like Gentoo anyday. I can't stand that a Fedora install requires 5CDs and installs some 600 packages that I will never use. Why do I need so many text editors, etc? I get lost in the and nervous in the Applications menu. Sure, I tried 30 text editors before I found the one I wanted, but that's all I install on my box durring reinstall or upgrade.
  
  BTW, this parent might be offtopic, be he is no troll. Shame on you mods!
  - - Re:I agree! (Score:2)
      
      by drinkypoo ( 153816 ) writes:
      
      If "your" (sic) doing anything other than little kid stuff, use a real distro like RHEL. Fedora is unsupported and is permanently unstable because it's a testing ground... for RHEL.
    - - Re:I agree! (Score:2)
        
        by BobPaul ( 710574 ) * writes:
        
        Gentoo [secunia.com] has significantly more vulnerabilites than Fedora [secunia.com], even if you add up all the vulnerabilities for all 4 cores (not that those raw numbers really matter in the end as long as they all get patched)
        
        Well, first I'd like to irraterate what you already pointed out, that neither has unpatched vulnerabilities.
        
        Second, you're comparing EVERY release of Gentoo ever to Fedora Core 4.0. Notice how Fedora Core 4.0 doesn't have any vulnerabilities before Feb 2005? That's because it didn't exist much before then.
        
        Yo
- Re:Echoes of Redhat (Score:4, Insightful)
  
  by LnxAddct ( 679316 ) writes: <sgk25@drexel.edu> on Thursday November 10, 2005 @02:10PM (#13999548)
  
  Fedora overtook Suse within a year and a half in terms of users. It is now a close 3rd to Debian which is a far second from Red Hat (Red Hat and Fedora together have around 3 times the market share of Debian, check netcraft to confirm those numbers). The numbers on distrowatch are not downloads or users, that number is how many people clicked on the link to read about Ubuntu. Mark Shuttleworth is obscenely good at getting press about Ubuntu so the Ubuntu link gets a lot of click throughs, and now that it is at the top, it is kind of self fulfilling as interested people want to read about the top distro so they click on that more.
  
  When it comes down to it, Fedora is the most advanced linux distribution out there. It comes standard with SELinux and virtualization. It uses LVM by default, integrates exec-shield and other code foritfying techniques into all major services. It has the latest and greatest of everything. Things just work in Fedora because a large portion of that code was coded by Red Hat. Red Hat maintains GCC and glibc, they commit more kernel code than anyone else, they play a large role in everything from Apache and Gnome to creating GCJ to get java to run natively under linux. Whether you like it or not, Fedora is the distro most professionals go with, despite what the slashdot popular oppinion is and despite the large amounts of noise that a few ubuntu users create.
  
  Out of the big two, Novell and Red Hat, Novell has never been worse off and Red Hat has never been healthier. Red Hat doesn't officially provide support for Fedora, but it is built and paid for by Red Hat and their engineers (in addition to the community contributions). By targetting Fedora, IBM knows that they are targeting a stable platform with the largest array of hardware support. IBM is in bed with both Novell and Red Hat, they didn't choose Fedora because they were paid to or something... they chose Fedora based on technical merits. Claiming that Fedora is unstable is no different than claiming GMail is in beta, both products are still the best in their respective industries. Why do people go spreading FUD about such a good produc when they've never used it themselves? Whether you want to admit it or not, Fedora is the platform to target for most. It is compatible in large part with RHEL, so you're getting the most bang for your buck.
  
  IBM doesn't just shit around, or make decisions for dumb reasons. If Fedora is good enough for IBM it is good enough for anyone. Apparently this is a common oppinion as more and more businesses switch to Fedora desktops. Here [computerworld.com.au] is one recent story of a major Australian company, Kennards, replacing 400 desktops with Fedora. Don't be so close minded or you might be left behind.
  Regards,
  Steve
  
  Parent Share
  twitter facebook
- Re:Echoes of Redhat (Score:2)
  
  by dominator ( 61418 ) writes:
  
  Fedora's #4 ranking [distrowatch.com] on Distrowatch can hardly be called "marginal". Nevermind that one should also question the site's "page hit ranking" methodology before passing it off as representative, much less authoritative.
- - - Re:Echoes of Redhat (Score:2)
      
      by CyricZ ( 887944 ) writes:
      
      Your ad hominem attack actually proves my point. Fedora is inferior to Ubuntu.
      
      You cannot prove Fedora better because it just plain is not. It is impossible to debate against the truth. Thus you must resort to ad hominem attacks, which instantly prove that I am the victor in this debate.
New & Improved (Score:2, Funny)

by Doc Ruby ( 173196 ) writes:

I dunno - telling people they have to upgrade their PC to run the SDK for a new PC architecture seems like a marketer's job.
Linux on PS3? (Score:2)

by deadline ( 14171 ) writes:

This is very interesting. The Cell has a very non-standard architecture, but it can be used in a very powerful way. The key is software and thus, an emulation SDK is really important for a new architecture. From and HPC (High Performance Computing) prospective, these chips could be very powerful.
The real question is whether the the PS3 will have an Linux hard disk option like the PS2. If that is the case, it may be the cheapest way to get actual development hardware.
- Re:Linux on PS3? (Score:2, Interesting)
  
  by MaskedSlacker ( 911878 ) writes:
  
  Almost definitely. A cheap beowulf of PS3s.
Cell Hardware... (Score:4, Informative)

by GoatSucker ( 781403 ) writes: on Thursday November 10, 2005 @01:35PM (#13999133)

From the article:
How does one get a hold of a real CBE-based system now? It is not easy: Cell reference and other systems are not expected to ship in volume until spring 2006 at the earliest. In the meantime, one can contact the right people within IBM [ibm.com] to inquire about early access.

By the end of Q1 2006 (or thereabouts), we expect to see shipments of Mercury Computer Systems' Dual Cell-Based Blades [mc.com]; Toshiba's comprehensive Cell Reference Set development platform [toshiba.co.jp]; and of course the Sony PlayStation 3 [gamespot.com].

Share
twitter facebook
- Sony has poisoned (Score:2)
  
  by quarkscat ( 697644 ) writes:
  
  the "Cell" well, as far as I am concerned. They seem to be totally unremorseful regarding their music CD DRM (aka rootkit). At one point I considered the purchase of a PS3 in order to gain experience with the Cell Processor. Today, I would not consider the purchase of ANYTHING with Sony's name on it, regardless of how "geeky" it might be.
  
  Purchasing IBM's (or perhaps Mercury Computer's) reference CBE-based platform are now my only choices. Sony's NRE for the PS3 might make their platform a "best buy" pri
Re: (Score:2, Interesting)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
- Re:Rosetta to the rescue? (Score:2)
  
  by Synic ( 14430 ) writes:
  
  you ever think that rosetta is more like wine than it is like virtual pc? ie handles the upper API calls by translating them to native lower level?
  - - Re:Rosetta to the rescue? (Score:2, Interesting)
      
      by Hal_Porter ( 817932 ) writes:
      
      Apple wrote a great 68K emulator for the PowerPC macs. It was non JIT, and worked like a big jump table. So you took a 16bit 68k instruction, shifted it and jumped to the base of the table + the shifted offset. The code there would essentially be a PowerPC version of the 68K code.
      
      http://www.mactech.com/articles/mactech/Vol.10/10. 09/Emulation/ [mactech.com]
      
      So you end up doing four instructions to decode the 68K instruction, and then whatever it takes to actually do the operation, typically 2-4.
      
      JIT emulators would profile
- Re:Rosetta to the rescue? (Score:2)
  
  by seebs ( 15766 ) writes:
  
  It's not the PPE cores that are slow to emulate. It's the 8 additional vector-only processors.
  
  This is a sim, not just an emulator. It's not just vaguely implementing the output; it is at least to some extent modeling the instruction pipelining, branch miss penalties, and so on.
"cell" architecture is all about local memory (Score:5, Informative)

by Animats ( 122034 ) writes: on Thursday November 10, 2005 @02:25PM (#13999729) Homepage

The "cell" processors have fast access to local, unshared memory, and slow access to global memory. That's the defining property of the architecture. You have to design your "cell" program around that limitation. Most memory usage must be in local memory. Local memory is fast, but not large, perhaps as little as 128KB per processor.
The cell processors can do DMA to and from main memory while computing. As IBM puts it, "The most productive SPE memory-access model appears to be the one in which a list (such as a scatter-gather list) of DMA transfers is constructed in an SPE's local store so that the SPE's DMA controller can process the list asynchronously while the SPE operates on previously transferred data." So the cell processors basically have to be used as pipeline elements in a messaging system.
That's a tough design constraint. It's fine for low-interaction problems like cryptanalysis. It's OK for signal processing. It may or may not be good for rendering; the cell processors don't have enough memory to store a whole frame, or even a big chunk of one.
This is actually an old supercomputer design trick. In the supercomputer world, it was not too successful; look up the the nCube and the BBN Butterfly, all of which were a bunch of non-shared-memory machines tied to a control CPU. But the problem was that those machines were intended for heavy number-crunching on big problems, and those problems didn't break up well.
The closest machine architecturally to the "cell" processor is the Sony PS2. The PS2 is basically a rather slow general purpose CPU and two fast vector units. Initial programmer reaction to the PS2 was quite negative, and early games weren't very good. It took about two years before people figured out how to program the beast effectively. It was worth it because there were enough PS2s in the world to justify the programming headaches.
The small memory per cell processor is going to a big hassle for rendering. GPUs today let the pixel processors get at the frame buffer, dealing with the latency problem by having lots of pixel processors. The PS2 has a GS unit which owns the frame buffer and does the per-pixel updates. It looks like the cell architecture must do all frame buffer operations in the main CPU, which will bottleneck the graphics pipeline. For the "cell" scheme to succeed in graphics, there's going to have to be some kind of pixel-level GPU bolted on somewhere.
It's not really clear what the "cell" processors are for. They're fine for audio processing, but seem to be overkill for that alone. The memory limitations make them underpowered for rendering. And they're a pain to program for more general applications. Multicore shared-memory multiprocessors with good cacheing look like a better bet.
Read the cell architecture manual. [ibm.com]

Share
twitter facebook
- Re:"cell" architecture is all about local memory (Score:2, Informative)
  
  by taracta ( 217357 ) writes:
  
  I think too much emphasis is being placed on "slow" access to system memory for the CELL processor when is is "slow" only relative to access to local memory of the SPUs. Please remember that system memory for the CELL is about 8 times faster than the memory in todays high end PCs with lower latency. XDR is by far the best memory type available unfortunately nobody like RAMBUS the company. So please when you are speaking about access to system memory keep in mind that the CELL processor has about the same
- Re:"cell" architecture is all about local memory (Score:2, Informative)
  
  by frostfreek ( 647009 ) writes:
  
  > It's not really clear...
  
  There was a Toshiba demo, showing 8 Cells; 6 used to decode forty-eight HDTV MPEG4 streams, simultaneously, 1 for scaling the results to display, and one left over. A spare, I guess?
  
  This reminds me of the Texas Instruments 320C80 processor; 1 RISC general purpose cpu, plus four DSP-oriented CPUs. Each had an on-chip memory chunk. 4KB. 256KB would be fantastic, after the experience of programming for the C80. 256KB will be plenty of memory to work on a tile of framebuffer.
  
  1.
- - Re:"cell" architecture is all about local memory (Score:2)
    
    by Naysayer ( 71120 ) writes:
    
    That 256k has to hold the program that the SPE is running, as well as all the data. For fast DMA, though, your data is probably double-buffered so divide your data space in half, and hey, you might want a little space for stack / other dynamic memory usage.
    
    Suppose your program is 48k, you use 32k of memory dynamically, that leaves 172k for data, which is double-buffered, which means the program can only process 86k of data at a time.
    
    But it sure can do it fast.
- - - The NVidia GPU in the PS3 (Score:3, Informative)
      
      by Animats ( 122034 ) writes:
      
      That's not what Sony is saying:
      SCEA press release:
      SONY COMPUTER ENTERTAINMENT INC. AND NVIDIA ANNOUNCE JOINT GPU DEVELOPMENT FOR SCEI'S NEXT-GENERATION COMPUTER ENTERTAINMENT SYSTEM> [playstation.com].
      TOKYO and SANTA CLARA, CA
      DECEMBER 7, 2004
      "Sony Computer Entertainment Inc. (SCEI) and NVIDIA Corporation (Nasdaq: NVDA) today announced that the companies have been collaborating on bringing advanced graphics technology and computer entertainment technology to SCEI's highly anticipated next-generation computer enterta
why? (Score:2)

by PopCulture ( 536272 ) writes:

I'm very excited about this project, even spec'd out a new dell to handle it. But before I can lay down the cash, I just wonder: why?

why? Is the cell processor expected to go anywhere past PS3? There is obviously no OS port planned, and I have no access to PS3 game SDK. I have read some pretty awesome posts regarding the technical details of cell vs. x86 or Mac architectures, but none that would encourage me to download, install, and play around with this with the hope of ever making a buck.
- Re:why? (Score:2)
  
  by seebs ( 15766 ) writes:
  
  Blade servers have already been announced.
  
  I would buy one of these, and no, I don't plan to get a PS3.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Well . . . (Score:2, Funny)

Re:Well . . .Next question (Score:2)

Wikipedia article question (Score:2, Insightful)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:5, Insightful)

Re:Wikipedia article question (Score:5, Insightful)

Re:Wikipedia article question (Score:2, Interesting)

Re:Wikipedia article question (Score:3, Funny)

Re:Wikipedia article question (Score:2, Informative)

Re:Wikipedia article question (Score:3, Informative)

Re:Wikipedia article question (Score:3, Interesting)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2, Funny)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:3, Informative)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2, Interesting)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:5, Informative)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2, Informative)

Re:Wikipedia article question (Score:2)

Re:Wikipedia article question (Score:2)

Is this the same Cell processor used in the PS3? (Score:2)

Re:Is this the same Cell processor used in the PS3 (Score:5, Funny)

Re:Is this the same Cell processor used in the PS3 (Score:2, Funny)

Unproductive? (Score:5, Funny)

Re:Unproductive? (Score:2)

Since the submitter didn't bother to explain... (Score:5, Informative)

Re:Since the submitter didn't bother to explain... (Score:2)

Re:Since the submitter didn't bother to explain... (Score:2)

Re:Branch Prediction (Score:2)

Re:Branch Prediction (Score:2)

Re:Branch Prediction (Score:2)

Re:Branch Prediction (Score:2)

Re:Branch Prediction (Score:2)

Re:Branch Prediction (Score:2)

Re:Since the submitter didn't bother to explain... (Score:2)

Re:Since the submitter didn't bother to explain... (Score:2)

Re:Since the submitter didn't bother to explain... (Score:2)

Source for actual chips? (Score:4, Interesting)

I should have added... (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

Re:Source for actual chips? (Score:2)

What about a PPC SDK and simulator? (Score:5, Interesting)

Re:What about a PPC SDK and simulator? (Score:2)

Re:What about a PPC SDK and simulator? (Score:2)

Another question about the simulator (Score:2)

Re:What about a PPC SDK and simulator? (Score:2)

Re:What about a PPC SDK and simulator? (Score:2)

Not a PPC Processor (Score:2, Informative)

Re:Not a PPC Processor (Score:2)

Re:Not a PPC Processor (Score:3, Informative)

Re:What about a PPC SDK and simulator? (Score:3, Informative)

Comment removed (Score:5, Interesting)

Re:GNU toolchain (Score:4, Informative)

Re:GNU toolchain (Score:2)

Re:GNU toolchain (Score:4, Informative)

Echoes of Redhat (Score:3, Insightful)

I agree! (Score:2)

Re:I agree! (Score:2)