IBM Releases Cell SDK 207
derek_farn writes "IBM has released an SDK running under Fedora core 4 for the Cell Broadband Engine (CBE) Processor. The software includes many gnu tools, but the underlying compiler does not appear to be gnu based. For those keen to start running programs before they get their hands on actual hardware a full system simulator is available. The minimum system requirement specification has obviously not been written by the marketing department: 'Processor - x86 or x86-64; anything under 2GHz or so will be slow to the point of being unusable.'"
Source for actual chips? (Score:4, Interesting)
What about a PPC SDK and simulator? (Score:5, Interesting)
It'd be nice if IBM released a PPC SDK for Fedora, it would have the potential to run much faster than an x86 SDK and simulator.
Comment removed (Score:5, Interesting)
Re:Wikipedia article question (Score:2, Interesting)
You know, I'm looking back at all these replies to the poor guy, and I can't help but think that he's sitting in front of his computer wondering, "Can't anyone explain it in ENGLISH?!?"
For instance, you have to unroll your "for" loops to start, since those SIMD co-processors can't do loops.
Actually, we need a new programming model. Instead of using FOR loops, we need a model under while you can say, "Perform these instructions X number of times." One could probably do a bit of guess-work in the compiler based on loops like "for(i=0;i<COUNT;i++)", but that doesn't help cases where the loop uses a more complex conditional statement (or where the test is affected by the loop itself). Thus the language needs to be changed to force the programmer to pre-compute the loop length for maximum performance. For example:
Re:Since the submitter didn't bother to explain... (Score:0, Interesting)
1) Apple tries to lowball IBM on the mobile 970 design
2) IBM give Apple the finger - they account for less the five percent of IBM's chip volume
3) Steve goes out on stage and pretends like he has made the 'choice' to move to Intel
4) With Cell processors in Macs no longer an option for Apple, the sour grapes meme that the idiot above parroted starts to make its rounds in Mac circles.
5) Intel's processor roadmap fiasco continues, but what is funny is how Intel's roadmap for future chips years down the road has chip designs that look very close to STI's Cell chips that being made today.
Enjoy your h.264 encoding times on those wonderful Intel SSE chips Mac crazies!
Re:Linux on PS3? (Score:2, Interesting)
Re:Wikipedia article question (Score:3, Interesting)
shr ecx,2
loop:
add eax,[ebx]
add eax,[ebx+4]
add eax,[ebx+8]
add eax,[ebx+12]
add ebx,16
dec ecx
jnz loop
With SIMD instructions, you can execute all four of those adds in one instruction. I wish I knew SSE a bit better, then I could rewrite the above. Sadly, I haven't gotten around to learning the precise syntax.
However, there's a fairly good (if not a bit dated) explanation of SIMD here [mackido.com].
Comment removed (Score:2, Interesting)
Re:Echoes of Redhat (Score:1, Interesting)
Must be a case of 'brand leakage' from a distant past, one that held Redhat as the most popular desktop Linux distribution.
uh, or maybe... 1) it's because IBM has a partnership with RedHat, 2) Fedora runs on PPC (which CBE is based on) so i'm sure it's easy for them to modify, 3) there's a good chance this was developed using FC4, so it's just easy to release it for FC4
Re:Rosetta to the rescue? (Score:2, Interesting)
http://www.mactech.com/articles/mactech/Vol.10/10
So you end up doing four instructions to decode the 68K instruction, and then whatever it takes to actually do the operation, typically 2-4.
JIT emulators would profile the code and check which bits were frequently executed. Then they would essentially copy the table entries into a buffer. So in a loop, you'd actually execute native just execute the 2-4 native instructions and skip the table dispatch.
There's another benefit too, you can skip things like condition code updates, if you know that they will be overwritten by another instruction before they are checked. Plus you can do peephole optimisations, constant folding and so on.
There's a wonderful article here -
http://www.gtoal.com/sbt/ [gtoal.com]
I can easily believe that CPU intensive code like image processing can run at a very impressive speed, especially as top of the range x86 chips have better SpecInt perormance than a top of the range PPC.
Incidentally, I read about Apple's second generation 68K emulator being a "dynamic recompiler", so they've been working on this sort of thing for ages.
Re:Wikipedia article question (Score:2, Interesting)
I now think you were using a simile or making an analogy to argue that compilers can benefit from careful construction of loops in the source code.
If so, then of course I agree with you.
Saying this in a much more general way: careful choice of syntax can make the semantics more clear to the compiler.
A high level language with "dotimes (count) { action }" syntax lets the compiler make good choices about loop unrolling and the counter's type.
A language where you have to test and modify your own counter lets the writer make good or incredibly awful choices about loop unrolling and the counter's type.
This version: is semantic brain-damage on a system with very slow very IEEE doubles, and loop-unrolling this naively is not going to help.
A compiler which realizes that this is a loop whose length is constant can unroll the loop fully, partially, or simply use a better/faster iterator like an integer. But should we end up with 0x400 or 0x800?
Haha, now throw side-effecting at your smart compiler by
inserting a debugging into the while loop
Anyway, I think we're not really disagreeing. You can write loops stupidly, whether they're iterative (as above) or whether they're recursive. A compiler probably can't save you if you are particularly stupid. It might even make things worse.
For what it's worth, when I say your sentence to myself, I want to make the like bold, I guess to emphasize the simile.