FASTRA II Puts 13 GPUs In a Desktop Supercomputer 127
An anonymous reader writes "Last year tomography researchers of the ASTRA group at the University of Antwerp developed a desktop supercomputer with four NVIDIA GeForce 9800 GX2 graphics cards. The performance of the FASTRA GPGPU system was amazing; it was slightly faster than the university's 512-core supercomputer and cost less than 4000EUR. Today the researchers announce FASTRA II, a new 6000EUR GPGPU computing beast with six dual-GPU NVIDIA GeForce GTX 295 graphics cards and one GeForce GTX 275. The development of the new system was more complicated and there are still some stability issues, but tests reveal the 13 GPUs deliver 3.75x more performance than the old system. For the tomography reconstruction calculations these researchers need to do, the compact FASTRA II is four times faster than the university's supercomputer cluster, while being roughly 300 times more energy efficient."
How fast is this really? (Score:3, Insightful)
Re:Silly (Score:2, Insightful)
It's not silly: (1) this is a research project, not production medical equipment, meaning that the funds to buy Tesla cards were probably not available, and they aren't particularly worried about occasional bit errors. (2) Their particular application doesn't need much inter-GPU communication, if any, so that bandwidth is not an issue. They just need for each GPU to load datasets, chew on them, and spit out the results.
How much does your proposed GPU supercomputer cost for 13 GPUs?
Re:That's why I have a problem with the comparison (Score:5, Insightful)
That was always true of supercomputers. In fact the stuff that runs well on CUDA now is almost precisely the same stuff that ran well on Cray vector machines - the classic stereotype of "Supercomputer"! Thus I do not see your point. The best computer for any particular task will always be one specialized for that task, and thus compromised for other tasks.
BTW, newer GPUs support double precision [herikstad.net].
Re:times less (Score:5, Insightful)
Re:That's why I have a problem with the comparison (Score:1, Insightful)
E X A C T L Y ! ! ! I always read about how fast the Cell Broadband Processor(tm) is and how anyone is a FOOL for not using it. No. They suck hard when it comes to branch prediction. Their memory access is limited to fast, but very small memory. Out of branch execution performance is awful. You have to rewrite code massively to avoid it. For embarassingly parallel problems, they are a dream. For problems not parallel, they are quite slow. An old supercomputer isn't as fast as a new one. If ordinary processors especially multi-core ones had two or four stream processors for every core, parallel operations would be much faster too, the processors themselves would be faster and its likely one of the improvements that are being looked at by Intel and AMD (and others). Something like this would make general purpose processors much more like the Cell Broadband Engine(tm), and would make them somewhat obsolete. Certainly the Cell processor suffers from being able to deal with problems that can only use 256 MB of memory (the cell BE uses proprietary memory, very fast, but only available up to 256 MB, no one else makes this kind of memory, and they don't make chip sizes bigger than what winds up being 256 MB. GPU's are limited by memory size too (although 1GB is bigger than 256 MB), but it still suffers all of the problems of a specialty processor. If you can use it, great. I can't get any performance boost out of them, because my programs have out-of-order branches, and I get better performance from a general purpose CPU.