FASTRA II Puts 13 GPUs In a Desktop Supercomputer

FASTRA II Puts 13 GPUs In a Desktop Supercomputer 127

Posted by timothy on Wednesday December 16, 2009 @07:07PM from the lucky-number dept.

An anonymous reader writes "Last year tomography researchers of the ASTRA group at the University of Antwerp developed a desktop supercomputer with four NVIDIA GeForce 9800 GX2 graphics cards. The performance of the FASTRA GPGPU system was amazing; it was slightly faster than the university's 512-core supercomputer and cost less than 4000EUR. Today the researchers announce FASTRA II, a new 6000EUR GPGPU computing beast with six dual-GPU NVIDIA GeForce GTX 295 graphics cards and one GeForce GTX 275. The development of the new system was more complicated and there are still some stability issues, but tests reveal the 13 GPUs deliver 3.75x more performance than the old system. For the tomography reconstruction calculations these researchers need to do, the compact FASTRA II is four times faster than the university's supercomputer cluster, while being roughly 300 times more energy efficient."

FASTRA II Puts 13 GPUs In a Desktop Supercomputer

This discussion has been archived. No new comments can be posted.

Search 127 Comments Log In/Create an Account

Comments Filter:

Re:Easy money to be made? (Score:5, Informative)

by Chirs ( 87576 ) writes: on Wednesday December 16, 2009 @07:26PM (#30466356)

Um...read the article?
The motherboard is a ASUS P6T7 WS Supercomputer.

Re:How fast is this really? (Score:5, Informative)

by jandrese ( 485 ) writes: <kensama@vt.edu> on Wednesday December 16, 2009 @07:37PM (#30466500) Homepage Journal

If you read the article it tells you that the supercomputer has 256 Opteron 250s (2.4Ghz) and was built 3 years ago. If you have a parallizable problem that can be solved with CUDA, you can get absolutely incredible performance out of off-of-the-shelf GPUs these days.

Re:GPU accuracy (Score:5, Informative)

by kpesler ( 982707 ) writes: on Wednesday December 16, 2009 @07:48PM (#30466606)

Presently the G200 GPUs in this machine support double-precision, but at about 1/8 the peak rate of single-precision. In practice, since most codes tend to be bandwidth limited, and pointer arithmetic is the same for single and double precision, double-precision performance is usually closer to 1/2 that of single-precision performance, but not always. With the Fermi GPUs to be released early next year, double-precision peak FLOPS will be 1/2 of single-precision peak, just like on present X86 processors. Also note that many scientific research groups, such as my own, have found that contrary to dogma, single-precision is good enough for most of the computation, and that a judicious mix of single and double-precision arithmetic gives high-performance with sufficient accuracy. This is true for some, but not all, computational methods.

Re:How fast is this really? (Score:2, Informative)

by jstults ( 1406161 ) writes: on Wednesday December 16, 2009 @08:08PM (#30466846) Homepage

you can get absolutely incredible performance out of off-of-the-shelf GPUs these days.
I had heard this from folks, but didn't really buy it until I read this paper [nasa.gov] today. They get a speed-up (wall clock) using the GPU even though they have to go to a worse algorithm (Jacobi instead of SSOR). Pretty amazing.

Re:Silly (Score:3, Informative)

by modemboy ( 233342 ) writes: on Wednesday December 16, 2009 @08:22PM (#30466998)

The difference between GeForce and Quadro cards is almost always completely driver based, it is the exact same hw, different sw.
This basically a roll your own Tesla, and considering the Teslas connect to the host system via an 8x or 16x PCI-e add in card, I'm gonna say you are wrong when it comes to the bandwidth issue as well...

That's why I have a problem with the comparisons (Score:4, Informative)

by Sycraft-fu ( 314770 ) writes: on Wednesday December 16, 2009 @09:08PM (#30467434)

Because it only applies to the kind of problems that CUDA is good at solving. Now while there are plenty of those, there are plenty that it isn't good for. Take a problem that is all 64-bit integer math and has a branch every couple hundred instructions and GPUs will do for crap on it. However a supercomputer with general purpose CPUs will do as well on it as basically anything else.
That's why I find these comparisons stupid. "Oh this is so much faster than our supercomputer!" No it isn't. It is so much faster for some things. Now if you are doing those things wonderful, please use GPUs. However don't then try to pretend you have a "supercomputer in a desktop." You don't. You have a specialized computer with a bunch of single precision stream processors. That's great so long as your problem is 32-bit fp, highly parallel, doesn't branch much, and fits within the memory on a GPU. However not all problems are hence they are NOT a general replacement for a supercomputer.

Re:How fast is this really? (Score:3, Informative)

by cheesybagel ( 670288 ) writes: on Wednesday December 16, 2009 @09:15PM (#30467514)

Really? Care to share any results that support that? I'm quite sure the peak flops you can achieve on the GPU are much higher than the limited SIMD capability of the CPU.
IIRC they claim 2.5-3x times more performance using a Tesla than using the CPUs in their workstation. Ignoring load time.
SSE enables a theoretical peak performance enhancement of 4x for SIMD amenable codes (e.g. you can do 4 parallel adds using vector SSE, in the time it takes to make 1 add using scalar SSE). In practice however you usually get like 3x more performance.
Theoretical SIMD performance for the GPU is very fine and nice, but in practice the small caches in current GPUs limit performance. CPUs also often have out-of-order execution support and other hardware which is too expensive in terms of transistors to implement in a GPU.
IMO the main problem here is that the programming model for the CPU is too complex since you need to use several different ways to express parallelism (SIMD/Multicore/Cluster) to get top performance.

Re:Silly (Score:3, Informative)

by jpmorgan ( 517966 ) writes: on Wednesday December 16, 2009 @09:16PM (#30467536) Homepage

The hardware is the same, but the quality control is different. Teslas and Quadros are held to rigorous standards. GeForces have an acceptable error rate. That's fine for gaming, but falls flat in scientific computing.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

FASTRA II Puts 13 GPUs In a Desktop Supercomputer 127

FASTRA II Puts 13 GPUs In a Desktop Supercomputer More Login

FASTRA II Puts 13 GPUs In a Desktop Supercomputer

Re:Easy money to be made? (Score:5, Informative)

Re:How fast is this really? (Score:5, Informative)

Re:GPU accuracy (Score:5, Informative)

Re:How fast is this really? (Score:2, Informative)

Re:Silly (Score:3, Informative)

That's why I have a problem with the comparisons (Score:4, Informative)

Re:How fast is this really? (Score:3, Informative)

Re:Silly (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot