Forgot your password?
typodupeerror
Supercomputing Hardware

FASTRA II Puts 13 GPUs In a Desktop Supercomputer 127

Posted by timothy
from the lucky-number dept.
An anonymous reader writes "Last year tomography researchers of the ASTRA group at the University of Antwerp developed a desktop supercomputer with four NVIDIA GeForce 9800 GX2 graphics cards. The performance of the FASTRA GPGPU system was amazing; it was slightly faster than the university's 512-core supercomputer and cost less than 4000EUR. Today the researchers announce FASTRA II, a new 6000EUR GPGPU computing beast with six dual-GPU NVIDIA GeForce GTX 295 graphics cards and one GeForce GTX 275. The development of the new system was more complicated and there are still some stability issues, but tests reveal the 13 GPUs deliver 3.75x more performance than the old system. For the tomography reconstruction calculations these researchers need to do, the compact FASTRA II is four times faster than the university's supercomputer cluster, while being roughly 300 times more energy efficient."
This discussion has been archived. No new comments can be posted.

FASTRA II Puts 13 GPUs In a Desktop Supercomputer

Comments Filter:
  • by Hatta (162192) on Wednesday December 16, 2009 @07:22PM (#30466302) Journal

    Oh, I read that wrong, it's 7 graphics cards. Who makes such a motherboard?

  • by Ziekheid (1427027) on Wednesday December 16, 2009 @07:43PM (#30466560)
    I'll admit that, thanks for the info, you'd think this was crucial information for the summary too though. Everything put in perspective, it will only outperform the cluster on specific calculations so overall it's not faster right?
  • I've got a pair of 9800gx2 in my rig. The cards turn room temperature air into ~46C air. Without proper ventilation, these things will turn a chassis into an easy bake oven.

    For those not familiar with the 9800gx2 cards, it essentially is two 8800gts video cards linked together to act as a single card - something called SLI on the NVidia side of marketing. SLI typically required a mainboard/chipset that would allow you to plug in two cards and link them together. This model allowed any mainboard to have two 'internal' cards linked together, with the option of linking another 9800gx2 if your board actually supported SLI.

    The pictures did not show any SLI bridge, so it looks like they are just taking advantage of multiple GPUs per card.

  • by raftpeople (844215) on Wednesday December 16, 2009 @07:55PM (#30466688)
    It's all a continuum and depends on the problem. For problems with enough parallelism that the GPU's are a good choice, then they are faster. For a completely serial problem, then the current fastest single core is faster than the both the supercomputer and the GPU's.
  • Re:GPU accuracy (Score:4, Interesting)

    by Beardo the Bearded (321478) on Wednesday December 16, 2009 @08:00PM (#30466752)

    First, a gaming card is going to get fast firmware. A workstation card is going to get accurate firmware. I imagine that supercomputer cards would get specialized firmware. (I only skimmed the summary.)

    GPUs are excellent at solving certain types of problems and excel at solving matrices. (That's what your video card is doing while it's rendering.) The best part of that is that most, if not all, mathematical problems can be expressed as a matrix, meaning that your super-fast GPU can solve most math problems super-fast.

    Next, GPUs love working together since they don't care about what the OS is doing. All they do is take raw data and respond with an answer. Usually we're putting that answer onto the display, since otherwise wtf are we doing with a GPU? In this case, the results are returned instead of using the flashy display. So what you end up with is a set of really fast, specialized, parallel engines solving broken down matrices.

    They're also not subject to the marketing whims of Moore's Law, so you can often get faster cards sooner than faster CPUs. To break down a supercomputer so that you get this kind of performance for 4000 EURO is a fantastic achievement. It's almost, but not quite, hobby range. (I'd still put money on someone trying to evolve this into a gaming rig...)

  • by CityZen (464761) on Wednesday December 16, 2009 @09:49PM (#30467826) Homepage

    Apparently, the regular BIOS can't boot with more than 5? graphics cards installed due to the amount of resources (memory & I/O space) that each one requires. So the researchers asked ASUS to make a special BIOS for them which doesn't set up the graphics card resources. However, the BIOS still needs to initialize at least one video card, so they agreed that the boot video card would be the one with only a single GPU. Presumably, they could have also chosen a dual GPU card that happened to be different from the others in some way.

  • Re:GPU accuracy (Score:3, Interesting)

    by hairyfeet (841228) <bassbeast1968.gmail@com> on Thursday December 17, 2009 @12:40AM (#30469142) Journal
    Question: Since you seem to be pretty knowledgeable on the subject, have you or any of your colleagues used or tried the AMD Stream SDK [gpgpu.org]? Because those ATi 5870s look to be pretty scary as far as raw power, and since the AMD SDK supports OpenCL on both the CPU and GPU, and AMD has opened up their code as well as supporting both Windows and Linux 32/64 bit I was just curious if you or anyone else here has tried it?
  • Re:GPU accuracy (Score:2, Interesting)

    by kpesler (982707) on Thursday December 17, 2009 @02:39PM (#30476618)
    I have not tried it for two reasons. First, to my knowledge there are no large public machines in the US being planned using AMD GPUs, so there is relatively little incentive to port the code to OpenCL. We run on large clusters and it appears for the moment that NVIDIA has the HPC cluster market tied up. Second, while OpenCL is quite similar to CUDA in many respects, it's also significantly less convenient from a coding perspective. NVIDIA added a few language extensions that makes launching kernels nearly as simple as a function call. As a pure C library, OpenCL requires much more setup code for each kernel invocation. If there was a strong incentive, such as the construction of a large NSF or DOE machine with AMD GPUs, I'd probably port it anyway, but without such a machine, it's not worth the time and effort. It's important to note that on GPUs, peak performance data often doesn't translate into actual performance numbers. The 4870 had a higher peak floating point rate than the G200, but in graphics and some other benchmarks, the G200 usually came out ahead. I don't know if this will also be the case with Fermi vs. 5870's. Finally, another large consideration is that AMD is pretty far behind on the software end. Besides mature compilers for both CUDA and OpenCL, NVIDIA provides profilers and debuggers that can debug GPU execution in hardware, and there is a growing ecosystem of CUDA libraries. For the sake of competition, I hope AMD adoption grows, but I've gotten the impression they are just not investing that much in general-purpose GPU computing.

"Don't talk to me about disclaimers! I invented disclaimers!" -- The Censored Hacker

Working...