Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AMD Microsoft Programming Hardware

Microsoft Demos C++ AMP At AMD Developers Summit 187

MojoKid writes "The second day of the AMD Fusion Developer Summit began with a keynote from Microsoft's Herb Sutter, Principal Architect, Native Languages and resident C++ guru. The gist of Herb's talk centered around heterogeneous computing and the changes coming with future versions of Visual Studio and C++. One of the main highlights of the talk was a demo of a C++ AMP application that seamlessly took advantage of all of the compute resources within a few of the various demo systems, from workstations to netbooks. The physics demo seamlessly switched from using CPU, integrated GPU, and discrete GPU resources, showcasing the performance capabilities of each. As additional bodies are added, workload increases with a ramp-up to over 600 of GFLops in compute performance."
This discussion has been archived. No new comments can be posted.

Microsoft Demos C++ AMP At AMD Developers Summit

Comments Filter:
  • by Anonymous Coward on Wednesday June 15, 2011 @11:30PM (#36458618)

    As someone actually at the event, someone who attended both the keynote and the later (and more in-depth) technical session, and someone who is employed as a GPU programmer, I would say that it's being vastly overblown. It is very easy to look at the examples in the keynote (dense matrix multiplication with very little code modification, and an N-body simulation for which the code is not shared) and believe that this is finally some panacea for the difficulties involved in GPU computing and massively parallel computing in general. But the reality is, much like some approaches before it, C++ AMP simply elides some of the verbosity in the CUDA/OpenCL APIs regarding memory allocation, thread configuration, etc. The matrix multiplication example appears dead simple because matrix multiplication on a GPU is dead simple. As soon as you start trying to write more advanced applications with this, you find that you need to take advantage of a fast shared memory to get worthwhile performance gains -- to do that, you add "tiles" to your "grid" (in CUDA terms, "blocks" and "grid", in OpenCL terms, "local workgroups" and "global workgroup"). As soon as your output starts getting more complicated than a nice, deterministic matrix multiplication or N-body simulation, you may find that you have potential race conditions that you have to address yourself. And when you've broken up your problem into a tiled grid, taken fast local shared memory and slow global shared memory into account, and ensured that you have no race conditions, you've basically done all of the work of writing a CUDA or OpenCL kernel. Only now you've done it in a way that is very proprietary, instead of the (comparatively open) CUDA and (way the fuck more open) OpenCL.

    It's unfortunate that it is being sold as this amazing world-changing breakthrough, because although it is not by any stretch that, it is in fact quite a nice concept. This is something, like Microsoft's PPL, that can be used to parallelize existing code very easily provided the code is parallelizable and written in a parallel-friendly manner. It is not something, however, that will do the work of parallelization or even the work of optimizing parallel-friendly code for GPU hardware for you.

  • Re:AMP? (Score:4, Interesting)

    by slimjim8094 ( 941042 ) on Thursday June 16, 2011 @12:28AM (#36458924)

    To be fair, doing any nontrivial assembly will put some serious hair on your balls. But it's just not very good for (almost*) any real work.

    I use C for performance-critical code, C++ for complex performance-significant components (like a OpenGL million-poly renderer), Java or C# (depending on target platform(s)) for large but otherwise-modest programs, and scripting languages (mostly Python) for one-off programs or little tools that don't justify the involvement of a more heavyweight language.

    Use the right tool for the job, as always. Can't go too far wrong with Java, and if you're going to hit its performance wall, you should know up front.

    * Only large-scale assembly coding I've ever had to do was for a compilers class, but there was obviously no way around it. Fascinating to learn and do, but I sure hope I'm done with it...

The moon is made of green cheese. -- John Heywood

Working...