Nvidia Will Support ARM Hardware For High-Performance Computing (venturebeat.com) 24
An anonymous reader quotes a report from VentureBeat: At the International Supercomputing Conference (ISC) in Frankfurt, Germany this week, Santa Clara-based chipmaker Nvidia announced that it will support processors architected by British semiconductor design company Arm. Nvidia anticipates that the partnership will pave the way for supercomputers capable of "exascale" performance -- in other words, of completing at least a quintillion floating point computations ("flops") per second, where a flop equals two 15-digit numbers multiplied together. Nvidia says that by 2020 it will contribute its full stack of AI and high-performance computing (HPC) software to the Arm ecosystem, which by Nvidia's estimation now accelerates over 600 HPC applications and machine learning frameworks. Among other resources and services, it will make available CUDA-X libraries, graphics-accelerated frameworks, software development kits, PGI compilers with OpenACC support, and profilers. Nvidia founder and CEO Jensen Huang pointed out in a statement that, thanks to this commitment, Nvidia will soon accelerate all major processor architectures: x86, IBM's Power, and Arm. "As traditional compute scaling has ended, the world's supercomputers have become power constrained," said Huang. "Our support for Arm, which designs the world's most energy-efficient CPU architecture, is a giant step forward that builds on initiatives Nvidia is driving to provide the HPC industry a more power-efficient future."
Weyland-Yutani is real (Score:2)
Re: (Score:2)
Says the guy who clearly knows nothing about FPGAs.
FPGAs are more expensive to produce the hardware, more difficult to program (read: more expensive again), less universally useful because far more complex logic can be executed via software in a universal logic controller (e.g. a CPU or GPU) than can be programmed into an FPGA - you're limited by the number of logic gates you can fit on the chip where software running on a CPU is not, and FPGAs use many times more power than CPUs or GPUs.
That last bit is th
Wonder if this is a need to some markets shrinking (Score:2)
Since Apple steadfastly is not supporting Nvidia in systems (or is the other way around? Not sure), and Tesla has dropped Nvidia in favor of custom hardware design that will probably have other self driving car makers following suit (tremendous power/performance savings), I wonder if Nvidia is looking to expand to new markets so they aren't dependent on gamers and localized machine learning boxes?
Re: (Score:2)
I wonder if this has anything to do with the strange, ridiculous anti-Kendall AC group (just one guy??) that keeps smearing him over incredibly minor grammar/spelling errors and rather meaningless differences of opinion.
I suppose there's no way we'll ever know.
British company? (Score:2)
What's this about a British company? As I understand it, ARM is now a division of a Japanese company.
Re: (Score:2)
I have to take issue with the summary (Score:3)
1st: FLOPS: FLoating point Operations Per Second. Not FLoating point cOmPutationS.
2nd: A flop is not two 15 digit numbers multiplied together. If that was true my 32-bit MCU without a FPU would have a better FLOPS measurement than it does. The MCU can easily run 123456789012345 x 123456789012345 using 64-bit integer software on the 32-bit ALU, much faster than it can run 1.23456789012345 x 123.456789012345 with floating point software.
Re: (Score:2)
Indeed! I just read up on it, since it's fascinating. My impression was that maybe FLOPS was an average use type measurement (e.g. an FPU or CPU calculating full-speed will generally perform X number multiplies and adds, resulting in X number of FLOPS), but that is completely wrong! And was a bit wrong-headed to begin with, and the reason should be obvious - a floating point operation is a step in the process (e.g. get value from registers, add x to y and save in z, etc), not a full computation, so of cours
Re: (Score:2)
multiply and addition take the exact same number of steps in floating point arithmetic
Rubbish. This depends on the exact nature of the hardware and floating point algorithms used. Your handwavy argument above is entertaining but has no basis in fact, logic or basic computer architecture knowledge.
Re: (Score:2)
It's great that you supplied any information at all that explained where my information was incorrect, rather than just saying "nuh-uh".
Oh wait that's not what you did at all! My mistake.
Since you don't seem to believe it's really four steps, I'll work it out for you in binary logic*. Lets say we've got a simple problem, 2.25 + 134.0625 (S is sign, E is exponent, M is mantissa, with the implied (1) on the front of it):
S E M
0 1000 0000 (1) 001 0000 0000 0000 0000 0000 +
0 1000 0110 (
Re: (Score:2)
And of course I flubbed the sign on the multiply Step 3, oh well. Should be 0 not 1.
Re: (Score:2)
You flubbed more than that. See already normal in this case. Don't be such an idiot. You aren't the only one in possession of an implementation manual for the ieee operations, in fact, I doubt you even possess one.
FLOPS are theoretical (Score:2)
The number of flops was almost always defied as the maximum number of floating point opera