Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Supercomputing Businesses Hardware Technology

Nvidia Will Support ARM Hardware For High-Performance Computing (venturebeat.com) 24

An anonymous reader quotes a report from VentureBeat: At the International Supercomputing Conference (ISC) in Frankfurt, Germany this week, Santa Clara-based chipmaker Nvidia announced that it will support processors architected by British semiconductor design company Arm. Nvidia anticipates that the partnership will pave the way for supercomputers capable of "exascale" performance -- in other words, of completing at least a quintillion floating point computations ("flops") per second, where a flop equals two 15-digit numbers multiplied together. Nvidia says that by 2020 it will contribute its full stack of AI and high-performance computing (HPC) software to the Arm ecosystem, which by Nvidia's estimation now accelerates over 600 HPC applications and machine learning frameworks. Among other resources and services, it will make available CUDA-X libraries, graphics-accelerated frameworks, software development kits, PGI compilers with OpenACC support, and profilers. Nvidia founder and CEO Jensen Huang pointed out in a statement that, thanks to this commitment, Nvidia will soon accelerate all major processor architectures: x86, IBM's Power, and Arm. "As traditional compute scaling has ended, the world's supercomputers have become power constrained," said Huang. "Our support for Arm, which designs the world's most energy-efficient CPU architecture, is a giant step forward that builds on initiatives Nvidia is driving to provide the HPC industry a more power-efficient future."
This discussion has been archived. No new comments can be posted.

Nvidia Will Support ARM Hardware For High-Performance Computing

Comments Filter:
  • It's currently known as ARM Holdings.
  • Since Apple steadfastly is not supporting Nvidia in systems (or is the other way around? Not sure), and Tesla has dropped Nvidia in favor of custom hardware design that will probably have other self driving car makers following suit (tremendous power/performance savings), I wonder if Nvidia is looking to expand to new markets so they aren't dependent on gamers and localized machine learning boxes?

  • What's this about a British company? As I understand it, ARM is now a division of a Japanese company.

  • by Durrik ( 80651 ) on Monday June 17, 2019 @06:36PM (#58778958) Homepage
    >least a quintillion floating point computations ("flops") per second, where a flop equals two 15-digit numbers multiplied together

    1st: FLOPS: FLoating point Operations Per Second. Not FLoating point cOmPutationS.

    2nd: A flop is not two 15 digit numbers multiplied together. If that was true my 32-bit MCU without a FPU would have a better FLOPS measurement than it does. The MCU can easily run 123456789012345 x 123456789012345 using 64-bit integer software on the 32-bit ALU, much faster than it can run 1.23456789012345 x 123.456789012345 with floating point software.
    • Indeed! I just read up on it, since it's fascinating. My impression was that maybe FLOPS was an average use type measurement (e.g. an FPU or CPU calculating full-speed will generally perform X number multiplies and adds, resulting in X number of FLOPS), but that is completely wrong! And was a bit wrong-headed to begin with, and the reason should be obvious - a floating point operation is a step in the process (e.g. get value from registers, add x to y and save in z, etc), not a full computation, so of cours

      • multiply and addition take the exact same number of steps in floating point arithmetic

        Rubbish. This depends on the exact nature of the hardware and floating point algorithms used. Your handwavy argument above is entertaining but has no basis in fact, logic or basic computer architecture knowledge.

        • It's great that you supplied any information at all that explained where my information was incorrect, rather than just saying "nuh-uh".

          Oh wait that's not what you did at all! My mistake.

          Since you don't seem to believe it's really four steps, I'll work it out for you in binary logic*. Lets say we've got a simple problem, 2.25 + 134.0625 (S is sign, E is exponent, M is mantissa, with the implied (1) on the front of it):
          S E M
          0 1000 0000 (1) 001 0000 0000 0000 0000 0000 +
          0 1000 0110 (

          • And of course I flubbed the sign on the multiply Step 3, oh well. Should be 0 not 1.

            • You flubbed more than that. See already normal in this case. Don't be such an idiot. You aren't the only one in possession of an implementation manual for the ieee operations, in fact, I doubt you even possess one.

  • FLOPS is an informal term that gained traction in the supercomputer community in the 80's. It never meant much back then because not all computers implemented the IEEE floating point standard. For IBM it was either 32 or 64 bit floating point while for CDC and later Cray it was 60 bits. IBM would quote faster 32 bit values while their competitors would say the real comparison should be 60 bit vs 72 bit. It was a mess.

    The number of flops was almost always defied as the maximum number of floating point opera

If you didn't have to work so hard, you'd have more time to be depressed.

Working...