Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI Hardware Science

NVIDIA Creates a 15B-Transistor Chip With 16GB Bandwidth Memory For Deep Learning (venturebeat.com) 128

An anonymous reader cites a report on VentureBeat: NVIDIA chief executive Jen-Hsun Huang announced that the company has created a new chip, the Tesla P100, with 15 billion transistors, 16GB high-bandwidth memory for deep-learning computing. It's the biggest chip ever made, Huang said. "We decided to go all-in on A.I.," Huang said. "This is the largest FinFET chip that has ever been done." The chip has 15 billion transistors, or three times as much as many processors or graphics chips on the market. It takes up 600 square millimeters. The chip can run at 21.2 teraflops. Huang said that several thousand engineers worked on it for years. Jim McGregor, writing for Forbes (the link is not accessible to ad-blocking tool users): It features NVIDIA's new Pascal GPU architecture, the latest memory and semiconductor process, and packaging technology -- all to create the densest compute platform to date. In addition, it combines 16GB of die stacked second-generation High-Bandwidth Memory (HBM2). The memory and GPU are combined into a multichip module on a state-of-the-art silicon substrate. The P100 has NVIDIA's NVLink interface technology to connect to multiple Tesla P100 GPU modules.
This discussion has been archived. No new comments can be posted.

NVIDIA Creates a 15B-Transistor Chip With 16GB Bandwidth Memory For Deep Learning

Comments Filter:
  • by Anonymous Coward on Tuesday April 05, 2016 @02:06PM (#51847895)

    Please enjoy hunting me with your time machine.

  • by dargaud ( 518470 )
    can it make me a sandwich ?
    • I'm always amazed how it takes so many engineers. What the heck do they all do? How does one organize this many contributions? Isn't this sort of they highly automated with largely repetitive subunits.

      • Re:1000 engineers (Score:5, Informative)

        by cowtamer ( 311087 ) on Tuesday April 05, 2016 @03:07PM (#51848309) Journal

        There are several factors. First of all, what they are building is a HUGE engineered system which would have taken up a couple of buildings a decade or two ago. The fact that the end product is small doesn't change the complexity. The second part is the fact that it IS so small, which brings its own complications. In addition, semiconductor manufacturing is a very tricky business where even making the simplest thing (e.g., a transistor) takes an enormous amount of planning, characterization, and tool design.

        Part of it is the R&D -- nothing like this has been done before, so certain things have to be figured out (heat dissipation, how the proximity of the components effect the other components,stuff neither of us will understand, etc. etc). Another huge part is tooling and process -- someone has to design, test and characterize the fabrication tools and processes (the "automation" you speak of has to be built by someone -- a device this complicated probably can't be built without the automation). The chip is divided into subsystems each of which needs to be designed, simulated, and optimized. Someone has to integrate all the subsystems and simulate them together. The 1000 people probably include material scientists, process engineers, electrical engineers of various stripes, semiconductor physicists, mechanical engineers (heat dissipation, packaging, etc)., systems engineers, engineering project managers, etc.

      • To some degree yes, but making a physical chip requires working withing the confines of the physical limitations of the fabrication process, so it's never as simple as designing some ideal chip. You can certainly try to do it that way, but the yields will suck. It's also not trivial to design the sub-units either, and there's always new instruction sets or other such things to be supported.
      • Re:1000 engineers (Score:5, Informative)

        by crgrace ( 220738 ) on Tuesday April 05, 2016 @03:20PM (#51848403)

        One organizes many contributions using any number of industry-standard design methodologies. Designing airplanes and cars uses even more engineers.

        I suspect NVIDIA is slightly exaggerating and are counting the contribution of many "overhead" engineers that provide value for the whole engineering organization, such as people who work on design tools, design kits, methodology and the like.

        You're right, there are many repeated subunit but each unit needs a team to be optimized.

        For a chip this complex you need:

        Logic Designers (who come up with high-level models for the chip and define the instruction set / hardware interface)
        Front-end engineers that write Verilog and/or VHDL (I have no idea what NVIDIA uses)
        Implementation engineers (who do place and route and parasitic extraction)
        Verification engineers (who use various tools to see if everything is as it should be)
        Packaging engineers (who work closely with vendors to develop a custom package for the chip/module)
        Module engineers (since we have 3D stacked memories on this device the module engineering is far from trivial)
        Thermal Engineers (3D modules typically have very complex thermal requirements)
        Signal Integrity engineers (since we're going so fast just getting a signal from point A to point B is hard)
        Analog/Mixed Signal engineers (for clocking, serial I/O development)
        Integration Engineers (for modeling how to put all this together)
        System Engineers (for figuring out if this is all going to work)
        Software Engineers (for low-level software dev)
        CAD Engineers (for developing and maintaining an appropriate computer-aided design flow)
        Foundry Engineers (for working with the foundry on the physical production of the wafers... anything this big and complex will need process customization)
        ESD engineers (for figuring out and implementing an ESD strategy)
        Library Engineers (for customizing and optimizing the standard cell library used in the chip)
        Product Engineers (for solving production problems as they arise)
        Test Engineers (for developing and implementing tests to show the chip is working as expecting)
        Application Engineers (who work with early adopters to integrate this chip into their systems)

        and on and on and on...

        As you can see, an army of engineers is required for a chip this complex to see the light of day. On simpler chips, many of these roles can be played by the same people, but in a chip this big, they need to divide the work or it would never get done.

        • For a chip this complex you need:

          How many of those job titles and descriptions actually correspond to a college major that an American citizen can learn?

          It's a problem mostly seen in the U.S., say labor-market experts, thanks to a rapidly evolving economy and a divide between the country's educational institutions and employers that isn't there in other advanced economies. In Germany and Denmark, for example, the two groups collaborate to ensure training and apprenticeships lead to jobs after graduation. The gap has helped push U.S. job vacancies to 5.5 million, near historic highs. For most of the past year the number of job openings has exceeded the number of new hires, a reflection of employers' difficulty in filling positions.

          http://www.wsj.com/articles/colleges-drill-down-on-job-listing-terms-1459704268 [wsj.com]

          • by crgrace ( 220738 )

            I see what you're getting at and I don't disagree. However, as you know this stuff is really complicated and you need to be specialized in your career to be effective.Most of these jobs are for Electrical Engineers, a few could be also held by people who studied Computer Science or Mechanical Engineering. I'm an Analog/Mixed-Signal Engineer and while I know Verilog and how to run verification tools, I'm frankly not as competent at those roles as specialists are. It is the way of the world.

            I agree you coul

          • by Pulzar ( 81031 )

            How many of those job titles and descriptions actually correspond to a college major that an American citizen can learn?

            They almost all correspond to an electrical or computer engineering major, however the vast majority are not really available to new grads. Three or four on that list will accept new grads who have already specialized a bit in their masters programs, and then once they have a better handle on the big picture they can transfer into other roles.

            For example, you don't design systems without k

        • Wow, "and on and on and on..."

          It is amazing how much I take for granted.

      • by KGIII ( 973947 )

        I've actually been wondering and asking, for a while now, why we can't just buy like a 5 cubic inch block of CPU and stuff it in our computers. Yes, I know it will get hot. Yes, I know it'll suck down juice like an arc welder. I'm okay with that. I've got solar and wind. I can cool that down - there are things to do that with.

        Seriously, am I the only one that envisions a 5^3" chunk of CPU and all the glorious things I could do with it? Coupled with stacks of those NVRAM critters, piped right next to it, and

        • except you can't remove the heat from inside the block. The reason cpu's are basically 2D flat pieces is so we can glue on cooling fins, and cool that bad boy down. until we figure out how to do micro liquid cooling, under pressure, and interweave that cooling into the chip's design we won't get real 3D stacks of processors.

          Though What we can do is create 4-6 cpu's stacked vertically around a coolant tube. but I believe that run a foul of pushing parts of the pct to far away from the rest of the componen

        • Because we can only make things in layers. So it you want to stack a million layers of silicon, and each layer is a dozen process steps, manufacturing a wafer will take years. Even with current 2D designs, the process already takes many weeks. The chance of a tiny error in one of the layers messing up the whole chip will be huge.
    • by Grog6 ( 85859 )

      Only if you have admin privileges, are s superuser, and enter the right password.

      Or say "sudo make me a sandwich".
      It works on geek girls, anyway, from what I hear. :)

  • Maybe you should use some of those engineers to fix your drivers, to, you know, support the people that have already paid you for a product you already produce. Seems that deep learning tech hasn't taught you anything. Spoken as an owner of GTX 660 SLI setup, not some rabid other team fanboi.
    • by halivar ( 535827 )

      Yes, let's have the chip designers drop everything and help write the device drivers. Brilliant. I should have the network engineers come upstairs and help me with my Excel spreadsheets.

      • Make sure you have a couple of them so they can argue amoung themselves and find an agreement.
        Worst case sent randoms of them off to fetch you a coffee!

      • I am sure they could help if YOU are the one doing the spreadsheet work, since reading comprehension eludes you. Mr. NVidia said 'engineers', which is a broad term, versus chip designers which is pretty specific. If they needed 100,000 chip designers then they DO have problems. Now crawl back under your bridge.
        • by halivar ( 535827 )

          Oh God, okay, I suppose you're right, and they had software engineers design this new chip instead of actual chip designers, thus stealing the precious resources you're bitching about.

    • Perhaps the value here is the AI to overcome that driver OOPS! Replace those lame driver devs with this AI chip!
      • Artificial Intelligence sponsored/created by a selfish corporation. Let me get some Mellow Yellow and Pringles so I can enjoy the show.
  • This should provide some astonishing porn.

    • LOL, why am I suddenly picturing millions of horny guys getting blown off by a porn AI which has developed attitude?

      Except for the people into that whole humiliation thing, I just don't see that being a big selling point. :-P

      I think sentient porn is the last thing we want.

    • Well, it could learn to categorize it and learn your preferences very fast, so it's not out of the question...

  • In the Tesla's firmware http://jalopnik.com/a-hacker-m... [jalopnik.com] That would be interesting if it was a chip reference and not a car reference --- tinfoil hat.
  • by Anonymous Coward on Tuesday April 05, 2016 @02:14PM (#51847957)

    From what my friends who work at nVidia tell me, most engineers work on all projects. They get sent problems from one GPU, after fixing that, start working on issues from a CPU or some other project.

  • by Anonymous Coward

    16GigaBillion

  • by etash ( 1907284 ) on Tuesday April 05, 2016 @02:27PM (#51848053)
    Yes it can do 21,6 teraflop.... at FP16.... half precision...it can "only" do 10,6 teraflop at single precision and 5,3 teraflop at double (64) precision. Also it doesn't have 1TB/sec advertised (for months) HBM2 memory speed, but only 720GB/sec
    • by Anonymous Coward on Tuesday April 05, 2016 @02:35PM (#51848121)

      It turns out that for deep learning, 1/2 precision is very commonly used. You are using floats for numbers in a fairly small range, and accuracy isn't key. half precision speeds up processing, and more importantly lets you work with twice as much data.

      • Is there really any advantage over 16 bit integer, which would be faster and less complex?
        • by Mandrel ( 765308 )

          Is there really any advantage over 16 bit integer, which would be faster and less complex?

          Yes, artificial (and real) neurons deliver a weighted average of positive and negative inputs. So you usually have large positive and negative inputs which subtractively cancel to a moderate output. Integer doesn't handle this subtractive cancellation nearly as well as floating point, which can keep the same precision over large changes in scale.

      • Shaders for mobile GPUs use 1/2 precision quite a bit, however the small range (-2 - 2) is a problem for many operations, so you end up only being able to use them for less than half of the code.

  • by tekrat ( 242117 ) on Tuesday April 05, 2016 @02:28PM (#51848063) Homepage Journal

    Just imagine a beowulf clust........
    Oh, never mind.....

  • Isn't that a little close to the other Tesla? http://jalopnik.com/a-hacker-m... [jalopnik.com]
  • I'm still waiting for the game AI that can match a human brain for strategy in an open world, especially in an RPG game but anything beyond well-studied board games really. It's so frustrating to have the computer win by cheating. Not to mention implications for new expert systems. This technology can't mature soon enough.
  • Hmmm... NVIDIA. Giant chip.

    Bill Dally, are you going for a "jump approximate" instruction again?

  • Is this right?? 2' x 2' chip?
  • Note it's 20 Tflops at half precision. Single is 10, and double is 5.
  • Deep learning leads to Deep Thought leads to forty two.

  • by BrendaEM ( 871664 ) on Tuesday April 05, 2016 @08:27PM (#51850079) Homepage
  • How do you make 16 GB memory from 15 G transistors?
  • So can it run 600 km on a single charge?

  • An aside from the article: "Huang showed a demo from Facebook that used deep learning to train a neural network how to recognize a landscape painting. They then used the network to create its own landscape painting."

    So long for such jobs... How about deep learning about post-scarcity economics?
    https://en.wikipedia.org/wiki/... [wikipedia.org]
    https://en.wikipedia.org/wiki/... [wikipedia.org]

    Also: ""Our strategy is to accelerate deep learning everywhere," Huang said."

    How about some deep learning about morality? Imagine training children (or

You know you've landed gear-up when it takes full power to taxi.

Working...