NVIDIA Creates a 15B-Transistor Chip With 16GB Bandwidth Memory For Deep Learning (venturebeat.com) 128

Posted by msmash on Tuesday April 05, 2016 @03:01PM from the affinity-for-transistors dept.

An anonymous reader cites a report on VentureBeat: NVIDIA chief executive Jen-Hsun Huang announced that the company has created a new chip, the Tesla P100, with 15 billion transistors, 16GB high-bandwidth memory for deep-learning computing. It's the biggest chip ever made, Huang said. "We decided to go all-in on A.I.," Huang said. "This is the largest FinFET chip that has ever been done." The chip has 15 billion transistors, or three times as much as many processors or graphics chips on the market. It takes up 600 square millimeters. The chip can run at 21.2 teraflops. Huang said that several thousand engineers worked on it for years. Jim McGregor, writing for Forbes (the link is not accessible to ad-blocking tool users): It features NVIDIA's new Pascal GPU architecture, the latest memory and semiconductor process, and packaging technology -- all to create the densest compute platform to date. In addition, it combines 16GB of die stacked second-generation High-Bandwidth Memory (HBM2). The memory and GPU are combined into a multichip module on a state-of-the-art silicon substrate. The P100 has NVIDIA's NVLink interface technology to connect to multiple Tesla P100 GPU modules.

This discussion has been archived. No new comments can be posted.

NVIDIA Creates a 15B-Transistor Chip With 16GB Bandwidth Memory For Deep Learning

Load All Comments

Search 128 Comments Log In/Create an Account

Comments Filter:

Welcome SkyNET overlords! (Score:3, Funny)

by Anonymous Coward writes: on Tuesday April 05, 2016 @03:06PM (#51847895)

Please enjoy hunting me with your time machine.

Share
twitter facebook
- Re: (Score:2)
  
  by ShanghaiBill ( 739463 ) writes:
  
  Please enjoy hunting me with your time machine.
  https://xkcd.com/652 [xkcd.com].
- Re: (Score:2)
  
  by U2xhc2hkb3QgU3Vja3M ( 4212163 ) writes:
  
  (spoiler) Since the Doctor is Skynet, he already has a time machine.
So, (Score:2)

by dargaud ( 518470 ) writes:

can it make me a sandwich ?
- 1000 engineers (Score:2)
  
  by goombah99 ( 560566 ) writes:
  
  I'm always amazed how it takes so many engineers. What the heck do they all do? How does one organize this many contributions? Isn't this sort of they highly automated with largely repetitive subunits.
  - - Re: (Score:2)
      
      by U2xhc2hkb3QgU3Vja3M ( 4212163 ) writes:
      
      Dr. Ellie Sattler: Women inherits the earth.
      - Re: (Score:3)
        
        by slew ( 2918 ) writes:
        
        (with apologies to Michael Crichton)
        Ian - God creates intelligence, god destroys intelligence. God creates man, man destroys god. Man create AI.
        Ellie - AI destroys man, women inherit the earth...
        Perhaps a different, more historical view from the 1950's http://www.alteich.com/oldsite... [alteich.com]
        Dwar Ev threw the switch. There was a mighty hum, the surge of power from ninety-six billion planets. Lights flashed and quieted along the miles-long panel. Dwar Ev stepped back and drew a deep breath. "The honor of asking the first question is yours, Dwar Reyn."
        "Thank you," said Dwar Reyn. "It shall be a question that no single cybernetics machine has been able to answer." He turned to face the machine. "Is there a God?"
        The mighty voice answered without hesitation, without the clicking of single relay.
        "Yes, now there is a God."
        Sudden fear flashed on the face of Dwar Ev. He leaped to grab the switch.
        A bolt of lightning from the cloudless sky struck him down and fused the switch shut.
    - Re: (Score:2)
      
      by AK Marc ( 707885 ) writes:
      
      This won't get us closer to AI. A Sperm whale has a brain about 5x the size of a human, and isn't any smarter. So presuming that a larger brain will be smarter is contrary to the known facts.
      
      The solution to hard AI will not be direct.
  - Re:1000 engineers (Score:5, Informative)
    
    by cowtamer ( 311087 ) writes: on Tuesday April 05, 2016 @04:07PM (#51848309) Journal
    
    There are several factors. First of all, what they are building is a HUGE engineered system which would have taken up a couple of buildings a decade or two ago. The fact that the end product is small doesn't change the complexity. The second part is the fact that it IS so small, which brings its own complications. In addition, semiconductor manufacturing is a very tricky business where even making the simplest thing (e.g., a transistor) takes an enormous amount of planning, characterization, and tool design.
    Part of it is the R&D -- nothing like this has been done before, so certain things have to be figured out (heat dissipation, how the proximity of the components effect the other components,stuff neither of us will understand, etc. etc). Another huge part is tooling and process -- someone has to design, test and characterize the fabrication tools and processes (the "automation" you speak of has to be built by someone -- a device this complicated probably can't be built without the automation). The chip is divided into subsystems each of which needs to be designed, simulated, and optimized. Someone has to integrate all the subsystems and simulate them together. The 1000 people probably include material scientists, process engineers, electrical engineers of various stripes, semiconductor physicists, mechanical engineers (heat dissipation, packaging, etc)., systems engineers, engineering project managers, etc.
    
    Parent Share
    twitter facebook
    - - Re: (Score:2)
        
        by Coren22 ( 1625475 ) writes:
        
        You could probably train the AI to play Quake 1 pretty effectively.
  - Re: (Score:2)
    
    by alvinrod ( 889928 ) writes:
    
    To some degree yes, but making a physical chip requires working withing the confines of the physical limitations of the fabrication process, so it's never as simple as designing some ideal chip. You can certainly try to do it that way, but the yields will suck. It's also not trivial to design the sub-units either, and there's always new instruction sets or other such things to be supported.
  - Re:1000 engineers (Score:5, Informative)
    
    by crgrace ( 220738 ) writes: on Tuesday April 05, 2016 @04:20PM (#51848403)
    
    One organizes many contributions using any number of industry-standard design methodologies. Designing airplanes and cars uses even more engineers.
    I suspect NVIDIA is slightly exaggerating and are counting the contribution of many "overhead" engineers that provide value for the whole engineering organization, such as people who work on design tools, design kits, methodology and the like.
    You're right, there are many repeated subunit but each unit needs a team to be optimized.
    For a chip this complex you need:
    Logic Designers (who come up with high-level models for the chip and define the instruction set / hardware interface)
    Front-end engineers that write Verilog and/or VHDL (I have no idea what NVIDIA uses)
    Implementation engineers (who do place and route and parasitic extraction)
    Verification engineers (who use various tools to see if everything is as it should be)
    Packaging engineers (who work closely with vendors to develop a custom package for the chip/module)
    Module engineers (since we have 3D stacked memories on this device the module engineering is far from trivial)
    Thermal Engineers (3D modules typically have very complex thermal requirements)
    Signal Integrity engineers (since we're going so fast just getting a signal from point A to point B is hard)
    Analog/Mixed Signal engineers (for clocking, serial I/O development)
    Integration Engineers (for modeling how to put all this together)
    System Engineers (for figuring out if this is all going to work)
    Software Engineers (for low-level software dev)
    CAD Engineers (for developing and maintaining an appropriate computer-aided design flow)
    Foundry Engineers (for working with the foundry on the physical production of the wafers... anything this big and complex will need process customization)
    ESD engineers (for figuring out and implementing an ESD strategy)
    Library Engineers (for customizing and optimizing the standard cell library used in the chip)
    Product Engineers (for solving production problems as they arise)
    Test Engineers (for developing and implementing tests to show the chip is working as expecting)
    Application Engineers (who work with early adopters to integrate this chip into their systems)
    and on and on and on...
    As you can see, an army of engineers is required for a chip this complex to see the light of day. On simpler chips, many of these roles can be played by the same people, but in a chip this big, they need to divide the work or it would never get done.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by __aaclcg7560 ( 824291 ) writes:
      
      For a chip this complex you need:
      How many of those job titles and descriptions actually correspond to a college major that an American citizen can learn?
      It's a problem mostly seen in the U.S., say labor-market experts, thanks to a rapidly evolving economy and a divide between the country's educational institutions and employers that isn't there in other advanced economies. In Germany and Denmark, for example, the two groups collaborate to ensure training and apprenticeships lead to jobs after graduation. The gap has helped push U.S. job vacancies to 5.5 million, near historic highs. For most of the past year the number of job openings has exceeded the number of new hires, a reflection of employers' difficulty in filling positions.
      http://www.wsj.com/articles/colleges-drill-down-on-job-listing-terms-1459704268 [wsj.com]
      - Re: (Score:2)
        
        by crgrace ( 220738 ) writes:
        
        I see what you're getting at and I don't disagree. However, as you know this stuff is really complicated and you need to be specialized in your career to be effective.Most of these jobs are for Electrical Engineers, a few could be also held by people who studied Computer Science or Mechanical Engineering. I'm an Analog/Mixed-Signal Engineer and while I know Verilog and how to run verification tools, I'm frankly not as competent at those roles as specialists are. It is the way of the world.
        I agree you coul
      - Re: (Score:2)
        
        by Pulzar ( 81031 ) writes:
        
        How many of those job titles and descriptions actually correspond to a college major that an American citizen can learn?
        They almost all correspond to an electrical or computer engineering major, however the vast majority are not really available to new grads. Three or four on that list will accept new grads who have already specialized a bit in their masters programs, and then once they have a better handle on the big picture they can transfer into other roles.
        For example, you don't design systems without k
    - Re: (Score:1)
      
      by X-Ray Artist ( 1784416 ) writes:
      
      Wow, "and on and on and on..."
      It is amazing how much I take for granted.
  - Re: (Score:2)
    
    by KGIII ( 973947 ) writes:
    
    I've actually been wondering and asking, for a while now, why we can't just buy like a 5 cubic inch block of CPU and stuff it in our computers. Yes, I know it will get hot. Yes, I know it'll suck down juice like an arc welder. I'm okay with that. I've got solar and wind. I can cool that down - there are things to do that with.
    Seriously, am I the only one that envisions a 5^3" chunk of CPU and all the glorious things I could do with it? Coupled with stacks of those NVRAM critters, piped right next to it, and
    - Re: (Score:2)
      
      by peragrin ( 659227 ) writes:
      
      except you can't remove the heat from inside the block. The reason cpu's are basically 2D flat pieces is so we can glue on cooling fins, and cool that bad boy down. until we figure out how to do micro liquid cooling, under pressure, and interweave that cooling into the chip's design we won't get real 3D stacks of processors.
      Though What we can do is create 4-6 cpu's stacked vertically around a coolant tube. but I believe that run a foul of pushing parts of the pct to far away from the rest of the componen
    - Re: (Score:2)
      
      by religionofpeas ( 4511805 ) writes:
      
      Because we can only make things in layers. So it you want to stack a million layers of silicon, and each layer is a dozen process steps, manufacturing a wafer will take years. Even with current 2D designs, the process already takes many weeks. The chance of a tiny error in one of the layers messing up the whole chip will be huge.
- Re: (Score:3)
  
  by Grog6 ( 85859 ) writes:
  
  Only if you have admin privileges, are s superuser, and enter the right password.
  Or say "sudo make me a sandwich".
  It works on geek girls, anyway, from what I hear. :)
Now We Know Why Drivers Suck (Score:2)

by zenlessyank ( 748553 ) writes:

Maybe you should use some of those engineers to fix your drivers, to, you know, support the people that have already paid you for a product you already produce. Seems that deep learning tech hasn't taught you anything. Spoken as an owner of GTX 660 SLI setup, not some rabid other team fanboi.
- - Re: (Score:2)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    I'm guessing that the mention of 'SLI' might be the key point here. Taking problems not explicitly designed to be parallelized and attempting to parellelize them at the driver level after the fact is...a bit of a mixed bag...in terms of actually working. It's not clear that Nvidia is holding out on us here; given that they sell fancy multi-GPU systems to high end customers I'm sure that they would be delighted to also offer tools that make using those expensive multi-GPU systems really easy; but that doesn'
    - - Re: (Score:2)
        
        by fuzzyfuzzyfungus ( 1223518 ) writes:
        
        I think I did a terrible job of explaining it: Shader programs are indeed designed to be parellelized across however many shader units a given GPU provides. However, SLI(presumably because the niche market doesn't justify a nicer interconnect, or because it may simply not be feasible to provide the same level of integration between multiple PCBs as it is between elements on the same die) doesn't provide particularly close integration of the participant GPUs.
        
        SLI-ed GPUs can't even share VRAM(with some lim
        
        Re: (Score:2)
        
        by Coren22 ( 1625475 ) writes:
        
        SLI is an acronym, it stands for Scan Line Interleave. What this means, is that each GPU does half the screen worth of work by running a line at a time on each GPU. I am not sure what the OP's issue with drivers is, but my assumption would be the age of the hardware, 6xx is pretty old, and might not be enough to support modern games anymore. I highly doubt the OP's issue is with the driver having parallelization problems.
        
        Re: (Score:2)
        
        by Blaskowicz ( 634489 ) writes:
        
        SLI was Scan Line Interleave back in 1998. (and the Voodoo5 used small horizontal bands of pixels rather than straight scanlines). Then nvidia decided to revived it years later, but deciding the letters would stand for Scalable Link Interface, which doesn't really mean anything in particular. It never interleaved scanlines anymore.
  - Re: (Score:2)
    
    by Hognoxious ( 631665 ) writes:
    
    You're box is messed up
    One item on a long, long list.
- Re: (Score:2)
  
  by halivar ( 535827 ) writes:
  
  Yes, let's have the chip designers drop everything and help write the device drivers. Brilliant. I should have the network engineers come upstairs and help me with my Excel spreadsheets.
  - Re: (Score:2)
    
    by angel'o'sphere ( 80593 ) writes:
    
    Make sure you have a couple of them so they can argue amoung themselves and find an agreement.
    Worst case sent randoms of them off to fetch you a coffee!
  - Re: (Score:2)
    
    by zenlessyank ( 748553 ) writes:
    
    I am sure they could help if YOU are the one doing the spreadsheet work, since reading comprehension eludes you. Mr. NVidia said 'engineers', which is a broad term, versus chip designers which is pretty specific. If they needed 100,000 chip designers then they DO have problems. Now crawl back under your bridge.
    - Re: (Score:2)
      
      by halivar ( 535827 ) writes:
      
      Oh God, okay, I suppose you're right, and they had software engineers design this new chip instead of actual chip designers, thus stealing the precious resources you're bitching about.
- Re: (Score:1)
  
  by martinfb ( 743607 ) writes:
  
  Perhaps the value here is the AI to overcome that driver OOPS! Replace those lame driver devs with this AI chip!
  - Re: (Score:2)
    
    by zenlessyank ( 748553 ) writes:
    
    Artificial Intelligence sponsored/created by a selfish corporation. Let me get some Mellow Yellow and Pringles so I can enjoy the show.
- - Re: (Score:2, Informative)
    
    by Anonymous Coward writes:
    
    It's a FinFET device. You can represent more than 1 binary bit per transistor by using multi-gate transistors.
    This is not a factually correct statement. Multi-gate transistors are used because they are more energy-efficicient, perform better, and can be scaled to smaller dimensions than traditional planar CMOS devices. The extra gates give better electrostatic control over the MOSFET channel, but they do not allow the device to perform operations on more than one bit of data at once.
    https://en.wikipedia.org/wiki/Multigate_device
  - Re:15B transistors = 16 GB ? (Score:5, Informative)
    
    by DRJlaw ( 946416 ) writes: on Tuesday April 05, 2016 @04:45PM (#51848563)
    
    It's a FinFET device. You can represent more than 1 binary bit per transistor by using multi-gate transistors.
    Oh, for God's sake, I ignored this at first but now it's been modded up.
    15 billion is the transistor count for the GPU logic. It's not the transistor count for the HMB2 memory installed alongside the GPU on the interposer. Adding an interposer does not suffice to make it all the same chip (hint from TFS: "multichip module").
    FinFET is neither necessary nor sufficient to for multi-level-cell-like bit representation. That's also a flash storage technology, not a logic or volatile memory technology (at least in mass produced products).
    It's 15 days to Weed Day. Put down whatever you're smoking and get back to work.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by rahvin112 ( 446269 ) writes:
      
      15 billion is the transistor count for the GPU logic.
      I've seen nothing that would indicate either way that this could be the truth. It's pure speculation. Others have speculated that to reach that 15 billion number they have to be counting the memory transistors as well. Though this is big at 600mm2 it isn't that much bigger than previous die's that held a fraction of that number of transitions.
      - Re: (Score:2)
        
        by Agripa ( 139780 ) writes:
        
        I've seen nothing that would indicate either way that this could be the truth. It's pure speculation. Others have speculated that to reach that 15 billion number they have to be counting the memory transistors as well. Though this is big at 600mm2 it isn't that much bigger than previous die's that held a fraction of that number of transitions.
        It has about 50% more transistors than the Oracle Spark M7 at 10.2 billion so the increase is reasonable.
      - Re: (Score:2)
        
        by DRJlaw ( 946416 ) writes:
        
        You do realize that 15 billion transistors, if you assume that each holds one bit of stored information (HA!), is less than 2GB of storage?
        BTW, it's not speculation, it's from NVIDIA's own press release.
        
        Re: (Score:2)
        
        by rahvin112 ( 446269 ) writes:
        
        There are two sets of memory on this chip if you read the reports. The die itself has an additional layer that is HBM (High bandwidth memory) linked directly to the CPU. Think of it like the L1 and L2 cache in x86 chips. There is nothing that indicates how much memory this is (as the quoted memory sizes are for the memory chips attached to the boards). I'm willing to bet the chip has around 10billion transistors and the remaining 5 are the HBM layer that sits on top.
        
        Re: (Score:2)
        
        by DRJlaw ( 946416 ) writes:
        
        There are only two sets of memory if you consider the register file to be memory instead of cache (which you apparently do). The problem is, the published specifications demonstrate that you are simply wrong.
        4MB of L2 cache and ~14MB of register file space per GPU [nvidia.com] means that there is about 151 million bits associated with cache and "memory." On a chip with 15.3 billion transistors, that comfortably means that you have about 15 billion transistors for GPU logic.
        There is everything to indicate the specs of
- Re: (Score:2)
  
  by CastrTroy ( 595695 ) writes:
  
  Since they are talking about bandwidth, I would guess that what they really mean is 16 GB/s. Although I don't see any reference to bandwidth in the article and the only reference I see to 16 is the 16 nm fabrication process.
- Re: (Score:1)
  
  by Anonymous Coward writes:
  
  The chip has 15B transistors and no RAM. (it does have 4MB L2 cache and 14MB worth of register files)
  The entire Tesla P100 package is comprised of many chips not just the GPU, that collectively add up to over 150 billion transistors and features 16GB of stacked HBM2 VRAM.
- Re: (Score:2)
  
  by OrangeTide ( 124937 ) writes:
  
  I can address 16GByte with only 44 bits. that leaves more than 14.9B transistors left over to do whatever.
Traditional early adopter killer app (Score:2)

by Impy the Impiuos Imp ( 442658 ) writes:

This should provide some astonishing porn.
- Re: (Score:2)
  
  by gstoddart ( 321705 ) writes:
  
  LOL, why am I suddenly picturing millions of horny guys getting blown off by a porn AI which has developed attitude?
  Except for the people into that whole humiliation thing, I just don't see that being a big selling point. :-P
  I think sentient porn is the last thing we want.
- Re: (Score:2)
  
  by Perky_Goth ( 594327 ) writes:
  
  Well, it could learn to categorize it and learn your preferences very fast, so it's not out of the question...
The P100 was already discovered.. (Score:2)

by HumanWiki ( 4493803 ) writes:

In the Tesla's firmware http://jalopnik.com/a-hacker-m... [jalopnik.com] That would be interesting if it was a chip reference and not a car reference --- tinfoil hat.
- Re: (Score:1)
  
  by michelcolman ( 1208008 ) writes:
  
  The reference was P100D, so maybe the car is using two of them?
most nVidia enginners work on all projects (Score:4, Informative)

by Anonymous Coward writes: on Tuesday April 05, 2016 @03:14PM (#51847957)

From what my friends who work at nVidia tell me, most engineers work on all projects. They get sent problems from one GPU, after fixing that, start working on issues from a CPU or some other project.

Share
twitter facebook
Units (Score:1)

by Anonymous Coward writes:

16GigaBillion
somewhat deceiving numbers.... (Score:5, Informative)

by etash ( 1907284 ) writes: on Tuesday April 05, 2016 @03:27PM (#51848053)

Yes it can do 21,6 teraflop.... at FP16.... half precision...it can "only" do 10,6 teraflop at single precision and 5,3 teraflop at double (64) precision. Also it doesn't have 1TB/sec advertised (for months) HBM2 memory speed, but only 720GB/sec

Share
twitter facebook
- Re:somewhat deceiving numbers.... (Score:5, Informative)
  
  by Anonymous Coward writes: on Tuesday April 05, 2016 @03:35PM (#51848121)
  
  It turns out that for deep learning, 1/2 precision is very commonly used. You are using floats for numbers in a fairly small range, and accuracy isn't key. half precision speeds up processing, and more importantly lets you work with twice as much data.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by ChrisMaple ( 607946 ) writes:
    
    Is there really any advantage over 16 bit integer, which would be faster and less complex?
    - Re: (Score:2)
      
      by Mandrel ( 765308 ) writes:
      
      Is there really any advantage over 16 bit integer, which would be faster and less complex?
      Yes, artificial (and real) neurons deliver a weighted average of positive and negative inputs. So you usually have large positive and negative inputs which subtractively cancel to a moderate output. Integer doesn't handle this subtractive cancellation nearly as well as floating point, which can keep the same precision over large changes in scale.
    - - Re: (Score:3)
        
        by 0100010001010011 ( 652467 ) writes:
        
        Of course they do and do it with 0.00001525878 precision to boot. If that's all you need then you can get by with 16 bit numbers just fine.
      - Re:somewhat deceiving numbers.... (Score:5, Informative)
        
        by fizzup ( 788545 ) writes: on Tuesday April 05, 2016 @05:24PM (#51848795)
        
        I don't think that's a very good explanation. If sigmoid or step neurons only used numbers in the range [0,1], then you could divide the range into 65,536 individual states and use 16 bits to translate [0,65535] into [0,1]. However, sigmoid neurons have many inputs of many different weights, so the total input to a sigmoid neuron can be greater than one. In fact, any one input, after weighting, can be greater than one. The weights themselves can be greater than one. Only the output is constrained to [0,1] by the sigmoid or step function.
        In order to represent a number without a lot of accuracy, but keep the ability to represent large and small values, you need a floating-point number. I'm no expert in deep learning, but it does pass the sniff test that a 16-bit float would be good enough for neurons. I assume that NVIDIA has done their homework and determined that FP-8 numbers have too much rounding error to be useful in a neural network.
        
        Parent Share
        twitter facebook
    - - Re: (Score:2)
        
        by DrJimbo ( 594231 ) writes:
        
        Now with floating point 0.5*0.5 = 0.25 which is a smaller number as expected. If you multiply two positive integers like 50*50 you get 2500, so a larger value which requires further operations on it for it to be useful.
        The only "further operation" needed is to look at the higher word of the result which takes zero extra effort. For example, if you multiply two 16-bit words then you get a 32-bit result. The "extra effort" is taking the upper 16-bits of the result and ignoring the lower 16-bits.
        There may well be good reasons for FP16 to preferred over using integers but scaling the result of multiplications isn't one of them.
        
        Re: (Score:2)
        
        by religionofpeas ( 4511805 ) writes:
        
        The only "further operation" needed is to look at the higher word of the result which takes zero extra effort. For example, if you multiply two 16-bit words then you get a 32-bit result. The "extra effort" is taking the upper 16-bits of the result and ignoring the lower 16-bits.
        
        So, multiplying 100 by 100 equals 0, but starting at 0 and adding 100 for 100 times equals 10000 ?
  - Re: (Score:1)
    
    by edxwelch ( 600979 ) writes:
    
    Shaders for mobile GPUs use 1/2 precision quite a bit, however the small range (-2 - 2) is a problem for many operations, so you end up only being able to use them for less than half of the code.
I for one welcome our () Overlords (Score:5, Funny)

by tekrat ( 242117 ) writes: on Tuesday April 05, 2016 @03:28PM (#51848063) Homepage Journal

Just imagine a beowulf clust........
Oh, never mind.....

Share
twitter facebook
- Re: (Score:2)
  
  by avandesande ( 143899 ) writes:
  
  Actually I believe they are going to simulate Natalie Portman's mind using this chip.
  - - Re: (Score:2)
      
      by avandesande ( 143899 ) writes:
      
      If this processor runs hot we can use it to make the grits!
Tesla P100? (Score:1)

by rickyb ( 898092 ) writes:

Isn't that a little close to the other Tesla? http://jalopnik.com/a-hacker-m... [jalopnik.com]
- - Re: (Score:3)
    
    by ShanghaiBill ( 739463 ) writes:
    
    Musk can't copyright the name of a famous scientist.
    TRADEMARK, not copyright ... and yes he can, but only for a narrow commercial purpose. Elon owns the trademark "Tesla" as a car brand. NVIDA owns the trademark "Tesla" as a GPU brand.
    - Re: (Score:2)
      
      by stealth_finger ( 1809752 ) writes:
      
      Musk can't copyright the name of a famous scientist.
      TRADEMARK, not copyright ... and yes he can, but only for a narrow commercial purpose. Elon owns the trademark "Tesla" as a car brand. NVIDA owns the trademark "Tesla" as a GPU brand.
      And C&C owns "Tesla" as a tank
The Most Advanced Hyperscale Datacenter GPU Ever (Score:2)

by Grismar ( 840501 ) writes:

But can it run Crysis?
My brain has unlimited storage capacity and speed (Score:2)

by JoeyRox ( 2711699 ) writes:

Yet my learning ain't so deep.
Decent game AI when? (Score:1)

by Iamthecheese ( 1264298 ) writes:

I'm still waiting for the game AI that can match a human brain for strategy in an open world, especially in an RPG game but anything beyond well-studied board games really. It's so frustrating to have the computer win by cheating. Not to mention implications for new expert systems. This technology can't mature soon enough.
History repeats? (Score:2)

by dlleigh ( 313922 ) writes:

Hmmm... NVIDIA. Giant chip.
Bill Dally, are you going for a "jump approximate" instruction again?
600 square millimeters ???? (Score:2)

by colin_faber ( 1083673 ) writes:

Is this right?? 2' x 2' chip?
- Re:600 square millimeters ???? (Score:5, Funny)
  
  by Waffle Iron ( 339739 ) writes: on Tuesday April 05, 2016 @05:52PM (#51848955)
  
  Is this right?? 2' x 2' chip?
  That's right.
  And the package is shaped like Stonehenge.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by Janthkin ( 32289 ) writes:
  
  Is this right?? 2' x 2' chip?
  No, it's not right.
  600mm^2 is a chip just under 25mm on a side.
- Re: (Score:2)
  
  by TomGreenhaw ( 929233 ) writes:
  
  2 feet X 2 feet - my guess is no
  
  Square root of 600mm = 24.4948974278mm
  
  24.4948974278 mm = 0.964366 inches (0.964366")
  
  Even still .. that's a friggin HUGE chip
20 Tflops at half precision (Score:2)

by rfengr ( 910026 ) writes:

Note it's 20 Tflops at half precision. Single is 10, and double is 5.
We all know how this ends (Score:2)

by U2xhc2hkb3QgU3Vja3M ( 4212163 ) writes:

Deep learning leads to Deep Thought leads to forty two.
Or.... (Score:3)

by BrendaEM ( 871664 ) writes: on Tuesday April 05, 2016 @09:27PM (#51850079) Homepage

https://www.youtube.com/watch?... [youtube.com]

Share
twitter facebook
Numbers don't add up (Score:2)

by flyingfsck ( 986395 ) writes:

How do you make 16 GB memory from 15 G transistors?
Tesla P100 (Score:2)

by hackertourist ( 2202674 ) writes:

So can it run 600 km on a single charge?
Deep learning about morality and post-scarcity? (Score:2)

by Paul Fernhout ( 109597 ) writes:

An aside from the article: "Huang showed a demo from Facebook that used deep learning to train a neural network how to recognize a landscape painting. They then used the network to create its own landscape painting."
So long for such jobs... How about deep learning about post-scarcity economics?
https://en.wikipedia.org/wiki/... [wikipedia.org]
https://en.wikipedia.org/wiki/... [wikipedia.org]
Also: ""Our strategy is to accelerate deep learning everywhere," Huang said."
How about some deep learning about morality? Imagine training children (or
Imagine what a BEOWULF CLUSTER of these could do!! (Score:1)

by ChoosyBeggar ( 2969823 ) writes:

:P

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Welcome SkyNET overlords! (Score:3, Funny)

Re: (Score:2)

Re: (Score:2)

So, (Score:2)

1000 engineers (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re:1000 engineers (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re:1000 engineers (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Now We Know Why Drivers Suck (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2, Informative)

Re:15B transistors = 16 GB ? (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Traditional early adopter killer app (Score:2)

Re: (Score:2)

Re: (Score:2)

The P100 was already discovered.. (Score:2)

Re: (Score:1)

most nVidia enginners work on all projects (Score:4, Informative)

Units (Score:1)

somewhat deceiving numbers.... (Score:5, Informative)

Re:somewhat deceiving numbers.... (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re:somewhat deceiving numbers.... (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

I for one welcome our () Overlords (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Tesla P100? (Score:1)

Re: (Score:3)

Re: (Score:2)

The Most Advanced Hyperscale Datacenter GPU Ever (Score:2)

My brain has unlimited storage capacity and speed (Score:2)

Decent game AI when? (Score:1)

History repeats? (Score:2)

600 square millimeters ???? (Score:2)

Re:600 square millimeters ???? (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

20 Tflops at half precision (Score:2)

We all know how this ends (Score:2)

Or.... (Score:3)

Numbers don't add up (Score:2)

Tesla P100 (Score:2)

Deep learning about morality and post-scarcity? (Score:2)

Imagine what a BEOWULF CLUSTER of these could do!! (Score:1)

Related Links Top of the: day, week, month.