Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AI Hardware Technology

Cerebras Systems Unveils a Record 1.2 Trillion Transistor Chip For AI (venturebeat.com) 67

An anonymous reader quotes a report from VentureBeat: New artificial intelligence company Cerebras Systems is unveiling the largest semiconductor chip ever built. The Cerebras Wafer Scale Engine has 1.2 trillion transistors, the basic on-off electronic switches that are the building blocks of silicon chips. Intel's first 4004 processor in 1971 had 2,300 transistors, and a recent Advanced Micro Devices processor has 32 billion transistors. Samsung has actually built a flash memory chip, the eUFS, with 2 trillion transistors. But the Cerebras chip is built for processing, and it boasts 400,000 cores on 42,225 square millimeters. It is 56.7 times larger than the largest Nvidia graphics processing unit, which measures 815 square millimeters and 21.1 billion transistors. The WSE also contains 3,000 times more high-speed, on-chip memory and has 10,000 times more memory bandwidth.
This discussion has been archived. No new comments can be posted.

Cerebras Systems Unveils a Record 1.2 Trillion Transistor Chip For AI

Comments Filter:
  • It's now only a matter of time...

    • Oh God, please tell me you are not one of those "AI is going to kill us all!" kooks.

       

  • I call bullshit. (Score:1, Informative)

    by fifirebel ( 137361 )

    I call bullshit on that one.

    They claim a 42,225 mm^2.
    That's over 4 dm^2, a rectangle 20 cm x 20 cm, roughly 7 in x 7 in.

    And the biggest available wafer only sports 64,000 mm^2. [wikipedia.org]

    So it cannot be a single chip. Now they may packages a bunch of smaller chips into a single package, but that's been done before and is much less impressive. It cannot be the largest chip ever built. Possibly the largest number of chips in a single package, but they lost all cred at this stage.

    • They claim a 42,225 mm^2..... And the biggest available wafer only sports 64,000 mm^2. [wikipedia.org]

      Now, last I checked, 42,2215 is less than 64,000. Not sure I see the problem?

      • The only issue aside from needing perfection is a fault-tolerant design with redundancy.

        • Perfection just impacts yield... ;) But the claimed size is clearly within the realm of reality.
          • by Matheus ( 586080 )

            This article seems to indicate that there are other limitations that indicate your likelihood of getting *any successful chips would be quite low...
            https://www.quora.com/Why-does... [quora.com]

            "Perfection impacts yield" is generally acceptable because a given wafer can produce numerous die so getting some(hopefully high) % success is just a cost factor. When you need nearly your entire wafer to make a single die And your % success is negatively affected by the size of the die itself you're shooting yourself in the foot

    • SQRT(42225) is 205.487, or 200mm on a side.

      A wafer is over 300mm these days at the big foundries, so quite doable.

      Those kind of metric conversions have crashed spacecraft, so I wouldn't feel too bad. :)

      • by suutar ( 1860506 )

        That's what he said. 200mm = 20cm ~= 7in.

        It'll fit on a wafer, but it won't have much room for error.

        • Ok so you would fit one chip on a 300mm wafer.. your expected yield would be near zero. TFA doesn't say it's a single die though. Current high-end chip fabs use 6'' x 6'' masks (minus a margin for mounting, barcodes etc), and all the high-end layers are printed on a 4x reduction, meaning 1000mm^2 are near the maximum that can be achieved. Oh and building a whole new tool set... not without multiple $bn of investment. I'm guessing (if this isn't complete vapourware) it's a "chiplet" approach as seen on all
          • by marcelk ( 224477 )

            It's possible, albeit expensive, to stitch multiple mask images together at the wafer level by careful projection and position control. That's how large field camera chips are made. 36x24 mm doesn't fit otherwise. Wafer-sized chips are also made for x-ray detection applications.

          • by suutar ( 1860506 )

            I didn't say I'd fit one chip on a wafer, I said that Grog6 is calling math error and yet coming up with the same numbers.

    • If it's a square chip, then the corners are going to be outside the disk of the 64k mm^2 wafer but who says it has to be square?

      Where I see the issues is when we talk about interconnects between the cores? A little more information like the size of the cores on the chips as well as more explaining the core interconnect strategy would be nice. How many other cores (and memory) does each core connect to. Then, of course, there is power distribution and dissipation for the chip.

      If it is what they say it is,

    • Re:I call bullshit. (Score:5, Informative)

      by r2kordmaa ( 1163933 ) on Monday August 19, 2019 @06:21PM (#59103740)
      Not bullshit, they really made a wafer sized chip, here are pics https://www.bbc.com/news/techn... [bbc.com]
    • According to Wolfram Alpha.... half the size of a medium size Domino pizza :-)
      • ...and all the single-threaded performance of a used tampon. Plus they'll need advanced laser cooling if they want to ratchet it up past 500MHz... no, I did not RTFA.
    • It's not bullshit (Score:5, Informative)

      by StandardCell ( 589682 ) on Monday August 19, 2019 @09:30PM (#59104266)
      What they did was completely insane, but possibly not bullshit. This topic is covered in better detail at and they're not bullshit artists, though anything is technically possible. [eetimes.com]

      Most modern process technologies run on either 12" or 18" wafers. I don't know what the wafer map looks like and I'm too lazy to do the geometry, but it may literally be one die off a wafer at that size.

      The things that would concern me personally as a former lead ASIC designer are:

      - Design - over that kind of silicon area, how many possible clock domains and PLLs do they have? There'd have to be a lot of clock domains asynchronously latching data because even at low speeds you'd have enough clock skew to choke a blue whale. And what tool could actually place and route and what kind of memory/CPU did it have? How was power and signal integrity closed at this scale as well? And what type of front-end and back-end back-annotated simulation did they conduct, or was this all reliant on formal verification + static timing analysis? I'm even curious how long physical design and design-rule checks took on this.

      - Testability - in several respects, this is a monster to test, so how long does scan/BIST/memBIST take to run, what kind of probe card and load board was designed to test this at "wafer sort" (and I use that term loosely), and how do they deal with things like gross IDD (i.e. dead shorts between power and ground)? I get that they have some kind of built-in self-repair, but one gross IDD failure and you're literally cooked. Yields must be utterly dreadful even with a stable process at a Tier 1 fab.

      - Packaging - again in several respects, including how was the packaging designed, what type of I/O and power distribution scheme was used (at 15kW no less!). I'd also be really concerned about what type of heat dissipation at that much power they have, and how they prevent warping of the package substrate because of thermal differences across the area of the die/package. Is this even possible with FR4 or did they go to PTFE or some other material? Same with once it's placed on a board.

      - Product - what kind of I/O is this thing supporting? How many layers of PCB did they use for this? What actually feeds this thing data coherently? Where does it all go?

      Bluntly, that's a lot of questions. The fact that nobody heard anything about this up to now may be a factor of NDAs, but this monstrosity is so beyond the pale from a design perspective that I don't know that I could take someone seriously if they even told me to work on this. Again, I'm not saying it's impossible, but I'm saying that truly nothing like this has ever been attempted, and I would be much more reliant on a subdivided design with fast interconnect even when they're talking about the type of computing problem they're trying to solve. Let's see the package alone, and it'll answer some more questions for us.
      • by hAckz0r ( 989977 )

        Since this is a "Sparse Array" device it may be possible that they plan on selling devices graded on how many quadrants are properly functioning, and merely marking or deactivating those bad cells/cores that were found to have problems during testing. I understand companies like Intel will crank up the GHz during testing and scale the frequency back until the chip passes the test, and the part is then labeled and sold for use at that GHz processing speed. Other types of failures just wind up on the recycle

    • And what does it matter that it's single wafer vs multiple separate dies?

        Do kids typically complain when mom buys Dino Bites instead of Fruity Pebbles?

        At least separate dies can be connected in 3D, reducing the length of the signal path and (theoretically) getting a much better throughput between dies.

  • They finally built a chip that can run Crysis! ;)

  • by stabiesoft ( 733417 ) on Monday August 19, 2019 @05:41PM (#59103568) Homepage
    Has been tried numerous times and always failed. Maybe it will work, but it has been more effective to do thin-film and mount known functional parts on the substrate, like the new AMD processor is doing.
    • by gTsiros ( 205624 )

      what's a "wsi" ? wafer sized ic ?

    • On the other hand, we haven't gone back to having a separate IC for each core and L2 cache and memory controller, have we? What's the optimum? Perhaps the highly parallel / repetitive structure of this chip makes it relatively painless to route around defective cores.
    • by necro81 ( 917438 )
      As I understand it, the (potential) benefit for going the wafer-scale route is that the power and overhead for inter-processor communications is drastically lower than if you are going between discrete chips, or even between separate dies in the same package/module. The distances are shorter, you don't have material discontinuities (i.e., going from silicon to the ball-grid to the die carrier and back), and so you don't have to drive the signals as hard. From the article:

      Typical messages traverse one hard

  • Imagine (Score:2, Funny)

    by HighOrbit ( 631451 )
    Imagine a beowulf cluster of these.....
  • "...which measures 815 square millimeters..."

    So, it's roughly the side of a side salad plate? Better be chilled.

  • For a system this size, the interconnect architecture is much more critical than the sheer number of cores. Anyone know what's going on there?

  • by Grog6 ( 85859 ) on Monday August 19, 2019 @05:55PM (#59103638)

    The bigger it is makes the odds of a catastrophic defect more likely.

    That's why bigger chips are more expensive, because they have to throw so many away.

    The basic silicon is relatively cheap; but by the time you put 18 layers of diffusions, metal layers, and oxide layers on it to fine linewidths, it's pricer than platinum.

    Bondwires are made of gold, so think of the relation there. :)

    This is probably a $5million chip, and I'm likely to be off by more than one zero. :)

    • Plus, after you've made the thing, you need to come up with some way to keep it from catching on fire when you're using it. Something that size is going to run some serious wattage that you need to get out, and it's got to be running up on some serious challenges to get all that wattage out even with that big of a chip.

    • by Kjella ( 173770 )

      While that's true not all chips are created equal, it also depends on how many parts of the chip are critical and how much can be gated off as defective. For example GPUs often have versions with some SP/ROPs disabled. If this is an all-compute chip with a huge array of identical compute blocks that could be easier than a GPU with unique elements like display outputs and such.

    • by tlhIngan ( 30335 )

      This is probably a $5million chip, and I'm likely to be off by more than one zero. :)

      Probably.

      I've seen systems built with $30,000 FPGAs (that's $30,000 EACH in 1000 piece quantities!). They were fairly large chips already (the package was around 8cm x 8cm, but the die is relatively small comparatively).. The system had 4 of them in most configurations, though I think we had one with two boards, so 8 of those FPGAs (think of it - nearly a quarter million dollars in FPGAs).

      Given the size of this chip, it wil

    • by fintux ( 798480 )

      The article says, however (emphasis mine):

      If you have only one chip on a wafer, the chance it will have impurities is 100%, and the impurities would disable the chip. But Cerebras has designed its chip to be redundant, so one impurity won’t disable the whole chip.

      I'm pretty sure by "one" they mean "a few". Of course there are almost certainly some critical areas where defects might make the whole cihp fail, but probably it is tolerant to a big portion of the defects. Even without defects, this will certainly not be a cheap chip, as a whole wafer for typical CPUs costs somewhere around $10,000 or more, depending on the process used, and any more specific techniques can of course drive the prices up significantly (and those ty

    • by necro81 ( 917438 )

      The bigger it is makes the odds of a catastrophic defect more likely.

      A defect may take out a portion of the chip, but the remainder should be configurable for use. With 10^5 cores and an inter-processor communication "fabric" (their term), you should be able to reconfigure or reroute around the damaged area.

      I expect that, much like with microprocessors, they intend to bin these chips/wafers according to how many defects they have and how well they perform at the wafer-scale testing. That is, the wafe

  • by dryriver ( 1010635 ) on Monday August 19, 2019 @05:55PM (#59103642)
    Remember how we all thought in the 1980s that by 2020 or so we'd have "VR that looks like real life"? Well, Nvidia and AMD/ATI failed miserably to make this happen with their "we'll give you %10 extra performance every year" BS Business Model. So lets get a GPU this size and see whether we can do "reality" in games and VR content. Nobody will buy a GPU that expensive you say? Some people spend tens of thousands of Dollars a year on "entertainment". I bet they'll buy it, and then gradually the price will come down for the rest of us. Also 3D Content Creators will buy it. No more waiting hours for shit to render. Also Architects and CAD people will buy it. A GPU that size would let you walk around a not-yet-built building complex as though it is real. Seriously. Give us GPUs that are this big, and maybe we'll get "reality" in games and VR before we all die of old age or ill health. As for Nvidia/AMD? You guys keep selling us 15 year old tech on little electronics boards with noisy fans on them.
    • So, you imagined something that doesn't exist. Then call plans and products of companies making money "BS".

      Then say you wish you had cool gaming rig gizmo.

      yup, you're a kid.

    • This would great.
      I could then play Civ 6.
    • Hmm. Maybe you need to go show Nvidia and AMD how it's done? Don't you think that if it was that easy, that someone with the resources of Intel wouldn't have been doing it in order to gain market share? No - instead you see Nvidia and AMD slugging it out with modest gains because as it turns out, doing that kind of processing massively faster is really fucking hard without making other unacceptable compromises. Like having multiple power supplies in your system to power up some mammoth processing engine

    • Remember how we all thought in the 1980s that by 2020 or so we'd have "VR that looks like real life"?

      No; most people had never even heard of "Vee Argh" in the 80's until Lawnmower Man came out. Anyone weened on Gibson (or Stephenson's Snowcrash) were familiar with the concept, but it wasn't until after the 90's that we began to do more than hope.

    • You should learn about Physics. Moore's Law is dead, despite best efforts. You just never noticed.

      • Moore's law says nothing more than that the number of transistors on an integrated circuit doubles every two years.

        While log-plot the curve for mainstream CPUs has bent a little flatter over the past 10 years, this chip with over 1e12 transistors puts a dot solidly on or even above the original main line going back to the 1960s.

        In other words, Moore's law isn't dead quite yet.

        • Moore's law says nothing more than that the number of transistors on an integrated circuit doubles every two years.

          Moore's mildly favoured formulation chose to couch it in those terms.

          In addition to his having been inside the loop enough to get first dibs on a new economic paradigm of sustained exponential growth, Moore was canny enough to choose the formulation with maximal wiggle room (so as to gather more credit, while risking less blame). Many high school students are equally canny. This is not the toga

          • It doesn't matter: He said what he said.

            If you really dislike it so much, you can promulgate your own law.

  • I don't know why, but whenever ibsee posts like this, i always think of those old 3dfx commercials, where they do this magestic dream, of using their tech to solve the worlds biggest problems, then throw in the "wait a minute... we could use this to play games!" :)

  • Pics or it didn't happen!

  • This thing might be able to heat your home all winter but I would not want the electric bill. Maybe there should be shoot in the top of the case so dry ice can be fed it at all times.
  • by larryjoe ( 135075 ) on Monday August 19, 2019 @06:25PM (#59103752)

    The article makes it clear that this device is a wafer and not a chip, even though the title still makes claims about a chip. The wafer-scale device is still potentially impressive, but the claims of chip-level size and transistor count are misleading.

    • The article makes it clear that this device is a wafer and not a chip, even though the title still makes claims about a chip.

      It's both. It's a chip the size of a full wafer, instead of one of many chips cut out of a wafer.

  • I read the fine article. Typical closed-kimono orgasmic gush.

    While I didn't actually learn anything much at all, the penny finally did drop on Master Sergeant Schultz's likely career trajectory after the war concluded.

    John Banner was born to Jewish parents in Stanislav, Austria-Hungary (now Ivano-Frankivsk, Ukraine).

    He studied for a law degree at the University of Vienna, but decided instead to become an actor. In 1938, when he was performing with an acting troupe in Switzerland, Adolf Hitler annexed Austri

  • IEEE Spectrum has a post [ieee.org] regarding this chip. Yes, it really is the size of an entire wafer (lopped off to make a square-ish die a few hundred mm across).

He has not acquired a fortune; the fortune has acquired him. -- Bion

Working...