Tesla Unveils Dojo Supercomputer: World's New Most Powerful AI Training Machine (electrek.co) 32
New submitter Darth Technoid shares a report from Electrek: At its AI Day, Tesla unveiled its Dojo supercomputer technology while flexing its growing in-house chip design talent. The automaker claims to have developed the fastest AI training machine in the world. For years now, Tesla has been teasing the development of a new supercomputer in-house optimized for neural net video training. Tesla is handling an insane amount of video data from its fleet of over 1 million vehicles, which it uses to train its neural nets.
The automaker found itself unsatisfied with current hardware options to train its computer vision neural nets and believed it could do better internally. Over the last two years, CEO Elon Musk has been teasing the development of Tesla's own supercomputer called "Dojo." Last year, he even teased that Tesla's Dojo would have a capacity of over an exaflop, which is one quintillion (1018) floating-point operations per second, or 1,000 petaFLOPS. It could potentially makes Dojo the new most powerful supercomputer in the world.
Ganesh Venkataramanan, Tesla's senior director of Autopilot hardware and the leader of the Dojo project, led the presentation. The engineer started by unveiling Dojo's D1 chip, which is using 7 nanometer technology and delivers breakthrough bandwidth and compute performance. Tesla designed the chip to "seamlessly connect without any glue to each other," and the automaker took advantage of that by connecting 500,000 nodes together. It adds the interface, power, and thermal management, and it results in what it calls a training tile. The result is a 9 PFlops training tile with 36TB per second of bandwight in a less than 1 cubic foot format. But now it still has to form a compute cluster using those training tiles in order to truly build the first Dojo supercomputer. Tesla hasn't put that system together yet, but CEO Elon Musk claimed that it will be operational next year.
The automaker found itself unsatisfied with current hardware options to train its computer vision neural nets and believed it could do better internally. Over the last two years, CEO Elon Musk has been teasing the development of Tesla's own supercomputer called "Dojo." Last year, he even teased that Tesla's Dojo would have a capacity of over an exaflop, which is one quintillion (1018) floating-point operations per second, or 1,000 petaFLOPS. It could potentially makes Dojo the new most powerful supercomputer in the world.
Ganesh Venkataramanan, Tesla's senior director of Autopilot hardware and the leader of the Dojo project, led the presentation. The engineer started by unveiling Dojo's D1 chip, which is using 7 nanometer technology and delivers breakthrough bandwidth and compute performance. Tesla designed the chip to "seamlessly connect without any glue to each other," and the automaker took advantage of that by connecting 500,000 nodes together. It adds the interface, power, and thermal management, and it results in what it calls a training tile. The result is a 9 PFlops training tile with 36TB per second of bandwight in a less than 1 cubic foot format. But now it still has to form a compute cluster using those training tiles in order to truly build the first Dojo supercomputer. Tesla hasn't put that system together yet, but CEO Elon Musk claimed that it will be operational next year.
Matrix Dojo. (Score:2)
A rather impressive piece of hardware [youtu.be].
Re: (Score:2)
Re: (Score:2)
Except, newer and faster.
sure if you ignore TPU, Cerebras and others (Score:3, Informative)
Re: (Score:2)
google's alphago was a really big deal in the field and they basically brute-forced the first release to beat everyone else who had to be clever about it. (they later developed a much more efficient training algorithm.) i suspect it paid off in terms of attracting talent and advertising for their cloud/ML services.
it's easy to say "throwing compute at it isn't the solution," except for when it is. as for wasting energy, well, all competition wastes energy strictly speaking; nonetheless it's what humans do.
I thought Musk was afraid of AI? (Score:1)
Re: (Score:2)
But he is building one? I am confused.
Musk is scared of AGI [wikipedia.org].
A chip for image processing using gradient descent DL is unlikely to lead to AGI.
Re: (Score:2)
He did say that they're going to make their androids run slower than humans and also make sure we can over-power them physically.
Re: (Score:2)
Technical analysis (Score:2)
https://semianalysis.com/tesla... [semianalysis.com]
The article claims 11 cabinets are enough to get 1.1 Exoflops .
Re: (Score:1)
Beyond hype (Score:4, Informative)
Let us compare with Fugaku supercomputer:
Fugaku about 100k general purpose chips --> 500 Peta flops actual measurements
Tesla 3k custom AI chips --> 1.1 eflop. But these 8/16 bits. Ratio of 32 bit to 8/16 bit is 1:16. So actual is 70 Peta flops 32bit
This gives close to 4x theoretical perf advantage over a 2 year old chip (introduced in 2020, Tesla is expected to be available next year).
So still a good performance but not as much as hyped.
Re: (Score:2)
Re: (Score:3)
Sure if you math poor.
3k cpu x 400 w/cpu = 1.2 MW. Its performance is about 1/7th. So at Fugaku speed, it is 8.4 MW and this is just the CPU chips. Fugaku is 30 MW total. Go figure. At most 2x improvement over 2 year old machine. That is not called "absolutely destroys".
It's FP32 Re:Beyond hype (Score:2)
These are not "8/16 bits". It's exaflop at FP32. (According to Elon Musk's tweet.)
Re: (Score:3)
Well, looking at the details, he may have exaggerated with that twit. I'll have to look at the calculation.
Re: (Score:2)
Either you didn't read the tweet or he put it wrong. The article says 22 TFlops (32 bit) per CPU and there are 3000 cpus. So it is 66 petaflops. Anyway, I did further research on Fugaku. See https://www.fujitsu.com/global... [fujitsu.com] Based on this, it is 6.8 TFlops per cpu (32 bit). So Dojo is about 3 times faster for 32 bit. May be 2 times power efficient. But this is in comparison to 2 year old general purpose 64 bit chip (Tesla chip doesn't even mention it).
See the image https://electrek.co/wp-content... [electrek.co] It says
Will this prevent . . . (Score:1)
his cars from plowing into emergency vehicles [cnn.com] at crash sites?
Re: (Score:2)
I think the plan is to understand why his cars are plowing into emergency vehicles at crash sites.
Re: (Score:2)
"Musk tweeted last month that Tesla's advanced camera-only driver assistance system, known as "Tesla Vision," will soon "capture turn signals, hazards, ambulance/police lights & even hand gestures."
https://www.reuters.com/busine... [reuters.com]
Re: (Score:2)
It was also NOT in general use either. All of the publicity to date has been around idiots with autopilot thinking THAT was FSD...
Re: (Score:2)
But call it autopilot, FSD, or driver assist, you are agreeing that Tesla's system was blind to turn signals, hazards, and ambulance/police lights right ?
Re: (Score:2)
The radar has to ignore objects that are stationary and the camera doesn't seem to be contributing to the model of cars. I can take my car down the road and it'll see almost every car next to or in front even at stop lights. It take it down a residential road with cars parked on the side and it completely ignores them. Non-radar targets are fine however so it can correctly map trash cans, traffic cones, and a few other objects that are also stationary.
https://youtu.be/jQioNtg4oq4?t... [youtu.be]
Few second clip driving
Re: (Score:2)
"Non-radar targets are fine however so it can correctly map trash cans, traffic cones, and a few other objects that are also stationary."
It sees those hazards. It may be intermittent at times but largely any non-car object is detected which causes my confusion about why it doesn't seem to see cars if they're not visible to the radar system.
Re: (Score:2)
Is this approach fundamentally flawed? (Score:2)
Re: (Score:2)
Their new system is doing predictive 3d modeling, unlike their old one. It's taking images and finding embeddings in a physical 3d space; it's rather impressive.
Solving for that problem is a primary goal of the newer modeling process that they discussed at their "AI day". Their system does now presume and is trained to emphasize object constancy despite intermittent occlusion, and there are planning neural networks which make predictions for self as well as other objects.
I think there is enough in the da