

Startup Claims Its Upcoming (RISC-V ISA) Zeus GPU is 10X Faster Than Nvidia's RTX 5090 (tomshardware.com) 44
"The number of discrete GPU developers from the U.S. and Western Europe shrank to three companies in 2025," notes Tom's Hardware, "from around 10 in 2000." (Nvidia, AMD, and Intel...)
No company in the recent years — at least outside of China — was bold enough to engage into competition against these three contenders, so the very emergence of Bolt Graphics seems like a breakthrough. However, the major focuses of Bolt's Zeus are high-quality rendering for movie and scientific industries as well as high-performance supercomputer simulations. If Zeus delivers on its promises, it could establish itself as a serious alternative for scientific computing, path tracing, and offline rendering. But without strong software support, it risks struggling against dominant market leaders.
This week the Sunnyvale, California-based startup introduced its Zeus GPU platform designed for gaming, rendering, and supercomputer simulations, according to the article. "The company says that its Zeus GPU not only supports features like upgradeable memory and built-in Ethernet interfaces, but it can also beat Nvidia's GeForce RTX 5090 by around 10 times in path tracing workloads, according to slide published by technology news site ServeTheHome." There is one catch: Zeus can only beat the RTX 5090 GPU in path tracing and FP64 compute workloads. It's not clear how well it will handle traditional rendering techniques, as that was less of a focus. In speaking with Bolt Graphics, the card does support rasterization, but there was less emphasis on that aspect of the GPU, and it may struggle to compete with the best graphics cards when it comes to gaming. And when it comes to data center options like Nvidia's Blackwell B200, it's an entirely different matter.
Unlike GPUs from AMD, Intel, and Nvidia that rely on proprietary instruction set architectures, Bolt's Zeus relies on the open-source RISC-V ISA, according to the published slides. The Zeus core relies on an open-source out-of-order general-purpose RVA23 scalar core mated with FP64 ALUs and the RVV 1.0 (RISC-V Vector Extension Version 1.0) that can handle 8-bit, 16-bit, 32-bit, and 64-bit data types as well as Bolt's additional proprietary extensions designed for acceleration of scientific workloads... Like many processors these days, Zeus relies on a multi-chiplet design... Unlike high-end GPUs that prioritize bandwidth, Bolt is evidently focusing on greater memory size to handle larger datasets for rendering and simulations. Also, built-in 400GbE and 800GbE ports to enable faster data transfer across networked GPUs indicates the data center focus of Zeus.
High-quality rendering, real-time path tracing, and compute are key focus areas for Zeus. As a result, even the entry-level Zeus 1c26-32 offers significantly higher FP64 compute performance than Nvidia's GeForce RTX 5090 — up to 5 TFLOPS vs. 1.6 TFLOPS — and considerably higher path tracing performance: 77 Gigarays vs. 32 Gigarays. Zeus also features a larger on-chip cache than Nvidia's flagship — up to 128MB vs. 96MB — and lower power consumption of 120W vs. 575W, making it more efficient for simulations, path tracing, and offline rendering. However, the RTX 5090 dominates in AI workloads with its 105 FP16 TFLOPS and 1,637 INT8 TFLOPS compared to the 10 FP16 TFLOPS and 614 INT8 TFLOPS offered by a single-chiplet Zeus...
The article emphasizes that Zeus "is only running in simulation right now... Bolt Graphics says that the first developer kits will be available in late 2025, with full production set for late 2026."
Thanks to long-time Slashdot reader arvn for sharing the news.
This week the Sunnyvale, California-based startup introduced its Zeus GPU platform designed for gaming, rendering, and supercomputer simulations, according to the article. "The company says that its Zeus GPU not only supports features like upgradeable memory and built-in Ethernet interfaces, but it can also beat Nvidia's GeForce RTX 5090 by around 10 times in path tracing workloads, according to slide published by technology news site ServeTheHome." There is one catch: Zeus can only beat the RTX 5090 GPU in path tracing and FP64 compute workloads. It's not clear how well it will handle traditional rendering techniques, as that was less of a focus. In speaking with Bolt Graphics, the card does support rasterization, but there was less emphasis on that aspect of the GPU, and it may struggle to compete with the best graphics cards when it comes to gaming. And when it comes to data center options like Nvidia's Blackwell B200, it's an entirely different matter.
Unlike GPUs from AMD, Intel, and Nvidia that rely on proprietary instruction set architectures, Bolt's Zeus relies on the open-source RISC-V ISA, according to the published slides. The Zeus core relies on an open-source out-of-order general-purpose RVA23 scalar core mated with FP64 ALUs and the RVV 1.0 (RISC-V Vector Extension Version 1.0) that can handle 8-bit, 16-bit, 32-bit, and 64-bit data types as well as Bolt's additional proprietary extensions designed for acceleration of scientific workloads... Like many processors these days, Zeus relies on a multi-chiplet design... Unlike high-end GPUs that prioritize bandwidth, Bolt is evidently focusing on greater memory size to handle larger datasets for rendering and simulations. Also, built-in 400GbE and 800GbE ports to enable faster data transfer across networked GPUs indicates the data center focus of Zeus.
High-quality rendering, real-time path tracing, and compute are key focus areas for Zeus. As a result, even the entry-level Zeus 1c26-32 offers significantly higher FP64 compute performance than Nvidia's GeForce RTX 5090 — up to 5 TFLOPS vs. 1.6 TFLOPS — and considerably higher path tracing performance: 77 Gigarays vs. 32 Gigarays. Zeus also features a larger on-chip cache than Nvidia's flagship — up to 128MB vs. 96MB — and lower power consumption of 120W vs. 575W, making it more efficient for simulations, path tracing, and offline rendering. However, the RTX 5090 dominates in AI workloads with its 105 FP16 TFLOPS and 1,637 INT8 TFLOPS compared to the 10 FP16 TFLOPS and 614 INT8 TFLOPS offered by a single-chiplet Zeus...
The article emphasizes that Zeus "is only running in simulation right now... Bolt Graphics says that the first developer kits will be available in late 2025, with full production set for late 2026."
Thanks to long-time Slashdot reader arvn for sharing the news.
About time we can upgrade GPU RAM (Score:4, Insightful)
It has been a long time that we are missing RAM slots on a GPU board.
With Nvidia setting limits that 3rd party OEMs aren't allowed to surpass, the GPU has been fixed and there is not enough competition on the high end. AMD is looking viable now at least for enthusiast gamers.
With x86 licenses being short, the APU market does not have enough competition either.
Re: (Score:3)
There will never be RAM slots or sockets on a GPU. It's impossible from a signal integrity standpoint.
Re: (Score:2)
At least AMD has an APU that can share most of its possible 128GB with the CPU part.
Unfortunately, as far as I am aware, only Framework has a product announced in that regard.
Re: (Score:2)
Its memory bandwidth and performance are still a far cry from Apple APUs. It's a step in the right direction, but I think they could have made it better than they did.
The only thing it does better than an Apple Silicon part of equivalent loadout (M4 Max, 128GB) is tasks that can only be done on x86.
I really, really, really want an x86 part that competes.
Re: (Score:3)
There will never be RAM slots or sockets on a GPU. It's impossible from a signal integrity standpoint.
The picture of the graphics card on their website clearly shows two SODIMM slots.
According to their presentation the GPU card has soldiered LPDDR5X memory running at 273GB/s and card slots for DDR5 at 90GB/s. This isn't total bandwidth available to the GPU as a whole but rather the dedicated memory/bandwidth available to each chiplet within the GPU /w high speed interconnects between the various chiplets.
Re: About time we can upgrade GPU RAM (Score:1)
I'm not sure soldiers are the answer here.
Re: (Score:2)
Re: (Score:2)
That sounds suboptimal, given that there's a reason GPUs tend to use GDDR (or HBM) rather than normal DDR.
HBM is currently being driven by AI rather than rendering. I don't think pushing the dial on uniform memory is optimal either from a cost or energy perspective. Certainly makes some things easier and certainly some workloads that benefit but NUMA schemes are more energy efficient and scalable. The metric that matters is bandwidth available to each core.
For comparison, a 5090 has around 1.8TB/s of memory bandwidth. Maybe better for some tasks, but that makes it a sidegrade to the 5090 at best.
What is the 5090? ... 30% faster than a 4090 @ 1TB/s when actually rendering graphics? There are no shortage of applications for which bandwidth is not
Re: (Score:2)
The number of connector points required to achieve this sort of thing (the H100 memory interface is 5120 bits wide - that's 10K pins) are physically impossible for a pluggable connection that's also going to run at 6.4 gigabaud, NOT consume a thousand watts driving the bus lines and NOT be the siz
Re: (Score:2)
It's an interesting setup. Slow third tier expandable memory. Is it worth it over just using system memory? Maybe for some things, and this thing seems to be pretty focused on specific problems.
Let's see (Score:4, Informative)
this is only in "simulation" so the proof and the TDP numbers will be in the tape-out prototypes..
Re:Let's see (Score:4, Insightful)
Even 10% slower would put them in the running if there is a lower initial capital investment for someone putting together a data center. I don't see a need to exaggerate if your fundamentals are in order.
I understand Elon Musk (Score:2)
Re: (Score:1)
You're off your meds again, I see. I know it's hard, but take them despite how you think they make you "feel." It will help reign in your TDS and get you that much closer to reality. You might want to take an extra dose of the antipsychotics to start, because you aren't a danger to anyone else while you're busy staring at a wall.
Not for gaming (Score:3)
It sounds like they just put a lot of cores on it, enough to handle ten times as much rays in a raytracer. Which can be parallelized up to each pixel on screen.
Unverified claims (Score:5, Informative)
I'll believe it when I see it. I hate it when it a piece of corporate hype finds its way into the slashdot feed.
Re: (Score:3)
Re: (Score:2)
True, but they're claiming to be ten times faster for some workloads, and ten times slower for others. While the claims are pretty wild, at least they're not claiming they're better at everything.
Re: (Score:2)
And specifically FP64, which is not something the 5090 is especially good at: NVidia claim only a few percent of the FP32 performance, actually somewhat less than a high end threadripper.
They're claiming 3x the FP64 performance of a threadripper, presumably without the general purpose CPU performance of that.
Re: (Score:2)
And specifically FP64, which is not something the 5090 is especially good at: NVidia claim only a few percent of the FP32 performance, actually somewhat less than a high end threadripper.
Nvidia goes out of its way to make sure none the xx90 GPUs come anywhere close to the performance and usability of the much more expensive data center GPUs for compute. This market segmentation is very much intentional.
Re: Unverified claims (Score:2)
Re: (Score:2)
Re: (Score:2)
On the other hand, if they use too large silicon, their yield may be get funny. But if I can think of that, so can they.
Rookie Numbers! (Score:5, Funny)
My upcoming Custom GPU with integrated AI will be 1,000 times faster than NVidia's puny RTX5090. Better still, it only draws 5 watts!
Right now, I'm a little behind schedule on delivery to market due to some manufacturing delays due to capital issues. If you're an investor, this is your chance to git in on the ground floor of what is certain to be the market and industry disruptor of the century. Don't miss out on this once in a lifetime opportunity. Send me your life saving via BitCoin or Ethereum TODAY!
Can't go tits up!
Re: (Score:3)
Yesterday's news. I recently published a slide that says MY upcoming GPU will be 10,000 times faster than Nvidia - and draw only 3.5 watts!
Re:Rookie Numbers! (Score:4, Funny)
Re: Rookie Numbers! (Score:2)
I was going to do that to.
Re: (Score:2)
Proprietary RISC-V (Score:2)
Sure would be nice to see a fully open source hardware implementation of a GPU with RISC-V so that a SBC could be built without any limits on use from the licensing.
Plenty of GPU opportunities (Score:2)
Re: (Score:2)
It's about FP64 (Score:4, Informative)
This is a contrived comparison. All GeForce cards have terrible FP64 performance. It's rarely used in gaming and they don't bother trying to make it fast. Data center cards like H100 and B200 have much better FP64 performance. They're designed for compute applications where its more important.
So they compared their own GPU designed for compute against an NVIDIA GPU designed for gaming, and found theirs is better at compute. I'm shocked, shocked to hear that! If they're going to test against a GPU designed for gaming, they need to test on gaming benchmarks. If they're going to test on compute benchmarks, they need to compare to a GPU designed for compute.
Re: (Score:2)
I've also found that AMD GPUs have much better FP64 performance than comparable Nvidias (using my own OpenCL code, so not a general benchmark). It's just another instance of Nvidia's optimizations [slashdot.org] in their gaming cards.
As for the better FP64 of data center cards, it's interesting that AI uses very low precision such as 4 or 8 bits. So I wouldn't be surprised to see AI-optimized GPUs that lose the higher precision parts.
Finally, if you can make a chip with a ton of RISC-V cores, why not make a massively
Proprietary extensions are bad (Score:2)
"Bolt's additional proprietary extensions designed for acceleration of scientific workloads"
When GPGPU in supercomputers was an up and coming thing 20 years ago, the big roadblock to acceptance by the big government funding agencies was the need to rewrite software. The scientific programs were written, debugged, and optimized over many years. Switching to CUDA or any other new thing requires rewriting the code, including debugging and optimizing. Often the time to redevelop the software swamps the actua
Optimized for one specific metric? (Score:2)
There is one catch: Zeus can only beat the RTX 5090 GPU in path tracing and FP64 compute workloads. It's not clear how well it will handle traditional rendering techniques, as that was less of a focus.
It's relatively easy to optimize one metric or two. But that's not the same as being "10x faster" generally.