Startup Claims Its Upcoming (RISC-V ISA) Zeus GPU is 10X Faster Than Nvidia's RTX 5090 (tomshardware.com) 69

Posted by EditorDavid on Sunday March 16, 2025 @12:34PM from the in-the-chips dept.

"The number of discrete GPU developers from the U.S. and Western Europe shrank to three companies in 2025," notes Tom's Hardware, "from around 10 in 2000." (Nvidia, AMD, and Intel...) No company in the recent years — at least outside of China — was bold enough to engage into competition against these three contenders, so the very emergence of Bolt Graphics seems like a breakthrough. However, the major focuses of Bolt's Zeus are high-quality rendering for movie and scientific industries as well as high-performance supercomputer simulations. If Zeus delivers on its promises, it could establish itself as a serious alternative for scientific computing, path tracing, and offline rendering. But without strong software support, it risks struggling against dominant market leaders.
This week the Sunnyvale, California-based startup introduced its Zeus GPU platform designed for gaming, rendering, and supercomputer simulations, according to the article. "The company says that its Zeus GPU not only supports features like upgradeable memory and built-in Ethernet interfaces, but it can also beat Nvidia's GeForce RTX 5090 by around 10 times in path tracing workloads, according to slide published by technology news site ServeTheHome." There is one catch: Zeus can only beat the RTX 5090 GPU in path tracing and FP64 compute workloads. It's not clear how well it will handle traditional rendering techniques, as that was less of a focus. In speaking with Bolt Graphics, the card does support rasterization, but there was less emphasis on that aspect of the GPU, and it may struggle to compete with the best graphics cards when it comes to gaming. And when it comes to data center options like Nvidia's Blackwell B200, it's an entirely different matter.

Unlike GPUs from AMD, Intel, and Nvidia that rely on proprietary instruction set architectures, Bolt's Zeus relies on the open-source RISC-V ISA, according to the published slides. The Zeus core relies on an open-source out-of-order general-purpose RVA23 scalar core mated with FP64 ALUs and the RVV 1.0 (RISC-V Vector Extension Version 1.0) that can handle 8-bit, 16-bit, 32-bit, and 64-bit data types as well as Bolt's additional proprietary extensions designed for acceleration of scientific workloads... Like many processors these days, Zeus relies on a multi-chiplet design... Unlike high-end GPUs that prioritize bandwidth, Bolt is evidently focusing on greater memory size to handle larger datasets for rendering and simulations. Also, built-in 400GbE and 800GbE ports to enable faster data transfer across networked GPUs indicates the data center focus of Zeus.

High-quality rendering, real-time path tracing, and compute are key focus areas for Zeus. As a result, even the entry-level Zeus 1c26-32 offers significantly higher FP64 compute performance than Nvidia's GeForce RTX 5090 — up to 5 TFLOPS vs. 1.6 TFLOPS — and considerably higher path tracing performance: 77 Gigarays vs. 32 Gigarays. Zeus also features a larger on-chip cache than Nvidia's flagship — up to 128MB vs. 96MB — and lower power consumption of 120W vs. 575W, making it more efficient for simulations, path tracing, and offline rendering. However, the RTX 5090 dominates in AI workloads with its 105 FP16 TFLOPS and 1,637 INT8 TFLOPS compared to the 10 FP16 TFLOPS and 614 INT8 TFLOPS offered by a single-chiplet Zeus...
The article emphasizes that Zeus "is only running in simulation right now... Bolt Graphics says that the first developer kits will be available in late 2025, with full production set for late 2026."

Thanks to long-time Slashdot reader arvn for sharing the news.

Startup Claims Its Upcoming (RISC-V ISA) Zeus GPU is 10X Faster Than Nvidia's RTX 5090

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 69 Comments Log In/Create an Account

Comments Filter:

About time we can upgrade GPU RAM (Score:5, Interesting)

by HalAtWork ( 926717 ) writes: on Sunday March 16, 2025 @12:52PM (#65238193)

It has been a long time that we are missing RAM slots on a GPU board.
With Nvidia setting limits that 3rd party OEMs aren't allowed to surpass, the GPU has been fixed and there is not enough competition on the high end. AMD is looking viable now at least for enthusiast gamers.
With x86 licenses being short, the APU market does not have enough competition either.

- Re:About time we can upgrade GPU RAM (Score:4, Interesting)
  
  by Guspaz ( 556486 ) writes: on Sunday March 16, 2025 @12:59PM (#65238211)
  
  There will never be RAM slots or sockets on a GPU. It's impossible from a signal integrity standpoint.
  
  - Re: (Score:3)
    
    by Kokuyo ( 549451 ) writes:
    
    At least AMD has an APU that can share most of its possible 128GB with the CPU part.
    Unfortunately, as far as I am aware, only Framework has a product announced in that regard.
    - Re: (Score:2)
      
      by DamnOregonian ( 963763 ) writes:
      
      I'm a bit disappointed in it.
      Its memory bandwidth and performance are still a far cry from Apple APUs. It's a step in the right direction, but I think they could have made it better than they did.
      
      The only thing it does better than an Apple Silicon part of equivalent loadout (M4 Max, 128GB) is tasks that can only be done on x86.
      
      I really, really, really want an x86 part that competes.
      - Re: (Score:2)
        
        by Entrope ( 68843 ) writes:
        
        Apple's memory solutions have an unusually large number (mostly 2, 3, 4 or 8 for plain, Pro, Max and Ultra chips respectably) of high-speed memory channels to provide that much bandwidth. It's only practical because they use something like a package-on-package design. It would take a huge number of circuit traces to do that with replaceable memory, meaning a lot of power loss and board space and probably lower clock rates because of the additional distance.
        I would also love to see a good competitor to tha
        
        Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        Apple's memory solutions have an unusually large number (mostly 2, 3, 4 or 8 for plain, Pro, Max and Ultra chips respectably) of high-speed memory channels to provide that much bandwidth.
        Correct.
        It's only practical because they use something like a package-on-package design.
        Incorrect.
        It would take a huge number of circuit traces to do that with replaceable memory
        Strix Halo will not have replaceable memory.
        meaning a lot of power loss and board space and probably lower clock rates because of the additional distance.
        Power loss isn't really a general problem, it's a problem that specific to LPDDR. You can have lots of bandwidth with replaceable memory too, but these days, we're using LPDDR which makes replaceable memory difficult at higher speeds due to the low voltage. But still not relevant- as no Strix Halo part will have replaceable DRAM.
        I would also love to see a good competitor to that solution, but I don't expect other companies to have that much integration for desktop or workstation designs. Others do it for cell phones and tablets, but the optimization goals there are very different.
        Na. Strix Halo is a direct attempt at competing with Apple Silicon- and it's a a step toward it, but they stopped early
        
        Re: (Score:1)
        
        by Entrope ( 68843 ) writes:
        
        Power loss isn't really a general problem
        It absolutely is a general problem, because physics are real. It takes energy to toggle a bit. It takes more energy to make that transition visible at a longer distance. It takes more energy to do that if there are -- or just can be -- more devices per line. It takes more power to toggle a bit more times in a second. All of those things relate to resistance and capacitance, and the length of a trace is a major driver for both resistance and capacitance.
        Strix Halo is nice, but it has less memory bandwid
        
        Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        It absolutely is a general problem, because physics are real. It takes energy to toggle a bit. It takes more energy to make that transition visible at a longer distance. It takes more energy to do that if there are -- or just can be -- more devices per line. It takes more power to toggle a bit more times in a second. All of those things relate to resistance and capacitance, and the length of a trace is a major driver for both resistance and capacitance.
        The implication was not that it does not take power, lol
        The implication is that you are wrong that it's somehow a limiting factor.
        Strix Halo is nice, but it has less memory bandwidth than Apple's higher end chips and doesn't compete very well on power efficiency.
        No shit. That was the point of my post that you replied to.
        The top end "AMD Ryzen AI Max+ 395" is comparable to the M4 Pro in most specs but uses a lot more power to get slightly less memory bandwidth
        Yup.
        in significant part because of the memory layout.
        Negative. It's limited by its 256-bit worth of DRAM bus.
        They could have made it more if they had wanted.
        There's no way for it to meet even M1 Max levels of bandwidth (400 GB/sec) with off-package RAM because, with current memory data rates, that would require twice as many pins on the SoC and twice as many traces on the motherboard.
        It's a mobile part- it's not like its socketed.
        Yes- you add more pins.
        AMD makes a 512-bit x86 APU right now. They just don't sell it to the consumer market.
        The whole package would be so much bigger that it wouldn't be cost-competitive.
        The package isn't the expensive part. The silicon is.
        R
        
        Re: (Score:2)
        
        by DamnOregonian ( 963763 ) writes:
        
        Also, M4 Max has 8 channels, not 4.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      At least AMD has an APU that can share most of its possible 128GB with the CPU part.
      That's not a positive. The performance of such systems is pathetic compared to what we expect from high-end GPUs.
  - Re:About time we can upgrade GPU RAM (Score:4, Informative)
    
    by WaffleMonster ( 969671 ) writes: on Sunday March 16, 2025 @01:55PM (#65238317)
    
    There will never be RAM slots or sockets on a GPU. It's impossible from a signal integrity standpoint.
    The picture of the graphics card on their website clearly shows two SODIMM slots.
    According to their presentation the GPU card has soldiered LPDDR5X memory running at 273GB/s and card slots for DDR5 at 90GB/s. This isn't total bandwidth available to the GPU as a whole but rather the dedicated memory/bandwidth available to each chiplet within the GPU /w high speed interconnects between the various chiplets.
    
    - Re: About time we can upgrade GPU RAM (Score:3, Funny)
      
      by 50000BTU_barbecue ( 588132 ) writes:
      
      I'm not sure soldiers are the answer here.
      - Re: About time we can upgrade GPU RAM (Score:1)
        
        by flyingfsck ( 986395 ) writes:
        
        Well, the signals all have to march to the beat of the clock
    - Re: (Score:2)
      
      by mattventura ( 1408229 ) writes:
      
      That sounds suboptimal, given that there's a reason GPUs tend to use GDDR (or HBM) rather than normal DDR. For comparison, a 5090 has around 1.8TB/s of memory bandwidth. Maybe better for some tasks, but that makes it a sidegrade to the 5090 at best.
      - Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        That sounds suboptimal, given that there's a reason GPUs tend to use GDDR (or HBM) rather than normal DDR.
        HBM is currently being driven by AI rather than rendering. I don't think pushing the dial on uniform memory is optimal either from a cost or energy perspective. Certainly makes some things easier and certainly some workloads that benefit but NUMA schemes are more energy efficient and scalable. The metric that matters is bandwidth available to each core.
        For comparison, a 5090 has around 1.8TB/s of memory bandwidth. Maybe better for some tasks, but that makes it a sidegrade to the 5090 at best.
        What is the 5090? ... 30% faster than a 4090 @ 1TB/s when actually rendering graphics? There are no shortage of applications for which bandwidth is not
    - Re: (Score:3)
      
      by zeeky boogy doog ( 8381659 ) writes:
      
      Unfortunately this only reinforces GP's claim that achieving the main memory bandwidth required of a "serious" modern GPU (the H100 SXM5 module has 3.5TBps for reference) is not possible outside of soldered HBM.
      
      The number of connector points required to achieve this sort of thing (the H100 memory interface is 5120 bits wide - that's 10K pins) are physically impossible for a pluggable connection that's also going to run at 6.4 gigabaud, NOT consume a thousand watts driving the bus lines and NOT be the siz
      - Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        Unfortunately this only reinforces GP's claim that achieving the main memory bandwidth required of a "serious" modern GPU (the H100 SXM5 module has 3.5TBps for reference) is not possible outside of soldered HBM.
        The high bandwidth shit is primarily for AI. I think the niche here is more likely to be less bandwidth intensive applications than AI. Having said that if bolt cards are cheap and available I wouldn't be surprised to see people pick them up for batch mode inference across large sparse models. This would be a kick ass cost effective option for a lot of people.
        The number of connector points required to achieve this sort of thing (the H100 memory interface is 5120 bits wide - that's 10K pins) are physically impossible for a pluggable connection that's also going to run at 6.4 gigabaud, NOT consume a thousand watts driving the bus lines and NOT be the size of two premium server CPU sockets.
        This is a false choice. If you look at DDR4, DDR5, DDR6, DDR7 pin counts remain more or less constant and bandwidth doubles each time. Where you
        
        Re: (Score:2)
        
        by zeeky boogy doog ( 8381659 ) writes:
        
        Yes DDR speed has managed to keep increasing, but generation for generation it is far behind HBM. We can't wait for DDR10 to reach the speed HBM3 offers now, and the other problem is that DDR is approaching fundamental (physics) limits in terms of power consumption by transmission line drivers and imminently, problems revolving around x=c*t.
        
        Also, the vast memory bandwidth of HBM GPUs isn't just for AI. Their FP32/FP64 and INT32/64 performance also double with every generation, and float point scientific
    - Re: (Score:2)
      
      by ceoyoyo ( 59147 ) writes:
      
      It's an interesting setup. Slow third tier expandable memory. Is it worth it over just using system memory? Maybe for some things, and this thing seems to be pretty focused on specific problems.
    - Re: (Score:2)
      
      by thegarbz ( 1787294 ) writes:
      
      The picture of the graphics card on their website clearly shows two SODIMM slots.
      And yet this is *NOT* a typical gaming GPU. The GP's post stands. You're comparing run of the mill LPDDR5X memory at 273GB/s to the current standard GDDR7 at 1,792GB/s. Expandable RAM is great for some compute scenarios which is what this card is targetted for. But this is objectively not a good thing for a desktop GPU - this is a huge tradeoff in speed for available VRAM. For virtually all desktop / gaming scenarios the amount of VRAM isn't nearly as relevant as the speed of it. ... within reason (the low
      - Re: (Score:2)
        
        by ThurstonMoore ( 605470 ) writes:
        
        RAM upgrades for video cards have been a thing for a while. I upgraded my S3 to 2 MB back in the 90's. Let me guess, that doesn't count because of reasons, right?
    - Re: (Score:2)
      
      by AmiMoJo ( 196126 ) writes:
      
      It seems this thing isn't really designed to be a GPU, it's a compute accelerator for tasks that are typically done on a GPU like AI and raytracing.
      Really more of a very specialist CPU, with tiered RAM, connectivity like ethernet, and so on.
      - Re: (Score:2)
        
        by Creepy ( 93888 ) writes:
        
        Hmm, well Ray Tracing or Path Tracing (a simplified form of Ray Tracing that uses the most likely path to each light) need access to the entire scene and that takes more memory. Traditional rasterization culls the scene using the view frustum, backface culling, etc., so you end up with a much smaller dataset that needs to get processed, but you need those back faces and objects outside the scene to properly calculate reflections in ray algorithms. You could have a set of high speed memory for the rasterizat
  - Re: (Score:2)
    
    by ThurstonMoore ( 605470 ) writes:
    
    I hate to break this to you but RAM upgrades for GPUs have been around for at least 30+ years.
    https://upload.wikimedia.org/w... [wikimedia.org]
    That's what those brown sockets are for.
    - Re: (Score:2)
      
      by Guspaz ( 556486 ) writes:
      
      Yes, they were, and they stopped being viable as the memory bus got faster and faster. It hasn't been viable for many years at this point.
- Re: (Score:1)
  
  by nagglerdamus ( 1131755 ) writes:
  
  AMD is looking viable now at least for enthusiast gamers.
  amd has stated theyre not even trying to compete in the high-end gaming gpu market anymore (for now, at least).
  they have nothing that approaches 5090 performance, their upscaling is inferior, etc.
But is it... (Score:1)

by Anonymous Coward writes:

...10x faster at the same price? Anyone can design a better item, but the claim means nothing if it is that much more expensive as well.
- Re: (Score:2)
  
  by DrMrLordX ( 559371 ) writes:
  
  Looking at the specs, a fully kitted-out 1c26-032 with 160GB of RAM would probably cost more than a 5090. Certainly not $20000, but it would be pricey.
Let's see (Score:5, Informative)

by Virtucon ( 127420 ) writes: on Sunday March 16, 2025 @01:01PM (#65238223)

this is only in "simulation" so the proof and the TDP numbers will be in the tape-out prototypes..

- - Re:Let's see (Score:5, Insightful)
    
    by OrangeTide ( 124937 ) writes: on Sunday March 16, 2025 @02:43PM (#65238367) Homepage Journal
    
    Even 10% slower would put them in the running if there is a lower initial capital investment for someone putting together a data center. I don't see a need to exaggerate if your fundamentals are in order.
    
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    If an established company announce a graphics card that is 10x faster, that's news.
    Not really. Read the story. It's 10x faster in a very specific scenario. That happens frequently enough. This is not going to replace the GPU in your gaming rig.
- - - Re: (Score:1)
      
      by Anonymous Coward writes:
      
      You're off your meds again, I see. I know it's hard, but take them despite how you think they make you "feel." It will help reign in your TDS and get you that much closer to reality. You might want to take an extra dose of the antipsychotics to start, because you aren't a danger to anyone else while you're busy staring at a wall.
Not for gaming (Score:3)

by Gabest ( 852807 ) writes: on Sunday March 16, 2025 @01:30PM (#65238271)

It sounds like they just put a lot of cores on it, enough to handle ten times as much rays in a raytracer. Which can be parallelized up to each pixel on screen.

Unverified claims (Score:5, Informative)

by ahoffer0 ( 1372847 ) writes: on Sunday March 16, 2025 @01:32PM (#65238277)

I'll believe it when I see it. I hate it when it a piece of corporate hype finds its way into the slashdot feed.

- Re:Unverified claims (Score:4, Informative)
  
  by ffkom ( 3519199 ) writes: on Sunday March 16, 2025 @02:06PM (#65238325)
  
  Indeed, given how even the GPUs from the established manufacturers are hardly available at MSRP, this "simulation only"-vapor-ware is not worth reporting on at this time.
  
  - Re: (Score:2)
    
    by evil_aaronm ( 671521 ) writes:
    
    this "simulation only"-vapor-ware is not worth reporting on at this time
    Some of us might be interested in this, even if it's vapor ware, at the moment. The best ideas all start out as vapor ware.
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  True, but they're claiming to be ten times faster for some workloads, and ten times slower for others. While the claims are pretty wild, at least they're not claiming they're better at everything.
  - Re: (Score:2)
    
    by serviscope_minor ( 664417 ) writes:
    
    And specifically FP64, which is not something the 5090 is especially good at: NVidia claim only a few percent of the FP32 performance, actually somewhat less than a high end threadripper.
    They're claiming 3x the FP64 performance of a threadripper, presumably without the general purpose CPU performance of that.
    - Re: (Score:2)
      
      by larryjoe ( 135075 ) writes:
      
      And specifically FP64, which is not something the 5090 is especially good at: NVidia claim only a few percent of the FP32 performance, actually somewhat less than a high end threadripper.
      Nvidia goes out of its way to make sure none the xx90 GPUs come anywhere close to the performance and usability of the much more expensive data center GPUs for compute. This market segmentation is very much intentional.
      - Re: Unverified claims (Score:2)
        
        by SuperDre ( 982372 ) writes:
        
        That's just utter BS as they don't have datacenter GPU's which are faster as the xx90 series.
        
        Re: (Score:2)
        
        by Woeful Countenance ( 1160487 ) writes:
        
        FP64 performance on the consumer-grade cards is limited to 1/32 or 1/64 (depending on GPU) of their potential capability, presumably to drive demand for the more-expensive cards for computation. This is mostly irrelevant to the video performance of the GPUs.
- Re: (Score:2)
  
  by Fons_de_spons ( 1311177 ) writes:
  
  It is not that big a hurdle to translate a design and simulation to reality for digital electronics. Also plenty of competent people around who can do that... Personally I'd unleash the marketing department when you are in that phase in a start-up.
  On the other hand, if they use too large silicon, their yield may be get funny. But if I can think of that, so can they.
- Re: (Score:2)
  
  by bloodhawk ( 813939 ) writes:
  
  The actual details here reveal it is in a very specific compute scenario, they aren't claiming they are a better gaming card or better graphics perf in general. Just FP64 performance which Nvidia don't actually do very well at to start with.
Rookie Numbers! (Score:5, Funny)

by SlashbotAgent ( 6477336 ) writes: on Sunday March 16, 2025 @01:53PM (#65238313)

My upcoming Custom GPU with integrated AI will be 1,000 times faster than NVidia's puny RTX5090. Better still, it only draws 5 watts!
Right now, I'm a little behind schedule on delivery to market due to some manufacturing delays due to capital issues. If you're an investor, this is your chance to git in on the ground floor of what is certain to be the market and industry disruptor of the century. Don't miss out on this once in a lifetime opportunity. Send me your life saving via BitCoin or Ethereum TODAY!
Can't go tits up!

- Re: (Score:3)
  
  by 93 Escort Wagon ( 326346 ) writes:
  
  Yesterday's news. I recently published a slide that says MY upcoming GPU will be 10,000 times faster than Nvidia - and draw only 3.5 watts!
  - Re:Rookie Numbers! (Score:5, Funny)
    
    by Fons_de_spons ( 1311177 ) writes: on Sunday March 16, 2025 @03:46PM (#65238467)
    
    My GPU will consume 0,05uW. It will be 1 000 000 000 times faster than anything on the market. Bit error rate is 0.5. I can make it for 30 cents. Want better bit error rate? Just buy two!
    
  - Re: Rookie Numbers! (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    Damn !
    I was going to do that to.
- Re: (Score:2)
  
  by Fons_de_spons ( 1311177 ) writes:
  
  Investor here with some background in the semiconductor industry. You mention it draws only 5 Watt, is this after place and route and parasitic extraction?
Proprietary RISC-V (Score:2)

by algaeman ( 600564 ) writes:

I don't think the architecture matters much here. The cost difference between using RISC-V and whatever other licensed ISA you might want in a product like this is trivial. RISC-V is there because it is convenient. This is a quite proprietary device.
Sure would be nice to see a fully open source hardware implementation of a GPU with RISC-V so that a SBC could be built without any limits on use from the licensing.
- Re: (Score:2)
  
  by Misagon ( 1135 ) writes:
  
  The thing with RISC-V is that its vector extension had been designed from the start to be useful as a foundation to design a GPU around.
  Then there are many ways to do the actual microarchitecture, of course, and any RVV implementation is not necessarily particularly GPU-like.
  I don't want to see just open source hardware, I would like to see open source GPU frameworks based around RVV -- and GPU designers embracing such an open source system. More portability between vendors could lead to more competition.
Plenty of GPU opportunities (Score:2)

by xack ( 5304745 ) writes:

With so many unemployed people in the tech market right now surely some can come together and start designing new GPUs? The market for GPUs is huge as so many companies need them and stock issues have been happening since 2017. Want to be the next trillion dollar company? Start selling GPU shovels in the AI gold rush.
- Re: (Score:2)
  
  by sanf780 ( 4055211 ) writes:
  
  Not even Intel can create a discrete GPU that many people buy. Competing on the high end requires a lot of effort, and Nvidia seems to provide a complete package already.
It's about FP64 (Score:5, Informative)

by SoftwareArtist ( 1472499 ) writes: on Sunday March 16, 2025 @03:16PM (#65238407)

This is a contrived comparison. All GeForce cards have terrible FP64 performance. It's rarely used in gaming and they don't bother trying to make it fast. Data center cards like H100 and B200 have much better FP64 performance. They're designed for compute applications where its more important.
So they compared their own GPU designed for compute against an NVIDIA GPU designed for gaming, and found theirs is better at compute. I'm shocked, shocked to hear that! If they're going to test against a GPU designed for gaming, they need to test on gaming benchmarks. If they're going to test on compute benchmarks, they need to compare to a GPU designed for compute.

- Re: (Score:2)
  
  by TeknoHog ( 164938 ) writes:
  
  I've also found that AMD GPUs have much better FP64 performance than comparable Nvidias (using my own OpenCL code, so not a general benchmark). It's just another instance of Nvidia's optimizations [slashdot.org] in their gaming cards.
  As for the better FP64 of data center cards, it's interesting that AI uses very low precision such as 4 or 8 bits. So I wouldn't be surprised to see AI-optimized GPUs that lose the higher precision parts.
  Finally, if you can make a chip with a ton of RISC-V cores, why not make a massively
  - Re: (Score:2)
    
    by DrMrLordX ( 559371 ) writes:
    
    That was true up through Radeon VII. For any consumer AMD card beyond that, it's no longer true. And Radeon VII was released in 2019.
Proprietary extensions are bad (Score:3)

by larryjoe ( 135075 ) writes: on Sunday March 16, 2025 @04:32PM (#65238561)

"Bolt's additional proprietary extensions designed for acceleration of scientific workloads"
When GPGPU in supercomputers was an up and coming thing 20 years ago, the big roadblock to acceptance by the big government funding agencies was the need to rewrite software. The scientific programs were written, debugged, and optimized over many years. Switching to CUDA or any other new thing requires rewriting the code, including debugging and optimizing. Often the time to redevelop the software swamps the actual runtime (e.g., it may take 1-2 years to optimize a complex program to get it to run a few days faster), which wipes out the benefit of the new hardware.

- Re: (Score:2)
  
  by TeknoHog ( 164938 ) writes:
  
  When GPGPU in supercomputers was an up and coming thing 20 years ago, the big roadblock to acceptance by the big government funding agencies was the need to rewrite software. The scientific programs were written, debugged, and optimized over many years. Switching to CUDA or any other new thing requires rewriting the code, including debugging and optimizing.
  Switching to OpenCL would give you a lot more hardware/vendor options than CUDA. OTOH, if this new company can put a lot of RISC-V cores on one die, why don't they just make a massively multicore CPU?
Optimized for one specific metric? (Score:2)

by Tony Isaac ( 1301187 ) writes:

There is one catch: Zeus can only beat the RTX 5090 GPU in path tracing and FP64 compute workloads. It's not clear how well it will handle traditional rendering techniques, as that was less of a focus.
It's relatively easy to optimize one metric or two. But that's not the same as being "10x faster" generally.
Only Up, Never Down (Score:3)

by dohzer ( 867770 ) writes: on Sunday March 16, 2025 @07:55PM (#65238881)

Oh great. A competitor. This means NVIDIA GPUs are going to double in price, right?

So this GPU can run 32-bit PhysX? (Score:4, Funny)

by JackAxe ( 689361 ) writes: on Sunday March 16, 2025 @09:36PM (#65239027)

Anybody else read ISA and thought it would be awesome if this GPU was using the Industrial Standard Architecture? Nope, just me? :)

Apple (Score:2)

by mattr ( 78516 ) writes:

So I've been wondering how good is Apple's neural engine at calculating the transformers compared to Nvidia and is there a chance they could improve training by beefing it up i.e. adding more neural cores? Looking at buying an M4 64GB MBP and having to use cloud for training if needed. Is there a shred of a possibility that Apple could release coprocessor modules that could be added to the motherboard or even connected via USB cables, even if it is a box with Nvidia chips in it to improve their training per

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

About time we can upgrade GPU RAM (Score:5, Interesting)

Re:About time we can upgrade GPU RAM (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:About time we can upgrade GPU RAM (Score:4, Informative)

Re: About time we can upgrade GPU RAM (Score:3, Funny)

Re: About time we can upgrade GPU RAM (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

But is it... (Score:1)

Re: (Score:2)

Let's see (Score:5, Informative)

Re:Let's see (Score:5, Insightful)

Re: (Score:2)

Re: (Score:1)

Not for gaming (Score:3)

Unverified claims (Score:5, Informative)

Re:Unverified claims (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Unverified claims (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Rookie Numbers! (Score:5, Funny)

Re: (Score:3)

Re:Rookie Numbers! (Score:5, Funny)

Re: Rookie Numbers! (Score:2)

Re: (Score:2)

Proprietary RISC-V (Score:2)

Re: (Score:2)

Plenty of GPU opportunities (Score:2)

Re: (Score:2)

It's about FP64 (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Proprietary extensions are bad (Score:3)

Re: (Score:2)

Optimized for one specific metric? (Score:2)

Only Up, Never Down (Score:3)

So this GPU can run 32-bit PhysX? (Score:4, Funny)

Apple (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals