NVIDIA GTX 970 Specifications Corrected, Memory Pools Explained 113
Vigile writes Over the weekend NVIDIA sent out its first official response to the claims of hampered performance on the GTX 970 and a potential lack of access to 1/8th of the on-board memory. Today NVIDIA has clarified the situation again, this time with some important changes to the specifications of the GPU. First, the ROP count and L2 cache capacity of the GTX 970 were incorrectly reported at launch (last September). The GTX 970 has 52 ROPs and 1792 KB of L2 cache compared to the GTX 980 that has 64 ROPs and 2048 KB of L2 cache; previously both GPUs claimed to have identical specs. Because of this change, one of the 32-bit memory channels is accessed differently, forcing NVIDIA to create 3.5GB and 0.5GB pools of memory to improve overall performance for the majority of use cases. The smaller, 500MB pool operates at 1/7th the speed of the 3.5GB pool and thus will lower total graphics system performance by 4-6% when added into the memory system. That occurs when games request MORE than 3.5GB of memory allocation though, which happens only in extreme cases and combinations of resolution and anti-aliasing. Still, the jury is out on whether NVIDIA has answered enough questions to temper the fire from consumers.
Re: (Score:2)
Re: (Score:2)
(And before you say it, Intel is only the choice of idiots when it comes to graphics for gamers. Maybe someday they'll get their header out of their asterisk, but it hasn't happened yet.)
Re: Don't worry, AMD would never lie to us... (Score:2)
Re: (Score:2)
There is a 200 dollar price difference for the cards on new egg.... You don't stay in business by undercutting yourself.. Anyone who thought the cards would be identical is beyond retarded.
ever heard of Fire and Quadro series... ?
They are partnering with SOE on a fix (Score:3, Insightful)
You pay for an airdrop containing the extra ROPS and Cache. It's contested, though, so you may or may not get it.
Option? (Score:2)
How about giving us the option to either always be able to run at maximum speed (disable that last 0.5GiB) or always let the software use the full 4GiB (at the cost of speed if more th
Re: (Score:2)
Re: (Score:2)
There might be cases where an application queries how much memory is available, then allocates all of it to use as caching. If the driver doesn't manage that memory well (putting least-used data in the slower segment), that could cause performance to be lower than if it were forced to 3.5GB only.
That said, nobody seems to have found any applications where the memory management malfunctions like that, so it's more a theoretical quibble than actual at this point. And, knowing Nvidia, they'd just patch the dri
Re: (Score:2)
There might be cases where an application queries how much memory is available, then allocates all of it to use as caching.
As you say, that is really just theoretical. Doing that would be a very poor memory management system. Assuming that just because there is free memory you can allocate all of it to use for caching would be a silly thing to do. Even in the case where you can assume that your process has exclusive control and ownership of the memory pool no middleware is even going to do that as code outside that middleware but within that process could allocate GPU memory for some other use. So I doubt this would happen exce
Re: (Score:2)
Most OSes will allocate the vast majority of RAM to a disk cache, because unused RAM is wasted RAM.
Yes, but that is completely different. That is the operating system and it is dealing with system RAM - which the operating system controls access to anyway. What we are talking about here is video memory and processes that do not exclusively control that video memory.
Re: (Score:2)
Re: (Score:2, Informative)
There's really no point to doing that. If you disable the memory and run at high resolutions with ultra textures and AA that would cause you to break that 3.5 GB barrier, your performance would just tank because you are exchanging with main memory. In other words, the performance of the card is exactly at least what you would get from a 3.5 GB card. That extra 500 MB isn't hurting anything.
Re: (Score:2)
How about giving us the option to either always be able to run at maximum speed (disable that last 0.5GiB) or always let the software use the full 4GiB (at the cost of speed if more than 3.5GiB is required).
Because if it doesn't use more than 3.5GB then the performance is no different and if it does use more than 3.5GB then it will use that slower 0.5GB of additional video memory rather than using the even slower system memory. Disabling that extra 0.5GB will do nothing for performance except make it worse in cases where more than 3.5GB is used.
Re: (Score:2)
Re: (Score:2)
has not answered the important question (Score:2)
What about those users (more than one, anecdotes are data not anomalies!) whose use causes the GPUs to attempt to address more than 3.5GB VRAM causing them to crash out? If what NVidia are claiming here according to TFS is accurate, then this should not be happening. It is happening, the 3.5GB roof is being hit hard and people are feeling it. What say you, NVidia?
Re: (Score:3, Informative)
>> causing them to crash out
This is a blatantly misieading thing to say. The cards don't crash at all. The only thing that happens as a result of this is a properly handled decrease in real world performance compared to the 980.
Are you seriously trying to claim that the 970 _should_ have the same performance as the 980?
Re: (Score:1)
no, I did not say that. The claim is that these cards, in the first instance, are being sold as 4GB cards. That's as may be, but the top 500MB is deliberately crippled (to turn a 980 into a 970? I don't get the logic) to the point where it can and does cause repeated and repeatable crashes when the 3.5GB ceiling is met under certain conditions, which is ENTIRELY doable when you have a multiple screen setup. That is the issue.
Re: (Score:1)
The fix here would be to issue a firmware update that disables that slow 500MB*, and update the labelling on the card to reflect the fact that it's a 3.5GB card not a 4GB.
*Ever run RAM of two different speeds in a desktop? Wonder where those crashes are coming from? It's not a case of the faster RAM waiting for the slower RAM, the slower stuff is tripping over trying to keep up with the faster stuff. It doesn't work in the same way as a PATA channel where the bus runs at the speed of the slowest device.
Re: (Score:2)
>> *Ever run RAM of two different speeds in a desktop? Wonder where those crashes are coming from?
No I'm not that stupid. And yours is a *terrible* analogy. You're not meant to mix different speed ram for your system. At least it says so in every motherboard manual I've ever read, which is a LOT in over 30 years of building my own PCs. If you do then you are not only operating outside the design parameters of the motherboard maker, but cluless about computers.
Also, GPU memory is functionally totally d
Re: (Score:2)
credible citations required. As in, direct links to pages in motherboard manuals that do NOT specify ECC memory modules.
Re: (Score:2)
Stop trying to change the subject. The issue here is graphics cards and your claims that the 970 is causing crashes. I say prove it.
Re: (Score:2)
Re: (Score:2)
possibly but if so its still a stupid thing to say as its obviously going to confuse meaning.
Re: (Score:2)
Re: (Score:3)
>> where it can and does cause repeated and repeatable crashes
I call bullshit. Please post credible references to people actually experiencing crashes while gaming as a result of this.
Re: (Score:2)
uh, the fucking slashdot thread from yesterday?
Re: (Score:2)
The only graphics-related article posted yesterday that I'm seeing is about openCL on Linux.
Re: (Score:2)
Re: (Score:2)
...Which again makes no mention at all of crashes occuring as he stated several times.
Seems he's just trolling with no actual evidence. Probably just another rabid ATI fanboi.
Consumers? No just whiny fanboys (Score:4, Insightful)
Consumers are fine. The only benchmark that matters to a normal consumer is "How fast does it run my games?" and the answer for the 970 is "Extremely damn fast." It offers performance quite near the 980, for most games so fast that your monitor's refresh rate is the limit, and does so at half the cost. It is an extremely good buy, and I say this as someone who bought a 980 (because I always want the highest end toy).
Some people on forums are trying to make hay about this because they like to whine, but if you STFU and load up a game the thing is just great. While I agree companies need to keep their specs correct, the idea that this is some massive consumer issue is silly. The spec heads on forums are being outraged because they like to do that, regular consumers are playing their games happily, amazed at how much power $340 gets you these days.
Re:Consumers? No just whiny fanboys (Score:4, Insightful)
Honestly, even something like that 970 is overkill for me. I've still got an 8800 in my old machine that runs plenty of games just fine, especially many of the older ones that I'm finally getting around to playing.
Re: (Score:2)
>> there are some people who do have cause to complain if they would have changed their purchasing decision based on having the correct information at the time of their purchase.
While I agree that nVidia should have got the published details correct, do you seriously think a customer exists that would not have bought this card had they knew the memory handling strategy was slightly different than published, even though the overall performance of the card was advertised correctly?
That would be as insan
Re: (Score:2)
As an owner of a GTX 970 card, all I can say is I can run Shadow of Mordor at full 1920x1080 res with the "ultra" texture setting and it never dips below 30fps, usually getting 45-60.
The additional fact I got the card as an open-box return at the local computer store for $220 makes things a no-brainer for me even if the allegations of 3.5gb vram were true.
There is no game in existence that a 980 or titan card can play that my 970 couldn't, even if I had to bump the settings down to just "very high".
If I bou
Re: (Score:2)
The issue here is that you're playing on lower resolutions that generally require a lot less VRAM.
But for that very reason, you don't buy a ~350EUR card for a 1080p. You buy a 200EUR one, which is 960.
970 is meant for 1440p or higher, and there, and games like SoM start to ask for more VRAM on ultra. A lot more. And at 2160p you're looking at being capped, though for that you'd probably want a 980.
Re: (Score:1)
False advertising requires that the false information, amongst other things, would have affected the purchase decision and resulted in some sort of lose -- usually the amount spent on the item that would otherwise have been used differently.
So given that requirement I'll go ahead and fix your statement.
If you bought 970 because it has 64 ROP and it only has 56, then fix it.
Re: (Score:1)
I spent $800 on a pair of 970s and as I use massive resolutions, I use a lot of gram. They basically fucked me and took the cash. This is not what I expect having come from a pair of actual 4gb 670s. So fuck you. I am not a whiny fanboy, I am a pissed CUSTOMER who paid good fucking cash and lots of it.
Re: (Score:2)
Hey, I'm still happy about my purchase but when I bought it I looked at the specs and thought: Hmm, they've disabled 3/16ths of the shaders, but it has the same ROPs, same cache, same RAM... if I buy two for SLI it should perform like the GTX 980 except for having 2x13 = 26 shader blocks instead of 16/32 for a single/double 980. Now I find out that's just not true, it has 0.5 GB quasi-RAM it can't access at the listed memory bandwidth, I feel I got very legitimate reason to feel cheated.
Apparently the ROP/c
Re: (Score:2)
Kind of.
You're right, the performance - as long as it stays within 3.5GB - is fine. The thing is, given that the extra 0.5GB brings the performance down, they probably should have just made it a 3GB card - which again would have given the same amount of performance in the majority of cases - and shaved a bit of the costs.
Re:1/7th the speed? (Score:5, Insightful)
Re: (Score:2)
The factor is 1/7.
Use this formula: 1/7 * speed
Better article (Score:5, Informative)
As usual, AnandTech's article is generally the best technical reporting on the matter [anandtech.com]
Key takeaways (aka tl;dr version):
* Nvidia's initial announcement of the specs was wrong, but only because the technical marketing team wasn't notified that you could partially disable a ROP unit with the new architecture. They overstated the number of ROPs by 8 (was 64, actually 56) and the amount of L2 cache by 256KB (was 2MB, actually 1.75MB). This was quite unlikely to be a deliberate deception, and was most likely an honest mistake.
* The card effectively has two performance cliffs for exceeding memory usage. Go over 3.5GB, and it drops from 196GB/s to 28GB/s; go over 4GB and it drops from 28GB/s to 16GB/s as it goes out to main memory. This makes it act more like a 3.5GB card in many ways, but the performance penalty isn't quite as steep, and it intelligently prioritizes which data to put in the slower segment.
* The segmented memory is not new; Nvidia previously used it with the 660 and 660 Ti, although for a different reason.
* Because, even with the reduced bandwidth, the card is bottlenecked elsewhere, this is unlikely to cause actual performance issues in real-world cases. The only things that currently show it are artificial benchmarks that specifically test memory bandwidth, and most of those were written specifically to test this card.
* As always, the only numbers that matter for buying a video card are benchmarks and prices. I'm a bigger specs nerd than most, but even I recognize that the thing that matters is application performance, not theoretical. And the application performance is good enough for the price that I'd still buy one, if I were in the market for a high-end but not top-end card.
Not a shill or fanboy for Nvidia - I use and recommend both companies' cards, depending on the situation.
Car Analogy (Score:2, Insightful)
A particular high performance car has a premium 8 cylinder engine and 32 valves at 400 hp. They also sell a non-premium version which is also 8 cylinders but only 30 valves and makes 350 hp but is a lot cheaper. The difference is that one cylinder is missing two valves which lowers its maximum power compared to the premium version. The engine's computer correctly controls the engine to compensate for the one weird cylinder, but someone in the marketing department sold the car as having 32 valves when it onl
Re: (Score:1)
That's a REALLY bad car analogy.
It's more like:
Chevy makes a Z-28 Camaro with a 5.7L 525HP V8 engine and sell it as such, with all sales brochures and commercials stating it has a 5.7l 525HP V-8 engine.
They also make a Base model Camaro with a 5 liter V8 making 450HP.
Both models hit their estimated performance targets and all performance testing results show the Z-28 out-performs the base model by the pre-determined percentage. As it should.
Chevy dealers around the nation receive and start selling base mode
Re: (Score:2)
>> window stickers stating the base model has a 5.7L 525HP V8 engine....They discover it's only putting out 450HP
Not at all, Nvidia has never advertised it as having more performance than it actually has.
Re: (Score:3)
Both of you suck at car analogies.
Let's say Nissan makes an engine. V6, 3.8L. They advertize it as being 250HP, promote it mainly by putting it in racecars and winning races, and a whole lot of other technical specs get handed out to reviewers to gush over, but nobody really reads them except nerds.
They then make a variant engine. Same V6, but they cut the stroke down so it's only 3.0L. They advertize it as being 200HP, promote it with some more racecars that don't win the overall race but are best in their
Re: (Score:2)
Both of you suck at car analogies.
Let's say Nissan makes an engine. V6, 3.8L. They advertize it as being 250HP, promote it mainly by putting it in racecars and winning races, and a whole lot of other technical specs get handed out to reviewers to gush over, but nobody really reads them except nerds.
They then make a variant engine. Same V6, but they cut the stroke down so it's only 3.0L. They advertize it as being 200HP, promote it with some more racecars that don't win the overall race but are best in their class, and again they hand out a small book worth of technical specs, this one with a minor error in the air flow rates on page 394. Somebody forgot to edit the numbers from the 3.8L engine, so even though the actual airflow is more than enough for the smaller engine, the numbers originally given look bigger.
Except memory for whatever reason is what most laymen measure graphicscard performance on. So it i s not an obscure little number. This is claiming the engine is 3.8L and forgetting to say that it a 3.8L that has been cut to only use 3.0L and therefore perform as a 3.0L.
Re: (Score:2)
Unlikely to be deliberate? For this to NOT be deliberate you have to assume that everyone involved in the design never ever looked at the marketing material, which is absurd. Not only that but even after the card was released the incorrect information was what the company said was the facts until there was objective proof that it was a lie.
Of course none of this would be a big deal at all had they not lied about it. I'm a long time nvidia user (I don't consider myself a fanboi), but the only way you can say
Re: (Score:2)
Re: (Score:1)
Yeah; in this case, the initial mistake at publishing appears to be an honest mistake. As the specs lined up and the tech marketing guys were told that nothing changed on that hardware, they kept the same ROP specs they''d already been using in their marketing materials.
I have to admit that back when I was publishing specs like these, I made that mistake myself a few times.
However, it's what happened next that's a bit odd: I find it difficult to believe that it took 4 months for it to come to the attention
Re: (Score:3)
This wasn't "marketing material", it was "technical marketing material", the stuff given to review sites, not the general public. And it was a relatively obscure portion that was incorrect, not something that most consumers would even understand, let alone care about. The technical marketing staff (a distinct group from the consumer marketing department) made the assumption that every enabled ROP/MC functional unit has two 8px/clock ROPs, two L2 cache units of 256KB, two links into the memory crossbar, and
Re: (Score:2)
How the information was disseminated and to who is a red herring.
Technical information was published publicly. No matter how esoteric and incomprehensible it was or how small of an audience it received it was out there and the ONLY way you could argue this is a genuine mistake (in that it remained uncorrected until now) is if the people who inside nvidia who knew the actual capabilities never saw this published public information. I find the idea that their team, who knew it was false, didn't see this publi
Re: (Score:2)
I won't make any assumptions about you, but I've *never* looked at the marketing for the product I work on. I don't check to make sure their numbers are accurate, because my job is to build the damn thing, not proofread. If someone from marketing *asks* me to check something, I will, but I don't go around reading reviews to make sure all the numbers are right.
Further, it's a compromise in a part that's already compromised. In any video card, there are several parts that need to be roughly proportionate in p
Re: (Score:1)
I still have to criticize Nvidia. I don't know if this has been done before on previous cards or not (which doesn't make it ok), but when you advertise a certain amount of memory with no indication that part of that memory performs differently/worse, then that's just wrong.
Would it be okay if 0.5GB of the 4GB was the pool that actually ran at full speed? I mean technically you still get 4GB... no sane person would say yes. It's not much different than what is happening with the GTX 970 and I think Nvidia sh
Re: (Score:2)
I would imagine that it get most troublesome only if you use the card for computing and rely on a homogenous memory access....
Re: (Score:2)
... and if you actually have completely random memory access, and if you're using a 970 instead of an actual compute card from the Quadro or Tesla line...
Re: (Score:2)
Yes. I think if I do SW development, a price difference of 3 man-hours of less does not justify the trouble...
Re: (Score:2)
What "fire from consumers"? (Score:1)
I just bought a GTX970 and I'm chuffed with it, 89% of the performance of a GTX980 for 60% of the price, I couldn't justify spending the thick end of five hundred quid for a graphics card and the GTX970 will run pretty much anything I throw at it.
Captcha : illusion
Just bought two of these cards (Score:2)
I thought about shelling out the dough for the 980s, but didn't, because, well - same CPU, lower clock, right?
Wrong.
Less than happy about this.
Re: (Score:2)
>> Less than happy about this.
Why? How does it adversely affect you in the real world? You're stil getting the same GPU performance you paid for right?
Re: (Score:2)
I run three 30" 2650x1600s and fully intend on maxing out the VRAM on those cards.
I get a big hit in doing so. Boo.
Looking into returning them on principle and getting the 980s.. not sure if that hurts nVidia any, but might piss off their vendors a bit.
Re:Just bought two of these cards (Score:5, Informative)
>> I run three 30" 2650x1600s
Thats pretty much irrelevant. GPU ram isn't used that way at all. Its used to hold the 3D geometry, bitmaps, bump maps etc of assets and other processing data which is largely if not completely independent of screen resolution/no.of screens.
Do the math:
2560 x 1600 x 4 (4 bytes per pixel for 32 bit color) = 15.625 Mb * 3 monitors = screen buffer for 3 screens total size = 46.875 Mb.
Even triple buffering your total screen buffer requirement for all 3 monitors is less than 150Mb.
Re: (Score:2)
>> You people clearly have no idea what a frame buffer does.
They're probably the same people that also think buying/adding more ram speeds the processor up.
Re: (Score:2)
Re: (Score:2)
Thats pretty much irrelevant. GPU ram isn't used that way at all. Its used to hold the 3D geometry, bitmaps, bump maps etc of assets and other processing data which is largely if not completely independent of screen resolution/no.of screens.
For real-time rendering of a simulated environment - that is, gaming - textures are generally stored as mipmaps [wikipedia.org] so the more pixels it's going to take up on the screen, the more detailed version of the texture is used and thus the memory use rises accordingly through the entire pipeline. It's pretty easy to see if you keep resolution or texture quality constant and vary the other. If you're doing some other kind of simulation that might not hold, but for gaming what you said is pretty much false.
Re: (Score:2)
Yeah I know. Its basicaily what I meant when I listed bitmaps.
The whole point of using mipmaps is as a strategy to reduce GPU memory/bandwidth usage. Consequently I don't think differentiating between bitmaps and mipmaps is actually relevant to the larger discussion here.
Re: (Score:3)
Yeah, you return those cards and give nvidia more money for the more expensive cards. That'll show 'em!
Re: (Score:2)
... except they WEREN'T the "same CPU". They were the same GPU die (GM204), but three of the sixteen cores were disabled. This was perfectly explained at launch. If you bought a 970 thinking you could overclock it to 980 clocks and get the exact same performance, I'm sorry, but you just weren't paying any attention.
First it's the games, now the hardware :D (Score:2)
Wondering if we have to start doing the same thing with hardware now. Let the same folks who pre-order game titles beta-test this stuff for a few months to determine if the marketing claims are legitimate or not, then decide on if you should buy it.
" Just ship the damn thing ! We'll updat
Re: (Score:1)
Your statement that the last .5GB is not running at 1150MHz is as factually correct as Nvidia's statement that the card had 64 ROPs...
The issue isn't with the speed of the RAM, it's with the setup of the connections between the RAM and the GM204 GPU, the entire last 1GB of RAM is accessed using a single L2 interface, while the other six L2 interfaces only handle .5GB each.
Each of the seven L2 interfaces in the GPU can handle roughly 22GB/s bandwidth to RAM and data in RAM is interleaved between interfaces,
Re: (Score:2)
Isn't this largely a driver issue? (Score:1)