Benchmarking the Benchmarks 126
apoppin writes "HardOCP put video card benchmarking on trial and comes back with some pretty incredible verdicts. They show one video returning benchmark scores much better than another compared to what you get when you actually play the game. Lies, damn lies, and benchmarks."
Erase Futuremark = instant win (Score:1, Insightful)
Re:OSS (Score:2, Insightful)
Either I misunderstood you, or I don't see how the license can be a metric of performance or accuracy.
Benchmarks (Score:5, Insightful)
You must perform the same exact test on all video cards, disclose any variables, and you must not "pick a subset of completed tests to publish". You must not compare tests performed using different procedures, no matter how slight the deviation of the procedures are.
One cannot draw conclusions about "real world" performance from a benchmark. The benchmark is merely an indicator. A "real world" test that uses the strong, formalized procedures of a benchmark IS a benchmark - and suddenly, the benchmark is not "real world" - because the "real world" doesn't have formal procedures for gameplay.
Haphazard "non-blind" gameplay on a random machine is NOT a benchmark, and it can not provide useful, comparable numbers.
A good benchmark is one where (1) most experts agree that it has validity, and (2) one where the tester cannot change the rules of the game.
The numbers of a benchmark are meaningless, except in terms of being compared to one another using the same exact procedure.
Re:HardOCP benchmarks suck ass (Score:4, Insightful)
The highest playable settings for given hardware.
They then change the video card and find the highest playable settings for that hardware.
I'd much rather compare the highest playable settings for two different cards than the timedemo benchmark numbers for two different cards.
Re:Benchmarks are a marketing tool only (Score:3, Insightful)
Not in Crysis, Call of Duty 4, UT3, etc.
When I go to plunk down $200 - $300 on a video card, and one of them performs comfortably at my LCD's native resolution and the other one doesn't, that matters. Saying all cards in a given price range are roughly equivalent is saying that you are completely, 100% blind to the reality of video cards today.
Not the same card (Score:3, Insightful)
It is a bit of a shock that ATI's latest and greatest can't seem to consistently beat nVidia's over a year old GTX cards I guess.
Re:HardOCP benchmarks suck ass (Score:3, Insightful)
For example: 1620x1050 with no AA may be considered unplayable (jaggies) for some, but others it's perfectly fine...
Or, maybe you can turn on the AA, but deactivate shadows, changing your whole "playable" demographic again.
It's like asking someone to benchmark coffee at different resturants to grade whether it is palletable or not.
~D
Re:back in my day... (Score:3, Insightful)
I'm pretty sure these benchmarks are invented by men.
Re:back in my day... (Score:5, Insightful)
There is indeed a bare minimum hardware performance required to play but sadly many new games, especially Crysis, that bare minimum is scarily close to the market's maximum. Benchmarks are supposed to be a way to isolate this and objectively measure it so that a good purchasing decision can be made by the consumer and when the game is played hopefully the subjective experience of enjoyment will follow. A framerate above human perception is needed for fun (as jerky frames lead to nausia and frustration), high detail is needed for the beauty of a game which is probably just as important (it's been the basis for visual art, music and poetry for millennia).
The reason we've got so far and now can have computers, electricity, aeroplanes, cars, etc. is because of the willingness of scientifically inclined individuals to isolate, experiment and measure. Technology is one of the things in life that can be measured and I think it is a good idea to continue to do it, provided we can do it right. Experimentation and science is what got us out of caves no?
As for Hardocp, what have they proven? Apparently traditional time demos run a fairly linear amount faster than realtime demos, even though it has been acknowledged that realtime demos render more including weapons, characters and effects that the canned demo does not. This would be interesting if the question was "how fast can Crysis run on different cards" but that's not what people want to know. What I'd want to know is which card should I buy to allow me to continue to play cutting edge games for as long as possible while enjoying their whole beauty but not getting a framerate low enough to make me uncomfortable. It just so happens that the card with the best timedemo benchmark has the best actual playthrough benchmark and by roughly the same factor. The only difference is that the traditional timedemo depends on only the graphics hardware whereas the playthrough benchmark depends on efficiency elsewhere in the engine (AI physics), where the player spent most time and if reviewing subjectively, the reviewers current mindset and biases.
Somebody please think of the science!
Re:back in my day... (Score:4, Insightful)
I must be getting old, I haven't upgraded my box in almost 2 years.
Cheers.
Re:back in my day... (Score:3, Insightful)
The point is that you can't use a standard game (plus FPS meter) played by a human player to judge a graphics card's raw capabilities. To reduce subjectivity and error, you need a consistency in what is being rendered.