Both ATi & nVidia Cheated on Popular Benchmarks for their Benefit

Eugenia Loli 2003-05-27 Benchmarks 33 Comments

ATI Technologies admits cheating the drivers to achieve higher 3DMark03 score, while ExtremeTech’s exhaustive testing of both nVidia and ATI 3D hardware with the newest version of 3DMark2003 confirms Futuremark’s findings of unfair optimizations by both companies.

About The Author

Eugenia Loli

Ex-programmer, ex-editor in chief at OSNews.com, now a visual artist/filmmaker.

Follow me on Twitter @EugeniaLoli

33 Comments

2003-05-27 8:01 am
Anonymous
In my view, when somebody states he did all the things as intended, this is not an admittance of the cheating accusation but a rejection…
2003-05-27 8:11 am
Anonymous
In the ATI’s case – they shuffled a few instructions without changing the way the shader works. This is quite different from what NVidia did. And their mistake is that they should have made that shuffling more generic (e.g. work in all shaders where such shuffling can help), which is harder to do. ATI already said that “they would remove them from their driver” so I guess actualy they will make this kind of shuffling work in a generic way.
2003-05-27 8:17 am
Anonymous
I’m really impressed about the way that ATi handled it.
2003-05-27 8:30 am
Anonymous
ATi are managing to come away from this in a comparatively clean manner. In fact you could say they are smelling of roses compared to the dark stench that is emmenating from nVidia.
HOWEVER – I believe this is purely down to luck.
nVidia would not be trying so hard to improve their scores if the FX5800 cards had shown even half baked performance. As it stands, they produced a turkey and are desperately trying to make it into a fine fillet steak. When caught sticking their hands into the cooking, things are definately NOT looking good for them.
ATi on the other hand produced a great card with the 9700 and have the chance to concentrate more on producing the NEXT card rather than making the current one look good. I fully believe that they would have been putting a LOT more effort into these “optimisations” if the performance between the FX5800 and the 9800 had been reversed.
It’s a funy old world. Perhaps Futuremark should be making it more difficult to cheat the driver tweakers.
A certain scene in one of the games could perhaps be rendered and then compared to the “original” in it’s database. I know that the driver tweakers could get round pretty much anything that was thrown at them, but the more code they have to put in, the bigger the drivers would become. I think everyone would notice if the drivers had to double or triple in size just to cheat a benchmark!
Just an idea anyway.
I wouldn’t really mind what they do to tweak the drivers so long as the tweaking is very generic. To tweak settings just for a benchmark illustrates how desperate a company is to improve it’s performance in the market.
Dunk
2003-05-27 8:40 am
Anonymous
Good for them, I say. Microbenchmarking is deadly. You strive to create as great a piece of art as you can with hardware, dealing with hard tradeoffs and real-use cases, and your sales are totally dependent on some number a synthetic program spits out.
Sometimes companies “lose it” when it comes to meeting customer demand. But other times, they throw far more resources at finding out what customers want than the benchmark writers. Competitors’ marketing departments will crucify you for insignificant score differences, when only large ones matter.
2003-05-27 8:49 am
Anonymous
Just about all the big Silicon Valley tech companies cheat for their own benefit, too. And the Silicon Valley investment banks, venture capital firms, accounting firms, law firms, etc.
One of the reasons that the market is down is that people have begun to realize that technology is mostly a scam. There are precious few real benefits to people. Most IT money ends up going down the drain. It’s not a pretty scene.
Who cares if Nvidia cheats on their benchmarks? What’s that compared to having the low quality video signal they had for years and years? Many times you couldn’t even return the card you bought. Nvidia has always done everything they can to cut corners.
Nvidia had two executives in the SJ Mercury who made the “most overpaid” list. Who honestly thinks Nvidia gives a shit about cheating. It’s part of the new Valley culture that came with the dotcom era. And it looks like it’s here to stay.
Go Nvidia!
2003-05-27 9:04 am
Anonymous
This keeps happening. ATI did it with Quake 3 a while ago, with binary detection and things.
Both companies are being idiots, but I don’t give 3dMark much weight as a benchmark, either. It all depends on the real games at the end of the day.
2003-05-27 9:23 am
Anonymous
I wonder whether other card makers like Matrox and Intel are guilty of this as well. Does anyone have any evidence of things like this occuring in the past?
2003-05-27 10:47 am
Anonymous
Like the old “city of thieves” story…if everyone cheats, it’s the same as if they hadn’t.
2003-05-27 10:52 am
Anonymous
I wonder whether other card makers like Matrox and Intel are guilty of this as well. Does anyone have any evidence of things like this occuring in the past?
If you look back to the mid-90’s, when most companies started making 3d cards, I think you’ll find a lot of this stuff. Of course, Intel’s rarely (if ever) done well with 3d graphics, so I doubt you’ll find much from them. Of course, most of the companies that needed to cheat benchmarks are no longer even minor players in the 3d market.
Of course, I don’t know how many end-users look at benchmarks like 3dMark as much as scores on games they actually have (or can buy). Personally, I found ATI’s Quake 3 cheat to be a bigger problem (since it degraded quality for people actually playing the game by dropping back to 16-bit colour when people selected 32-bit colour).
2003-05-27 11:10 am
Anonymous
One thing is for sure — Matrox is obviously the only one NOT cheating… *lol*
2003-05-27 11:45 am
Anonymous
The problem IMO is: Is there ANY fair, unhacked benchmark, be it a game or synthetic bench…. Given tha nature of the hack NVidia used (clipping), its child’s game to make same thing to EVERY widely used demo out there. I expect soon many reviewers to start using their own, closed demos
2003-05-27 12:09 pm
Anonymous
I expect soon many reviewers to start using their own, closed demos
Even then, since the users don’t have access to it themselves, it won’t mean much to anyone.
The only reason 3dMark is remotely useful as a benchmark is because anyone using Windows can go download it and run it on their own system to make a comparison. Quake 3 would still be the primary benchmark for video cards if id hadn’t released an early version of Doom 3 for that purpose (which seems a little rediculous since no one has the game, but I guess Quake 3’s age makes it a bit rediculous as a benchmark, too).
2003-05-27 1:52 pm
Anonymous
It is probably getting to the point where the market for high-end video cards is going to quickly wind down. Integrated video is pretty good now except for gamers and improving rapidly. Quality 128 MB aftermarket cards are now less than US$100 and price are falling rapidly. The upcoming S3 integrated grapgics chipsets appear to offer high-end performance at a bargain price. I can imagine very high performance 512MB integrated video chipsets will be available in a couple of years on entry level (<US$50) motherboards so the profit margin will be very slim.
2003-05-27 2:00 pm
Anonymous
They’d better give us specs to make drivers for alternative OSes instead of wasting time on that kind of greedy stuff 🙁
2003-05-27 2:29 pm
Anonymous
Once again, this shows that real-world benchmarking should be used in place of synthetic benchmarking, or that synthetic benchmarks should be based on real games.
If, for example, ATI and nVidia had been optimizing for Doom 3 as a benchmark target they could have positively effected the gamer’s experience with that game. Perhaps an external script could control fly-throughs of commercial games with reproducible random elements (one visible, controllable ranseed). Companies would have to program their games to be controlled in such a fasion, but to be used as a benchmark would be a great sales boost.
Everyone has always optimized for tests, both real and synthetic. They know the real sales point isn’t the in-game performance but the specs sheet on the reviewer’s sites. Let’s make them optimize for things we’re really going to be playing.
2003-05-27 2:39 pm
Anonymous
Heh, actually no.
If anyone recalls, back in ’95 there was a big thing going on about how the matrox cards were grabbing the PCI bus for much longer than the allowed time slice in order to get better bench numbers. This was uncovered when the Ensoniq AudioPCI card came out and the sound happened to be extremely choppy when using it with a Matrox card. I remember this because I actually bought a beta Ensoniq board to go in a P133 with a Matrox Millenium I.
I’m too lazy to go looking up URLs on this.
2003-05-27 2:43 pm
Anonymous
Both are guilty . No trust we have in benchmarks anymore .
Beter do it your self tests for what software gonna use:
databases, games , compilers ,appservers
Remember the vansmith scandal vs intella ?
Take a big breath …
http://www.vanshardware.com/articles/2001/august/010814_Intel_SysMa…
2003-05-27 3:54 pm
Anonymous
Reading all of this stuff about vendors cheating on benchmarks is interesting and funny as to how far they are willing to go to in order to make their hardware/software appear faster. Especially the commodity PC hardware such as video cards, hard disks, etc.
What the hardware sites should be doing is instead of being puppets to the vendors (getting engineering samples, etc.) is they do the following:
1. Go out and buy the hardware in question as opposed to getting it from the vendor. This way you avoid having the vendor give you a “juiced” version of the product.
2. Develop a series of real world benchmarks that tax a system instead of using synthetic benchmarks using commercial software. Make these benchmarks so general that it would be next to impossible for a company to “juice it” in order to get a better score.
Most of the tests are unrealistic to begin with because I doubt whether the reviewer(s) have been influenced in one way or another by vendors (amongst other things). I prefer my own testing and have since I did quality control and process analysis for the Navy.
The only ones I take seriously are the Spec benchmarks because they have to be compiled for the test platform, and the reviewer has to specify what flags, etc. were used to compile the Spec benchmarks or the results cannot be used.
2003-05-27 4:56 pm
Anonymous
Ok, so you get 7,000,000 FPS in Quake 3 instead of 7,000,001. Is this really the end of the world, or is there more to it?
2003-05-27 5:12 pm
Anonymous
>>The only ones I take seriously are the Spec benchmarks >>because they have to be compiled for the test platform, and >>the reviewer has to specify what flags, etc. were used to >>compile the Spec benchmarks or the results cannot be used.
Even spec is bad written and can be easily cheated
http://www.aceshardware.com/forum?read=95035033
“You obviously haven’t been following the Spec CPU benchmark in the last few years. Sun’s Forte’ compilers have been performing major algorithm substitution (row <-> column switches including calls to *malloc*.) And if you bother to check you can see that Intel’s compilers have “magically” gotten a lot better at the very same tests that Sun’s have. I.e., probably the two most actively developed compilers right now *both* have algorithm substitution “features”.”
2003-05-27 5:31 pm
Anonymous
Interesting thread. However, I think the poster you linked to is being a little paranoid. Intel can implement some very high level optimizations as a result of all the infrastructure it has to support loop vectorization. It also has very powerful whole program analysis, so it *can* often decide if two bits of code have the same effects. Changing row/column order is actually a technique used in SIMD code, where it is often much more efficient to traverse memory in a specific direction. Since the technique is in Intel’s processor optimization manual, I wouldn’t be surprised if they implemented it. Similarly, it is likely that the Intel compiler can detect that the malloc calls in the loop have no visible effects, and remove those calls or pull them out of loops.
I’m not saying that Intel doesn’t tailor the optimizations to SPEC, they might very well do that. I’m just point out that from my experience with icc, it has some suprising optimizations that are generally applicable, rather than benchmark-specific.
Also, SPEC generally *does* corrolate well with CPU performance, so if people are cheating, they’re doing it to the same degree. The forum implicates the Intel and Sun compilers, which covers the SPARC, Itanium, Pentium4, and Athlon CPUs.
2003-05-27 8:38 pm
Anonymous
“Reading all of this stuff about vendors cheating on benchmarks is interesting and funny as to how far they are willing to go to in order to make their hardware/software appear faster. Especially the commodity PC hardware such as video cards, hard disks, etc.”
2003-05-27 8:38 pm
Anonymous
“Reading all of this stuff about vendors cheating on benchmarks is interesting and funny as to how far they are willing to go to in order to make their hardware/software appear faster. Especially the commodity PC hardware such as video cards, hard disks, etc.”
2003-05-27 8:44 pm
Anonymous
Damn! Sorry about that (we need preview).
Anyway maybe we’re looking at the wrong end. Instead of berating the companies (even though they deserve it), maybe we should look at peoples focusing unecessarily on benchmarks, and using that to determine a “winner” (whatever that means).
2003-05-27 9:23 pm
Anonymous
Even motherboard makers like msi are cheating these days.
http://www17.tomshardware.com/cpu/20030522/index.html
2003-05-27 9:27 pm
Anonymous
Speed means different things to different people, an example is my 13 year old daughter. She complains that her computer is slow (was a dual processor Celeron rig with 512 MB of RAM). This is based on her observations of boot time for her machine (Windows 2000) compared to one of mine (an Ultra 30 running Solaris 9). Yes the Ultra 30 boots faster, but does it mean I am more productive, no.
Some people believe that if they have the absolute “latest and greatest” it will be better for whatever reason. The vendors are marketing to those people as well. I for one prefer to wait until the technology becomes affordable.
There are a number of uneducated computer users out there who every day believe that the only way to get better performance is to buy brand new (and expensive) hardware. Don’t worry about tweaking the OS, just buy more hardware and it will get faster.
To respond to maruiz: If you doubt the native compiler, download and install GCC. However I am sure that Spec has very specific rules on how to use their benchmark, which is why it is NOT widely employed (http://www.spec.org/hpc2002/docs/runrules.html) as an example. And if I am paying the big bucks (over $1,000.00 for Sun Studio) I want those optimizations.
The problem I have with benchmarks of computers in general is that the test environment is not documented well enough in most cases (OS tweaks, driver versions, etc.) which means if I was to assemble like hardware and run the same tests, the results would be far different. The difference in the results would not necessarily be attributable to minor differences in hardware, but configuration, power, time of day, etc. The hardware sites would consume page after page on configuration rather than pages of Sandra graphs. I am looking at it from a scientific level, and so should they.
2003-05-27 11:17 pm
Anonymous
That’s nothing more or less than capitalism. We cheat to get ahead. Anything for the money.
My only hope is that one day someone will cheat so bad that EVERYONE loses their nest egg and learns the hard way. Anything short of that will only prolong the inevitable. That one day we’ll manufacture robots and replace our employees/slaves with machines. What will that mean for the employees/slaves? Who cares? Profit!
2003-05-28 3:28 am
Anonymous
God why do people confuse cheating with Capitalism !! Capitalism is not about cheating ! Captialism is about running a fair race and winning by your own merits. Capitalism needs rules to work properly. Otherwise you end up like Russia which looks like a bad episode of the Sopranos ! That is not Capitalism at all. Yes you want to make money ( aka win the race ) but you can’t and should not do it by breaking the rules !
2003-05-28 4:48 am
Anonymous
O.K. there are lots of ways to “cheat” or enhance ones out come. If you wear lip stick are you cheating, if you get a boob job are you cheating, how about if your a guy but your dressing like a woman? Think about how you draw the line?
2003-05-28 5:50 am
Anonymous
“O.K. there are lots of ways to “cheat” or enhance ones out come. If you wear lip stick are you cheating, if you get a boob job are you cheating, how about if your a guy but your dressing like a woman? Think about how you draw the line?”
Tack this on: Are you causing harm to others by doing so? The line isn’t that hard, unless one’s trying to avoid the consequences.
2003-05-28 5:58 pm
Anonymous
And were 3DFX still in the market, they’d cheat too.
I’m actually not so sure if the chipmakers are really to blame this once: 3DMark has been actively marketed by Futuremark and its predecessors as a benchmarking _Game_, really, complete with the shiny great online-resultsbrowser-who-has-the-biggest-dong^Wscore-thingy. So the nvidia and ati driver guys do what’s in their job-description: Optimize their software for products popular with end users.
Maybe the whole outsourcing-benchmark-suites-to-the-commercial-software-market wasn’t such a brilliant idea after all, dear editors and publishers of tech-magazines.
2003-05-28 6:39 pm
Anonymous
And were 3DFX still in the market, they’d cheat too.
No, you don’t want 32-bit colour, it’s too slow. What you really want, though, is motion-blur. See, like the movies, it’s cool, everyone will use it.
(/sarcasm) (and yes, I do realize some people have been using motion blur recently, especially on consoles, frankly the effect makes me sick)
Actually, 3dfx was the first company I thought of when I read (seems like a long time ago now) that ATI had been dropping back to 16-bit colour for Quake3 when the user specified 32-bit colour, because, obviously, it made the benchmarks look good.
From what I read, 3DMark was doing some rather odd things in order to push the cards harder than they would be in normal use (purging buffers to force the card to re-render the same objects, for instance). While this is perfectly legitimate for benchmarking a card (as long as the purge wasn’t written specifically to function only on certain cards), it should be obvious to most people that a developer looking for good performance isn’t going to purge the card’s RAM completely on every pass if they can reuse anything (whether it’s most of the scene or a handful of polygons and textures) without comprimising the quality of the rendering.
I’m not trying to say that what nVidia did was ok, because they should be working on optimizing their hardware, not making their drivers ignore or substitute operations to drive numbers up without boosting actual performance (and, assuming the claims are correct, what they did caused some degradation in the rendering of the scenes in the test). They can make their functions more efficient, they can work with developers so that games use their cards more efficiently, but if you don’t like the way a program uses your card you don’t start messing with the drivers so that one call means something different with this program than it does with another program.
All of that said, synthetic benchmarks for 3d graphics seem to have always been pretty sad tests of a card’s capabilities. It’s only gotten worse since everyone seems to have agreed to use this one synthetic benchmark rather than developing their own. When a graphics card company has to pay to work with the developers of the benchmark, it only makes things worse. At least when they’re working with the developers of popular games it’s good for both parties (the card company because the game runs well on their card and the game developer because their game runs well on the card; and let’s not forget the end-users that just want the game they bought to work with the card they bought). The end users don’t gain anything from the benchmark software unless it’s widely used and works to get what it can from each card anyway, and even then only if they pay attention to the numbers generated by that benchmark and take the time to understand what they mean (or at least think they know what it means).