Slashdot Log In
'Killer' Network Card Actually Reduces Latency
Posted by
Zonk
on Sat Dec 09, 2006 05:20 PM
from the i'll-be-a-monkey's-uncle dept.
from the i'll-be-a-monkey's-uncle dept.
fatduck writes "HardOCP has published a review of the KillerNIC network card from Bigfoot Networks. The piece examines benchmarks of the product in online gaming and a number of user experiences. The product features a 'Network Processing Unit' or NPU, among other acronyms, which promise to drastically reduce latency in online games. Too good to be true? The card also sports a hefty price tag of $250." From the article: "The Killer NIC does exactly what it is advertised to do. It will lower your pings and very likely give you marginally better framerates in real world gaming scenarios. The Killer NIC is not for everyone as it is extremely expensive in this day and age of "free" onboard NICs. There are very likely other upgrades you can make to your computer for the same investment that will give you more in return. Some gamers will see a benefit while others do not. Hardcore deathmatchers are likely to feel the Killer NIC advantages while the middle-of-the road player will not be fine tuned enough to benefit from the experience. Certainly though, the hardcore online gamer is exactly who this product is targeted at."
Related Stories
[+]
Technology: Killer NIC K1 and Custom BitTorrent Client Tested 106 comments
NetworkingNed writes "The new Killer NIC K1 is the successor to the much debated original Killer NIC card that offers the same features at a lower price: this time for about $170 or so. Not cheap, that's for sure. But in this review at PC Perspective, not only is the new card tested under the drastically updated Vista networking stack with improved results, but the free BitTorrent client that runs on the Killer NIC is reviewed as well; with it you should be able to download torrents without affecting online gaming performance. Enough to warrant a $175 network card?"
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
correct me if I'm wrong... (Score:4, Interesting)
A killer NIC? LOL what a phrase... Aren't there several of these Nicolas guys in jail already? right next to the killer Bobs and killer Joes.... sheesh
Re:correct me if I'm wrong... (Score:5, Funny)
Parent
Re: (Score:3, Funny)
Re:correct me if I'm wrong... (Score:4, Funny)
Parent
Re:correct me if I'm wrong... (Score:5, Informative)
A better design would be to have networked data in well-defined regions which the card can DMA directly out of and into. The PCI bus can handle a 4K transfer as an atomic operation on a single channel, so a 4 channel PCI card can simultaneously send and receive streams of jumbo packets without requiring any CPU intervention. The driver would merely need to be passed two lists of physical page pointers - one for inputs, one for outputs - for each of the open connections and to pass back signals from the board that an entire packet and/or message had been uploaded for a given connection.
(Jumbo packets can go up to 8K, and you can do 8K over two PCI channels, so that's a good unit to be working with.)
The next improvement that can be made to such a system is to improve the buffering. The network will be slower than the computer, almost always, so being able to queue up multiple packets for sending is a Good Thing. Filtering out packets that have not been sent but are no longer worth sending can be done entirely in parallel and does not require anything extra at the end of the pipeline. Not all packets are of equal value, though. You want to deliver the information in the order that will give the best possible benefit - which may not be the order in which the program generates the traffic. Hierarchical Fair Service Curve, Class Based Queueing, and a bunch of other similar techniques, have been developed to fix exactly that sort of problem.
If the ISPs would stop being so bloody stupid, you could also enable protocols such as your basic multicasting (for your UDP stuff) and Scalable Reliable Multicast (for the stuff that needs to reliably get through). That hacks, slashes, butchers and roasts (with just a hint of parsley) problems associated with sending identical state information to multiple end-points.
Bear in mind that Myrinet, Dolphinics, and a bunch of other vendors, use essentially the above mechanisms already and are achieving latencies in the region of 2.5 - 3 microseconds. I say essentially, because I'm not convinced they've optimized quite to the degree I'm suggesting - reliable, scalable multicast RDMA isn't something you'll see a lot of even at a supercomputer fair. True, you're not getting that kind of latency over the Internet whatever you do, but if you can achieve a hard real-time guarantee of 3 microsecond delivery in a LAN party, you WILL notice a difference. At the very least, in the door price, which will now be expressed in exponential notation to fit on the door.
Parent
Re:correct me if I'm wrong... (Score:5, Interesting)
1) bunch of blah and stuff about memory. Since your explanation is memory->application->CPU->kernel memory->protocol stack->CPU memory->NIC driver->bus (basically, it was hard to follow with all the fud), you obviously have no idea how an OS works (I can't think of any modern, common OS's that have such a path). None of this happens as you describe, they are all parts, but the flow is nothing like you describe. See LKML for 2.6 on network programming if you want to see how this works on Linux, which is relatively transparent http://lkml.org/lkml/2005/5/17/78 [lkml.org] also you can look at BSD.
2) The PCI Bus is irrelevant for gigabit ethernet (which is about the only network controller commonly in production, legacy stuff like 10/100 is more common- but is almost out of production) and for faster types (10GE or myrinet or infiniband), totally irrelevant. The 32bit PCI bus limit is about at gigabit speeds, and it is shared with everything else on the PCI bus- therefore suboptimal:
http://www.codepedia.com/1/PCI+BUS [codepedia.com]
PCI-X and gigabit controllers directly off the Controller chipsets is how networking is mostly done now.
3) blah blah, network slower than computers (ridiculous depends on the network and computer exclusively- in consumer computers it swings in a pendulum, when 100Mb came out most of the stuff in the PC couldn't keep up- it was faster to install over the network than from CD ROM because the CD drive was slower, it is going through that again with gigabit- most consumer PCs disk systems can't even approach filling gigabit). Then some conflation about what QoS, and policing can do... QoS only helps if the pipe is full:
http://en.wikipedia.org/wiki/Quality_of_service [wikipedia.org]
or
http://www.cisco.com/univercd/cc/td/doc/cisintwk/
4) ISP and stupidity. ISP's may or may not be stupid. They are driven by market forces and the market force is people don't currently want to pay for a tiered service class internet. When they do, they will offer it. Technically it has been feasible for years. Read NANOG mailing list, you will see they are not stupid, but instead are in a low margin business.
5) blah blah blah, microsecond delay, destinguishable from millisecond via a consumer computer with a common OS by a person?? hahahahah. not without a measuring device. It is possible with enough training (I suppose musicians can). Since you can buy commodity off the shelf lan gear that will turn in sub millisecond delay, I don't think spending the extra-money on low microsecond delay will help
Bunch of pseudo-science modded up on Slash again...
Oh and Jumbo FRAMES are commonly 9000B in size (although the term can refer to anything bigger than 1500B:
http://sd.wareonearth.com/~phil/net/jumbo/ [wareonearth.com]
or 9K on cisco:
http://www.cisco.com/warp/public/473/148.html [cisco.com]
Parent
ISPs aren't "bloody stupid" about multicast (Score:4, Interesting)
Network-layer, Deering-model multicast is never going to happen. It has nothing to do with ISP business models and everything to do with simple technical feasibility:
There isn't even an agreement among protocol designers about what multicast is supposed to accomplish anymore. BitTorrent is taking a lot of the steam out of it; so are unicast solutions to streaming media that prove that multicast is inessential. Multicast gets used tactically inside of some networks, but if you're on the same LAN as your other players, the network is already plenty fast for gaming even with unicast.
Forget about multicast.
Parent
Re: (Score:3, Informative)
Re:correct me if I'm wrong... (Score:4, Insightful)
Parent
Re:correct me if I'm wrong... (Score:5, Insightful)
This is an "emperor with no clothes" thing - if you can't tell the difference, you must not be an experienced gamer. Since I'm an experienced gamer, I can tell the difference. HORSE PUCKY, boy!
Naw, latency is an easily measured and quantified number and evidently this card does lower your latency somewhat.
How much that "somewhat" is noticeable is debatable. For those spending $bucks a month for high-speed internet for their $buckbucksbucks gaming rig, a crappy NIC is going to be rather bothersome. Go talk to a rabid "knife-makes-you-run-faster" CounterStrike player and ask him about the importance of latency.
But, for the rest of us, a NIC isn't really a bottleneck and onboard/generic PCI NICs do just fine. It's not "noticeable" enough.
Think of it as "online gamer viagra" - lower your ping by 5 ms!
Parent
Not exactly. (Score:5, Informative)
The $500 shoes worn by the professional will not be the same as the $500 shoes purchased by the average person. For one thing, the professional is paying for the technology and customization. The average person is paying for the marketing and endorsements.
That being said, the professional would NOT compare two shoes provided by a shoe company and "tested" on their own track.
S/He would compare them to his/her CURRENT favourite shoes on his/her current training track.
And that is where every single one of these KillerNIC "reviews" fails. It is not that difficult to swap a NIC. Yet the "testing machines" are always different. And none of the "reviewers" seem to be able to script a game. Or setup a test network with a test game server.
The "professional" in this case would setup a test network, with a test game server and a sniffer to see what is happening "on the wire" and script the game on his/her favourite machine with his/her current NIC.
Then the "professional" would swap the NIC's and re-set everything and run the script to see what difference/improvements there were.
It's not that difficult and it's not that expensive and yet not a single "review" of this "KillerNIC" seems to be able to do that.
Sure, you can pay $500 for shoes that were hand stitched by virgins under the light of a full moon with thread blessed by the Pope. And they may perform better than this other pair of shoes I'll give you to run in.
But in the end, you'd still be paying for the marketing of un-tested technology.
Parent
Re:Not exactly. (Score:5, Funny)
Paragraphs are fun.
And a great way to break up thoughts.
But each sentence,
does not need its own
paragraph.
Parent
Re:Not exactly. (Score:4, Funny)
you are making
poetry
Parent
Re:Not exactly. (Score:5, Funny)
Parent
All I can say is... (Score:3, Funny)
(sorry for mangling to PA quote)
Seriously, $200? WTF?
ONLY useful for world class gamers/richie rich (Score:5, Insightful)
These kinds of "professional" gamers could use a fancy NIC with lower times. Or if your Richie Rich and you need some extras for your already pimped out gaming rig.
Parent
Check their home machines. (Score:5, Insightful)
In all the "reviews" of this that get posted here, I notice a few recurring items.
One of the most interesting to me is that they want the "gamers" to test the NIC as part of their entire box. But the real gamers would already have a box built to their specs that they were familiar with
Yet the "gamers" never seem to insist that they be allowed to compare the KillerNIC in their own box, against their existing NIC. And if they're serious gamers, they've already spent money replacing the on-board NIC if their motherboard came with it.
Kind of like if a tire company wants you to like new tires, but they won't let you drive them on your own car. You have to use their car. And you have to compare it to a different car that they have without the tires. And people accept that.
Under those conditions, I can show you improved ping times using nothing more than cool stickers for your case.
Parent
Been done before (Score:5, Funny)
http://www.fiftythree.org/etherkiller/ [fiftythree.org]
Game studio did their own test... (Score:5, Informative)
> killer NIC. bad.
> file transfers = 1/4 of the speed of a normal NIC
> the drivers are fucking TERRIBLE to install/uninstall/update, you have to reboot. then it'll let you do whate you need to do. then reboot AGAIN...
> and when it does start working, there is literally no difference in either framerate or ping, even on the games they say it specifically improves
This wasn't entirely scientific... (Score:5, Insightful)
If applicable, what are the settings for the onboard NICs being tested? Many have options for various CPU offload settings and optimizations for throughput or CPU usage.
Until we see these, how can we be sure if a high-end regular PCI-e NIC won't work just as well?
Mod parent up! (Score:5, Insightful)
From TFA:
That is just idiotic.
If you aren't going to do it right, then you are doing it WRONG. So it did NOT "reflect what would happen in real world gaming situations".
Again, you script it. You do not play it.
I'll give the KillerNIC people this, they certainly know how to pick their suckers.
Seriously. They didn't even bring their own PC's? They used the "testing machines" provided for them. And they think this has anything to do with "real world" performance?
A far, far better test, even under these biased conditions, would have been for them to use their own PC's. It cannot be that difficult to swap a NIC, can it?
In a blind taste test, more people preferred Coke over the Pepsi that I had previously pissed in.
For some strange reason, all I ever see in these "reviews" are the KillerNIC people insisting that the games be run on THEIR machines. And people who are "reviewing" it accepting this strange requirement. And not even scripting it so that they can compare it with their home machines.
Parent
The more interesting thing (Score:3, Interesting)
Why? (Score:5, Funny)
Hardware Engineer: Uh, yeah. We were about to offload them on eBay to make room in our closet...
Business Development Guy: No! Lets tack on some parts from China and sell them! We'll call it, hmm.. the Killer NIC! Since no one wants to buy NIC cards, we'll overprice them for no apparent reason! $250 a pop!
* Hardware Engineer bashes his forehead on the desk.
Hardware Engineer: You've got to be kidding me. Isn't that, like, fraud?
Business Development Guy: Not at all. We'll just never say how it works, only that it works. The processor will be for like, decorative purposes. Consumers love that kind of stuff!
Snake oil (Score:4, Insightful)
Latency is 99% percent due to delays over the Internet, not anything that happens on your local machine. What does this card do, sprinkle magic fairy dust over packets so they go faster through the wire?
This reminds me of gold-plated power cords for sound systems. Guaranteed to create richer, deeper sound!
Holy shit (Score:5, Insightful)
Designed to seperate fools from their money (Score:3, Insightful)
Anandtech == Better Review (Score:5, Informative)
Where are the benchmarks? (Score:5, Insightful)
I don't see any benchmarks in that article. Here are some, [extremetech.com], and they don't make the thing look all that impressive.
The only benefit in this thing, apparently, is that, for games which make too many "select()" polls, there's a faster no-data return. This is really a bug in the game, which ought to be multi-threaded by now. As games are revised for multi-core systems, this problem had better go away. In fact, it probably will go away in Vista, which has a multithreaded network stack.
And the question I'd have (Score:3, Insightful)
Comparing it only to cheap onboard NICs really isn't useful. I
Re: (Score:3, Informative)
There's not much to offload in UDP. The checksums are about it.
stupid suggestion... (Score:3, Insightful)
Just get an Intel NIC (Score:3, Informative)
Pfft... (Score:3, Funny)
Game Servers (Score:4, Informative)
Many games have their own interesting capabilities for performance tuning. For instance Counter Strike 1.6 has the -pingboost setting which will switch between select() and alert() syscalls (10 ms reduction) or processing a frame for every packet. Other games have similar tuning options that will enhance performance. Then there's also tuning your network settings.
By the way, as far as I remember this Killer NIC is just some kind of offload engine. How *exactly* does this increase performance when most game specific packets are simple UDP packets that performance-wise are not as demanding as TCP packets (less checksums, no window scaling and other options easily tunable etc.)?
Different types of latency (Score:3, Interesting)
I wrote a client/server app that had to deal with a rediculous amount of information about hundreds of entities moving around the screen. I found the most efficient way to keep messages being processed was to lock the framerate at 30fps and drop frames if that rate could not be maintained. When a frame is dropped the only thing that doesn't happen is that a frame doesn't get rendered. Suddenly the main look is running at thousands of iterations per second clearing out messages from the queue and processing them because it doesn't have to render a frame for a few ms. 30 ms of focused message processing will reduce lag significantly.
If I put the emphasis on rendering frames per second the message queue would back up and eventually the app would crash because the buffer was filling up faster than it could empty it.
Maybe instead of focusing on rendered frames per second, people should be putting more emphasis on iterations per second and getting those messages processed. At 100 fps that give 10ms to render a frame, process all the waiting messages, and perform game logic. Good luck with that. 10ms is barely enough time to just render a frame.
I bet gamers would have a better on-line experience if they'd lock the rendered frame rate to free up more processing power to handle packets. However, I don't think any modern games allow that. Locking the frame rate typically means locking the entire game processing loop and that's stupid and unnecesary. It is possible to not render a frame but still do everything else.
no data, only tell, bug FUD story (Score:3)
The story only talks about "perceived" improvements, purely from a persons perspective while playing.
This is the worst story ever. No factual data has been given to support the writers opinion.
A human being will unlikely perceive the difference between a 50 and a 60ms ping. Only the human ear can distinguish events that close apart in time, but I doubt that even an experienced gamer would be able to tell how high his ping is even in clean situations.
Why not do a double-blind test with multiple test subject? That would have been at least a fair discussion of how people perceive performance in a marginal field like this.
This article is horrible, absolutely rock-bottom. What a FUD.
Lowering pings (Score:5, Interesting)
What almost no one knew was that the mod API allowed you to simply edit those values on the fly.
Re:How ... (Score:5, Insightful)
Parent
Probably not... (Score:3, Interesting)
An ICMP echo reply is totally different though. Unless you have a weird firewall setup going on, it's pretty much just safe to send out the echo response as soon as you get the echo request. So in this situation, you could peg the main CPU and th
No different from any other decent server NIC (Score:4, Insightful)
Of course they also need to be running 15,000rpm SCSI drives on a decent SCSI HBA as well as a top of the line CPU and loads of RAM and top of the range graphics card.
Parent
Re:No different from any other decent server NIC (Score:5, Informative)
Parent
Re: (Score:3, Interesting)
Here are two datapoints. A $10 PCI NIC, and a $100 mobo I bought lately (with an integrated NIC) feature checksum offloading. They are both GBit, so I guess you get that for free on any GBit NIC nowadays.
Other than that, I really don't see how a NIC can decrease latencies. The latency of that first hop off your computer is below 1ms anyways.
Re:How ... (Score:4, Informative)
Of course, all this presumes you're running windows to begin with, so it's useless to me. But for those so afflicted, perhaps a small subset are hyper sensitive to lag enough to make the thing be worth it. And it sounds interesting to me for other reasons - couldn't you stick a hard drive in directly attached to it and run bittorrent on your NIC?
Parent
Re:How ... (Score:5, Insightful)
You'd be surprised what marketers can do.
Parent
Re: (Score:3, Funny)
Re:Bashing (Score:5, Funny)
Parent
Re:Bashing (Score:5, Funny)
Clearly you didn't read the review. The card works.
Maybe you shouldn't accept everything you read on wikipedia as scientific fact.
Parent
Re:Bashing (Score:5, Funny)
Parent
Re:I waited long enough (Score:4, Funny)
Parent
Re: (Score:3, Insightful)