Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Hardware

TCP Equipped Ethernet Card 168

Josh Baugher writes " A 100 megabit ethernet card with a TCP/IP stack built in. They claim to be able to do 9 megabytes/second with only 2% CPU load (compared to 4.5 megabytes/second at 98% receiving CPU load using Windows NT TCP/IP ( read about this on "geeks" mailing list.) "
This discussion has been archived. No new comments can be posted.

TCP Equipped Ethernet Card

Comments Filter:
  • by Anonymous Coward
    I'm using Samba between Windows 98 and Linux 2.2.6, and my speeds are significantly lower than
    that (~ 100 kb/s). I have a decent 3Com card in
    the Linux box and a cheapo Linksys card in the
    Windows box, along with a Linksys hub. Is there
    anything I can tweak to make it go faster?
  • by Anonymous Coward
    I never though CPU load was an issue, if 100 Mbit netwwork cards have that sort of cpu load then whats the go with gigabit ethernet ?
    Maybe they have some wierdo network cards that dont use DMA.
    I think ill stick my 10mbit NIC for now
  • by Anonymous Coward
    There are at least a half dozen companies developing/offering routers on silicon. Most have highly parallel and exotic (e.g. hypercube) architectures with hardwired advanced pattern matching algorithms. The future of TCP/IP is in dedicated hardware circuits.
  • by Anonymous Coward
    Banging bits out an ISA bus isn't exactly a fast process.
  • by Anonymous Coward
    Ok, Ive got two computers with SMC etherpower II 10/100 cards running under Linux 2.2.4, connected by a crossover cable in full-duplex mode, with little other traffic. ie: ideal conditions.

    host 1: nc -l -p 5050 > /dev/null
    host 2: dd if=/dev/zero of=/proc/self/fd/1 bs=4096 count=102400 | time nc utrk 5050 -w1

    output from host 2:
    102400+0 records in
    102400+0 records out
    0.70user 7.70system 0:46.10elapsed 18%CPU (0avgtext+0avgdata 0maxresident)k
    0inputs+0outputs (140major+22minor)pagefaults 0swaps

    since netcat had a 1 sec delay, we get a total of 45 seconds for the transfer of 419430400 bytes = 9320675 bytes/sec = about what they are quoting. CPU load was higher that 2%, obviously (actually, about 25%), but the computer was still usable (it is the disk IO that would kill it, anyway)
  • by Anonymous Coward
    Yeah, this is in fact one of the "other uses" that toshiba is planning for the chip that is going to be used in the "Playstation II"
  • by Anonymous Coward
    Um, just wanted to point out, that the samba protocols specifies that for most machines out there, you can only send 65,535 bytes in one go. And then you have to make sure that the other side heard you. So, ftp will always be faster than samba.

  • by Anonymous Coward
    Remember to keep in mind the differences between bits and bytes, while its probably true that you don't _notice_ a performance hit at 900K/s, what you really mean is 900 kilobits/s. That's nearly 80 times less than 9 megabytes/s. So if the kernel spends 1% of CPU time on 900kb then its easy to see why it might spend a whole lot more time on 80 times more data.

    My two cents.
  • by Anonymous Coward
    oh yeah, interested people should take a look at

    D. Clark, V. Jacobson, J. Romkey, and H. Salwen,
    An analysis of TCP processing overhead, IEEE Communications, Vol. 27, No. 6, pp. 23--29,
    June 1989.

    Classic paper! (well, kinda)
  • by Anonymous Coward
    You must be *very careful* assessing the load on a Linux system, as the measurement of the load is far from optimal or complete!
    The "load average" you see displayed in the output of the "w" or "uptime" commands is quite useless in this case, it shows the average number of user processes that are "ready to run", i.e. they would take the processor if it were available. This number in no way reflects the load by interrupt handlers. It is also badly computed, as it includes processes that are waiting for the disk.
    (this error was introduced into the kernel long, long ago and I have mailed Linus about it, but he did not want to change it as this figure better reflected the general feel of load on the system, in his opinion. In my opinion, it makes the whole "load average" useless, so I always patch this away when I compile a kernel)

    The numbers in /proc/stat are also useless for the kind of tests required in this case, because the load imposed by interrupt handling is added to the process that happens to be running when the interrupt comes in. When the interrupt comes when the idle process is active, the interrupt handling time is counted as idle time.
    This means that when a test is run that does not use much user-space processing and heavily relies on interrupt handling (like a networking test where a test process sends useless data over a TCP socket! same when it tries to use a serial port), the load results obtained from Linux will be far off the realistic load on the processor.

    Rob
  • by Anonymous Coward on Saturday May 08, 1999 @06:45PM (#1899615)
    Although writting a driver would be fairly easy, this would break all sorts of features that have come into existance due to open source protocol stacks (not being able to do IP chains stuff to an external IP stack is a definite step backwards).

    They would either need to open source the stack and make it downloadable from an OSS driver (like some of the SCSI cards out there) or the card will never get within 10' of my boxen.

    Somone already pointed out the security implications. Personally, I don't want to be yanking a card in and out of production just becuase someone built the next teardrop and the vendor is slow to fix it.

    As for NT, well, this is obviously the tact that MS is pursuing to gain equal performance with other OSS operating systems, but it has certain implications that will keep NT in the second fiddle chair. The card will probably weaken NT security initially by breaking fixes that are covered by current hotfixes and service patches. Additionally, I can think of no better way to fill up somone's disks than by having improved transfer retes across the net, operating system and disks. Worm designers should re-joice as well. Now, you can design worms that can consume more resources without being noticed. If you're really tricky-trick you could design a worm that existed only within the context of the the TCP/IP stack and if the board has NVRAM... well, a box could stay compromised for years. How long before a Microsoft Weapons System sees daylight?


    -- "Most decently written TCP/IP stack applications have NO buffer overrun problems" - an anonymous programmer at Fort Mead

    -- "ALL TCP/IP stack applications are a long way from being mathmatically correct." -- A mathmatician's retort.

    -- "Our job is to find the differences between one and the other and keep this information from the public as long as possible." -- A manager who successfully defused the situation..
  • by Anonymous Coward on Saturday May 08, 1999 @04:06PM (#1899616)
    Heh, the sites been slashdotted, they should use their own cards or something... :)

    Moving the stack into hardware is an interesting idea, though. Unfortunately it has some negative (and admittedly positive) implications for those concerned about security.

    First, it will be impossible to tell what operating system a computer is running by using TCP fingerprinting. This is both good and bad in that it will thwart script kiddies to some extent by not revealing the platform, thus making it more difficult to take advantage of well known exploits. On the other hand things like Netcraft and the Internet OS counter will also not be able to take surveys properly.

    Second, and entirely negative, is the possibility that their hardware implmentation of TCP/IP may be sub-standard. It may have scads of DOS loopholes and other weaknesses. Unless they make the thing software upgradeable as holes are found, and make the software Open Source, I don't see it gaining much marketshare against the cheap and plentiful cards we have now.
  • I submitted the news item, and thought it to be a bit overwhelming to cite more than one source. What is the appropriate limit as to how many sources to cite?

    I think going 1 level down is fair -- I got the item from 1 source, the geeks list.

    What if an item popped up on the geeks list that came from site1 that came from site2 that came from site3...?

    Shoud geeks, site1, site2, site3 ALL be cited?

    Opinions?
  • I've managed to max out the ethernet card a few times with ftp transfers between my Linux box and my FreeBSD box at home. I would certainly be complaining if I was only getting 900 kbits/second. Heck, I could get more than that over a parallel port. :)

    - A.P.
    --


    "One World, One Web, One Program" - Microsoft Promotional Ad

  • Maybe NT's stack is just woefully inefficient, but I never experience undue CPU load when doing massive file transfers in Linux. At worst, the load will hit 0.05 or so. I certainly don't *notice* any performance hits, however, even when getting 900K/s on a 10 megabit card.

    - A.P.
    --


    "One World, One Web, One Program" - Microsoft Promotional Ad

  • I've got 4 drives of varying make and model from IBM, Seagate, and Fujitsu running at 7200 or 10K RPM. The IBM gets about 10 MBytes/s on average. I've seen it hit 13. The others get between 8 and 12 (usually on the low side.) Since Linux's load average is not just a measure of CPU activity, when I'm transferring files between drives on the same SCSI chain, even, the load will hit 1, but the CPU usage will remain at or close to 0.

    - A.P.
    --


    "One World, One Web, One Program" - Microsoft Promotional Ad

  • I've seen ftp reporting 1.1e+3 kbytes/s on my 10Mbit Ethernet. Of course, there wasn't really any other traffic on the network at the time.
  • by Fict ( 475 )
    If your return policy is still good get rid of the linksys hubs. not because it is slow, but because in a matter of months, the hub will experience nice spiraling death-type symptoms.

    So far my sohoware 8 port + 1 switch is holding up nicely.


    ------------------
  • Note that WinNT had 98% CPU load going at 4.5 MB/sec. You mention that you don't notice a performance hit at 900kB/sec. If the CPU load varies linearly with transfer speed, NT would only use around 19.5% of the CPU at the 900kB/sec you're getting, at which point you probably wouldn't *notice* any performance hits either.
  • Posted by max kreed:

    http://mainz-online.de/internet/news/news130598. html

    Notice the date: 13th of May 1998. If it's a hoax then they've been doing it for quite a while I guess.
  • >since netcat had a 1 sec delay, we get a total >of 45 seconds for the transfer of 419430400 >bytes = 9320675 bytes/sec = about what they are
    9,320,675

    Actually, the speed he is quoting is about 9.3MB
  • Whomever said proprietary?

    First of all, many threads here have been talking about writing drivers for it for Linux, etc, and second, many threads have talked about makeing it software upgradable to fix security holes.

    Consider that BECAUSE it was posted to Slashdot the makers of this card could be slashdotted with email and offers of help for making drivers for alternative OSs. Plus, if slashdotters show enough intrest they'll make OS drivers because they can see that it will increase their sales.


    I think this kind of thing is EXACTLY what should be posted to Slashdot because we'd be able to make a difference.

  • Netcraft uses ident info returned by the web server. This would still be available to both Netcraft and script-kiddies.

    DOS attacks on the card might be a problem, but I imagine they would be upgradable. Also, the higher levels of the protocol would still need to be on the main processor. Presumably workarounds would still be possible.

    ... Ami.

  • by Ami Ganguli ( 921 ) on Saturday May 08, 1999 @04:08PM (#1899628) Homepage

    I worked with an expensive Intel NIC 9 years ago that had an i960(I think) and an OSI protocol stack on board. Never did any benchmarks, but I'm guessing the complex OSI protocol stack plus wimpy ISA '386 boxes made putting intelligence on the NIC a good idea at the time.

    I figure there must be a good reason these things haven't gone mainstream in almost a decade. The proliferation of simple TCP/IP plus faster CPUs might be one reason.

    ... Ami.
  • Not all of your bandwidth is going to go to data. Remember that you have protocol overhead to deal with -- ethernet packets have headers, TCP packets have headers, and so forth. All of those headers take up some of your precious bandwidth, so you don't really have 1.25MB/sec to play with.
  • The PIO is killing your performance. Switch to a PCI bus mastering card, and your load will go WAY down.

  • (a full TCP stack is a pretty large thing to be putting in hardware)

    Unless they count firmware as hardware. I have considered similar stunts with the Acenic gigabit card. At gigabit speeds, the idea looks a lot more useful.

  • Please, correct me if i'm wrong, but couldn't you
    use an FPGA or something similar thereby providing
    reconfigurable hardware TCP.

    Now that would almost certainly put the cost up and there may be security issues, i'm not too hot on FPGA's. :(


    Anybody know if this has been tried using FPGA's somewhere else ???

    Iggy
  • I can't wait until memory management is moved to hardware only. Maybe Linux MM would stop crashing. Looks like web servers are going to be implemented in hardware real soon. Would you believe 3D graphics were once done in software?
  • by jd ( 1658 )
    Depends on your MPU. :) 1500 is merely the maximum. :)
  • But ethernet only lets you send 1500 bytes in one go, so it's not that big of a deal.
  • Winmodem people need to learn something from this. Things perform better in hardware than software. Of course, this depends on the openness of the drivers. We may be stuck with a good card and no documentation. Only problem I see is that tcp and other pieces of this layer are intended for software. So maybe this isn't a good idea (I'm tending to agree with the "what if there is a hardware bug" comment). Winmodem people seem to have taken the opposite approach and I'm not sure who is worse, but the winmodem people can definately learn from these guys.
  • Linux's tcp/ip is many times better than NT's, someone did a comparison I don't remember the URL. But do your own tests...
  • Give up some more information. Is this just with Samba? Try FTPing and see if you get similar speeds. If you do, then I can't help you because I'm hopeless with Samba :)
  • by mikpos ( 2397 )
    If whoever did this is getting 900 kilobits/s, he's got SERIOUS problems :). Even 900 kilobytes/s is pretty slow for a 10Mbit (although maybe SMB is just a slow protocol?). Still, my ftpd will eat up to 5% of my CPU running at ~1100 kbytes/s, so I can see where this TCP on-a-card sort of thing might come in handy for higher bandwidth stuff.
  • by mikpos ( 2397 )
    Thanks, I never saw that before. Still there is more to consider.

    When you start offloading stuff from the CPU to a processor on an add-on card, it's going to be a pretty single-purpose processor. This really limits the lifespan of it, though. What if there's some new networking protocol that obsoletes TCP? If the processor on the netcard isn't general purpose enough, then it'll go to waste. What if people start using voxels instead of polygons (hypothetical), are we going to find a bunch of Voodoo cards in the trash?

    Of course general purpose processors on cards would be even worse. You'd be much better off with another CPU. But it would be nice to have some means of keeping abreast with changing protocols and APIs and the such.

  • 5 port 10mbps 10baseT hub, $50 with 2 PCI NE2000-compliant NICs. 2 years later, still going strong, with an average of around 650kbytes/sec transfer (acceptable for what I use it for)...
  • Western Digital Caviar 20.4 GB IDE drive gets 10.8 MBytes/sec raw reads on my box (P200, 96 megs RAM), and 33.3MBytes/sec from cache.

    But to WRITE that fast is another issue entirely.
  • I've run it on a 386DX/33 with 4MB of RAM and -NO- hard drive, a 486SX/33 with 4MB RAM and a 204MB hard drive, a 486DX2/50 with 4MB of RAM and a 400MB hard drive... No problems ever. 2.0.34, 2.0.36, 2.2.1, 2.2.5.
  • Remote loopback is a term used by some Ethernet card diagnostics for a network test in which one system sends out an ethernet packet and the receiving system immediately takes it and sends it back.. sort of a link-level ping test.

    With a remote loopback, you get hardware testing, but it's probably not a great test for anything having to do with tcp/ip since it's not involved.

  • It's a hardware story... how hard would it be to implement this on a linux box?

    Just about any interesting new computer toy should be reported here. If linux (and other) users don't keep up on what's new, they're going to end up in the silicon ghetto, just like Microsoft would want.

  • Putting that functionality in hardware ties you to a particular implementation. Systems have been designed that would handle page faults purely in hardware. They were very complicated and rather limiting. Experience has shown this to be an area best handled by software, which is flexible and can be easily changed or fixed.

    Yes, I believe that 3D graphics were once done in software. They still are, in many applications. 3D game rendering is a rather specific task, and very computationally expensive. Thus, it was worthwhile to develop specialized hardware for this purpose. Compared to modern workstation hardware, TCP stacks are fairly inexpensive, if written correctly. I question how how useful silicon TCP would be. I expect that the added cost and reduced flexibility would more than outweigh the saved CPU-load for most cases.

    Then again, perhaps NT's implementation of TCP is poor enough to where this is a concern.

    Oh, and I never have problems with Linux MM.
    --Lenny

    //"You can't prove anything about a program written in C or FORTRAN.
    It's really just Peek and Poke with some syntactic sugar."
  • it doesn't have to be *good* news....

    Ah, I didn't think of that. Well said. I fell into the "this shouldn't be on Slashdot!" trap there when what I really wanted to do was inject a note of caution. Proprietary hardware is one of the few things that can kill a free operating system, and a nerd who values her freedom has cause to be wary.

  • can't wait until memory management is moved to hardware only. Maybe Linux MM would stop crashing.

    Are you trying to spread FUD? I've been using Linux for 7 years now, never had it crash. What are you _doing_ wrong?
  • One thing that really made a difference for me was switching samba (ver 1.9.18 on Caldera) to user level security (at least I think that's what made the diff.=)

    Tranfer speeds of 1 MB/sec (100Mb NIC, all IDE since it used to be my desktop system,) compared to our NT server (100Mb NIC, all SCSI on a $25K dual CPU Compaq server) doing about 500K. My boss (a big NT fan) hates when I point this out. Hehe.

    Performance does suck when backed up to another NT server, though. Only 500K/sec. I have a feeling this would inprove if we'd map the share instead of using the UNC. Anyone have exp w/this?
  • i get about a 50% cpu load on a p5-120/3Com 509/10BT when moving at full speed, messes with my mp3 playing...
  • true, but a ten base card can't fill the throughput of an isa slot.
  • Look here [marko.net] for my study on Linux vs NT performance with TCP. NT's stack is indeed inefficient, but the performance may also be related to really lousy interrupt handling as well.

    Of course, I tried to post this to /. a few times and it didn't make it. Oh well. Maybe people will find it appropriate to this thread.

  • I've oft wonderd about this maybe time for an other ask how can the load avg be out of sync with the cpu utlisation by such a large margin?
  • I find it odd that so many linux users complain about fud directed towards linux, then go around and spread this kind of nonsense. I get 1mb/s at work easily between 2 win95 machines, both with 10mb cards with no noticable cpu overhead at all.
  • You can actually dynamically load a new TCP/IP stack if you want. Most applications ask you to use Microsoft TCP/IP stack, and ask you to reboot after making changes because the Win32 API is so brain-dead that no one is sure what changes do what to what. So long as you're fairly confident about the reliability and ability to stay within its memory space, you can plug and chug a new stack (i'd reccomend you do it in kernel land, instead of userland, the hooks ARE there, or use psxss)
  • by Detritus ( 11846 ) on Saturday May 08, 1999 @05:06PM (#1899657) Homepage
    When 3COM first started making Ethernet boards, they tried putting the protocol processing software on a smart Ethernet board. It never worked too well. Unless you had a slow system, with a brain dead operating system or small address space. The processors on the "smart" boards were cheap and slow. The on-board software tended to be buggy, limited and out-of-date. The cards were expensive. You still had to have some sort of light weight protocol for communication between the operating system and the card, adding another layer of software. With well written software, a 25 MHz 68020 host, could run TCP/IP at wire speed on a "dumb" 10 MBPS Ethernet card. The "smart" cards quickly disappeared.

    Putting the protocol processing in Silicon will burn you when you need new features and algorithms in your networking stack. What happens if you need large windows, SACK, IPV6, IPSEC, QOS?

    The right solution is to use an operating system that doesn't suffer from MBD and use decently designed network cards on a fast bus. 100 MBPS shouldn't be a problem for a decent system. 1 GBPS is where current hardware and operating systems fall down and need improvement.
  • Although, how much CPU do you need to fill that ISA bus? My understanding is that by moving 10BT ethernet to a PCI slot on an older machine (like that guy's P120) could cut CPU usage by 50% or more. In ye olden days, moving the network from ISA to even EISA was a noticible improvement.
    --
  • If I remember well, an Ethernet above a 10% usage level is considered very close to being quite dead.

  • From what I've heard Linux has the fastest TCP/IP stack of any OS. But i've never seen any numbers to back this up.
  • This item was originally posted on memepool [memepool.com] on Friday. Hey folks, there's absolutely nothing wrong with taking items from one forum and sharing them on another. Just make sure credit is given where credit is due. In this case, the original item at the geeks list [monkey.org] referenced both memepool [memepool.com] and robotwisdom [robotwisdom.com].

    Let's get those attributions right!

    Peter

  • I've been wondering about how well a Linux PC would actually be doing on a 100Mbit Ethernet.. I'm a bit worried that it wouldn't be too good, as there seem to be real work for the CPU to do. I became aware of this when I found that an R5000 SGI O2 couldn't do more than max. 5 MB/sec, memory-to-memory TCP no disk involved! And this used more than 60% of the CPU, the system was completely kneeling and with all the other work going on the TCP became a real bottleneck :-(
    IRIX 6.3 isn't the worst operating system in the world so this got me thinking.
    That on-board TCP stack seems interesting, but only if it supports something else than NT of course.
    But then there's the problem of embedded TCP stacks, I've yet to see one withouth strange bugs here and there. TCP stacks are notoriously difficult to get right, in practice it's only a real, open-source preferably Unix box that can be trusted to (eventually) get it Right.
    TA
  • Who's talking out of his ass here? If you haven't got better comments than that please shut up.
    Here's the .sig Dave Miller used last year:
    Yow! 11.26 MB/s remote host TCP bandwidth & ////
    199 usec remote TCP latency over 100Mb/s ////
    ethernet. Beat that! ////
  • >I always patch this away when I compile a kernel)
    Would you mind posting a patch for us lazy people? :-)
    TA
  • by TA ( 14109 )
    Thanks,
    TA
  • If you think that 900KB/sec is slow for 10bT, what do you think the average is? Ethernet is not known for it's ability to work well at high levels of utilization, and 900KB/sec is 7.2Mb/sec. At this level of utilization and up, the contention between different hosts trying to talk at once drives up the collision rate which keeps the tranfer rates down.
  • How about... A sound card with an MP3 player easily loadable into it's onboard DSP to offload that task from the processor?

    Such a thing has already become commonplace on the Amiga. Most of the soundcards have DSPs, and most of those are capable of doing MP3 decoding to take load off CPU. It's cool, but I would expect it to only be of interest to people who aren't able to get faster CPUs (e.g. 680x0 users). With today's PPCs and x86 chips, and clock rates approaching a gigahertz in the next year or so, it seems like the processing requirements of decoding MP3s are kinda trivial.

    Don't get me wrong -- I'm always interested in custom hardware to take stuff over from the CPU. Heck, that was the whole point of the original Amiga hardware. But general-purpose CPUs are just getting so damned fast... who cares about CPU load anymore? Just get a faster processor. They're already "infinitely" fast for most people's purposes.

  • The first thing I would do is get a couple of better ethernet cards. The 3C509 series are good 10 Mbit cards, but the 3C905 has gotten bad reviews as a 100 Mbit card. The cheapo LinkSys card is probably an NE2000 clone, and NE2000 clones are notoriously poor performers. As cheap as 100 Mbit Ethernet cards are these days there is little reason not to get them. Assuming you have an available PCI slot in each machine I recommend getting the Bay Networks FA310TX card, they run between $25 and $40 retail (CompUSA and Best Buy both carry them) or mail order. These cards use the DEC Tulip style chipset which is very fast. Another Tulip based card that is good is the D-Link DFE-500TX. I believe LinkSys also has a Tulip type card, but be careful, all of these brands also make cheapo "NE2000" style cards, even in 100 Mbit versions. The older Tulip cards were easy to identify because the large chip had "DIGITAL" on it, but newer chips are manufactured by someone else as Compaq sold off the old DIGITAL chip manufacturing recently.

    In general, NE2000 style cards are at best mediocre performers, and at worst are a major nightmare (read the Ethernet HOWTO). If you can afford it, get yourself a 100 Mbit hub. If you only have two machines and can't afford a 100 Mbit hub, get a flipover cable and don't use a hub at all. If you have more than two machines and some money, you can get a switching hub to reduce collisions.

  • hmm...does this mean that some of those poor, defenseless servers are finally going to have a /. defense mechanism?
  • Um, it is equally possible it doesn't suck. What is a "DOS loophole" Open source would be nice, but it doesn't necessarily determine the success or failure of this card. I think the amount of money they can put into marketing and distribution is a bigger issue.
  • First, a couple of years is a long time in the networking industry. More impressive first products have been engineered in that time frame.

    Second, don't be missled by claims that something is done in hardware. As often as not, this does not mean that the entire implementation is burned into silicon. Often it only means that processing that may have been done on the main CPU has been shifted to a dedicated processor. Often times this dedicated processor may be particularly well suited to the task at hand because it implements special instructions. It may also be on the same die as other discrete functional units dedicated to the task (like an ethernet controller).

    What you end up with can be quite fast, but it still retains a bit of flexibility, so it can be reprogrammed to fix bugs, or meet new standards. (or something totally unrelated. Appearantly the engineers at Alteon programmed the MIPS CPUs they use in their gigabit ethernet switches (two per port) to crack RC5 keys for a laugh.)
  • Ever heard of switched ethernet?
  • In reading this last post, I realized that I have a number of questions which I'm not able to answer. Figuring that I'm not in the minority here, here are my questions for others to answer:
    • There are many hardware patents out there that don't affect us that much, the question is -- is this one of them?
    • is the "Silicon TCP" is indeed a step forward for NICs, and if so
    • is the patent based on "prior art", and therefore unenforceable?
    I would hate to see something that would work well for us regular (e.g., don't have unlimited funds to buy the next new great hardware) folks trapped within a "can only get it from one company" patent.
  • What do you think DVD and VideoCD hardware decoders do? Video CD is MPEG-I, and DVD movies are stored using MPEG-II.

    Of course, it's not as though MP3 decoding should be rough on a reasonably fast machine. My 333 Celeron typically has loads less than 1% while playing using mpg123 or x11amp. Comparably, running WinAmp over on my windows partiton typically uses ~10% CPU.

    The problem is that adding a specialized MP3 decoder eats up yet another slot in the case (unless you make it a USB/FireWire thingy). I'm down to one PCI/ISA slot on my ATX (w/ AGP) mainboard, and once I buy a modem next month, it's full.

    A potentially bigger problem is that you've got to redo all the MP3 players out there to support the decoders, and you've got to settle on a standard hardware decoding method, so we don't have to have 5 different versions of x11amp.
  • And when does this reach it's critical mass? How much CAN you take off the software? Are we going to reach a point where the only bit of software we have is that which writes the (hopefully) upgradable driver?

    And then, will the standards still be there? Will they be open? We can hope so, but we have to remember that hardware companies can be extremely stuffy about giving away hardware specs. Just look at Aureal or Creative Labs who won't release specs for driver writing to the Linux community.

    Recently, more and more is moving to the hardware. And as Moore's law begins to top out we can expect to see more of it. 3D graphics, 3D sound, these tcp/ip stacks, et al. Where does that put software itself?

    Food for thought.
  • > Maybe we should try to port netperf
    > (www.netperf.org) to Windows and add raw
    > Ethernet to it.

    netperf allready runs on Win32, I've tested it on both 95 and NT. The netperf ftp site has binaries for Intel and Alpha.

    Neither 95 nor NT has any problem saturating a 10Mbps ethernet, but at higher speeds NT kicks the snot out of 95 on the same h/w. I've seen NT get 90+Mbps on a 100Mbps ethernet, but don't have anything faster to test with.

    The loopback tests show that Linux and FreeBSD have comparable maximum speeds, much higher then NT (on the same h/w). One assumes that this has to do with how often stuff is moved around in memory, etc. etc.

    BTW, if you ever want to convince yourself that ISA sucks, just run some tests on a PCI NIC and then an ISA NIC, and watch the 60-80% increase in CPU use for the ISA card.
  • The systems were PII-333s with 64MB RAM running NT4SP4, and DLINK 8029 (or maybe 8019) PCI NICs.

    I believe that the NT TCP/IP stack has been improved greatly since the 3.5x days, which was what I suppose you would have been running on a P90.

    (Sorry for delay getting back to you, has been busy at work)
  • I don't find SMB that slow... between my 2 machines Dual-100 linux to celeron 450 win98, I get 900KB/sec on 10mbit and 4MB/sec on 100mbit. And yes those are bytes, not bits.
    The main reason it isn't any faster is the drive in the linux box can't go any faster, linux/ide-fireball -> win98/scsi-cheetah...
  • Surely there is more at stake here than the CPU usage?

    If the soundcard could do the decoding then you'd only be sending 3 or 4 megs over the PCI bus rather than 30 or 40 megs. Would this not be a good thing?
  • Sigh. That old myth. Look at the first paper on this page. [utexas.edu] One of the designers of Ethernet tested it. Drove it at almost 100% of 10Mbps. A bunch of workstations on the net.
  • by DeepBrain ( 28018 ) on Saturday May 08, 1999 @04:18PM (#1899683)
    They make it sound like they're using fancy servers.

    They're not, they're using two proprietary IBM Aptivas, low end machines. (I should know, I have one of them, the same model they used). Firstly, the E56 has a 266 mhz K6, unless IBM lied to me too :) Secondly, 48 megs of RAM seems a bit low for NT. Thirdly, the hard drive systems in there are probably also low quality to say the least.

    Couldn't they have borrowed Mindcraft's server or something? At least THEY could tune an NT box :), although not a Linux box... and it was a server. I don't call a home system a server that will be pumping 90 megabits/sec.
  • Isn't this what IO2 is supposed to accompish? This is very cool, but it would be nice for the TCP/IP stack to be configurable and upgradable. i.e. to IPv6 etc. This should be great for homebrew routers and such also. I hope linux drivers appear soon.
  • I just checked the troughput between a pair of my computers. I did this with a test program that used read and write calls to transfer the data and did nothing else with it.
    The data transfer ran at 10Mbytes/second with a 12% cpu load on a PII-266. configuration info at end
    Extrapolating to a modern 500MHz system, that would be something like a 6% cpu load. I suppose there are niches where the differense might matter, but with three 10Mbyte/sec transfers the PCI bus will choke anyway and we are looking at the difference between 18%cpu for network and 6%cpu for network.
    I guess if you are stuck with a server OS where the TCP stack is a pig and you can't fix it then maybe this is a reasonable optimization, but I sure wouldn't bother with it for linux.
    Footnote:It turns out the sending data to /dev/null is more expensive than the ethernet transfer.
    Configuration of Receiver: DellXPS PII-266, Kingston 10/100mbit card(dec21140), Linux 2.2.6, tulip driver.
    Configuration of Sender: Gateway PII-300, Kingston 10/100mbit card(dec21140), Linux 2.2.1, tulip driver.
    I'm not sure if Kingston still makes these, we buy equivalent cards for $13 now.
    Configuration of Network: Half duplex 100mbit ethernet with about 50 machines hanging on it. Mostly idle during test.
    Procedural Note: I did run the test many times and exclude the slow outliers, both of these machines have operational services that I did not wish to disable, so many test runs were ruined by other loads on the machines.
    Open Benchmark In order to preserve my credibility I will offer to allow anyone to run my tests exactly as I have in order to verify my results. No changes which may improve the results will be permitted. ;-)
  • Still, my ftpd will eat up to 5% of my CPU running at ~1100 kbytes/s, so I can see where this TCP on-a-card sort of thing might come in handy for higher bandwidth stuff.

    This is ESR's cycle of reincarnation in action.

    Putting TCP on the ether card should allow some serious savings in CPU time, regardless of OS, if it's bus-mastering. For ftpd, the CPU should (in theory) set up a TCP connection with the card, open up a set of blocks (or a file; I'm no expert on this particular protocol) with a bus-mastering SCSI, and just let the disk flow to the card, or the card flow to the disk. Voila, file transfers that eat only the bus and not the processor.

    I think that, in the cycle of reincarnation, we have been going from everything-on-the-processor to everything-on-separate-chips. We have separate video accelerator cards (ignoring the video cards themselves), we have SCSI to drop the I/O load off of the chip. People still haven't folded sound into the processor.

    Between all this and the dearth of processor vendors, offloading TCP onto a card makes sense. The less load one places on one's processor, the less tied one is to the processor vendor. Today, people talk about having Intel or not. Imagine a world where people describe their computers by their net card type or memory vendor...

  • Interesting, though I expect IO bottlenecks elsewhere in the architecture would keep you from seeing the full benefits in a real world situation. Still, might be fun to snag a couple and hack together some Linux drivers.

    Thad

  • The web page describing this NIC is really unimpressive. What I'd really like to know is:

    How many simultaneous connections can this thing support, and is it slower when multiple connections are used? How's the performance when one of these cards is hooked up to a machine with a standard software TCP/IP?

    Putting TCP/IP in hardware is nice, I guess, but then nasty real-world issues crop up. What happens if there's a bug in the implementation? There's also the nasty challenge of writing a driver for a card like this, but I won't claim that's a defect of this NIC design.
  • Cisco's switching layer 3 at 40mpps, and layer 2 at 256mpps. All at wire speed gigabit ethernet, OC48 ATM, etc., all non-blocking. mmm... beefy...
  • the big push for gig ether is campus trunks between buildings. goes up to 10km (single mode) and Cisco's Gigabit EtherChannel can aggregate up to 16 links. Really great if your org grew strangely, and you have departments in two buildings. It also helps in linking switches in a internet server fanout. Either way, it's more of a backbone technology than anything else. In fact, PCI32's theorietical peak (132MB/s) doesn't match the throughput of full duplex 1000Base-SX. And NT's networking core prevents running the network over 400Mbit peak.

    Interestingly enough, on most tested unices, i think they're getting around 800-900Mbit. It'd be interesting to see how fast ftp.cdrom.com would be with gig ether to the backbone... Maybe then i'd see more than 10KBps...
  • >My 333 Celeron typically has loads less than 1% >while playing using mpg123 or x11amp.

    no, that is not right, linux has got a problem with this, also the CPU usage of NAD was under Windows only 0%...

    winamp consums 5,4% of the cpu power, on my PII/300.
    tested with the rc5-client, with and without winamp, this 5,4% are the difference between the key-rates.

    try it under linux with this method
    (i had not the time, yet)
  • First off, to all those engaged in the "I get more bandwidth than you" pissing contest: try beating 560Mb/s sustained application-to-application bandwidth between two P2/450GX systems (running NT, BTW, but not NT's TCP/IP). So you think you can beat that, eh? Try beating 2us application-to-application latency for zero-length messages. Can't do it, even with buzzwords like VIA and I2O, can you? OK, next topic.

    Regarding TCP/IP on the card: as another poster pointed out, this is not a new thing but a very old thing. I once worked on putting DECnet on old 3Com "smart" cards...ick. There are all sorts of problems with doing this sort of thing on the card. First is upgradability of the network stack. You immediately become dependent on the manufacturer for upgrades - don't expect open-source firmware any time soon, even if you had the tools to compile and load it. FPGAs aren't really a good choice here because they increase the component and design costs too much. You'd be much better off using a commodity embedded microcontroller with "firmware" stored in flash memory, although this may still increase the cost unacceptably and for _real_ speed you just plain have to chuck all this stuff out the window and go ASIC. As it turns out, most systems in most uses have more CPU power to burn than any other type of resource. Some guys at HP several years ago took this observation and ran with it; they designed a card that was even more stripped down than the typical Ethernet card, doing even more of the work in software, and they actually got excellent results.

    Lastly, the conversation about NT's networking code reminds me of an exchange I had with an engineer at MS a couple of years ago. He was saying that they had to sacrifice a little on TCP/IP features and error checking (e.g. not crashing if sent a source-routed frame, or something like that) to get speed. My response was that (a) not checking unusual conditions in incoming network packets is just unacceptable, and (b) NT's TCP/IP performance is piss poor, indicating that they have bigger issues to worry about than shaving a few instructions by not checking packet headers. In the time since then I have found no reason to change either observation.
  • Apparently not. I can't access their site right now. /.'ed allready. :o)

    Breace.
  • by Breace ( 33955 )
    I guess on a Half Duplex that's acceptable (depending on the entire network usage though), but on Full Duplex I would wonder where the other 25% went.

    Breace.
  • O common, what do you expect?
    Can we please be serious about this. 1MB/s is nothing. If win95 wouldn't be able to even sustain that...

    Let us know when you get >10MB/s with a 100Mb/s board.

    btw. please get your capitalization correct: MB = Mega Byte, Mb = Mega bit, mb = I dunno ;o).

    Breace.
  • by Breace ( 33955 )
    Let's see:
    preamble & start frame = 8 bytes
    Ethernet header = 14 bytes
    UDP header = 28 bytes (TCP = 40 bytes)
    DATA (not counted as overhead) = 1500 bytes
    CRC = 4 bytes
    Interpacket gap = 8 bytes

    That's 62 - 74 bytes overhead for 1500 bytes of data: 4.13% - 4.93% overhead, not 25%.
    And this is realistically _achievable_ throughput on a two node network.

    Breace.
  • Using an other system doing the remote loopback, or simply wiring TX to RX on the system that's being tested itself.

    Sorry about the confusion.

    Breace.
  • No you don't understand.
    With Linux you don't need multiple processors to sustain decent Ethernet throughput.

    Breace.
  • I've seen NT get 90+Mbps on a 100Mbps ethernet

    I'm sure you are right, but can you please discribe the system (CPU, RAM etc?), just for us to get an idea. Because I've never been able to see such a thing. Then again, I think I gave up on NT when we still only had P90's...

    BTW, if you ever want to convince yourself that ISA sucks, just run some tests on a PCI NIC and then an ISA NIC, and watch the 60-80% increase in CPU use for the ISA card

    Right. Although we had a bad experience with the OPTi 802/832 PCI chipset. It doesn't implement DMA burst modes from RAM to PCI devices, and thus limits the max datarate in that direction to about 5MB/s. That's worse then ISA! Anyways, that's an other story altogether... :(

    Thankx for letting us know about netperf. I wasn't able to connect to their ftp server yesterday.

    Breace.
  • Video CD is MPEG-I, and DVD movies are stored using MPEG-II.

    Video CD uses MPEG-1, layer I or II, not layer III which is MP3. Most MPEG-1 hardware decoders that I know of don't implement layer III decoding.

    Most DVD movies use Dolby AC-3 for audio, not MPEG although I believe in Europe they do (but again layer I or II as I understand).
    I also know of no DVD decoder that integrates MP3 decoding.

    Too bad really...

    Breace.
  • Looks like an excellent study to me.

    One little thing, your network card:
    DEC 21040 based AsanteFAST 10/100Mbps ethernet cards
    Should probably be DEC 21140. 21040 is only 10Mb/s.

    Breace.
  • Hmm, you must be confused about the D-Link. I don't think they have a 8029. They do have a Realtek 8029 clone though, but that's a 10Mb/s only chip. It's also a NE2*00 clone, which means that it's very unlikely to actually get anywhere close to 100Mb/s because of the silly I/O scheme. I imagine it was probably a DEC based board.

    Anyways, that's a pretty fast system,- we don't have one in the office here. I'll certainly redo our tests though as soon as I can, because we do have NT4 now and I think you are right about us using 3.51. (Although most of our test are Raw Ethernet related, not TCP/IP)

    Thankx for letting us know.

    Breace.
  • by Breace ( 33955 ) on Saturday May 08, 1999 @06:16PM (#1899703) Homepage
    Exactly, this card is of interest because the Microsoft network stack is terribly slow, and costs an awful lot of CPU time.

    The main reason for this is: Poor design. (What did you expect?)

    The M$ network stack is (as with most M$ device driver architectures) way more complex to deal with then for example Linux and as a result many of the network drivers are not well written. E.g. they are not optimized for performance. I'm sure the writers are happy if the damn thing works at all.

    The network stack itself also has much more involvement with each packet going out or coming in then with Linux. In some cases packets are actually copied in RAM. This is what causes the higher CPU overhead.

    We have run several tests and the results where depressing. I think we probably lost the source code, but if I can find it I will post it somewhere.

    Here's some of the figures I remember: (all tests where raw network tests, RAM to RAM, no hard drives involved)

    Two P90 systems using 100Mb/s Full Duplex (DEC 21140) cards. We where unable to sustain more then 35Mb/s using UDP in one direction only.

    Linux on this configuration: close to 100Mb/s.

    Using a raw packet driver on a 200MMX notebook with a SMC 100Mb/s Full Duplex card and using remote loopback we are unable to get much more then 15Mb/s sustained. (That's 15Mb/s going out and 15Mb/s comming in)
    This was using raw Ethernet packets, no TCP/IP or other protocol.

    This same configuration in a normal network is unable to receive more than around 4Mb/s UDP multicast packets, or packets will be dropped (thus not received)

    I'm surprised that people have kept up with the poor raw network performance of our Redmondians. It's a disgrace, and I will NEVER use a M$ product if serious networking is to be done. Even if the Ethernet adapter handles the protocols.

    Breace.
  • by Breace ( 33955 ) on Saturday May 08, 1999 @09:05PM (#1899704) Homepage
    One thing is becoming obvious.

    We need some serious network benchmark tools that are cross-platform usable between Linux and Windows and maybe more.

    Strangely enough not too many seem to exist. Neither does someone seem to have some hard data on network performance. It entirely useless to say 'I get 1MB/s using FTP'.

    A good network benchmark tool will be able to test raw Ethernet performance as well as performance through protocol layers.

    Of course it would only test the network and not use the file system or hard drive. It should be very clear what sort of configuration is supposed to be used: two systems running full-duplex, one system using remote loopback or whatever. It could also be interesting to have a > 2 system test to show what collisions do.

    And most important as far as I'm concerned: Open Source. I don't believe in closed source benchmarks.

    The difference between raw Ethernet and TCP/IP protocol would show us how badly we need hardware assistance on what platform.

    Maybe we should try to port netperf ( www.netperf.org [netperf.org]) to Windows and add raw Ethernet to it.

    Well, maybe I'll be a bit more serious about this if there's an interest.

    Breace.
  • by Meathook ( 36990 ) on Saturday May 08, 1999 @11:25PM (#1899705)
    Hmmm...

    When I read the slogan under the Slashdot logo at the top of this page it says "News for Nerds. Stuff that Matters". I don't see "LINUX ONLY" stamped anywhere up there. I don't see anything wrong with the posters giving us articles concerning platforms or OSes other than Linux and the hardware it runs on. In fact even though I'm quite a Linux supporter, I'm sick of the people who think this must be a Linux only site. My understanding is that Linux is on the minds of the nerds at this point of history so we get alot of stories about Linux (if I'm wrong here, well, sorry... somebody should make that clear). That's cool, but why are the posters slammed when they post an article about something other than Linux? That's not cool, or uncool, or whatever.

    bah
  • I don't *know* about NT's stack, but I would suspect it's woefully ineffecient; I've run tests Linux vs Win95 on the same box, and Linux wins by a minimum margin of four times on everything I've bothered to test, and it's usually higher. (Max Win95 rate is about 40k/s; max Linux is about 200k/s; this is the 2.0.34 kernel with Becker's WD driver.) This really sounds like a hardware solution to a problem in Microsoft's software. (Why, pray tell, should the transfer rate depend on 'getting lucky' with context switching? Did somebody at Microsoft decide that the scheduler worked fine because it didn't lose keyboard strokes?)

    -_Quinn

"Virtual" means never knowing where your next byte is coming from.

Working...