Forgot your password?
typodupeerror
This discussion has been archived. No new comments can be posted.

Intel Develops Hardware To Enhance TCP/IP Stacks

Comments Filter:
  • Good stuff! (Score:5, Interesting)

    by kernelistic (160323) on Monday February 21, 2005 @04:04AM (#11734099)
    First checksum offloading, now this... It is nice to see that hardware vendors are realizing that 10Gbit/s+ speeds aren't currently realistic without extra forms of computation support from the underlying network interface hardware.

    This is Good News.
    • Re:Good stuff! (Score:5, Informative)

      by RatRagout (756522) on Monday February 21, 2005 @04:33AM (#11734202)
      Yes. Checksum was one of the problems. The other problem is the memory-to-memory-copying of data due to the semantics of the tcp/udp-send() call. This semantics require that the data existing in the memory location at the time send() is called is the data to be sent. If the application changes the data directly after the send()-call this should not affect what is sent. This means that the OS has to copy the data into kernel memory, and then at some later time copy it onto the nic. This memory-to-memory-copying becomes a severe problem when the traffic and bandwidth increases
      • Re:Good stuff! (Score:5, Informative)

        by kernelistic (160323) on Monday February 21, 2005 @04:39AM (#11734227)
        There have been multiple fixes to address the inefficiencies of the original design of the BSD TCP/IP stack.

        FreeBSD for example, has a kernel option called ZERO_COPY_SOCKETS, which dramatically increases network throughput of syscalls such as sendfile(2). With this option enabled, as the name entails, data is no longer copied from userland to kernel space and then passed onto the network card's ringbuffers. It is copied in one swoop!
        • Re:Good stuff! (Score:2, Interesting)

          by RatRagout (756522)
          For sending of files I'm sure this has increased performance greatly as you when sending a file might have to first read the file into userland, copy into kernel and then onto nic. Reading directly from disc to a TOE would of course be the real overhead-killer. Zero-copy techniques are also done for newer APIs like uDAPL for RDMA-operations (over InfiniBand or similar).
        • It is copied in one swoop!

          So shouldn't it be called ONE_COPY_SOCKETS, then?

      • Well, the kernel could just block till it is done sending, thus
        sending it straight from the userspace supplied buffer.

        Doing so may ofcurse have other affects though.
        • Doing so may ofcurse have other affects though.

          Of curse?

          d
          What do you want to drop? [a?*]
          ?
          a - a cursed -1 tcp/ip connection
          a
          Sorry, you can't drop the tcp/ip connection, it seems to be cursed.

          Hmmm ... where's my scroll of remove curse?
      • Re:Good stuff! (Score:2, Interesting)

        by acaspis (799831)

        > If the application changes the data directly after the send()-call this should not affect what is sent.

        So just don't let the application change the data (hint: single-assignment programming languages).

        > This means that the OS has to copy the data into kernel memory,

        Either that, or you could improve support for copy-on-write in the MMU (which might benefit other tasks than just networking).

        Sometimes changing the assumptions is the proper way to solve the problem.

      • Ethernet RDMA protocols solve this problem. RDMA will be ubiquitos in the next year or two.
    • 10GB isn't realistic without some faster BUS technology either. Will 64-bit PCI handle it?
  • finally... (Score:5, Funny)

    by N5 (804512) on Monday February 21, 2005 @04:07AM (#11734109)
    intel is working on something worthwile: a cure for the common slashdot-ing

    and they say the drug companies are miracle workers ;)
    • Enhance your Stack!

      Have you ever wanted your TCP stack to be more secure? Has your internet ever dribbled? Sign up for intel soft tabs now!
  • White elephant? (Score:5, Interesting)

    by Toby The Economist (811138) on Monday February 21, 2005 @04:08AM (#11734112)
    I think in Tannenbaum's book there's a reference which states that offloading network processing normally isn't useful, because the CPU that work is offloaded to is always less powerful than the main CPU and the main CPU is normally blocked in it's task until the network processing has completed.

    --
    Toby
    • Re:White elephant? (Score:3, Informative)

      by Uhlek (71945)
      That all depends on how it's done. Simply offloading the processing won't work, but replacing the TCP/IP drivers with simple hooks into a hardware-based I/O system can.
      • Re:White elephant? (Score:5, Informative)

        by Toby The Economist (811138) on Monday February 21, 2005 @04:23AM (#11734168)
        You must imply that the hardware implimentation will be faster than the main CPU, which it almost certainly won't be, because if you've just spent 300 USD on your P4 CPU, what are you doing spending the same amount again - or more - just on your network subsystem?

        Also remember that a well implimented TCP/IP stack runs at about 90% of the speed of a memcpy() (Tannenbaum's book again).

        For hardware TCP/IP processing to be useful, you need to be say 2x the speed of the CPUs memcpy() function!

        Given that the main performance bottleneck is memory access, since you're basically copying buffers around and so caching isn't going to help you, I don't see how any sort of super-duper hardware is going to give you anything like a 2x speed up, let alone at an economic price.

        --
        Toby
        • Re:White elephant? (Score:5, Informative)

          by Uhlek (71945) on Monday February 21, 2005 @05:05AM (#11734312)
          Hardware implementation will most definitely be leaps and bounds faster than the general CPU. Can a Linux router route 720Gbps of traffic through hundreds of interfaces at once? No. But a Cisco 6500 can, because of hardware designed especially for the task.

          Simply put, software on general purpose processors sucks for doing heavy computational work. Hardware tuned especially for a task has, and always will, be where it's at. However, the costs involved in creating ICs specific to a task usually mean that ASICs are only created where there is a need. Modern graphics cards are a great example. The on-board graphics processors are designed especially to create graphics, something that, if offloaded onto the GP CPU, would crush even the highest of the high end.

          Also, offloading the TCP/IP stack on a normal workstation probably isn't going to be a huge performance boost. Where this will be useful is in situations where there is a need for high-throughput, low-latency network I/O processing.
        • For hardware TCP/IP processing to be useful, you need to be say 2x the speed of the CPUs memcpy() function!

          Not at all true. Dipping into Ricardian economics, you can conclude that the best, most valuable, purpose of the primary CPU is to process user input and to execute applications. If another CPU can be introduced into the computational economy such that it can perform a task, even if at a lower rate than the primary CPU, thus freeing up the primary CPU to perform its most valuable task more efficien

        • Using a P4 to do I/O work is like using a battleship as a landing craft. Until now, the alternatives have been to do that or let your soldiers (packets) swim to shore. Intel's smarter cards are like providing landing craft.

          This is not a new concept.

          DEPCAs made network I/O easy back in the days of ISA busses twenty odd years ago, and there have been PCI cards with their own CPUs which you can actually load a version of Linux into and use as standalone routers - so the network cards handle stuff like ICMP a
    • Re:White elephant? (Score:4, Interesting)

      by mr_zorg (259994) on Monday February 21, 2005 @04:33AM (#11734203)
      I think in Tannenbaum's book there's a reference which states that offloading network processing normally isn't useful, because the CPU that work is offloaded to is always less powerful than the main CPU and the main CPU is normally blocked in it's task until the network processing has completed.

      I think in xyz's book there's a reference which states that offloading graphics processing normally isn't useful, because the CPU that work is offloaded to is always less powerful than the main CPU and the main CPU is normally blocked in it's task until the graphics processing has completed.

      See how silly that sounds when you substitute network with graphics? We all know that offloading graphics processing is a good thing. Why? Because it's optimized for the task. Why couldn't the same be done for networking?

      • See how silly that sounds when you substitute network with graphics?

        Well, does waiting 3 milliseconds at 3 GHz outrun waiting 3 milliseconds at 300 MHz?

        The only advantage I can see to this is that it's often nice to have I/O handled in a separate process/thread running on a separate processor. But, as many have already noted, unless the I/O processor is tuned for this you've either got another expensive processor or you're running the I/O thread on a slower processor.

        If the processor _is_ tuned for

        • Re:White elephant? (Score:3, Insightful)

          by aminorex (141494)
          The IO processor can be made to do the task much faster than the CPU, because it is not a general-purpose chip. It implements in hardware what the CPU would implement in software. As a result, it costs much less to produce. These are the same considerations that apply to graphics pipelines. It would be grossly economically infeasible to implement the functions of a high-end GPU on the CPU, in part because it's on the wrong end of a bus.
      • Re:White elephant? (Score:5, Interesting)

        by Jeff DeMaagd (2015) on Monday February 21, 2005 @04:45AM (#11734248) Homepage Journal
        Graphics and networking are two very different things. Networking isn't compute intensive, it is I/O intensive. I don't think the Intel hardware network offload is for much more than basic computation.

        Besides, GPUs are more powerful than CPUs at the task of rendering polygons.

        Very often ASICs are better at a task than general purpose CPUs, just that considerations must be made as to whether the performance gain is worth the cost difference.
        • and this hardware will be better at checksumming than a normal CPU, you didn't think they were just going to stick a pentium II on there and call it a NIC.
        • Besides, GPUs are more powerful than CPUs at the task of rendering polygons.

          Yes, that's the whole point - they're more powerful at that task because they're specifically designed to perform that task (amongst others). Similarly, a "network processing unit" would be specifically designed to support in hardware the operations required of it. Make that chip fast enough, and it'll be faster at doing it than a general-purpose CPU. The only question is how fast it has to be, and whether or not it's cost-effecti
      • Re:White elephant? (Score:5, Informative)

        by Toby The Economist (811138) on Monday February 21, 2005 @05:04AM (#11734307)
        You can accelerate graphics to a very large degree because the problem is very subject to parallelism.

        You cannot accelerate networking very much because the problem is highly serial.

        It is improper to compare the two because they are fundamentally different problems.

        You can throw tons of hardware at 3D graphics and get good results, because just by having more and more pipelines, you go faster and faster.

        Processing a network packet is quite different; the data goes through a series of serial steps and eventually reaches the application layer. The only way you can really make it go faster is to up the clock rate, and you find it's uneconomic to try to beat the main CPU, which remember has *already* been paid for. You have all that CPU for free; to then spend the kind of money you'd need to outpace the CPU makes no sense, let alone the money you'd need to spend to outpace the CPU by a decent margin.

        --
        Toby
        • Ever heared of pipelining?

          So it's a series of steps. Ok, then make each step a part of a pipeline, with a specialized circuit for exactly that step. Then while the next circuit on the pipeline gets to do the next step on that packet, the first one can already start processing the next packet. This is how modern CPUs speed up the decoding of machine instructions, so why shouldn't the same work with TCP/IP packets as well?
        • "The only way you can really make it go faster is to up the clock rate, and you find it's uneconomic to try to beat the main CPU, which remember has *already* been paid for. You have all that CPU for free; to then spend the kind of money you'd need to outpace the CPU makes no sense, let alone the money you'd need to spend to outpace the CPU by a decent margin"

          what??? not when your doing multimedia decoding of compressed data... or other such tasks... offloading the networking stuff to hardware will have th

        • Re:White elephant? (Score:4, Informative)

          by sconeu (64226) on Monday February 21, 2005 @12:03PM (#11736298) Homepage Journal
          Bullshit.

          I used to work at a company that did Fibre Channel.
          One of the things we had was an ASIC that did network processing in hardware, allowing us to do all sorts of interesting stuff at wire speed (2Gbps). If we had to load into memory we would have been at least an order of magnitude slower.
    • by Anonymous Coward on Monday February 21, 2005 @04:37AM (#11734211)

      AC being Alan Cox, DM being Dave Miller.

      Read Alan's opinion here [theaimsgroup.com].

      Read Dave's opinion here [theaimsgroup.com].

      There has been discussion of this specific Intel announcement here [theaimsgroup.com].

    • Not that some things Intel does isn't marketing driven. I doubt they would go about doing this if they didn't have good reason to.

      It's not like this would be an easy thing to sell in some way that people would really understand very well. But regardless they aren't going to develop a whole new piece of hardware that is worthless. Making a design decision that pushes something down a bad path like clock speed is a whole different issue. I'm pretty sure intel guys would think this one out before spending
    • What the heck. Few factoids
      The main CPU runs multiple things.
      The cost of network traffic are cache flushes and context switches. And so on.
      General purpose CPU is much weaker than special purpose CPU, if you can parallerize at all.
      And MFG costs my ass. These things should be relatively small.

      Think following scenario.
      Network interrupt->context switch-> move lot of data around and compute some what-> context switch.
      To finish what I was doing, and then compute the thing that I just put in the line. (u
    • Re:White elephant? (Score:3, Insightful)

      by Trogre (513942) *
      Try telling that to Amiga fans in 1989-1992.

      Those little boxes were masters at multi-processing, and they did it right - one processor for pretty much every major peripheral task (disk, graphics, sound, something else I can't remember).

      As long as these Intel coprocessors are going to be an open standard (which they almost certainly won't), then I'd welcome this addition to PC architecture.

    • by Moderation abuser (184013) on Monday February 21, 2005 @05:07AM (#11734323)
      My boxes all run tens to hundreds of processes for tens to hundreds of people. Offloading the processing to a networking subsystem isn't going to hurt, especially with gig and 10gig.

      Not that this is a new idea. It's been done for donkey's years.

      • Intuitively, people think it won't hurt, but intuition is wrong.

        Consider; you have a hundred users, all doing some sort of network based task - say, reading Usenet via an NNTP server.

        You offload their network processing from the CPU to a slower CPU on the network card.

        Every time a thread in your NNTP server blocks, waiting for a packet to arrive or be sent, the main CPU moves onto another thread...which also then needs a send/recv, and blocks, and so on.

        In the meantime, the slow CPU gets around to deali
    • that offloading network processing normally isn't useful, because the CPU that work is offloaded to is always less powerful than the main CPU

      This is a pretty ridiculous claim. Take a look at Cisco routers some time... With a slow CPU, they can transfer gigabit upon gigabits of data through every second. In some cases, they are even just using PCI network cards.

      Packetizing data, and handling the incredible storm of interrupts, is something CPUs are very poor at. Servers stand to get a huge performance

    • Using the same logic, machines with two (or more) CPUs wouldn't be useful, since the second CPU is not going to be any faster in than the first one.

      With all due respect to Mr. Tannenbaum, but if he stated what you put in your post, his logic is severely flawed.

      Let's compare the general CPU/networking CPU combination with a manager/secretary.
      The manager has a number of tasks which needs to be done, including scheduling a number of appointments. Without a secretary, he'll be obliged to call/contact the peo

    • I think in Tannenbaum's book there's a reference which states that offloading network processing normally isn't useful, because the CPU that work is offloaded to is always less powerful than the main CPU and the main CPU is normally blocked in it's task until the network processing has completed.

      This a bit of an oversimplification. There are at least three cases in which offloading makes sense: dropping packets on the NIC (for example, during a DoS attack), reducing bus overhead by combining multiple req
    • the main CPU is normally blocked in it's task until the network processing has completed.


      Good thing my scheduler has about 50 other tasks in the queue waiting for their turn.
  • by Anonymous Coward on Monday February 21, 2005 @04:08AM (#11734116)
    I was one of the lucky few who beta tested this. The plus side is you can overclock your network card to download faster than the remote server bandwidth. I did not try it, but I would be able to slashdot the slashdot.org website just by browsing it.
  • by KiloByte (825081) on Monday February 21, 2005 @04:08AM (#11734119)
    As we know it damn well, shit happens all the time.

    So... how exactly are they going to ship patches in the case of a security issue?
  • Ethernet controllers (Score:3, Interesting)

    by Anonymous Coward on Monday February 21, 2005 @04:10AM (#11734127)
    What is needed more is a high-speed bus for network interfaces, as gigabit ethernet becomes more common. Even if a gigabit adapter had a whole 32-bit PCI bus to itself, it could still easily saturate it.

    It seems like most common denominator board manufacturers have put off 64-bit PCI support for too long. It's going to bite them in the ass if it doesn't become standard very soon.
    • by afidel (530433) on Monday February 21, 2005 @04:22AM (#11734167)
      No, a gigabit adapter can't saturate a PCI bus by itself, 32bit 33MHz PCI is 133MB/s, gigabit is 100MB/s. Then there is 32bit 66MHz PCI, and if you want you could run a 32bit card at 133MHz as the standard supports it (though I've never heard of such a card, if you need 133MHz you generally also need 64bit but I assume a ADC could use the faster speed but not need the wider word size. The fastest current implementation of the slot local bus is 16 channel PCI-express which could handle 4 10gigabit adapters. The problem would be coming up with enough data to keep those pipes full, no disk subsystem is fast enough, and any meaningfull SQL transactions are going to be CPU limited on even the bigest of servers, so why would you need a bus with more bandwidth than that? Add to this the fact that servers which actually need more throughput have long had the faster PCI slots and you realize that it's not a problem in the real world.
      • by Anonymous Coward
        You got the PCI bandwidth correct, but you're gigabit bandwidth is a hair off. Depending on how you define "giga" (base 10 or base 2), you get the following numbers:

        a) Gigabit/sec = 1000 Mbit/sec = 125MByte/sec
        b) Gigabit/sec = 1024 Mbit/sec = 128MByte/sec

        True, even these speeds don't completely saturate the PCI bus, though because of how the PCI bus is shared (each device gets a few clock cycles to do it's thing before passing control off to the next device) no single device could anyway unless it's the O
        • by jpc (33615)

          gigabit is full duplex - double your figures.

          But new motherboards are already starting to come with gigabit attached to PCI Express. For the last few years any decent board has had them on fast PCI-X, at least 64 bit 66 MHz.
      • There is more than one device on the PCI bus...
      • by Matt_Bennett (79107) on Monday February 21, 2005 @08:25AM (#11734974) Homepage Journal
        The critical aspect you leave out is that Gigabit ethernet is (inherently) Full Duplex. That means that that a 32/33 PCI bus would be saturated at a gigabit out, but have no bandwidth for anything incoming.

        In truth, a gigabit ethernet card can saturate a 1X PCI-E link (2Gb/s after the 8B/10B encoding is removed), when sending small packets- basically due to packet overhead.

    • Most currently sold chipsets provide a network interface right into the chipset as its own port, bypassing the PCI bus. The same is done with on-board IDE/ATA/SATA controllers, audio, USB, Firewire and such.
  • nvidia (Score:5, Interesting)

    by Ecio (824876) on Monday February 21, 2005 @04:10AM (#11734130)
    Isnt Nvidia doing the same with his new nforce serie motherboards? lowering cpu usage by adding network management code and a SPI firewall inside the chipset?
    • Re:nvidia (Score:4, Informative)

      by Glock27 (446276) on Monday February 21, 2005 @10:08AM (#11735421)
      Isnt Nvidia doing the same with his new nforce serie motherboards? lowering cpu usage by adding network management code and a SPI firewall inside the chipset?

      Yes. The nForce4 chipsets offload most TCP/IP processing and firewall [nvidia.com] from the main CPU.

      If you go with a Athlon64 Socket 939 nForce4 board, you get PCI Express, lower power consumption, a ton of great features, good Linux support, and plug-compatible dual core upgrades down the road. Intel's offerings just seem anemic by comparison.

      (Personally, I'd also do an NVIDIA graphics board for the excellent Linux driver support. And no, I don't work for NVIDIA, I'm just a satisfied customer.)

  • Interesting (Score:5, Insightful)

    by miyako (632510) <miyako@gma[ ]com ['il.' in gap]> on Monday February 21, 2005 @04:11AM (#11734132) Homepage Journal
    This seems interesting, though given intels track record I wonder if it will really be as useful as they are speculating, as the article has no real technical information.
    Granted, I've never administered a server that was under anywhere remotely near the types of loads we are talking about for this to be useful, but I have a hard time imagining that dealing with the TCP/IP stack would be more intensive than running applications (as the article claims).
    So, far all you people out there much more qualified to discuss this than I am, will having some part of the processor dedicated to handling TCP/IP really speed things up, or is this primarily a marketing technology?
    • Re:Interesting (Score:3, Insightful)

      by AutumnLeaf (50333)
      I've seen extremely beefy NFS file-servers go into a crash-reboot-crash cycle after the first crash because all of the hosts trying to remount the filesystem completely crush the machine before it is fully up to speed. We've had to unplug the network cables on the server to prevent the mount storm for killing the server again.

      Note, this is enterprise-grade hardware hooked up to million-dollar disk arrays.

      Now, is that entirely from dealing with the networking stack? No. Not quite. However, consider th
  • Qlogic TOE cards (Score:5, Informative)

    by jsimon12 (207119) <tzzhc4NO@SPAMyahoo.com> on Monday February 21, 2005 @04:16AM (#11734151) Homepage
    Uh, this isn't new, Qlogic has been doing it for some time now, in there TOE cards (TCP Offload Engine) [qlogic.com]. The cards are smoking, especially on Solaris, cause Sun's TCP stack is crappy.
  • yeah great (Score:5, Funny)

    by Anonymous Coward on Monday February 21, 2005 @04:27AM (#11734186)
    soon it will be dedicated processor and RAM to deal with tcp, then a dedicated processor for the keyboard input, then a dedicated processor for the fans and a special dedicated processor on 12" PCI-X card for the extremely computationally intensive MOUSE, actually this will have it's own special dedicated path call 'AMP' or Accelerated Mouse Port. Mice of the future will need much more bandwidth than today. About 16 GB i/o so they need their own data paths.

    And then there will be other enhancements like the tcp/ip one.

    For instance a special accelerator card for Word and Internet Explorer will be developed.

    Furious Linux users will demand their own technology, so one manufacurer will come up with a special card for running GNOME apps. This card will have 4 duel core 6 Ghz processors and allow Gnome to run at normal speeds.

    • I always thought having components offloaded to their cards(the way OS X offloads video the video car). Network offload to the NIC, sound to the sound card, etc. Why not? Given that 100mhz+ processor are becoming dirt cheap, and their ability to take on processor load only makes sense, freeing time for the system CPU to move on to better things.
    • by ceeam (39911)
      But then - imagine that - a single Z80 would suffice to act as a _C_PU commanding all those!
    • Re:yeah great (Score:2, Insightful)

      by yem (170316)
      I didn't know whether to mod you interesting or funny :-)

      Parallelism is great. Look the way things are going. Dual CPU motherboards, Dual core CPUs, Cell..

      And gnome.. sheesh.. back when I ran a P100 and Gnome was slow, I thought "well one day I'll have a 500Mhz monster and Gnome will be fast". Here I am with a P4-2.6Ghz/1Gb and Gnome is STILL a dog. *sigh*
    • All these custom chips remind me of the Amiga. Back then, custom chips were considered unnecessary. Now PCs are full of them (custom chips, that is). It's funny how the world goes around...
    • ...the orignal IBM PC put a processor in the keyboard and another (dumb) processor on the motherboard to talk to it.

      This USB keyboard I'm typing on involves at least three processors, one to scan the keys, one to do the USB on the peripheral side and the third to do the USB on the motherboard side.
  • by arc.light (125142) <dbcurry.hotmail@com> on Monday February 21, 2005 @04:29AM (#11734193)
    The article doesn't say, and I'd hate to be "stuck" with a card that only does IPv4. Yeah, I know, hardly anyone uses IPv6 today, but the nations of China and Japan, as well as the US DoD, are starting to roll out IPv6 networks in a big way.
  • by ABeowulfCluster (854634) on Monday February 21, 2005 @04:44AM (#11734240)
    targeting the OS. I can see this technology being useful on servers which have multiple network cards and heavy traffic, but not for joe average pc user.
  • So finally! (Score:5, Funny)

    by Trogre (513942) * on Monday February 21, 2005 @04:47AM (#11734252) Homepage
    buying Intel really will make the internet go faster!

  • Old news (Score:5, Informative)

    by obeythefist (719316) on Monday February 21, 2005 @04:56AM (#11734281) Journal
    Intel has been wanting to do this for years! I remember reading old articles on The Register about it, and how they were pulling back because Microsoft didn't like the idea of Intel taking away things that Microsoft were running with their software, including things like managing networking instead of having the OS do it.

    Of course it couldn't last, what with nVidia doing firewalls and NICs and all sorts of other things, Intel is a big company and they know when they need to compete. MS has also lost a bit of their clout when it comes to things like pressuring the bigger companies (intel, HP, Dell)
  • by evilmousse (798341) on Monday February 21, 2005 @05:10AM (#11734336) Homepage Journal

    i'd guess the tcp/ip stack implementations available to intel are pretty solid. still, i'd hope it'd be flashable just in case. i can imagine only once in a blue moon would you find someone with libpcap and the patience to find holes in some of the most trusted code in the net.
  • by flacco (324089) on Monday February 21, 2005 @06:23AM (#11734563)
    ...when you can get AOL internet accelerator for FREE!
  • by tjlsmith (583149) * on Monday February 21, 2005 @06:38AM (#11734607)
    and how much DRM are they going to build onto the motherboard, just in passing?

    Don't think for a minute the big boys aren't trying to take the Internet away from us. The missed the opportunity once, never twice.

  • DoS Attacks (Score:3, Interesting)

    by Gary Destruction (683101) * on Monday February 21, 2005 @06:52AM (#11734652) Journal
    Will this technology make it easier for systems to withstand DoS Attacks?
  • This is ridiculous.

    We're had this for years in FPS's- used to be that I used to have to practice for ages just to compete with the young kids at FPS's. Then along came some great 'acceleration' technology, and it's been so much easier. I call mine a bot.

    Ever since it hasn't been about upgrading my CPU or graphics cards to get that head-shot. I've been offloading all that work!
  • by argent (18001)
    Hasn't 3COM already implemented this, putting higher level stack elements in their firmware?
  • by Ancient_Hacker (751168) on Monday February 21, 2005 @08:33AM (#11734996)
    The nightmare continues. It goes something like this: Some drooling "computer scientist" is too dumb to do anything useful, so they speculate" "Wouldnt it be nice to free up this $XXXX CPU from this humdrum task (choose: moving bits/bytes/pixels/ or packets)". He finds a brain-addled silicon-stuffer to design a chip to do just that. All rejoice at the increased efficiency.

    Except:

    • The silicon-stuffer only has access to the slow processes of maybe two silicon generations back, unlike the CPU which paid for the latest whizzy xx picofurlong process. So the supposedly whizzy chip is still not particularly faster than the CPU.
    • The whizzy chip shows up late, just about when the associated CPU is going to take a 2x speed hike.
    • The chip is on the I/O bus, requiring many slow I/O cycles, with interrupts masked, to get its commands.
    • Said whizzy bit-banger doesnt have any software support from the main operating systems.
    • The silicon-etcher guy can't write english worth a damm, so nobody can understand the spec sheet.
    • And oh, he didnt know the bus was active-low, so all the data packets have to be inverted.
    • And sometimes byte-reversed too.
    • The chip designer doesnt know or care about the whole system, so the chip does several things that spoil the overall performance, like hogging the bus, saturating the bus snoop logic, poisoning the cache, interrupting too often, etc.
    • The droolers forgot to think about the multi-processor option, so the chip doesnt share well with multiple CPU's.
    • The chip is all hard-wired gates, so there's no way to fix the problems.
    Finally some software wizard finds a way of speeding up the code that runs in the CPU so it's now faster than the separate chip, so the chip is now useless and just an extra power waster.

    We've seen successive waves of this concept, none of them have had much success. Graphics processors are one partial exception, and it took almost a decade of mis-designs of those before they became stable enough to be usable.

  • Umm, haven't we been here before with the Intel PRO cards?

    They at one point used to do just the PRO/100 cards, then they dropped them and started doing PRO/100 cards that did IPSEC hand off? If I remember correctly the S was security and they had a few other models? I was thinking back then that they would be looking at IP hand off at some point.
  • by Hobart (32767) on Monday February 21, 2005 @12:28PM (#11736543) Homepage Journal
    A while ago I looked up what the original authors of BSD-on-the-386 (
    386bsd [wikipedia.org] ) authors had been up to, I just searched again and found http://www.interprophet.com [interprophet.com] and http://www.telemuse.net [telemuse.net] ...
    Their new gig was putting the TCP/IP stack into the silicon for performance, the Internet Archive
    version [archive.org] says they've been at it since 1989...
    I wonder if Intel licensed their patents, or if this is similar stuff...

"Marriage is low down, but you spend the rest of your life paying for it." -- Baskins

Working...