Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Data Storage

HyperSCSI Examined 179

An anonymous reader writes "Eugenie Larson of byteandswitch.com has published a brief article that reviews the HyperSCSI protocol, which like iSCSI allows for an IP based san. The twist of HyperSCSI is that it's opensource, and runs over raw ethernet, avoiding the overhead of TCP/IP. The article has some comments from early adopters of HyperSCSI, as well as some comments from top vendors in the iSCSI industry."
This discussion has been archived. No new comments can be posted.

HyperSCSI Examined

Comments Filter:
  • I read somewhere that it's like 5 times faster than SCSI over TCP/IP. Is it true? And how great is the sacrifice of not using TCP/IP? I mean, what doesn't support Ethernet these days?
  • by miguel_at_menino.com ( 89271 ) on Saturday September 27, 2003 @07:30PM (#7074804)
    The summary says "IP based" then "without the overhead of TCP/IP" and then "raw ethernet".

    Which one?

    Can't be both IP based and raw ethernet at the same time.

    You don't expect us to RTFA do you?
    • I R'd TFA, and it doesn't say anything about IP-based SANs.

      It does, however, call the protocol a "beer can with an engine" (or some such colorful metaphor for 'kludge').

      I'm not a bit twiddler or an electrical engineer, but it looks to me like this is reinventing the wheel.

      • Not quite reinventing, just reengineering. To keep the analogy, it's like reformulating the rubber to provide better "grip" in racing tires - great for flat, dry tracks, but not great for inclement weather. In this case, they are redesigning TCP to remove all of the stuff that is unneccesary for this particular purpose:

        • sliding window sizes - block data transfer needs to start with large windows, not slowly "wind up" to them
        • error checking - this is done already by the SCSI protocol, and the Ethernet p
    • by ProtonMotiveForce ( 267027 ) on Saturday September 27, 2003 @07:34PM (#7074826)
      TCP is a layer above IP. Hence, the two are not mutually exclusive.

      You can have "without the overhead of TCP/IP" and "IP based". All IP gets you is an address format and ARP type standards, it's not a lot of overhead.
      • Yes but "raw ethernet" is exclusive to mean you're not getting as far as IP... that's a physical first-layer protocol and that's it.
        • NO, Ethernet is also a layer 2 protocol. HyperSCSI runs as a layer 3 protocol over Ethernet's layer 2. Remember, Ethernet is both a layer 1 protocol (At the physical side) and a layer 2 protocol (Data Link).

          IP is Layer 3. HyperSCSI is Layer 3.

          • by Alien Being ( 18488 ) on Saturday September 27, 2003 @09:59PM (#7075322)
            "NO"

            # # ###### ####
            # # # #
            # ##### ####
            # # #
            # # # #
            # ###### ####

            "HyperSCSI runs as a layer 3 protocol over Ethernet's layer 2."

            Okay, so where's the IP layer? Wait, wait, don't tell me... it's on the bongos, right?

            HyperSCSI runs on top of a raw datalink. IP doesn't enter into it.
            • HyperSCSI can also run over IP (ie UDP presumably) according to their docs.
              • Yeah, but it does that by tunneling over TCP/IP (UDP based HyperSCSI doesn't appear to be implemented).
                It's pretty much tunneling Ethernet frames over TCP/IP.

                • ooh interesting.. so the UDP version encapsulates ethernet header et al? How does that work then with the MAC addr? Why the hell does it even want to have the MAC addr's included in the HS/IP flavour? (of no use to the remote end).

                  Its seems really strange for HS/IP to include the ethernet header and i get the impression from the PDF on their site that this isnt the case. Can you provide a reference?
                • When someone says "TCP/IP" they are referring to the entire protocol suite, which involves TCP, UDP, ICMP, and other stuff on top of IP.
                  It does not at all mean that they are using TCP itself.

            • IP is Layer 3. IP Rides the raw Datalink. IP is the same layer as HyperSCSI. Raw Datalink is layer 2.

              Ethernet is both a Layer 1 topology and a Layer 2 Datalink protocol. That's why you can push ethernet frames over dissimilar topologies (Like 100baseFX and LANE over ATM).

              OSI Layer 1 is Physical (Ethernet is here)
              OSI Layer 2 is Datalink (Ethernet is also here)
              OSI Layer 3 is Network (IP and HyperSCSI live here)
              OSI Layer 4 is Protocol (TCP, UDP and the SCSI side of HyperSCSI live here)
              • I think if you read again, and this time don't assume every poster above you doesn't already know this, you'll see the point they were trying to make.

                If this uses it's own Layer 3 protocol (presumably, and for the sake of argument, called HyperSCSI), then it's NOT IP BASED... and the article summary indicated it was IP based, then contradicted itself.

      • When someone says "TCP/IP" they are referring not to a protocol, but a protocol suite. They are NOT referring to "TCP and UDP ONLY".

        So, if you say something is not based on tcp/ip,that indeed DOES mean it does not use IP. Furthermore, saying it uses "raw ethernet" indicates that this uses it's own layer 3 protocol, other than IP.

    • How about UDP? It's IP based, but doesn't have the overhead of TCP.

      For people who wouldn't know this kind of stuff, TCP does much to ensure that every packet arrives as it was sent. This adds overheard, but it's hardly ever seen by any end user because it's pretty universal. UDP has no error checking, so it isn't fit for anything where any particular packet matters. On the plus side, overhead is severely reduced. I imagine UDP is used for streaming audio and video, but I don't know.
    • Its both apparently, ie can run both inside ethernet (HS/IP, frame type 0x889a and IP (HS/IP - udp port 5674). It apparently implements its own flow-control and optional authentication/encryption framework.

      RTFA? yes, perhaps you should have but instead you're moderated as insightful. (meta-mods?)
    • Just to clarify what a bunch of people are getting mixed up about.
      • Saying "TCP/IP" does not normally mean TCP.. it means the entire protocol suite. Saying something runs "on tcp/ip" is ambiguous.
      • "Raw ethernet" usually means "another layer 3 protocol"
      • iSCSI uses TCP as a transport. iSCSI IETF Draft [ietf.org].
      • HyperSCSI is a layer 3 protocol. HyperSCSI Spec [a-star.edu.sg]
      • HyperSCSI on ethernet uses a EtherType field of 0x889a.
      • With iSCSI, you can route over an IP network to your devices.. you could have a storage subnet, for ins
  • Bridge Board (Score:4, Interesting)

    by Detritus ( 11846 ) on Saturday September 27, 2003 @07:32PM (#7074816) Homepage
    I like the idea. Ethernet hardware is dirt cheap and fast. What it needs is a cheap IDE bridge board. That would let you put some IDE drives in an external enclosure and plug them into the local LAN.
    • According to the homepage for HyperSCSI, it can support IDE (as well as USB and Fibre Channel) devices:


      To put this in "ordinary" terms, it can allow one to connect to and use SCSI and SCSI-based devices (like IDE, USB, Fibre Channel) over a network as if it was directly attached locally.
      • I believe his idea was to have an IDE drive with a power connector and an ethernet connector. That way you don't need another computer holding the drives running the hyperSCSI server.
    • Ethernet hardware is dirt cheap and fast.

      But no cheaper than anything else could be if it gained popularity.

      The big problem is the interrupts, and the processing power used. High speed transfers, and you need an incredibly fast computer to handle the storm of interrupts from your ethernet card. That's why (expensive) fibre channel is used, instead of (cheap) networking technologies.
  • And that's not all. Since HyperSCSI was released as open-source software last year, it's free and licensed under the GNU General Public License (GPL). "It's not only free as in beer, but also free as in speech," says Jesse Keating.

    and

    "I would describe it as a beer can with a motor," says Andre Hedrick, president and CTO of iSCSI software vendor PyX Technologies Inc. [Ed. note: I need a beer!] "It will go really fast, but just hope there's not a problem, because there's nothing there to protect you."

    Mmmm

    • Re:favorite quotes (Score:2, Interesting)

      by anti-NAT ( 709310 )
      BTW, Andre Hedrick is one of the main IDE developers for Linux.

      I certainly appreciate his IDE efforts, but of course he is going to criticise the technology - his company is an iSCSI company!

      What, do they think he is going to say, "Gee, and all this time, I've thinking that iSCSI is the right thing to work on. I'm going to abandon iSCSI right now, and start playing with this HyperSCSI thing."
  • by ryanmoffett ( 265601 ) on Saturday September 27, 2003 @07:38PM (#7074836)
    If you look go to the MCSA site and look at the HyperSCSI FAQ, it does implement reliability and flow control, just not in the same manner as TCP.

    The only technical negative side I can (at this time) is that because the implementation isn't over IP, you can't traverse a router. This usually isn't a problem but could cause some inflexibility in larger deployments.
    • It'll cause problems in smaller environments too, if your goal is to replicate data offsite. The FAQ says that there might be a version that runs over UDP in the future. In the meantime you get to use bridging (yay!) if you want to move data to a different segment.
      • Use loose and fast HyperSCSI on your local segment where it's possible, and use a concentrator that translates into iSCSI over IPSEC for secure WAN connectivity.

        That way you only need to buy one TOE card per WAN edge. Those can get expensive!
    • Fiber Channel SANs aren't based on IP either, yet people manage to do off site replication with them.

      I don't know how far away you want to put your off-site backup, but Cisco have been selling a GBIC (Gigabit Interface Converter ? Too many FLAs for my head these days), which they've been calling 1000BaseZX, which will send an GigE signal around 90 Kilometers over single mode fibre.

      Even Full Duplex Fast Ethernet over multi-mode fibre will go 2 Kilometers.

      You can build some really big ethernet networks the
    • If you need a router, then clearly this protocol isn't for you and you should use iSCSI. The whole point of this is performance gain because in most SCSI setups, the parts are all right next to each other and can live on their own single-switch network...
    • last time I looked at it (a while back) none of ther performance figures were over gigabit. Or that recent. The question is can you run a tcp like protocol over ethernet that is more efficient than tcp (which is optimised by lots of kernel hackers)? nfs tried to run over udp, and on a modern os tcp gives much better performance than anything they came up with. ALthough there are interesting developments like the fact that gigabit optionally has flow control which could help if the protocol was aware of it (
  • While putting SCSI on raw Ethernet may speed up performance, there are also disadvantages associated with skirting TCP/IP, Smith says. "Without TCP/IP, it has no real error-recovery mechanism or guarantee that packets get delivered. It also appears to be quite limited in scaleability."

    And this is a technology breakthrough? I wouldn't want my data travelling down a wire with no error recovery no matter how small the error rate.

    • Vanilla Ethernet can be extremely reliable, without any additional layered protocols. I would be much more concerned about the reliability of the installation's AC power supply and distribution system.
      • I see you never had an Ethernet duplex mismatch or encountered a bad cable have you? It happens more often than you think...
        • How is that different than all of the many things that can screw up a SCSI or IDE setup?

          My experience is that a properly installed and tested Ethernet network is very reliable.

          • My experience is that a properly installed and tested Ethernet network is very reliable.

            An therein lies your problem. You just can't get around the human error factor. People incorrectly install equipment and configuration mistakes abound in a corporate environment.

    • remember that there are other ways to do error recovery besides with tcp. this system could detect errors by sending a crc of the total packet/sector sent and the receiving end would do the same crc. read-after-write would also detect bad stuff.

      eric
    • tuned for SCSI commands and data transfers. This is the particularly interesting part of the protocol. It assumes you're going to be doing bulk transfers, and lets both ends negotiate windows for performance (as opposed to using a sliding scheme).

      As I see it, the real problems:

      - SMP "experimentally" supported
      - client and server can't coexist on same box
      - client model is not decoupled enough from the server (a server going down can mean the client could crash)

      It appears the driver software needs some work
      • - SMP : If you mean on the server, who cares. most 'servers' are going to be embedded things much like a DEC HSG80 with a single task processor and a fuckton of disks. If you mean on the client, yeah, this is a problem. It should be fixed.
        - No client/server on the same box: Why the HELL would you want to do this? Most clients would have small local disks, if any, and mount most of their storage from a 'server' (possibly shared with other boxes using c.f. OpenGFS)
        - Server crashing can crash a client: Most
        • There is a possibility that you would have a small number of machines in a clustered configuration that are serving a database or other disk intensive task, and who share a large, distributed volume.

          Since HyperSCSI is supposed to help you save money by enabling cheap hardware utility, I'd expect the flexibility to realize something like that to maximize the utility of connection-rich 1U/blade servers available today.

          Also, don't you think it might be useful that a member of the disk array could access the
  • It's cute. but.... (Score:2, Insightful)

    by nugatory ( 205289 )
    Two big disadvantages:

    First, Ethernet can't be routed, so hyperSCSI isn't going to be nearly as flexible as iSCSI. This is the reason that just about everything that might want to be routed is usually carried over IP (and TCP and UDP and other stuff on top of IP). Straight ethernet is for stuff like ARP that really doesn't want to leave a network segment.

    Of course, one could reasonably do something hyperSCSI-like across IP, and still save the TCP overhead. (Consider that in a low-loss short-hop environ
    • Yeah, this looks like the results of the kind of factory who lays off its QC department since they only find a problem with 1 out of every 200,000 units the place makes.

      Yeah, it's five-nines reliablity, for a factory that makes 1,000,000 units a day, those 5 mistakes a day are gonna add up...
    • HyperSCSI /can/ be routed as it /can/ run over IP. Presumably you'd still need to translate HS/ethernet to/from HS/IP if you have HS/ethernet devices. (unless devices could simultaneously speak both HS/ethernet and HS/IP - quite possible).

      Lossy transport: HS implements its own flow-control.

      The biggest concern i'd have is the lack of an integrity check. Most modern link layers do have their own their own integrity check, but usually pretty basic - they can miss errors. (see Linus' story on how his sources
  • But I'd prefer to use it over gigabit ethernet, or at the very least a separate ethernet device than the one I use for me lan.

    Can multiple computers access the same drives through this?

    On another note, is it possible to network over traditional SCSI, by changing the SCSI card ID's to make them co-operate on the same chain? Does an implementation of this exist in Linux, *BSD, or Solaris?
    • YES! Or something like that. I once found a site (I lost the link) that had made a little kernel driver to run IP over SCSI. It was for one of the 2.2 kernels, IIRC, and it would only work with some SCSI cards for some reason.

      All and all, a very interesting idea, even if it's not practicle (low number of devices on a SCSI bus, short cable length, etc). Maybe for a little mini-cluster of PC/104 boards if they had built in SCSI or something...

      OK, after a quick googleing, I've found this [sourceforge.net] site. There are othe [google.com]

    • Can multiple computers access the same drives through this?

      I work for one of the first companies that is implementing Oracle RAC (2 and 4 node systems) over iSCSI to a Network Appliance.
      I gotta tell you, it works great.

      -djbkr

  • by Anonymous Coward
    This is great for folks that want to be locked into a single vendor without any path to get out

    Didn't the article state that HyperSCSI is GPL and runs on Linux? What the fuck is this guy talking about?
    • The GPL is not specific to Linux. You can have an application built upon Win32 and released under the GPL, and it will not run on Linux. The point of the GPL is not to help any particular operating system, yet to assist software to have a conditional freedom and authority backed by copyright law in recognition that if it were true public domain then it could be "hi-jacked" into another software.

      The GPL establishes penalties for people or artificial entities that don't provide the sourcecode of the softwa
  • Comment removed (Score:3, Insightful)

    by account_deleted ( 4530225 ) on Saturday September 27, 2003 @08:09PM (#7074956)
    Comment removed based on user account deletion
  • by bourne ( 539955 ) on Saturday September 27, 2003 @08:29PM (#7075025)

    The lessons of NFS are being ignored, and I'd expect HyperSCSI to die when it hits the same limitations.

    NFS started out UDP-based, and moved toward TCP with NFSv3. Why? Because having all that error correction done at the network layer made for a better product; TCP does all the work to insure packets aren't lost or out-of-order. UDP doesn't, and the NFS application layer had to handle it, making it slower, more painful, and a duplication of effort better spent elsewhere.

    The industry guys are almost right on this one. It isn't a beer can with a motor; it's a beer can with an M-80. Fun to watch when it works right, damn painful if you screw it up.

    • by nugatory ( 205289 ) on Saturday September 27, 2003 @09:18PM (#7075161)
      That's a good point, and needs a bit more modding up.

      It's worth adding, however, that the hyperSCSI folks are trying to make a distinction between wider-area networks (which they call SWANs for Storage WAN) and local single-segment (since they aren't routing) networks, and arguing that iSCSI is right for the former and hyperSCSI, because it's faster/cheaper, for the latter.

      This view has parallels in the history of NFS over TCP versus NFS over UDP, because NFS/UDP is still hanging on in one niche: short-haul, high-speed, low-latency, few-hops, negligible-loss environments.

      It also has parallels with the bad old days when direct-attached storage interconnects were much faster than LANs, so one set of protocols (FCP, SCSI, ESDI, IDE, SIMD...)evolved on the short fat pipes used to connect computers to peripherals, and a completely different set of protocols (ethernet, TCP/IP, SDLP, ...) evolved for the long thin pipes used to connect computers to one another.
      Similarly, hyperSCSI is an argument that the two domains are different enough to justify different protocols. That seems to be arguing against a historical trend tht says that the short/fat and long/thin differences are vanishing; compare gigE and fibrechannel as _wires_ today.

      All of this just reinforces Bourne's general point about ignoring the history. It's pretty clear that NFS over TCP is where the world is going, and the only reason that there's an NFS over UDP hanging around is that's how all NFS used to be, so some still is. When we compare hyperSCSI to iSCSI over TCP, I can't find any reason not to just deploy iSCSI everywhere and be done with it.

    • While UDP clearly was the wrong choice, TCP isn't the right choice either: TCP is a reliable, connection-oriented stream protocol. But for file and device access, you want a reliable connectionless protocol.

      There are a number of choices. Plan 9 defined IL for just this purpose, there is SCTP for telephony applications, and several operating systems support RDM ("reliably delivered message"). Maybe people should just start using one of them, rather than reinventing the wheel by building on top of unrelia
  • Saying that HyperSCSI is open source or HyperSCSI is under the GPL is pretty meaningless; those concepts don't apply to protocols. A HyperSCSI implementation is under the GPL, but so what? There are open source iSCSI implementations, too.
  • HyperSCSI, huh?

    Well, let's just add that to SCSI, SCSI 2, Fast SCSI, Wide SCSI, Fast Wide SCSI, Narrow SCSI, Ultra SCSI (aka SCSI 3), Ultra-2 SCSI, Ultra-3 SCSI (Ultra-160 SCSI to some), Ultra-320 SCSI and iSCSI. (I'm sure I've missed something out.)

    So what's next for this party? UberSCSI? 1337SCSI? TheOneRingSCSI?
  • by kfg ( 145172 ) on Saturday September 27, 2003 @08:56PM (#7075103)
    Never underestimate the bandwidth of a pair of sneakers carrying a hotswappable hard drive jogging down the hallway.

    KFG
  • I met Andre Hedrick at the linuxworld show, and thought he was very sharp. Don't dimiss him out of hand.
    iSCSI has drivers for every OS you can imagine, written by CISCO, IBM, Microsoft, and released under the GPL. This is from the iscsi sourceforge page.

    To attach to storage, you must also have an iSCSI-capable device connected to your network. The iSCSI device may be on the same LAN as your Linux host, or the iSCSI traffic may be routed using normal IP routing methods.

    The daemon and the kernel driv

  • Well, first off, "HyperSCSI' isnt such. All it is is just a correctionless protocol over ethernet hardware. Really, that's a bad idea. You'd be better off if you used ethernet hardware that sped up IPSEC and related ip protocols.

    1: Create 2 networks, one being the normal network, and one being the SAN

    2: Use GigE cable on your SAN with the inclusion of advanced Network FS'es like Coda.

    3: Provide POP's that connect the external network with the internal SAN/serverNET by way of tunnels and port forwarding.
  • by brer_rabbit ( 195413 ) on Sunday September 28, 2003 @12:43AM (#7075832) Journal
    Will you be able to boot from these devices? I don't see any mention of it in the article; I'd imagine you'd need support both in the BIOS and possibly the network card..?
  • HP's Graham Smith says:

    "Without TCP/IP, it has no real error-recovery mechanism or guarantee that packets get delivered."

    But that is wrong. There is error checking in the ethernet hardware and in the SCSI stack. It seems Smith needs to review the basic material, or should have at least read the introductory material [a-star.edu.sg]. Perhaps the takeaway here is, managers should not be allowed to comment on technical material, or if they do, they should solicit advice from a practicing engineer first.

    Smith also dumps
    • HP's Graham Smith says: "Without TCP/IP, it has no real error-recovery mechanism or guarantee that packets get delivered." But that is wrong. There is error checking in the ethernet hardware and in the SCSI stack. It seems Smith needs to review the basic material,

      Sorry, Daniel :) As Linux's network driver maintainer and author of a Serial ATA stack which goes through the Linux kernel SCSI layer... as well as being someone who reviewed ATA-over-ethernet [coraid.com] ...

      I can say that ethernet hardware and the

      • "HP's Graham Smith says: "Without TCP/IP, it has no real error-recovery mechanism or guarantee that packets get delivered." But that is wrong. There is error checking in the ethernet hardware and in the SCSI stack. It seems Smith needs to review the basic material"

        I can say that ethernet hardware and the Linux SCSI stack does not handle retranmits that are needed at the ethernet layer. HP's Graham Smith is precisely correct. As some other slashdotters pointed out, the HyperSCSI code includes logic to hand
      • "HP's Graham Smith says: "Without TCP/IP, it has no real error-recovery mechanism or guarantee that packets get delivered." But that is wrong. There is error checking in the ethernet hardware and in the SCSI stack. It seems Smith needs to review the basic material"

        Sorry, Daniel :) As Linux's network driver maintainer and author of a Serial ATA stack which goes through the Linux kernel SCSI layer... as well as being someone who reviewed ATA-over-ethernet ...

        I can say that ethernet hardware and the Linux S

Do you suffer painful hallucination? -- Don Juan, cited by Carlos Casteneda

Working...