Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

Wear Leveling, RAID Can Wipe Out SSD Advantage 168

storagedude writes "This article discusses using solid state disks in enterprise storage networks. A couple of problems noted by the author: wear leveling can eat up most of a drive's bandwidth and make write performance no faster than a hard drive, and using SSDs with RAID controllers brings up its own set of problems. 'Even the highest-performance RAID controllers today cannot support the IOPS of just three of the fastest SSDs. I am not talking about a disk tray; I am talking about the whole RAID controller. If you want full performance of expensive SSDs, you need to take your $50,000 or $100,000 RAID controller and not overpopulate it with too many drives. In fact, most vendors today have between 16 and 60 drives in a disk tray and you cannot even populate a whole tray. Add to this that some RAID vendor's disk trays are only designed for the performance of disk drives and you might find that you need a disk tray per SSD drive at a huge cost.'"
This discussion has been archived. No new comments can be posted.

Wear Leveling, RAID Can Wipe Out SSD Advantage

Comments Filter:
  • by OS24Ever ( 245667 ) * <trekkie@nomorestars.com> on Saturday March 06, 2010 @01:39PM (#31381688) Homepage Journal

    This assumes that RAID controller manufacturers won't be making any changes though.

    RAID for years has relied on millisecond access times. So why spend a lot of money on an ASIC & Subsystem that can go faster? So taking a RAID card designed for slow (relatively) spinning disks and attaching them to SSD of course the RAID card is going to be a bottleneck.

    However subsystems are going to be designed to work with SSD that has much higher access times. When that happens, this so called 'bottleneck' is gone. You know every major disk subsystem vendor is working on these. Sounds like a disk vendor is sponsoring 'studies' to convince people not to invest in SSD technologies now knowing that a lot of companies are looking at big purchases this year because of the age of equipment after the downturn.

    • Re: (Score:3, Insightful)

      The article is talking about stuff that's available today. They aren't saying "SSDs will never be suitable", they're saying they aren't suitable today. Why? Because none of the hardware infrastructure available is fast enough.

      • by vadim_t ( 324782 ) on Saturday March 06, 2010 @02:14PM (#31381898) Homepage

        Sure, but why do you put 60 drives in a RAID?

        Because hard disks, even the high end ones, have quite low IOPS. You can attain the same performance level with much fewer SSDs. If what you need is IOPS and not lots of storage that's a good thing even. You reach the required level with much fewer drives, so you need less power, less space and less cooling.

        • by Anpheus ( 908711 ) on Saturday March 06, 2010 @02:41PM (#31382088)

          I agree. 60 drives in RAID0 are going to see between 150 and 200 IOPS/drive, maybe more for 2.5" drives right? So that's 12,000 IOPS.

          The X25-E, the new Sandforce controller, and I believe some of the newer Indilinx controllers can all do that with one SSD.

          $/GB is crap, $/IOPS is amazing.

          • The next problem is that large RAID 0 arrays will suffer from word-width. The reason that ATA and SCSI buses were parallel buses had to do with chip fanout and cable length (slew rate). When SATA and SAS arrived, they use single fast clocks to frame out data and re-lengthen the cable by convenience.

            When you get a bunch of drives that can now be accessed faster than the interface IO rate, the interface IO rate has to change and that starts putting RAID controller technology into the same realm of difficulty

          • Disks are cheap. There's no reason to use the full GB (or TB) capacity, especially if you want fast response. If you just use the outside 20% of a disk, the random I-O performance increases hugely. ISTM the best mix is some sort of journalling system, where the SSDs are used for read oparions and updates get written to the spinning storage (or NV RAM/cache). Then at predetermined times perform bulk updates back to the SSD. if some storage array manu. came up with something like that, I'd expect most perform
            • by Emnar ( 116467 )

              Yes, disks are cheap

              The power to run and cool them, and the space to hold them on your data center floor, are not.

          • $/GB is crap, $/IOPS is amazing.

            Quoted for truth. For applications where GB are not needed but IOPS is, IOPS is the only thing that matters. Booting and applications usually need less than 40GB. We are already at that point where the total cost is more than worth it from a total system cost perspective. Games are there, or nearly there. I highly doubt that someone is going to have more than 160GB of games they play regularly, in which case, SSD are a logical choice for games. Low minimum FPS is what destro

        • Maybe you need 120TB of space, I don't see any SSDs yet where you can have this much but this is doable with current HDD tech. I can see local SSDs on servers but not in SANs at the moment. We will probably get there sooner or later, at which time the various bottlenecks will have appeared and been solved.
          • If you need 120TB of space, you wont be doing it with only 60x 2TB drives if you have any regard for the integrity of your data (ie, enjoy your massive dataloss).
      • by TheLink ( 130905 )
        > The article is talking about stuff that's available today. They aren't saying "SSDs will never be suitable", they're saying they aren't suitable today.

        They are suitable today. You just don't raid them using 50-100K RAID controllers.

        Anyway, the "Enterprise Storage" bunch will probably stick both SSDs and TB SATA drives in their systems for the speed and capacity (and charge $$$$$$). I think some are doing it already.

        Or you could stick a few SSDs in a decent x86 server with 10 Gbps NICs, and now you can
        • by itzdandy ( 183397 ) on Saturday March 06, 2010 @02:56PM (#31382230) Homepage

          You missed half the point. SSD use wear leveling and other techniques that are very effective on the desktop but in a high IO environment, the current wear leveling techniques reduce SSD performance to well below what you get on the desktop.

          I really think that this is just a result of the current trend to put high performance SSD on the desktop. When the market re-focuses these problems will disolve.

          This also goes for RAID controllers. If you have 8 ports and SAS 3Gb links, then you need to process 24Gb and a IO/s of current 15k SAS drives. Lets just assume for easy math that this requires a 500Mhz RAID Processor. What would be the point of putting in a 2Ghz Processor? What if you increase the IO/s by 100x and double the bandwidth? now you need to handle 48Gb/s throughput and 100x the IO and that requires 2x 3Ghz Processors.

          Its just takes time for the market players to react to each technology increase. New raid controllers will come out that can handle these things. maybe the current raid cpus have been using a commodity chip (powerpc often enough) because it was fast enough to handle these things and the new technologies are going to require more specific processors. Maybe you need to get cell chips or nvidia GPUs in there, whatever it takes.

          I admit it would be pretty interesting to see the new Dell/LSI 100Gb SAS powered by Nvidia logo in Gen12 Dell servers.

          • by sirsnork ( 530512 ) on Saturday March 06, 2010 @03:33PM (#31382602)
            He may have half missed the point, but so did you.

            I clicked on this thinking this guy has done some testing... somewhere. Nope, nothing, no mention of benchmarks or what hardware he used. I'm sure some of he said is true. But I'd really like to see the data that he gets the

            I have seen almost 4 to 1. That means that the write performance might drop to 60 MB/sec and the wear leveling could take 240 MB/sec.

            from. I'd also really like to know what controllers he's tested with, wheather or not they have TRIM support (perhaps none do yet), what drives he used, if he had a BBU and write-back enabled etc etc etc.

            Until he give us the sources and the facts this is nothing but a FUD piece. Yes, wear levelling will eat up some bandwidth, thats hardly news... show us the data about how much and which drives are best

            • Re: (Score:3, Interesting)

              by itzdandy ( 183397 )

              I dont think I missed the point. I am just a little more patient than most I guess. I don't think SSDs are ready from a cost/performance standpoint vs enterprise SAS 15k drives due to the market's focus.

              The OP may not have listed the hardware and disks but each controller has info published on max throughput.

              This is very comparable to running U320 SCSI disks on a U160 card. The performance bottleneck is often NOT the U160 interface but rather that the controller was not over engineered for its time. The

          • by TheLink ( 130905 )

            A fair number of the desktop stuff can take sustained writes for quite a long while- e.g. the entire disk or more.

            http://benchmarkreviews.com/index.php?option=com_content&task=view&id=454&Itemid=60&limit=1&limitstart=10 [benchmarkreviews.com]

            If that's not enough, some of the desktop benchmarks/tests, involve writing to the entire disk first, and then seeing how far the performance drops.

            e.g.
            http://www.anandtech.com/printarticle.aspx?i=3702 [anandtech.com]

            See: "New vs. Used Performance - Hardly an Issue"

            They're not cheap, but

            • But that isnt indicative of enterprise loads. Enterprise loads such as databases do many many seeks and tend to have long queues as many clients request the data. Size and throughput are less important for these loads than seek time (though still critical).

              A desktop system can only(realistically) have a similar load in synthetic benchmarks.

              The server vs desktop loads on disks are so different that they cant be directly compared. A great desktop drive can be a terrible server drive and vice versa.

              • Re: (Score:3, Informative)

                by TheLink ( 130905 )
                >Enterprise loads such as databases do many many seeks and tend to have long queues as many clients request the data. Size and throughput are less important for these loads than seek time (though still critical).

                Did you even read the link?

                "I saw 4KB random write speed drop from 50MB/s down to 45MB/s. Sequential write speed remained similarly untouched. But now I've gone and ruined the surprise."

                That's for random writes. 4KB random writes at 45MB/sec is 11520 writes per second.

                A 15000rpm drive doing 4KB r
          • Re: (Score:3, Interesting)

            by gfody ( 514448 )
            This is why software based raid is the way to go for ultimate performance. The big SAN providers ought to be shaking in their boots when they look at what's possible using software like starwind or open-e with host-based raid controllers and SSDs. Just for example, look at this thing [techpowerup.com] - if you added a couple cx4 adapters and run open-e you've got a 155,000iop/s iSCSI target there in what's basically a $30k workstation. 3PAR will sell you something the size of a refrigerator for $500,000 that wouldn't even p
            • Re: (Score:3, Interesting)

              by itzdandy ( 183397 )

              how about opensolaris with ZFS. you get a high performance iSCSI target and a filesystem with re-ordered writes that improves IO performance by reducing seeks plus optional deduplication and compression.

              Additional gains can be had from seperate log and cache disks and with 8+ core platforms already available you can blow a traditional RAID card out of the water.

              One nice thing about software raid is it is completely agnostic to controller failure. If you need to recover a raid after a controller failure, y

              • Re: (Score:3, Informative)

                by gfody ( 514448 )
                ..not to mention the gobs and gobs of cheap sdram you could use as cache. There's a huge opportunity for an up and coming SAN company to be competitive with commodity hardware. Doesn't look good for the likes of 3PAR, EMC, Equilogic, etc.
            • Re: (Score:3, Interesting)

              by petermgreen ( 876956 )

              WOW NICE motherboard there, TWO io hubs to give seven of x8 electrical/x16 mechanical slots along with a x8 link for the onboard SAS and an x4 link for the onboard dual port gigabit.

              http://www.supermicro.com/products/motherboard/QPI/5500/X8DTH-i.cfm [supermicro.com]

          • Not to be rude, but I'm guessing there's only 20-30 engineers in the world who have any idea what current wear leveling state-of-the-art is, and how it affects performance.

            There's a huge variety in the quality of controllers in the marketplace, and just because one design has a given advantage or flaw, doesn't mean others share those attributes.

            • I dont think its about 'quality' so much as target. Desktop drives and server drives are different things and wear leveling between the two could be significantly different. Right now, I think that everything has been targeting desktops and notebooks and those products are adapted to the server space rather than being developed from the ground up for servers.

              Also, what indication do you have that 20-30 people are experts in wear leveling? I would expect that many engineers as EACH flash memory vendor and

        • Welcome to the multi-tiered storage world. There are places and applications where SSDs are a perfect fit, and places where they are not. Eventually server builders will find a place where both work in tandem to give you the performance you were wanting to begin with. SSD is a fully cached drive. That's not necessary in all applications. For some applications, TB's of RAM is the better option. Combinations of various storage technology will find their niche market. SSDs are not financially practical for all

      • Re: (Score:3, Informative)

        by rodgerd ( 402 )

        ceph [newdream.net], XIV, and other distributed storage controller models are available today, and avoid controller bottlenecks.

      • by Twinbee ( 767046 )

        Why is it so hard for developers of ports and interface standards to get it super fast, first time round? It's not like there's a power issue and there's no worry about having to make things small enough (as with say the CPU).

        For example, let's take USB:
        USB 1: 12 Mbit/s
        USB 2: 480 Mbit/s
        USB 3: 4 Gbit/s

        Same goes for video and SATA etc. Perhaps I'm being naive, but it seems like they're all a bit short-sighted. They should develop for the hardware of the future, not artificially limit the speed to what current

        • Re: (Score:3, Informative)

          by amorsen ( 7485 )

          Why is it so hard for developers of ports and interface standards to get it super fast, first time round? It's not like there's a power issue and there's no worry about having to make things small enough (as with say the CPU).

          There IS a power issue, and most importantly there's a price issue. The interface electronics limit speed. Even today, 10Gbps ethernet (10Gbase-T) is quite expensive and power hungry. 40Gbps ethernet isn't even possible with copper right now. They couldn't have made USB 3 40 Gbps instead of 4, the technology just isn't there. In 5 years maybe, in 10 years almost certainly.

          USB 1 could have been made 100Mbps, but the others were close to what was affordable at the time.

          • by Twinbee ( 767046 )

            Isn't it possible to make a wired connection where speed is determined by the amount of data sent. If true, then the power would only increase as the hardware became more sophisticated. Therefore, the USB3 interface would only eat up minimal power in the USB1 computer age (perhaps as little as USB 1 itself).

            • Re: (Score:3, Informative)

              by amorsen ( 7485 )

              You would still have to run a sophisticated DSP, unless you kept entirely separate chips for USB1, USB2, and USB3. The DSP would eat lots of power even when working at USB1-speed.

              Also, we're talking hundreds or thousands of dollars for a USB3-DSP in the USB1 era.

    • by Z00L00K ( 682162 )

      Even if the bottleneck moves from disk to controller the overall performance will improve. So it's not that SSD:s are bad, it's just that the controllers needs to keep up with them.

      On the other hand - raid controllers are used for reliability and not just for performance. And in many cases it's a tradeoff - large reliable storage is one thing while high performance is another. Sometimes you want both and then it gets expensive, but if you can live with just one of the alternatives you will get off relativel

  • Duh (Score:2, Interesting)

    by Anonymous Coward

    RAID means "Redundant Array of Inexpensive Disks".

    • Re:Duh (Score:4, Informative)

      by Anarke_Incarnate ( 733529 ) on Saturday March 06, 2010 @01:48PM (#31381754)
      or Independent, according to another fully acceptable version of the acronym.
      • by NNKK ( 218503 )

        Fully acceptable to illiterates, you mean.

        • Yes, because 15k RPM SAS drives are OH so inexpensive, right?

          Starting out at 1.9$/GB for a 73.5 GB drive [newegg.com] is certainly inexpensive. Especially when you have to pay an insane 9.332 cent/GB for a 750 GB hard drive [newegg.com].

          By your definition, you could NEVER EVER use RAID on expensive hard drives. Which obviously means that you are an idiot.

          • by NNKK ( 218503 )

            "Inexpensive" does not have an absolute definition, and if you think it does, I think you need to reexamine who the "idiot" is.

            The idea has nothing whatsoever to do with the absolute price of any particular disk. The idea is that one can use comparatively inexpensive disks to create an array that would be prohibitively expensive were it to be replaced with a single drive of identical capacity and performance.

            Also, that link to a 750GB drive is for a 640GB drive. I know kindergarten is hard, but one would th

        • No, but it is a fully acceptable and more reasonable word than inexpensive. It is, in fact, my preferred variant of what RAID stands for, as the expense of disks is relative and the industry thinks that RAID on SAN disks is fine. Since many of those are thousands of dollars per drive, I would expect that using inexpensive would be a deprecated variant of RAID.
        • Re: (Score:3, Funny)

          Maybe the I stands for Illiterate?
    • Yes, but the word inexpensive is being used in a relative sense here - the idea being that (ignoring RAID0 which doesn't actually match the definition at all due to not offering any redundancy) a full set of drives including a couple of spares would cost less than any single device that offered the same capacity and long-term reliability. And the expense isn't just talking about the cost of the physical drive - if you ask a manufacturer to guarantee a high level of reliability they will in turn ask a higher
  • Correction: (Score:5, Informative)

    by raving griff ( 1157645 ) on Saturday March 06, 2010 @01:40PM (#31381694)

    Wear Leveling, RAID Can Wipe Out SSD Advantage for enterprise.

    While it may not be efficient to slap together a platter of 16 SSDs, it is worthwhile to upgrade personal computers to use an SSD.

    • I agree. You shouldn't be using consumer grade SSDs for servers - unless it's a game server or something. (Ex: TF2)

      Do you know why RE (RAID Edition) HDDs exist? They strip out all the write recovery and stuff, which could mess up speeds, IOPS, and seek times, and instead streamline the drives for performance predictability. That makes it far easier for RAID controllers to manage dozens of them.

      SSDs have a similar thing going. You're an enterprise and need massive IOPS? Buy enterprise-level SSDs - like the

      • by amorsen ( 7485 )

        The lousy thing about PCIe SSD's is that modern servers don't have enough PCIe slots. 1U servers often have only one free slot, and blade servers often have zero. The only blade vendor with decent PCIe expandability is Sun, and their blade density isn't fantastic.

        • True, but ioDrives have an IOPS edge that is massive. If that's what you need, then find a way to make it work.

          Heh... Sun... how typical. :P

        • There are server boards out there with pelnty of PCIe slots e.g. http://www.supermicro.com/products/motherboard/QPI/5500/X8DTH-i.cfm [supermicro.com] (shamelessly grabbed from a board in a picture linked from another post here)

          Yes you will need a case tall enough to take cards without risers (which means 3U afaict) but I would guess getting the same IOPS any other way would take up way more than 3U of rackspace.

          • Re: (Score:3, Interesting)

            by amorsen ( 7485 )

            I'm working for an ISP, rack units are way too expensive to waste 3 on a server. Heck they're too expensive to waste 1 on a server.

            The advantage over enterprise SATA/SAS SSD's isn't large enough for us at least. We would have to go to 6 socket motherboards to get the same CPU density.

    • by bertok ( 226922 )

      Wear Leveling, RAID Can Wipe Out SSD Advantage for enterprise.

      While it may not be efficient to slap together a platter of 16 SSDs, it is worthwhile to upgrade personal computers to use an SSD.

      If there's a benefit, why wouldn't you upgrade your enterprise servers too?

      We just built a "lab server" running ESXi 4, and instead of a SAN, we used 2x SSDs in a (stripe) RAID. The controller was some low-end LSI chip based one.

      That thing was blazing fast -- faster than any SAN I have ever seen, and we were hitting it hard. Think six users simultaneously building VMs, installing operating systems, running backups AND restores, and even running database defrags.

      It's possible that we weren't quite getting 'p

  • by WrongSizeGlass ( 838941 ) on Saturday March 06, 2010 @01:46PM (#31381740)
    Scaling works both ways. Often technology that benefits larger installations or enterprise environments gets scaled down to the desktop after being fine tuned. It's not uncommon for technology that benefits desktop or smaller implementations to scale up to eventually benefit the 'big boys'. This is simply a case of the laptop getting the technology first as it was the most logical place for it to get traction. Give SSD's a little time and they'll work their way into RAID as well as other server solutions.
  • Seek time (Score:5, Informative)

    by 1s44c ( 552956 ) on Saturday March 06, 2010 @01:48PM (#31381760)

    The real advantage of solid state storage is seek time, not read/write times. They don't beat conventional drives by much at sustained IO. Maybe this will change in the future. RAID just isn't meant for SSD devices. RAID is a fix for the unreliable nature of magnetic disks.

    • Re:Seek time (Score:4, Informative)

      by LBArrettAnderson ( 655246 ) on Saturday March 06, 2010 @02:34PM (#31382016)
      That hasn't been the case for at least a year now. A lot of SSDs will do much better with sustained read AND write speeds than traditional HDs (the best of which top out at around 100MB/sec). SSDs are reading at well over 250MB/sec and some are writing at 150-200MB/sec. And this is all based on the last time I checked, which was 5 or 6 months ago.
      • by Kjella ( 173770 )

        True, though if what you need is sequential read/write performance then RAID0 will do that well at less cost and much higher capacity than an SSD. Normally the reason why you want that is because you're doing video capture or something similar that takes ungodly amounts of space, so RAID0 is pretty much a slam dunk here. It's the random read/write performance that is the reason for getting an SSD. In the 4k random read/write tests - which are easier for me to understand than IOPS as reading and writing lots

      • The new 64MB cache WD Black drives have wicked sustained read speeds. Close to 140MB/sec.

        But when dealing with small files, you still notice the IOPS limit.

        The cheaper SSDs won't do as well with a sustained write situation (Ex: Recording 12 security camera feeds) as a traditional HDD will.

      • Re:Seek time (Score:5, Insightful)

        by Rockoon ( 1252108 ) on Saturday March 06, 2010 @05:56PM (#31383848)
        It seems that a lot of people are taking the price of the cheapest/GB HD's, but using the performance of the most expensive/GB HD's, in order to form their conclusions about how little they get for so much extra money.

        One of the fastest platters on the market today is the Seagate 15,000 RPM Cheetah and that one runs at about $1/GB. Some of the 15K drives go for $3/GB.

        SSD's are running about $3/GB across the board at the top end, a cost not dissimilar from the top end platters, but they perform much better.

        I understand that many people dont want to drop more than $120 on a drive, but many of the vocal ones are letting their unwillingness to do so contaminate their criticism. SSD's are actually priced competitively vs the top performing platter drives.
    • They don't beat conventional drives by much at sustained IO.

      umm, err?

      Which platter drive did you have in mind that performs similar to a high performance SSD's? Even Seagates 15K Cheetah only pushes 100 to 150MB/sec sustained read and write. The latest performance SSD's (such as the SATA2 Colossus) are have sustained writes at "only" 220MB/sec and with better performance (260MB/sec) literally everywhere else.

      • by scotch ( 102596 )
        ~ 2x performance for 10x the cost is the definition of "not by much"
        • Perhaps you need to be clued in on the fact that fast platter drives go for over $1 per gigabyte.

          $100 for a terabyte sounds great and all, but you cant get a fast one for that price. You wont be doing sustained writing 120MB/sec writing to those 7.2K drives. You will be lucky to get 80MB/sec on the fastest portions of the drive and will average around 60MB/sec.

          That SSD thats pushing 220MB/sec sustained writing is 4x the performance on that one metric, and even faster on every other metric.
          • Your benchmarks for 7.2k drives is defiantly outdated. My 2 drive raid-0 is doing 400MB/sec on the fastest part, and 250MB/sec at the slowest with an average of 309MB/sec. These aren't the latest drives, nor are they highend SCSI/SAS drive, just typical seagate desktop drives. My single Intel 80GB SSD is 220-200MB from start to finish.

            • Re: (Score:3, Insightful)

              by haruchai ( 17472 )

              Which Seagate drives would this be? Those numbers sound very high for typical desktop drives.

              Besides sustained sequential speed is one thing but what really gives a responsive "feel" on
              the desktop is random access and any one of the post-JMicron-stutter SSDs will stomp even a small RAID of
              dual-ported enterprise drives into the dirt on random reads and writes, especially combined with the order of magnitude
              faster access time of an SSD.

  • by fuzzyfuzzyfungus ( 1223518 ) on Saturday March 06, 2010 @01:55PM (#31381800) Journal
    This study seems to have a very bad case of "unconsciously idealizing the status quo and working from there". For instance:

    "Even the highest-performance RAID controllers today cannot support the IOPS of just three of the fastest SSDs. I am not talking about a disk tray; I am talking about the whole RAID controller. If you want full performance of expensive SSDs, you need to take your $50,000 or $100,000 RAID controller and not overpopulate it with too many drives. In fact, most vendors today have between 16 and 60 drives in a disk tray and you cannot even populate a whole tray. Add to this that some RAID vendor's disk trays are only designed for the performance of disk drives and you might find that you need a disk tray per SSD drive at a huge cost."

    That sounds pretty dire. And, it does in fact mean that SSDs won't be neat drop-in replacements for some legacy infrastructures. However, step back for a minute: Why did traditional systems have 50k or 100k RAID controllers connected to large numbers of HDDs? Mostly because the IOPs on an HDD, even a 15K RPM monster, sucked horribly. If 3 SSDs can swamp a RAID controller that could handle 60 drives, that is an overwhelmingly good thing. In fact, you might be able to ditch the pricey raid controller entirely, or move to a much smaller one, if 3 SDDs can do the work of 60HDDs.

    Now, for systems where bulk storage capacity is the point of the exercise, the ability to hang tray after tray full of disks off the RAID controller is necessary. However, that isn't the place where you would be buying expensive SSDs. Even the SSD vendors aren't even pretending that SSDs can cut it as capacity kings. For systems that are judged by their IOPS, though, the fact that the tradition involved hanging huge numbers (of often mostly empty, reading and writing only to the parts of the platter with the best access times) HDDs off extremely expensive RAID controllers shows that the past sucked, not that SSDs are bad.

    For the obligatory car analogy: shortly after the début of the automobile, manufacturers of horse-drawn carriages noted the fatal flaw of the new technology: "With a horse drawn carriage, a single buggy whip will server to keep you moving for months, even years with the right horses. If you try to power your car with buggy whips, though, you could end up burning several buggy whips per mile, at huge expense, just to keep the engine running..."
    • by volsung ( 378 ) <stan@mtrr.org> on Saturday March 06, 2010 @02:21PM (#31381942)

      And we don't have to use Highlander Rules when considering drive technologies. There's no reason that one has to build a storage array right now out of purely SSD or purely HDD. Sun showed in some of their storage products that by combining a few SSDs with several slower, large capacity HDDs and ZFS, they could satisfy many workloads for a lot less money. (Pretty much the only thing a hybrid storage pool like that can't do is sustain very high IOPS of random reads across a huge pool of data with no read locality at all.)

      I hope we see more filesystems support transparent hybrid storage like this...

      • by fuzzyfuzzyfungus ( 1223518 ) on Saturday March 06, 2010 @02:45PM (#31382122) Journal
        My understanding is that pretty much all the serious storage appliance vendors are moving in that direction, at least in the internals of their devices. I suspect that pretty much anybody who isn't already a sun customer doesn't want to have to deal with ZFS directly; but that even the "You just connect to the iSCSI LUN, our magic box takes it from there" magic boxes are increasingly likely to have a mix of drive types inside.

        I'll be interested to see, actually, how well the traditional 15K RPM SCSI/SAS enterprise screamer style HDDs hold up in the future. For applications where IOPS are supreme, SSDs(and, in extreme cases, DRAM based devices) are rapidly making them obsolete in performance terms and price/performance terms are getting increasingly ugly for them. The costs of fabricating flash chips are continuing to fall, the costs of building mechanical devices that can handle what those drives can aren't as much. For applications where sheer size or cost/GB are supreme, the fact that you can put SATA drives on SAS controllers is super convenient. It allows you to build monstrous, and still pretty zippy for loads that are low on random read/write and high on sustained read or write(like backups and nearline storage), storage capacity for impressively small amounts of money.

        Is there a viable niche for the very high end HDDs, or will they be murdered from above by their solid state competitors, and from below by vast arrays of their cheap, cool running, and fairly low power, consumer derived SATA counterparts?

        Also, since no punning opportunity should be left unexploited, I'll note that most enterprise devices are designed to run headless without any issues at all, so Highlander rules cannot possibly apply.
        • Re: (Score:3, Insightful)

          We haven't purchased 15k disks for years. In most cases, it is actually cheaper to buy 3x or even 4x SATA spindles to get the same IOPS. Plus you get all that capacity for free, even when you factor in extra chassis and power costs. We use all that capacity for snapshots, extra safety copies, etc. If your enterprise storage vendor is charging you the same price for a 1TB SATA spindle as a 300GB 15K spindle, you need to find a new vendor. Look at scale-out clustered solutions instead of the dinosaur "dual f
          • My (possibly incorrect?) understanding was that 3-4x 7200rpm drives arent a drop-in replacement for a 15k in all situations-- the slower drives still have a higher rotational latency, do they not? Even if you throw 50 slower drives at the problem, there are still situations where the 15k drive will respond faster simply because of its rotational latency.

            Correct me if i am incorrect.
            • My (possibly incorrect?) understanding was that 3-4x 7200rpm drives arent a drop-in replacement for a 15k in all situations
              Not all but I would think most.

              If you are comparing a big array of 15K drives with an even bigger array of 7.2K drives I would think it likely that the application in question is one that is capable of generating large numbers of requests in parallel (most likely some kind of database server).

    • by LoRdTAW ( 99712 )

      All I want to know is who is making RAID cards that cost $50,000 to $100,000? Or is he describing a complete system and calling it a RAID card?

      • He's clearly talking about SAN controllers like EMC Clariion or IBM DS5000; if you don't look too carefully you might mistake them for RAID controllers.

    • Just based on the face that it says "$50,000 or $100,000 RAID controller." Ummm what? Where the hell do you spend that kind of money on a RAID controller? A RAID controller for a few disks is a couple hundred bucks at most. For high end controllers you are talking a few thousands. Like Adaptec's 5805Z which has a dual core 1.2GHz chip on it for all the RAID calculations and supports up to 256 disks. Cost? About $1000 from Adaptec. Or how about the 3Ware 9690SA-8E, 8 external SAS connectors for shelves with

      • Where the hell do you spend that kind of money on a RAID controller?

        EMC Clariion CX3-80 will run you between $13k and $130k depending on how it is configured.

  • by bflong ( 107195 ) on Saturday March 06, 2010 @01:59PM (#31381812)

    ... researchers have found that putting a Formula One engine into a Mack truck wipes out the advantages of the 19,000 rpm.

  • So does anyone know if this applies to software RAID configurations?

    Just curious...

    • That was my first thought. Run standard SATA controllers, put one or two drives on each controller, and RAID-0 them. At least then you're CPU-bound. Doesn't fix the TRIM problem, though.
      • by Anpheus ( 908711 )

        This. I'm surprised no one has mentioned it. I don't think there's a RAID controller on the market that supports pass-through TRIM. Which is going to be one hell of a wakeup call when an admin finds the batch job took ten times longer than usual. I had this happen with an X25-M, I had stopped paying attention to the log file's end time for various steps, and one day I woke up to it running past 9AM (from the initial times of taking a mere ten minutes when starting at 5AM.)

        • One problem I see is that afaict trim can't just be passed through. The controller needs to take careful steps to handle trim in a way that keeps the raid data consistent.

  • by haemish ( 28576 ) * on Saturday March 06, 2010 @02:25PM (#31381962)

    If you use ZFS with SSDs, it scales very nicely. There isn't a bottleneck at a raid controller. You can slam a pile of controllers into a chassis if you have bandwidth problems because you've bought 100 SSDs - by having the RAID management outside the controller, ZFS can unify the whole lot in one giant high performance array.

    • by Anpheus ( 908711 )

      That's not the problem, the problem is a lot of the high end controllers have 8, 16, 24, etc SAS ports. If you were to plug SSDs into all of those ports, you'd swamp the card, whether you treat the disks as JBOD or let the controller handle it. And the storage vendors who make real nice SANs did the same thing. They have one controller managing dozens of HDDs because their performance is so abysmal.

    • Re: (Score:3, Interesting)

      by Anonymous Coward

      If you use ZFS with SSDs, it scales very nicely. There isn't a bottleneck at a raid controller. You can slam a pile of controllers into a chassis if you have bandwidth problems because you've bought 100 SSDs - by having the RAID management outside the controller, ZFS can unify the whole lot in one giant high performance array.

      If performance is that critical, you'd be foolish to use ZFS. Get a real high-performance file system. One that's also mature and can actually be recovered if it ever does fail catastrophically. (Yes, ZFS can fail catastrophically. Just Google "ZFS data loss"...)

      If you want to stay with Sun, use QFS. You can even use the same filesystems as an HSM, because SAMFS is really just QFS with tapes (don't use disk archives unless you've got more money than sense...).

      Or you can use IBM's GPFS.

      If you really wan

      • Re: (Score:3, Informative)

        by turing_m ( 1030530 )

        Get a real high-performance file system. One that's also mature and can actually be recovered if it ever does fail catastrophically. (Yes, ZFS can fail catastrophically. Just Google "ZFS data loss"...)

        I just did. On the first page, I got just one result on the first page relating to an event from January 2008 - Joyent. And they managed to recover their data. I did another search - "ZFS lost my data". One example running on FreeBSD 7.2, in which ZFS was not yet production ready. Other examples existed in wh

      • OMFG you used alot of acronyms WTF man?
  • Even the highest-performance RAID controllers today cannot support the IOPS of just three of the fastest SSDs.

    In the old days, raid controllers were faster than doing it in software.

    Now a days, aren't software controllers faster than hardware? So, just do software raid? In my very unscientific tests of SSDs I have not been able to max out the server CPU when running bonnie++ so I guess software can handle it better?

    Even worse, it seems difficult to purchase "real hardware raid" cards since marketing departments have flooded the market with essentially multiport win-SATA cards that require weird drivers because th

    • by TheRaven64 ( 641858 ) on Saturday March 06, 2010 @02:41PM (#31382080) Journal

      The advantage of hardware RAID, at least with RAID 5, is the battery backup. When you write a RAID stripe, you need to write the whole thing atomically. If the writes work on some drives and fail on others, you can't recover the stripe. The checksum will fail, and you'll know that the stripe is damaged, but you won't know what it should be. With a decent RAID controller, the entire write cache will be battery backed, so if the power goes out you just replay the stuff that's still in RAM when the array comes back online. With software RAID, you'd just lose the last few writes, (potentially) leaving your filesystem in an inconsistent state.

      This is not a problem with ZFS, because it handles transactions at a lower layer so you either complete a transaction or lose the transaction, the disk is never in an inconsistent state.

    • Just my thought. Hardware RAID adds latency and limits throughput if you use SSDs. On the other hand, server CPUs often have cycles to spare and are much faster than the CPU on the RAID controller. I've yet to see the dual quad cores with hyperthreading going over 40% in our servers.
      Now all we need is a VFS layer that smartly decides where to store files and/or uses a fast disk as a cache to a slower disk. Like a unionfs with automatic migration?

  • kernel based software raid or zfs gives much better raid performance IMHO. The only reason I use hw raid is to make administration simpler. I think there is much more benefit to be had letting the os govern partition boundaries, chunk size and stripe alignment. Not to mention the dismal firmware upgrades supplied by closed source offerings.
  • If using RAID for mirroring drives, well, you must also consider the fail rate of drives, as it is all about fault tolerance, no? It is reported that SSDs are far more durable, so the question should be, what does it take to match the fault tolerance of HDD RAID with an SSD RAID, and only after that, can we truly compare the pros and cons of their performance sacrifices.

    On a side note, you can now get a sony laptop that comes equipped with a RAID 0 quad SSD drive.
    http://www.sonystyle.com/webapp/wcs/stores/s [sonystyle.com]

    • Do SSDs really have a lower failure rate than HDDs? I mean, how many times can it be assumed that I can write to a specific sector on each? I'd be interested in a report in which this has been tested by writing various devices to destruction, rather than by quoting manufacturer predictions.

      Don't give me wear levelling arguments, as they assume that I'm not frequently changing all the data on the medium.

      • Re: (Score:3, Interesting)

        by Rockoon ( 1252108 )
        How about this for an argument.

        A 500GB SSD can be entirely over-written ("changing all the data on the medium") over 10,000 times. No wear leveling needed here. 10K writes is the low end for modern flash.

        Lets suppose you can write 200MB/sec to this drive. Thats about average for the top enders right now.

        It will take 2,500 seconds to overwrite this entire drive once. Thats about 42 minutes.

        So how long to overwrite it 10,000 times?

        Thats 25,000,000 seconds.
        Thats 416,667 minutes.
        Thats 6,944 hours.
        T
        • Re: (Score:3, Insightful)

          289 *days* of constant 24/7 writing to use of the flash.

          This assumes the case of repeated sequential write to blocks 1 to n, where no wear levelling occurs. Consider that I first write once to 100% of the disk, then repeatedly: write sequentially to the first 25% of the disk n times, then write to the remaining 75% of the disk once. Dynamic wear levelling is out. How is a typical static wear levelling algorithm likely to kick in in a way which prevents an unacceptable slowdown during one pass, while at the same time squeezing out max writes to all physical bloc

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...