Building a 10 TB Array For Around $1,000 227
As storage hardware costs continue to plummet, the folks over at Tom's Hardware have decided to throw together their version of the "Über RAID Array." While the array still doesn't stack up against SSDs for access time, a large array is capable of higher throughput via striping. Unfortunately, the amount of work required to assemble a setup like this seems to make it too much trouble for anything but a fun experiment. "Most people probably don't want to install more than a few hard drives into their PC, as it requires a massive case with sufficient ventilation as well as a solid power supply. We don't consider this project to be something enthusiasts should necessarily reproduce. Instead, we set out to analyze what level of storage performance you'd get if you were to spend the same money as on an enthusiast processor, such as a $1,000 Core i7-975 Extreme. For the same cost, you could assemble 12 1 TB Samsung Spinpoint F1 hard drives. Of course, you still need a suitable multi-port controller, which is why we selected Areca's ARC-1680iX-20."
...How is this news? (Score:3, Informative)
*gag* (Score:5, Informative)
Misleading headline (Score:5, Informative)
No controller? No failover? No interconnect? (Score:3, Informative)
What good are 12 hard drives without anything else? Absolutely nothing. An enclosure alone to correctly power and cool these drives costs at least $800 and that's only with (e)SATA connections. No SAS, no FibreChannel, no Failovers, no cache or backup batteries, no controllers, no hardware that can connect your clients over eg. NFS or SMB to it.
Currently I can do professional storage in ~$1000/TB if you get 10TB, including backups, cooling and power that would probably run you $1600/TB over the lifetime of the hard drives (5 years).
And even cheaper (Score:3, Informative)
I did someting some years ago with 200GB (and later 500GB) drives:
10 drives in a chieftec Big tower. 6 drives go into the two internal drive cases, 4 go into a 4-for-3 mounting with a 120mm fan. Controller: 2 SATA on board and 2 x Promise 4 port SATA conroller 300 TX4 (a lot cheaper than Arcea and kernel native support). Put Linux software RAID 6 on the drives, spare 1 GB or so per drive for RAID1 (n-way) system. Done.
Re:Another selling point for double parity (Score:3, Informative)
Another thing with RAID arrays that have quiete a few drives is, you have no method of correcting a flipped bit. You need at least RAID6 to correct these errors. With such vast amounts of data, a flipped bit isn't that unlikely.
If the bit flip a bit earlier, i.e. in the bus, RAID6 is not helping there either and this is not the task of RAID in the first place.
If you want to be sure your data is on disk correctly, do checksums or compares. They are really non-optional once you enter the TB range. Once the data is on disk, the checksumming done by the drives make flipped bits unlikely. However I do advise to keep the checksums (e.g. by md5sum) with the data.
Re:*gag* (Score:4, Informative)
What about the electricity? (Score:5, Informative)
Such a RAID is for an always-on server. Expect about 8 watts per drive after power supply inefficiencies. So 12 drives, around 100 watts. So 870 kwh in a year.
On California Tier 3 pricing at 31 cents/kwh, 12 drives costs $270 of electricity per year, or around $800 in the 3 year lifetime of the drives.
In other words, about the same price as the drives themselves. Do the 2TB drives draw more power than the 1TB? I have not looked. If they are similar, then 6x2TB plus 3 years of 50 watts is actually the same price as 12x1TB plus 3 years of 100 watts, but I don't think they are exactly the same power.
My real point is, that when doing the cost of a RAID like this, you do need to consider the electricity. Add 30% to the cost of the electricity for cooling if this is to have AC, at least in many areas. And the cost of the electricity for the RAID controller etc. These factors would also be considered in comparison to a SSD, though of course 10TB of SSD is still too expensive.
Re:Bad Journalism (Score:4, Informative)
Also notice that they decided to stick desktop drives in a Raid array, a big no-no if you want your array to last more than a few weeks.
Not my experience. I did this with Maxtor 120GB and 200GB drives some years ago and in >3 years 24/7 (then the systems were replaced) I had one failure in 50 drives. (Ok, there were some additional ones, but that were drives dropped in shipping.) I think the "big no-no" is just the drive vendors wanting to earn more per drive by rebranding the same hardware as "RAID edition" with a few firmware changes. At least with Linux software RAID you do not need any "RAID edition" drives.
Re:Sigh... (Score:3, Informative)
Re:Why This Article Is Stupid (Score:1, Informative)
You don't need a hardware raid controller, redundant power, or battery backup to have perfectly reliable storage. You just need ZFS.
Re:*gag* (Score:3, Informative)
Now luckily most of our controllers are made by either QLogic/Emulex (for fibrechannel). On board stuff is usually HP, and we have yet to have much of a problem.
misleading title, indeed (Score:2, Informative)
From building two or three of these at home myself, my practical experience for someone wanting a monster file server for home, on the cheap, consists of these high/low points:
1. the other poster(s) above are 100% correct about the raid card. to get it all in one card you'll pay as much as 4-5 more hdd's, and that's on the low end for the card. decent dedicated PCI-E raid cards are still in the $300+ range for anything with 8 ports or more.
2. be careful about buying older raid cards. I have 2 16-port and 2 8-port adaptec PCI-X sata raid cards that are useless. why? they only support raid arrays up to 2tb in size. "update the firmware", you say. sure, let me just grab the latest, from 2005, I'm sure that fixes it. oh, wait, my raid cards already have that, and it doesn't remove that limitation. 8 drives, 16 drives, even, and they hard-code a limit of 2tb? lame.
3. I've seen nothing in a home-budget price range that performs as well as linux software raid. My 1.5 yr old 500$ tyan workstation mobo(S5397, in another computer) has dedicated SAS raid that can't seem to do better than 10mbyte/sec throughput. reading data from drives that individually bench out at 50-60mbyte/sec.
4. which leads me to: use linux software raid. It's much more configurable than any hardware raid card, both in supported raid levels and monitoring capabilities. raid disks/arrays can be easily moved from one machine to another, one controller to another, etc. I've moved most of my disks between machines and controllers at least once.
5. I've come to believe over time that what you're really looking for is X SATA ports, not "controller capable of doing raid over X disks". Use SATA "mass storage" cards, or raid cards that will let you use them in pass-through mode to access the individual disks directly in the OS. here you have to be careful you don't get bit by #1, 2, or 3 again, since some raid cards don't behave well when not actually doing raid (I'm still looking at you, Adaptec). this makes it easier and much cheaper, you can mix and match lower-capacity cards to get 8-20+ sata ports for raid.
5.1 "hw vs sw raid tangent" : what happens on a dedicated raid card when you run out of ports? you usually can't span raid cards, unless you get multiple identical fancy (aka expensive) raid controllers from the same manufacturer. all linux needs is hard drives recognizable by the BIOS.
6. when using software raid, buy a decent CPU. You don't need some quad-core beast, but you don't want to be waiting on the CPU to finish your raid calculations. any 2-2.5ghz C2D is probably more than adequate...I've drawn the line with anything under 2ghz.
7. kiss backups good-bye. the price of any decent backup system capable of covering this much storage is WAY over the price of this whole setup. Anything I really don't want to lose gets saved multiple places outside of the raid array, otherwise I factor the potential for data loss as a risk of operating this way. Personally I don't really see how you could do otherwise in a setup like this.
8. be prepared for bottlenecks. you're doing this on a home budget, you probably won't get 300mbyte/sec reads off of your array, no matter how many drives configured at what raid level. I can only get 10-20mbyte/sec across my gigE network going to/from my raid 5 array. This is probably due to the cheap PCI sata cards I'm using. I willingly make this trade-off to obtain the capacity I have for the price I spent.
If any of these points is an overriding concern for your intended use, then you'd have to re-evaluate the importance of all the other considerations.
For me, stability, capacity and price are top three, leading me to research linux-stable cheap sata expansion cards (which is just a nice way of saying, I buy and try probably 2x the # of controllers I actually use, to find ones that won't corrupt data, time out on random drive accesses, or simply not display the real drives to the OS, etc), and compromise by waiting a bit longer for network transfers. Usua