RAID Vs. JBOD Vs. Standard HDDs 555
Ravengbc writes "I am in the process of planning and buying some hardware to build a media center/media server. While there are still quite a few things on it that I haven't decided on, such as motherboard/processor, and windows XP vs. Linux, right now my debate is about storage. I'm wanting to have as much storage as possible, but redundancy seems to be important too." Read on for this reader's questions about the tradeoffs among straight HDDs, RAID 5, and JBOD.
At first I was thinking about just putting in a bunch HDDs. Then I started thinking about doing a RAID array, looking at RAID 5. However, some of the stuff I was initially told about RAID 5, I am now learning is not true. Some of the limitations I'm learning about: RAID 5 drives are limited to the size of the smallest drive in the array. And the way things are looking, even if I gradually replace all of the drives with larger ones, the array will still read the original size. For example, say I have 3x500gb drives in RAID 5 and over time replace all of them with 1TB drives. Instead of reading one big 3tb drive, it will still read 1.5tb. Is this true? I also considered using JBOD simply because I can use different size HDDs and have them all appear to be one large one, but there is no redundancy with this, which has me leaning away from it. If y'all were building a system for this purpose, how many drives and what size drives would you use and would you do some form of RAID, or what?
Duh (Score:5, Insightful)
That said, RAID is not a replacement for proper backup. RAID is just a first line of defense to avoid downtime.
Re:Duh (Score:3, Insightful)
A good point. Consider, though, that most people don't run terabyte-size tape backup at home. It's not like it's business critical data, so RAID-5 is probably sufficient.
Is Google broken today? (Score:2, Insightful)
Yes... Duh....
And the way things are looking, even if I gradually replace all of the drives with larger ones, the array will still read the original size. For example, say I have 3x500gb drives in RAID 5 and over time replace all of them with 1TB drives. Instead of reading one big 3tb drive, it will still read 1.5tb. Is this true?
Yes... Fucking duh.... Have you even read the RAID 5 Wiki article? [wikipedia.org]
I also considered using JBOD simply because I can use different size HDDs and have them all appear to be one large one, but there is no redundancy with this, which has me leaning away from it. If y'all were building a system for this purpose, how many drives and what size drives would you use and would you do some form of RAID, or what?
We've been through this a million times before and the answer is always the same. You're a cheap bastard who wants gobs of space with an acceptable amount of redundancy but aren't willing to buy two sets of drives. Buy 4 of the biggest drives you can afford and RAID 5 them. Don't expect stellar write speeds. You won't have a backup if something happens and all 4 drives blow but you'll at least have protection when one drive gives up the ghost which is mainly what most people want to protect against.
Why does stupid shit like this keep getting posted to the front page?
Get what you need for *NOW* not for later (Score:5, Insightful)
With computers, the stupidest thing you can do is spend extra money to prepare for your needs for tomorrow. Buy for what you need now, and by the time you outgrow it, things will be cheaper, faster and larger.
By the way RAID 5 is a pain in the ass unless you have physical hotswap capability, which I highly doubt.
How the hell did this make the front page? (Score:4, Insightful)
This place has really gone downhill. I thought Firehose was supposed to stop stuff like this, not increase it!
Anyways, just to be slightly on topic: there's no one answer to this question. It depends on your budget, your motherboard, your OS, and, most importantly, your actual redundancy needs. This kind of thing is addressed by large articles/essays, not brief comments.
Re:Duh (Score:3, Insightful)
Planned obsolescence (Score:4, Insightful)
Here's the way I do it (for a home storage server, not a solution for business-critical stuff):
Examine current storage needs, and forecast about two years into the future.
Build new server with reliable midrange motherboard, and a midrange RAID card. These days you could do with a $100-$300 four-port SATA card, or two.
Add four hard disks in capacities calculated to last you for two years of predicted usage, in RAID 5 mode. Don't worry about brand unless you know for a fact that a particular drive model is a lemon.
Since manufacturer's warranties are about one year, and you may have difficulty finding an unused drive of the same type for replacement, buy two more identical drives. These will be your spares in the event of a drive failure.
When the two years are up, you should be using 80 to 90 percent of your total storage.
At this point, you build an entirely new server, using whatever technology has advanced to at that time.
Transfer all your files to the new server.
Sell your entire old storage server along with any unused spare drives. A completely prebuilt hot-to-trot RAID 5 system, with new matching spare disk, only two years old, will still be very useful to someone else and you can recoup maybe 30 to 40 percent of the cost of building a new server.
Lather, rinse, repeat until storage space is irrelevant or you die.
Re:Get what you need for *NOW* not for later (Score:3, Insightful)
500 GB isn't that much space any more. If he's thinking of making an HDTV MythTV box, for instance, full-res HDTV streams will require a lot of space to store in real-time. It would probably be too computationally intensive to recode them into MPEG4 on the fly.
Go RAID 5 BUT with real hardware.... (Score:3, Insightful)
Again, doing it correct up front takes care of upgrade options down the line. It also gives you room to do monster sized volume if you ever need that much space (8 disk array). Most of these RAID solutions are also OS independent, so if you want dual boot, the volume would be recognized by Windows, Linux, Unix, BSD, etc., and you are also not dependent on using the exact same motherboard if you motherboard dies or wants to be upgraded (you would lose all your data if you use the built in RAID on the motherboard when changing to a new motherboard other then the exact same model).
These better cards also can be linked together (i.e. you always get a second card assuming your motherboard has a slot for it, and add more disks to the array that way as well).
Re:Don't worry about losing your media files (Score:3, Insightful)
SCRUB your arrays! (Score:3, Insightful)
This is what you do: buy 2 drives exactly the same size and mirror them. End of story.
NO! That's NOT the end of the story. You need to do what is called "scrubbing" the array periodically, because drives "silently" fail, where areas become unreadable for various reasons. Guess when one usually discovers the bad data? When one drive screeches to a halt, and you confidently slap in another and hit "rebuild". Surpriiiiiiiiise.
You can do it a variety of ways. The most harmless is probably to run a read-only bad-block test via cron, while monitoring each drive's SMART parameters long-term and having your cron job let you know if badblocks finds anything. An alternative is to instruct md to verify the array, if you're doing software raid.
You cannot, cannot, CANNOT just drop a bunch of drives into raid 5 and expect it to be peachy for the rest of time.
By the way, regarding controllers- skip ANYTHING made by 3ware, especially their PCI controllers. They're barely able to push 20-25MB/sec and have a couple of bad compatibility problems with certain drives. Areca units are blazing fast (especially the PCI-E cards) but priced for businesses, not home users looking for "cheap as possible."
Software raid comes in #1 for price/performance, but I strongly, strongly recommend you play around with the mdadm tool quite a bit before you put actual data on an md array. The stuff is very half-baked.
KISS it (Score:2, Insightful)
Each drive is mapped by each the UNC path, i.e., \\movieserver\movies1 so the media centers have four drives mapped on each one.
If I lose a hard drive, oh well, some of the movies won't be available until they are re-ripped from the DVDs.
Had I used RAID5, I would have 1,500 GB and it would not have been easy to upgrade. I have ran out of room and I am adding a couple of 750 GB drives.
If you use a linux server and LVM, losing one drives loses everything.
Performance requirements (Score:5, Insightful)
The advantages of RAID 0 versus RAID 1 versus RAID 5 have already been covered in detail, here, and in many books and websites.
However, allow me to address the issue of how they relate to a media center:
Firstly, when you say "media center/media server", do you mean "I just want to build myself a kickass Tivo?", or do you mean "I want to serve video for everyone in my frat house, simultaneously?"
If the former, consider that Tivos ship with 5500 RPM drives for several reasons:
1) They're cheaper than faster drives
2) They run cooler than faster drives
3) They run quieter than faster drives
4) They use less power than faster drives
5) They're more than fast enough for streaming a single video to your TV while recording another
Long story short, if you're just building a "free" Tivo with a kickass drive array, performance is *not* an issue. Keep in mind that if you're building a set-top box of sorts, the low heat and low noise features are *very* big benefits. You probably want RAID 5, and/or JBOD.
If, however, you're planning on serving video to more than a handful of stations simultaneously, you may need to consider performance. This is a vote for RAID 0 and/or RAID 10.
Now, the second axis: How important to you is this data? Really?
I've got over 300 gigs of drive space on my Tivo. Most of it is the last two weeks of television reruns (Scrubs, 6 copies of last Thursday's Daily Show, etc.), movies I recorded but won't watch, etc. There are about 10 gigs (3%) of video on there that's been saved for a few months, and frankly, I couldn't tell you a single thing on there that I'd miss if my drives went belly up tomorrow. So: do you *really* need to save all those Seinfeld reruns on a highly-redundant storage array? How *much* of the stuff on the server do you really need to keep?
Assuming it's less than 50% (in the Tivo scenario, it probably is), consider using JBOD for most of your storage, and maintaining a single backup drive, or small backup drive array. Or just backing up the good stuff to DVD.
In summary: If you're just building a Tivo, you probably don't really need the performance, or redundancy that RAID offers.
Re:Duh (Score:3, Insightful)
Re:How the hell did this make the front page? (Score:2, Insightful)
You mean like digg? Nah. The there's no wisdom in crowds. Just lowest common denominator.
Re:Linux, RAID 5, md (Score:1, Insightful)
Benchmarks of hardware vs software RAID (results: mostly software > hardware raid):
http://www.chemistry.wustl.edu/~gelb/castle_raid.
http://milek.blogspot.com/2006/08/hw-raid-vs-zfs-
http://milek.blogspot.com/2006/08/hw-raid-vs-zfs-
http://milek.blogspot.com/2007/04/hw-raid-vs-zfs-
http://stoilis.blogspot.com/2005/09/linux-softwar
Benchmarks/info of Linux IP Routing (more than capable of gigabit routing):
http://hardware.slashdot.org/article.pl?sid=06/09
http://freedomhec.pbwiki.com/f/linux_ip_routers.p
http://docs.rodecker.nl/10-GE_Routing_on_Linux.pd
Of course a Linux machine isn't going to be all that much help to you if are doing supercomputing work with 10 gigabit routing (but as we start seeing more dual quad core machines with 4 PCI Express x16 slots, this is bound to change).
So if you're not working with high-end ("giga" prefix) storage/networking for large systems, you're wasting your money on hardware appliances. Cheap hardware firewalls are a scaled down PC in a fancy box. Cheap RAID cards don't have their own ASIC offload engines. Cheap hardware routers are a joke compared to Linux PC routers.
Unless it is a 10 gigabit router with everyone done in specially designed high performance ASIC chips, you will see better performance on a PC than in a hardware appliance. The same for hardware raid where we're mostly only talking about 5 gigabit read/write speeds to/from the array.
Re:KISS it (Score:5, Insightful)
But the minute or so of uptime you get by not having to power down the computer is more than made up when the controller chip on your beautiful RAID controller sizzles. Using Linux software RAID lets you plug the drive(s) into another computer of a completely different chipset, boot up, and continue operations as though nothing had ever gone wrong. IMHO, this is far preferable to the effective lock-in presented to you by hardware controllers.
For me, it's all 100% software RAID 1.
Re:Two words: RAID 0 (Score:5, Insightful)
I.e. 3 500 GB drives in a RAID 5 doesn't give you 1.5 TB. (RAID 0 dose that). With RAID 5 you only get 1 TB.
Re:Duh (Score:4, Insightful)
RAID is just a first line of defense to reduce downtime.
Re:KISS it (Score:3, Insightful)
For info, I use an Adaptec 2420SA - a very nice card as it turns out. I must commend Adaptec on pulling their fingers out on their hardware RAID and Linux support, as aacraid is right in the kernel now. Their previous cards were terrible. The bottleneck is in I/O, and not on the CPU. Hotswap is quite important. If you don't have the convenience of hotswap then a lot of the reasons for what you have RAID for (easy replacement of downed drives) are gone. Yes it is, and good hardware RAID cards are actually quite cheap. The convenience of the OS seeing an array as one big disk and having the hardware handle it is great, because it takes a layer of management out of the system. Just make sure you have good management tools, which Adaptec certainly seem to have, along with 3ware. Remember that the OS will simply not know if there is a problem, because all it sees is a disk.
Re:I would use (and do use) linux software raid (Score:2, Insightful)
Forget RAID (Score:3, Insightful)
I find the best is to have another computer or possibly external drives sitting somewhere, and just make weekly/daily/monthly/whatever rsync copies between them. This allows for you to recover from user error like accidental deletions, and if the entire system goes down your covered. Want more space? add a drive and presto, more space. No special configuration required. No expensive controller cards (or cheap and slow controller cards) required.
And if your like me, you have another set of drives stored offsite... but I'm pretty paranoid about such things. =P
Re:Is Google broken today? (Score:3, Insightful)
X-RAID and an Infrant ReadyNAS NV+ = gold. These NAS were built by people who actually thought about the home or prosumer's needs and built something that addressed it, instead of "here's what we offer, take it or leave it".
(However, we should note that Netgear has bought Infrant, so it's the Netgear ReadyNAS now.)