Ask Slashdot: Simple Way To Backup 24TB of Data Onto USB HDDs ? 405
An anonymous reader writes "Hi there ! I'm looking for a simple solution to backup a big data set consisting of files between 3MB and 20GB, for a total of 24TB, onto multiple hard drives (usb, firewire, whatever) I am aware of many backup tools which split the backup onto multiple DVDs with the infamous 'insert disc N and press continue', but I haven't come across one that can do it with external hard drives (insert next USB device...). OS not relevant, but Linux (console) or MacOS (GUI) preferred... Did I miss something or is there no such thing already done, and am I doomed to code it myself ?"
USB and disk Speed (Score:5, Insightful)
Re:USB and disk Speed (Score:5, Informative)
Re:USB and disk Speed (Score:4, Insightful)
I'd be willing to bet his change rate isn't 24TB/day.
Re:USB and disk Speed (Score:4, Funny)
Maybe he's personally backing up CERN?
Re: (Score:3, Informative)
The LHC generates a petabyte per second [slashdot.org].
Re: (Score:3, Informative)
Comment removed (Score:5, Funny)
Re:USB and disk Speed (Score:5, Funny)
If the OP's porn collection can be logically broken up at some level, eg:
then the backup software could create one job for each directory, and multiple USB disks could be attached at once giving increased throughput. USB3 also increases speed to the point where the 7200RPM disk itself will become the bottleneck.
So at 100MB/second per disk write speed with 4 disks going at once (assuming the source disks are capable of this supplying this volume of data and there are no other throughput limitations), you could do it in 16 hours, or 24 hours with more realistic margins.
If it turns out that the source data is not porn (unlikely) and is highly compressible, then it could be done in far less time.
Bacula can do all of this.
Re: (Score:2, Funny)
Or, he could watch the content as it is copied. At 600 Mbytes/hour (assuming standard mpeg compression), it would be a month of 24/7 nonstop action!
"- Hey boss, I need to, uhh, work from home for the next four weeks to handle the backup..."
Re:USB and disk Speed (Score:5, Funny)
then the backup software could create one job for each directory,
Is that what we call a blow job?
Re:USB and disk Speed (Score:5, Funny)
No. No it is not.
Re:USB and disk Speed (Score:5, Funny)
Re: (Score:2, Informative)
It's "nudge-nudge", not "notch-notch".
Also, you left out "wink-wink".
Yes, I know, I should get a life..
Re: (Score:3, Funny)
Bacula can do all of this
So he quantum leaps into you, and isn't allowed to leave until he performs the backup? Oh wait! Bacula, not Bakula.
Re:USB and disk Speed (Score:5, Funny)
Re:USB and disk Speed (Score:5, Informative)
I have a setup here where the server's video media is about 8tb in size. That backs up via rsync to the backup server which is in another room over rsync. It contains a large number of internal and external drives. None of them are over 2tb in capacity. The main drive has data separated into subfolders and the rsync jobs back up specific folders to specific drives.
A few times I've had to do some rearranging of data on the main and backup drives when a volume filled up. So it helps to plan ahead to save time down the road. But it works well for me here.
The only thing with rsync you need to worry about is users moving large trees or renaming root folders in large trees. This tends to cause rsync to want to delete a few TB of data and then turn around and copy it all over again on the backup drive. It doesn't follow files and folders by inode, it just goes by exact location and name.
I help mitigate this by hiding the root folders from the users. The share points are a couple levels deeper so they can't cause TOO big of a problem if someone decides to "tidy up". If they REALLY need something at a lower level moved or renamed, I do it myself, on both the source and the backup drives at the same time.
Another alternative is to get something like a Drobo where you can have a fairly inexpensive large pool of backup storage space that can match your primary storage. This prevents the problem of smaller backup volumes filling up and requiring data shuffling, but does nothing for the issue of users mucking with the lower levels of the tree.
Re:USB and disk Speed (Score:5, Interesting)
Agreed. Best thing I ever did was get a computer case with a SATA sled bay, like one of these [newegg.com]. It won't help with breaking up the files, but a plain SATA connection will be many times faster and many times cheaper than getting external USB drives (because you don't have to keep paying for external case + power supply). After you copy it over, you just store the bare drives in a nice safe place.
This assumes it's a one-time or rare thing. If you do want access or the backup process is a regular thing, then an NAS or RAID setup is probably more convenient so that you don't have to keep swapping drives in and out.
Re: (Score:3)
If you're just using RAID to make a bunch of disks look like a single logical unit, consider mhddfs [mindlesstechie.net]. It's a FUSE filesystem which makes a bunch of disks look like a single unit. I've used it for storing backups - it works as advertised.
IIRC there were one or two caveats like a lack of hard link support so make sure you try all your use cases before relying on it.
Re: (Score:3)
Actually, they are roughly the same price.
Although SATA is more widespread and avoids any reduction in performance you might get from putting an intermediate layer in front of the native interface of the drive. A large drive is going to require a wall wart and all of those will need to be looked after.
The problem with case+power supply is not the cost but the fact that it is something else to lose. This goes for the extra cabling too.
Plus with a bare drive you can buy with performance in mind since the driv
Re: (Score:2)
If he's looking for reliability in a backup, then his choice of disks is going to be a factor. A drive with consumer grade chances of URE is going to die in a handful of writes and reads. USB grade drives (Caviar Green anyone?) aren't known for their reliability. Something like a Hitachi Ultrastar RE has a very very low chance of encountering a URE, so will be much more reliable.
Bacula is your friend (Score:5, Informative)
http://www.bacula.org/en/ [bacula.org]
There's even a howto here:
http://wiki.bacula.org/doku.php?id=removable_disk [bacula.org]
Re: (Score:2)
was going to suggest bacula as well, but came a bit late :)
Re: (Score:3, Informative)
Yes, Bacula is the only real solution out there that isn't going to cost you an arm and a leg, and that allows you to switch easily between any backup medium. As long as your mySQL catalog is intact restoration is a synch...
Did I mention it supports backup archiving as well if you want duplicate copies for Tapes being shipped off site...
Re:Bacula is your friend (Score:5, Informative)
Yes, Bacula is the only real solution out there that isn't going to cost you an arm and a leg, and that allows you to switch easily between any backup medium.
Except for good old tar, which is present on all systems.
Most people are probably not aware that tar has the ability to create split tar archives. Add the following options to tar: ... where myscript.sh echoes out the name to use for the next tar file in the series. It can be as easy as a for loop checking where the tar file already exists and returning the next hooked up volume where it doesn't.
-L <max-size-in-k-per-tarfile> -M myscript.sh
Or it could even unmount the current volume and automount the next volume for you. Or display a dialogue telling you to replace the drive.
One advantage is that you can easily extract from just one of the tar files; you don't need all of them or the first-and-last like with most backup systems. Each tar file is a valid one, and at most you need two tar files to extract any file, and most of them just one.
Tar multivolume can, of course, be combined with tar's built in compression.
Re: (Score:3)
I know you tried to make an asshat joke, but I'll respond anyhow:
Yes, Microsoft provides tar (and many other useful apps primarily associated with Unix and Linux).
Quoting Wikipedia:
"Interix versions 5.2 and 6.0 are respective components of Microsoft Windows Server 2003 R2, Windows Vista Enterprise, Windows Vista Ultimate, and Windows Server 2008 as Subsystem for Unix-based Applications[1] (SUA[2]). Version 6.1 is included in Windows 7 (Enterprise and Ultimate editions), and in Windows Server 2008 R2 (all ed
Re: (Score:2)
Another thumbs up for bacula if you need more than a single backup of your data (like copying it to drives only once)
Split into multiple tar files? (Score:5, Informative)
I'm guessing you don't have enough space to split a backup on the original storage medium and then mirror the splits onto each drive?
Given the size requirements, it seems that might be prohibitive, but it would make things easier for you:
How to Create a Multi Part Tar File with Linux [simplehelp.net]
A Full 24TB using only 2 USB ports (Score:2)
RAID (Score:5, Informative)
For that much data you want a RAID since drives tend to fail if left sitting on the shelf, and they also tend (for different reasons) if they are spinning.
Basically: buy a RAID enclosure, insert drives so it looks like one giant drive, then copy files.
For 24TB you can use eight 4TB drives for a 6+2 RAID-6 setup. Then if any two of the drives fail you can still recover the data.
Re: (Score:2)
Re: (Score:3)
For that much data you want a RAID since drives tend to fail if left sitting on the shelf, and they also tend (for different reasons) if they are spinning. Basically: buy a RAID enclosure, insert drives so it looks like one giant drive, then copy files. For 24TB you can use eight 4TB drives for a 6+2 RAID-6 setup. Then if any two of the drives fail you can still recover the data.
Yeah... though I suspect with the price premium for 4TB drives - they're huge - and the cost of an 8-port RAID6 capable RAID card you're considerably above the budget he was going for. If this is like "projects" or something I'd probably suggest the human archiving method - split your live disk into three areas, "work in progress" and "to archive" and "archive". Your WIP you back up completely every time, your "to archive" you add to the latest archive disk (plain, no RAID), and make an index of it so you c
Re: (Score:2)
More to the point - Do what the parent post said, but use something like FreeBSD or Solaris and ZFS with a raidz3 setup (essentially RAID6), which gives you block level dedup, snapshotting, compression, encryption, etc.
Re:RAID (Score:4, Informative)
As mentioned already, RAID is not a backup solution. While it will likely work fine for a while, the risk [datamation.com] of a catastrophic failure rises as drive capacity increases. From the linked article:
With a twelve -terabyte array the chances of complete data loss during a resilver operation begin to approach one hundred percent - meaning that RAID 5 has no functionality whatsoever in that case. There is always a chance of survival, but it is very low.
Granted, this is talking about RAID 5, so let's naively assume that doubling the parity disks for RAID 6 will halve the risk... but then since we're trying to duplicate 24 terabytes instead of twelve, we can also assume the risk doubles again, and we're back to being practically guaranteed a failure.
Bottom line is that 24 terabytes is still a huge amount of data. There is no reliable solution I can think of for backing it all up that will be cheap. At that point, you're looking at file-level redundancy managed by a backup manager like Backup Exec (or whatever you prefer) with the data split across a dozen drives. As also mentioned already, the problem becomes much easier if you're able to reduce that volume of data somewhat.
Re: (Score:3)
You should check out ZFS. These issues go away. And with RAID-Z3, up to 3 drives can die before you have a problem.
Re:RAID (Score:4, Informative)
As mentioned already, RAID is not a backup solution.
Nevertheless, there is nothing wrong with using disks that happen to be in a RAID configuration as backup disks. In fact, it is probably a pretty good idea for large files and large amounts of data.
Re: (Score:3)
Quite the contrary, and that's my point. The errors here aren't just "let's try again" failures. They're unrecoverable, final, data-is-gone-forever errors, and the chances of encountering one are very high with so much data. Resilvering such a large array is practically impossible (as described in the article I linked to). Without resilvering and having blocks spread among disks, losing one disk means you've lost a little bit of everything, so all your data is corrupt, rather than just the fraction that was
Re: (Score:2)
Well, actually you're not helping someone do anything. You're just vomiting up speculative accusations.
See above. Then ignore children, CD rot, and every other legitimate reason for backing up the optical media that you've spent your hard-earned money on.
Now run along and sue everyone who's provided actual, helpful advice. However, you may want to look up the standard for "contributory infringeme
Re: (Score:3)
ZFS + snapshots. problem solved. Though you do need more drives than a 8x3TB.
Re: (Score:3)
You didn't read what I said. Yes, ZFS+Snapshots, but you also need at least Sun Cluster replication and tape backup. ZFS + Snapshots doesn't save you from fires, floods, software bugs and ill-will. It does save you from idiots, and disk failure though.
Re: (Score:3)
Sure it does, when you have a second set of them. Where you store tapes, store the drives instead. What do you think all those virtual tape vaults are made of?
Julian? (Score:5, Funny)
Out on bail mate?
git-annex (Score:4, Informative)
You might want to look into git-annex:
http://git-annex.branchable.com/ [branchable.com]
I've not tried it, but it sounds like an ideal solution for your request, especially if your data is already compressed.
Tape? (Score:5, Insightful)
Why not tape, backup RAID, SAN or some other dedicated backup hardware solution?
24TB is well within the range that a professional solution would be required.
Given a harddisk size of ~1TB, making a single backup to 24 disk isn't a backup; it's throwing data in a garbage can.
More than likely atleast one of those disks will die before it's time.
Re:Tape? (Score:5, Insightful)
Yup. spool to tape. get a SDLT600 tape cabinet and call it done. if you get a 52 tape robot cabinet you will have space to not only hold a complete backup but a second full backup in incrementals that will all run automatically. Plus it has the highest reliability.
And anyone whining about the cost. If your 24Tb of data is not worth that much then why are you bothering to back it up?
Re:Tape? (Score:5, Informative)
No kidding. For $2400, you get 24x TB HDs and a bookkeeping nightmare if you ever actually resort to the "backup." For $3k, you get a network-ready tape autoloader with 50-100TB capacity and easy access through any number of highly refined backup and recovery systems.
Now, if the USB requirement is because that's the only way to access the files you want to steal from an employer or government agency, then the time required to transfer across the USB will almost guarantee you get caught. Even over the weekend. You should come up with a different method for extracting the data.
Re: (Score:2)
Assuming the 24TB is worthy to backup as a single backup.
Your movie collection is probably (A) not worthy of backup and (B) far more easily backed up as individual movies.
tar --multi-volume (Score:5, Interesting)
Tar already does this (Score:3, Informative)
Re:Tar already does this (Score:5, Informative)
multi volume tar [gnu.org]Just mount a new usb disk whenever it is full.
However to have reasonable retrieve rate (going through 24 TB of data will rake some days over USB2), You better split the dataset in multiple smaller sets. That also has the advantage that if one disk chrashes (AND Consumer grade USB disk will chrash!) not your entire dataset is lost.
For that reason (diskfailure), do not use some linux spanning disk feature. File systems are lost when one of the disks they write on are lost. Unless you use a feature that can handle lost disks (Raid/ Zraid)
And last but not least: Test your backup. I have seen myself cheap USB interfaces failing to write the data to disk without a good error messages. All looks ok until you retreive the data and some files are corrupted.
Linuxquestions thread on multi-disk backups (Score:2, Informative)
Here's a Linuxquestions thread [linuxquestions.org] outlining multi-disk backup strategies.
The gist of the discussion is to use DAR [linux.free.fr].
No. (Score:2, Insightful)
I'm not sure if you posed the question out of being nieve, or if its just being daft. You don't want to be moving 24TB over the USB bus. End of discussion really - at least in terms of USB.
Whoever or however you ended up looking at USB for this was wrong/wrong way.
You have lots of choice in terms of boxes, servers, NAS boxes, locally attached storage. 24TB is in the range of midrange NAS boxes.
Once you have this, you can start to make choices on the many backup, replication, and duplication bits of software
Re: (Score:2, Informative)
USB 2.0 provides 480Mbps of (theoretical) bandwidth. So unless you go Gigabit all over your network (not unreasonable), you won't beat it with a NAS. Even then, it's only 1-and-a-bit times as fast as USB working flat-out (and the difference being if you have multiple USB busses, you can get multiple drives working at once). And USB 3.0 would beat it again. And 10Gb between the client and a server is an expensive network to deploy still.
Granted, eSATA would probably be faster but there's nothing wrong wi
Re: (Score:2)
You don't need a Gigabit connection everywhere, just on your computer and the NAS directly connected to your computer.
USB2 is not a very good option. For some reason, I've been getting poor performance from Linux with storage mounted via USB. Your best bet is eSata. If you can't install eSata, but have a Gigabit eithernet connection then go that route. USB2 is the connection of last resort when talking about backing up 24TB.
Re: (Score:3)
USB 2.0 provides 480Mbps of (theoretical) bandwidth. So unless you go Gigabit all over your network (not unreasonable), you won't beat it with a NAS. Even then, it's only 1-and-a-bit times as fast as USB working flat-out (and the difference being if you have multiple USB busses, you can get multiple drives working at once).
The 480Mbps is nowhere near what you will see in practise, unlike network speeds which are far closer to the rated maximum. Most USB drives I've seen top out at somewhere between 25 and 30MByte/sec, and if there are no other bottlenecks it isn't unusual to see 100Mbyte/sec from a gbit switched network. My main desktop pulls things from the fileserver at around 80Mbyte/sec, which is as fast as local reads tend to be on that array. So you are right about 100mbit networks: that'll be the bottleneck not USB, bu
You know... (Score:5, Funny)
Seriously: Build your own homebrew NAS. (Score:5, Interesting)
What your attemting isn't easy, it's actually difficult.
Buy a cheap and big refurbished workstation or rackmount server, install a few extra SATA controllers and maybe a new power supply, hook up 12 2TB drives, install Debian, check out LVM and your all set.
Messing around with 12 - 24 external HDDs and their power supplys is a big hassle and asking for trouble. Don't do it. Do seriously go through the possibilty of building your own NAS. You'll be thankfull in the end and it won't take much longer, it might even go faster and be cheaper if you can get the parts fast.
My 2 cents.
Re:Seriously: Build your own homebrew NAS. (Score:5, Interesting)
Way to redefine the problem instead of working within the specifications.
Perhaps:
1. The poster ALREADY has a NAS and wants to have airgapped or even offsite/offline backup.
2. External HDDs are fast, common, reasonably cheap, and do not have a single point of failure (e.g., the tape backup drive in many suggested alternatives)
I'm interested in this question. I use this general setup, but on a smaller scale. I cannot put a NAS in a safety deposit box. I cannot ensure that my "backup" NAS would not be drowned in a flood, burned in a fire, fried by a lightning strike...
Let's pretend the poster is not an idiot, and answer the actual question. If he has 24TB of data, IT'S ALREADY ON DAS/NAS. Geesh.
Re: (Score:2)
Don't assume he was the one that created his current storage solution. It could be a turnkey solution that he purchased, like one of those movie storage devices we read about on slashdot earlier this year.
If he installed his current storage configuration himself then why did he need to ask this question on Slashdot? I don't see any particular bad answers, and no one is insulting
Bash.... (Score:5, Informative)
First bash script to grab the size of the "current" storage;
compress the files up until that size;
Move compressed file onto storage;
request new storage, start again.
----------
Or, if you've got all the storage already connected; bash for 0..x; do { cp $archive$x /mount/$x/ }; done :D
we generate a lot of data (3 GB/min)... (Score:2)
... by employing a detector with a size of 2463 x 2527 pixels (6M) at 12 Hz (12 times / sec). When run continuously for a set of data (roughly 900 degrees) ...
we collect 900 frames in roughly 2 minutes including hardware limitations for starting/stopping.
In proper format for processing, this works out to about 6MB/image and roughly 3GB/min for 2 minutes.
With an experienced crew of 3-4 people ... one handling the samples, one handling the liquid nitrogen, one running the software and one taking notes (ove
Re: (Score:2)
I'm not the OP but:
Because downloading 3.6Tb to restore from a backup for just one day is pretty ridiculous for someone on a home broadband?
Backup to external servers is ridiculous for anyone without university-sized access to the net. Hell, the school I work for try to back up 10Gb to a remote server each night and it often fails because it took too long (and we're only allowed to do that because we're a school - the limits for even business use on the same connection are about 100Gb a month).
Absent a stu
Re: (Score:2)
Two quick things:
1. Why do a complete restore of the 3.6TB? Just take the files that want to use again/have been lost.
2. Why work at home? It's home, not work.
USB is not for backup (Score:2)
USB is for a second working copy.
Backups should also ensure durability of the copy, while USB HDD have a shorter lifespan than a normal HDD which in turn has shorter lifespan than tapes, the usual medium for durable backups.
Re: (Score:2)
Use DAR or KDAR (Score:3, Informative)
If you don't want to invest in new hardware, you could use DAR [ubuntugeek.com] or KDAR [sourceforge.net] (KDE front-end for DAR).
With KDAR, what you want is the slicing settings [sourceforge.net].
There's an option to pause between slices, which gives you time to mount a new disk.
Use purpose designed backup media. (Score:2)
Backup tapes were designed precisely for the problem you have. LTO-5 tapes are about 1.5TB, if I remember right. Stored correctly they shouldn't give any problems when you come to retrieve whatever is backed up. Most archiving efforts use backup tape, and they can't all be wrong :)
Re: (Score:2)
Actually handling all those tapes and recovering data from them is very expensive in manpower and time, and can be very awkward for recovering data. Those tapes, and tape drives, are also _expensive_. They're useful for sites that require secure off-site storage, or encrypted off-site storage, but for most environments today they are pointless. Easily detachable physical storage has become very inexpensive, far more economical, and is far less vulnerable to the vulnerabilities of mishandling SCSI connection
Re: (Score:2)
Actually, for a data set this large it will probably work out only very slightly more expensive - and the benefit to be gained is worth it IMHO (in speed if nothing else - USB disks are *slow* and eat a lot of CPU). I live in the UK so I'll work in GBP. I think US prices are likely to be cheaper but the relative sizes will be similar.
I'd figure around ~£1100 for drive and SAS interface plus £500-700 for 24TB worth of media. Throw in an extra 2TB drive to spool to before you write to tape as well
Re: (Score:2)
The slow disk is why you use rsync or other such efficient mirroring technologies. The tapes have a limited lifespan, they require significant maintenance, and have been prone to far too many mechanical failures and expensive downtime in my experence. The disks can actually be simultaneously connected for casual "read" access with a reasonable USB hub and possibly an additional USB card.
You've also left out the cost of recovery time for users. Swapping tapes to get recovery of arbitrary files is rather awkw
Re: (Score:3)
Whether tape or disk is appropriate really depends what you are intending to use the backup for and how important your data is. You might even choose to use a mixture of the two.
If it's your only backup, I would suggest that it's not wise to leave it permanently online in the way you suggest; that leaves you open to any number of potential issues which your backup is supposed to protect you from (OS bug, misconfiguration, lightning strike, power failure, overheating, ...). Tape libraries have the same issue
PAR (Score:4, Informative)
I have just seen "PAR" a couple of times here on slashdot, haven't used it, but it seems great for this: http://en.wikipedia.org/wiki/Parchive [wikipedia.org] . You need enough redundancy to allow one USB drive to fail. And I would rather get a SATA bay and use "internal" drives than having to deal with external USB drives. Get "green" drives, they are slow but cheap.
NAS (Score:3)
I know (Score:2, Funny)
The iCloud! ;-)
Going about it all wrong (Score:2)
Re: (Score:2)
build a box out of 2x4s and Drywall scraps form homedepot. Make it 5 sheets thick and it'll withstand any housefire you could possibly have
I find that statement suspect. I am not saying you are wrong but extraordinary claims require extraordinary evidence.
I have seen some pretty nasty house fires, the kind were the fire department sprays water on the neighbors houses to keep them from catching rather than try to do anything about the one that is actually burning. With all the modern synthetic materials in furniture, carpeting, and other flooring a house fire can hit 600 degrees and stay that way for hours.
If five sheets of dry wall ( .5" or
Read it from Torvald's lips (Score:4, Funny)
"Only wimps use tape[*] backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)"
Linus Torvalds (1996) http://en.wikiquote.org/wiki/Linus_Torvalds [wikiquote.org]
(Isn't that prescience of "The Cloud"?)
–––––––––– ;-)
* replace this with your favorite backup media of today
Re: (Score:3)
"Only wimps use tape[*] backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)"
Linus Torvalds (1996) http://en.wikiquote.org/wiki/Linus_Torvalds [wikiquote.org]
(Isn't that prescience of "The Cloud"?)
––––––––––
* replace this with your favorite backup media of today ;-)
"Only wimps use ftp[*] backup: real men just upload their important stuff to the iCloud, and let the rest of the world mirror it ;)"
An Amazon support employee (2012)
Re: (Score:2)
Linus should have patented his idea! He could have become a rich man.
Kim Dotcom (Score:2)
It's a little late to be asking that now.
Count Bacula (Score:3)
Count Bacula as your friend ;) -> http://www.bacula.org/ [bacula.org]
What do you have now? (Score:2)
Backup advice (Score:2, Insightful)
I do things like this all the time with a data set about half of that, ~ 12TB. You didnt say anything about what the data is but from the request and the fact you mentioned USB I would gather this is your typical warez hording mp3/flac, mkv, apps and also a personal picture and video collection of fam.
Here is a checklist i would execute similiar to mine. I find the most reliable way to keep your data over the years is by following a checklist or procedure and choosing when to move to the next storage pl
Keep it simple (Score:2)
Good luck
man tar (Score:2)
Fishy (Score:2)
Private Manning, is that you?
External RAID enclosure (Score:3)
You buy one of these:
http://www.newegg.com/Product/Product.aspx?Item=N82E16816322007 [newegg.com]
populate it with 4GB drives and create two RAID5 (or one RAID6) array, then you've got 24 or 28 TB of backup space, without having to change drives or break up your backup into smaller chunks.
But really, your backup methodology is broken; you need to organize the data into manageable chunks because aside from a large dedicated backup server/SAN, there is no reliable (don't tell me tape is reliable) backup solution for a such a large quantity of data in a single chunk.
What I do for backups: in my 24-bay server I have eight large drives in a (HARDWARE) RAID5 array (were 4TB drives available at the time I'd have gone RAID6) and rsync the virtualized server contents to that, then archive them into tarballs, and send copies of them across the LAN to another server that is running (HARDWARE) RAID5 as well. Every once in a while I back up the critical data (source, scripts, financial data, production web sites, /etc, and so forth but not the program binaries nor system binaries which are easily recreated or reinstalled, respectively) to optical media and external hard drives.
So what I have in summary is:
* Massive server with a backup array separate from the production array
* Separate backup server running another array (again, using a quality HARDWARE RAID controller. Safeguard your data and don't bother with Intel, Adaptec, Promise, or Highpoint "hybrid" RAID)
* Periodic backups of non-recreatable data to USB drives and optical media that are moved off site.
Re: (Score:3)
The drobo won't allow that, the file system is spread across all the drives.
I guess it kind of depends on what the author needs to do with the drives when he's finished writing to them.
Re: (Score:2)
An 8 drive DroboPro with 3 TB disks might just about do it.
Check out:
http://www.drobo.com/products/professionals/drobo-pro/index.php [drobo.com]
Re: (Score:3)
Actually 8x4 TB disks will do it, with the overhead etc, giving you 24.96 TB usable space.
Re:DaisyChain (Score:5, Informative)
We did this exact thing using WD Green drives for our 18Tb backup problem. Got two of 'em, planning on using their built-in rsync for onsite/off siting the data. Unfortunately, the units never broke 1MB/s transfer, and no amount of work with Drobo yielded faster performance reliably. Both of our units are now sitting unused, ($2500 each!), and we put the drives into a RAID-50 8 bay USB3 enclosure. The new unit runs about 150x faster, and ended up costing $400 (prices are for enclosures only, drives were additional).
Most disappointing was Drobo's support- they just seemed to shrug a lot, and were hyper-agressive about closing trouble tickets.
Re: (Score:2)
I have a Synology NAS and I'm very pleased with it. I don't have anywhere near the volume of data the OP has though. One thing with a NAS is that you'll be subject to the networks available bandwidth and, depending on your set up, this could make backing up lots of data pretty darn tedious. And might annoy admin (and other users). So while a decent portable raid might be the better option, it might be better to find one that just plugs in rather than use the network. Might find one that can be setup to use
Re: (Score:3)
The 200GB range drives in my main server have been trundling along for many years while I have a pile of 0.5-2TB hard drives I need to go through and get warrantied (three of them Caviar blacks). Not impressed with the big drives.
Re: (Score:2)
Are you REALLY sure that you want to use USB HDDs? The cost savings of using a box of HDDs may well be offset by the hassle in finding the backup software, the manual labor of swapping them, finding the correct drive to retrieve a certain file, etc.
How about a pair of Synology DS1512+ NASes? In addition to getting all of the storage online at all times, you get RAID support, etc.
No reason why they can't all be attached at once. with 3TB disks, and 8 USB3 ports, you'ld only need to plug them all in to do the backup then remove them all to take them offsite when the backup is done.
A few portable NAS's holding 4 disks each might be a better option, but don't exclude USB for its simplicity.
Re:solution (Score:5, Informative)
3.samba
Uh? Why?
cp -a is all you need once you put the HDD inside the target machine.
And if you put it into another machine on the same network, then rsync is the answer.
Forget about the buggy and slow SAMBA.
Re: (Score:2)
Agreed. Samba should be at the very bottom of the list. It is the best solution only when there's no other solution.
Re:solution (Score:4, Insightful)
Is it that much faster for 3mb to 20 gb files?
Re:solution (Score:5, Informative)
No. It's slower. Informative, my ass.
Re: (Score:3)
Yes. The above tar command is really from a time when cp did not have r and p options (and still likely doesn't on some systems so it's worth knowing). OTOH, you can add in the z option (compress) if you're doing something networky (though you'll probably want to throw in netcat or ssh too in that case). Of course, if you're doing that, rsync is probably the better option if available and leads to some interesting backup options going forward.
Re: (Score:3)
how transportable is that though?
I mean, if i copied 200 gig across 3 drives in a jbod raid, could i plug just one drive in to access the information on another machine? Suppose my laptop only has 2 usb ports and i do not have a hub plus i'm running a different OS, does this mean i can't look for information on the set?
I have never used JBOD for raid, I have however used regular mirrored and stripped raids with and without fault tolerance (raid 5 and 10 or a mirrored stripe for instance) and know this can b
Re: (Score:3)
To my mind, Bacula would be a good choice as you can set up virtual tapes that will correspond to the drives and you can set the backup to wait for the operator to swap over the drive a