Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Data Storage

Ask Slashdot: Simple Way To Backup 24TB of Data Onto USB HDDs ? 405

An anonymous reader writes "Hi there ! I'm looking for a simple solution to backup a big data set consisting of files between 3MB and 20GB, for a total of 24TB, onto multiple hard drives (usb, firewire, whatever) I am aware of many backup tools which split the backup onto multiple DVDs with the infamous 'insert disc N and press continue', but I haven't come across one that can do it with external hard drives (insert next USB device...). OS not relevant, but Linux (console) or MacOS (GUI) preferred... Did I miss something or is there no such thing already done, and am I doomed to code it myself ?"
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Simple Way To Backup 24TB of Data Onto USB HDDs ?

Comments Filter:
  • USB and disk Speed (Score:5, Insightful)

    by gagol ( 583737 ) on Friday August 10, 2012 @04:27AM (#40943507)
    May be your limiting factor here.
    • by gagol ( 583737 ) on Friday August 10, 2012 @04:30AM (#40943529)
      If you can achieve a sustained write speed of 50 megabytes per second, you are in for 140 hours of data transfer. I hope it is not a daily backup!
    • by jamesh ( 87723 ) on Friday August 10, 2012 @05:20AM (#40943779)

      If the OP's porn collection can be logically broken up at some level, eg:

      /porn/blonde
      /porn/brunette
      /porn/redhead

      then the backup software could create one job for each directory, and multiple USB disks could be attached at once giving increased throughput. USB3 also increases speed to the point where the 7200RPM disk itself will become the bottleneck.

      So at 100MB/second per disk write speed with 4 disks going at once (assuming the source disks are capable of this supplying this volume of data and there are no other throughput limitations), you could do it in 16 hours, or 24 hours with more realistic margins.

      If it turns out that the source data is not porn (unlikely) and is highly compressible, then it could be done in far less time.

      Bacula can do all of this.

      • Re: (Score:2, Funny)

        by Anonymous Coward

        Or, he could watch the content as it is copied. At 600 Mbytes/hour (assuming standard mpeg compression), it would be a month of 24/7 nonstop action!

        "- Hey boss, I need to, uhh, work from home for the next four weeks to handle the backup..."

      • by Pieroxy ( 222434 ) on Friday August 10, 2012 @05:42AM (#40943883) Homepage

        then the backup software could create one job for each directory,

        Is that what we call a blow job?

      • Re: (Score:3, Funny)

        by Anonymous Coward

        Bacula can do all of this

        So he quantum leaps into you, and isn't allowed to leave until he performs the backup? Oh wait! Bacula, not Bakula.

      • by v1 ( 525388 ) on Friday August 10, 2012 @07:31AM (#40944455) Homepage Journal

        I have a setup here where the server's video media is about 8tb in size. That backs up via rsync to the backup server which is in another room over rsync. It contains a large number of internal and external drives. None of them are over 2tb in capacity. The main drive has data separated into subfolders and the rsync jobs back up specific folders to specific drives.

        A few times I've had to do some rearranging of data on the main and backup drives when a volume filled up. So it helps to plan ahead to save time down the road. But it works well for me here.

        The only thing with rsync you need to worry about is users moving large trees or renaming root folders in large trees. This tends to cause rsync to want to delete a few TB of data and then turn around and copy it all over again on the backup drive. It doesn't follow files and folders by inode, it just goes by exact location and name.

        I help mitigate this by hiding the root folders from the users. The share points are a couple levels deeper so they can't cause TOO big of a problem if someone decides to "tidy up". If they REALLY need something at a lower level moved or renamed, I do it myself, on both the source and the backup drives at the same time.

        Another alternative is to get something like a Drobo where you can have a fairly inexpensive large pool of backup storage space that can match your primary storage. This prevents the problem of smaller backup volumes filling up and requiring data shuffling, but does nothing for the issue of users mucking with the lower levels of the tree.

    • by Anonymous Coward on Friday August 10, 2012 @05:30AM (#40943837)

      Agreed. Best thing I ever did was get a computer case with a SATA sled bay, like one of these [newegg.com]. It won't help with breaking up the files, but a plain SATA connection will be many times faster and many times cheaper than getting external USB drives (because you don't have to keep paying for external case + power supply). After you copy it over, you just store the bare drives in a nice safe place.

      This assumes it's a one-time or rare thing. If you do want access or the backup process is a regular thing, then an NAS or RAID setup is probably more convenient so that you don't have to keep swapping drives in and out.

      • If you're just using RAID to make a bunch of disks look like a single logical unit, consider mhddfs [mindlesstechie.net]. It's a FUSE filesystem which makes a bunch of disks look like a single unit. I've used it for storing backups - it works as advertised.

        IIRC there were one or two caveats like a lack of hard link support so make sure you try all your use cases before relying on it.

    • by shokk ( 187512 )

      If he's looking for reliability in a backup, then his choice of disks is going to be a factor. A drive with consumer grade chances of URE is going to die in a handful of writes and reads. USB grade drives (Caviar Green anyone?) aren't known for their reliability. Something like a Hitachi Ultrastar RE has a very very low chance of encountering a URE, so will be much more reliable.

  • by bernywork ( 57298 ) <bstapletonNO@SPAMgmail.com> on Friday August 10, 2012 @04:32AM (#40943543) Journal

    http://www.bacula.org/en/ [bacula.org]

    There's even a howto here:

    http://wiki.bacula.org/doku.php?id=removable_disk [bacula.org]

    • by richlv ( 778496 )

      was going to suggest bacula as well, but came a bit late :)

    • Re: (Score:3, Informative)

      by Anonymous Coward

      Yes, Bacula is the only real solution out there that isn't going to cost you an arm and a leg, and that allows you to switch easily between any backup medium. As long as your mySQL catalog is intact restoration is a synch...

      Did I mention it supports backup archiving as well if you want duplicate copies for Tapes being shipped off site...

      • by arth1 ( 260657 ) on Friday August 10, 2012 @06:17AM (#40944049) Homepage Journal

        Yes, Bacula is the only real solution out there that isn't going to cost you an arm and a leg, and that allows you to switch easily between any backup medium.

        Except for good old tar, which is present on all systems.

        Most people are probably not aware that tar has the ability to create split tar archives. Add the following options to tar:
        -L <max-size-in-k-per-tarfile> -M myscript.sh ... where myscript.sh echoes out the name to use for the next tar file in the series. It can be as easy as a for loop checking where the tar file already exists and returning the next hooked up volume where it doesn't.
        Or it could even unmount the current volume and automount the next volume for you. Or display a dialogue telling you to replace the drive.

        One advantage is that you can easily extract from just one of the tar files; you don't need all of them or the first-and-last like with most backup systems. Each tar file is a valid one, and at most you need two tar files to extract any file, and most of them just one.

        Tar multivolume can, of course, be combined with tar's built in compression.

    • by hoover ( 3292 )

      Another thumbs up for bacula if you need more than a single backup of your data (like copying it to drives only once)

  • by Anonymous Coward on Friday August 10, 2012 @04:34AM (#40943549)

    I'm guessing you don't have enough space to split a backup on the original storage medium and then mirror the splits onto each drive?

    Given the size requirements, it seems that might be prohibitive, but it would make things easier for you:

    How to Create a Multi Part Tar File with Linux [simplehelp.net]

  • Assuming you're not worried about backup speed, you could use a four-bay external hard-drive enclosure in combination with RSYNC and LVM on any linux variety. I don't know if they all do, but the MediaSonic HF2-SU3S2 supports 3TB hard drives per bay, which means that two of them could be used in conjunction to provide 24TB of backup storage. Since you can make a large volume out of the full 24TB using LVM, you could even use something like dd to write to the disk (RSYNC with the archive option would be a be
  • RAID (Score:5, Informative)

    by Anonymous Coward on Friday August 10, 2012 @04:34AM (#40943553)

    For that much data you want a RAID since drives tend to fail if left sitting on the shelf, and they also tend (for different reasons) if they are spinning.

    Basically: buy a RAID enclosure, insert drives so it looks like one giant drive, then copy files.

    For 24TB you can use eight 4TB drives for a 6+2 RAID-6 setup. Then if any two of the drives fail you can still recover the data.

    • If not a RAID (those tend to fail just as hard) get at least two, possibly three copies of each file on separate drives. The last thing you want is to wait for RAIDs to recover and watch them fail during recovery, with your only copy of a file on them.
    • by Kjella ( 173770 )

      For that much data you want a RAID since drives tend to fail if left sitting on the shelf, and they also tend (for different reasons) if they are spinning. Basically: buy a RAID enclosure, insert drives so it looks like one giant drive, then copy files. For 24TB you can use eight 4TB drives for a 6+2 RAID-6 setup. Then if any two of the drives fail you can still recover the data.

      Yeah... though I suspect with the price premium for 4TB drives - they're huge - and the cost of an 8-port RAID6 capable RAID card you're considerably above the budget he was going for. If this is like "projects" or something I'd probably suggest the human archiving method - split your live disk into three areas, "work in progress" and "to archive" and "archive". Your WIP you back up completely every time, your "to archive" you add to the latest archive disk (plain, no RAID), and make an index of it so you c

    • More to the point - Do what the parent post said, but use something like FreeBSD or Solaris and ZFS with a raidz3 setup (essentially RAID6), which gives you block level dedup, snapshotting, compression, encryption, etc.

    • Re:RAID (Score:4, Informative)

      by Sarten-X ( 1102295 ) on Friday August 10, 2012 @07:58AM (#40944647) Homepage

      As mentioned already, RAID is not a backup solution. While it will likely work fine for a while, the risk [datamation.com] of a catastrophic failure rises as drive capacity increases. From the linked article:

      With a twelve -terabyte array the chances of complete data loss during a resilver operation begin to approach one hundred percent - meaning that RAID 5 has no functionality whatsoever in that case. There is always a chance of survival, but it is very low.

      Granted, this is talking about RAID 5, so let's naively assume that doubling the parity disks for RAID 6 will halve the risk... but then since we're trying to duplicate 24 terabytes instead of twelve, we can also assume the risk doubles again, and we're back to being practically guaranteed a failure.

      Bottom line is that 24 terabytes is still a huge amount of data. There is no reliable solution I can think of for backing it all up that will be cheap. At that point, you're looking at file-level redundancy managed by a backup manager like Backup Exec (or whatever you prefer) with the data split across a dozen drives. As also mentioned already, the problem becomes much easier if you're able to reduce that volume of data somewhat.

      • You should check out ZFS. These issues go away. And with RAID-Z3, up to 3 drives can die before you have a problem.

      • Re:RAID (Score:4, Informative)

        by louic ( 1841824 ) on Friday August 10, 2012 @09:54AM (#40946019)

        As mentioned already, RAID is not a backup solution.

        Nevertheless, there is nothing wrong with using disks that happen to be in a RAID configuration as backup disks. In fact, it is probably a pretty good idea for large files and large amounts of data.

        • Quite the contrary, and that's my point. The errors here aren't just "let's try again" failures. They're unrecoverable, final, data-is-gone-forever errors, and the chances of encountering one are very high with so much data. Resilvering such a large array is practically impossible (as described in the article I linked to). Without resilvering and having blocks spread among disks, losing one disk means you've lost a little bit of everything, so all your data is corrupt, rather than just the fraction that was

  • Julian? (Score:5, Funny)

    by WinstonWolfIT ( 1550079 ) on Friday August 10, 2012 @04:35AM (#40943569)

    Out on bail mate?

  • git-annex (Score:4, Informative)

    by Anonymous Coward on Friday August 10, 2012 @04:40AM (#40943585)

    You might want to look into git-annex:
    http://git-annex.branchable.com/ [branchable.com]

    I've not tried it, but it sounds like an ideal solution for your request, especially if your data is already compressed.

  • Tape? (Score:5, Insightful)

    by mwvdlee ( 775178 ) on Friday August 10, 2012 @04:48AM (#40943625) Homepage

    Why not tape, backup RAID, SAN or some other dedicated backup hardware solution?
    24TB is well within the range that a professional solution would be required.
    Given a harddisk size of ~1TB, making a single backup to 24 disk isn't a backup; it's throwing data in a garbage can.
    More than likely atleast one of those disks will die before it's time.

    • Re:Tape? (Score:5, Insightful)

      by Lumpy ( 12016 ) on Friday August 10, 2012 @05:21AM (#40943783) Homepage

      Yup. spool to tape. get a SDLT600 tape cabinet and call it done. if you get a 52 tape robot cabinet you will have space to not only hold a complete backup but a second full backup in incrementals that will all run automatically. Plus it has the highest reliability.

      And anyone whining about the cost. If your 24Tb of data is not worth that much then why are you bothering to back it up?

    • Re:Tape? (Score:5, Informative)

      by Anonymous Coward on Friday August 10, 2012 @05:57AM (#40943955)

      No kidding. For $2400, you get 24x TB HDs and a bookkeeping nightmare if you ever actually resort to the "backup." For $3k, you get a network-ready tape autoloader with 50-100TB capacity and easy access through any number of highly refined backup and recovery systems.

      Now, if the USB requirement is because that's the only way to access the files you want to steal from an employer or government agency, then the time required to transfer across the USB will almost guarantee you get caught. Even over the weekend. You should come up with a different method for extracting the data.

  • tar --multi-volume (Score:5, Interesting)

    by jegerjensen ( 1273616 ) on Friday August 10, 2012 @04:51AM (#40943653)
    Evidently, our UNIX founding fathers had similar challenges...
  • by cyocum ( 793488 ) on Friday August 10, 2012 @04:56AM (#40943669) Homepage
    Have a look at tar and it's "multi-volume" [gnu.org] option.
    • by leuk_he ( 194174 ) on Friday August 10, 2012 @05:09AM (#40943739) Homepage Journal

      multi volume tar [gnu.org]Just mount a new usb disk whenever it is full.

      However to have reasonable retrieve rate (going through 24 TB of data will rake some days over USB2), You better split the dataset in multiple smaller sets. That also has the advantage that if one disk chrashes (AND Consumer grade USB disk will chrash!) not your entire dataset is lost.

      For that reason (diskfailure), do not use some linux spanning disk feature. File systems are lost when one of the disks they write on are lost. Unless you use a feature that can handle lost disks (Raid/ Zraid)

      And last but not least: Test your backup. I have seen myself cheap USB interfaces failing to write the data to disk without a good error messages. All looks ok until you retreive the data and some files are corrupted.

  • by Anonymous Coward

    Here's a Linuxquestions thread [linuxquestions.org] outlining multi-disk backup strategies.

    The gist of the discussion is to use DAR [linux.free.fr].

  • No. (Score:2, Insightful)

    by AdmV0rl0n ( 98366 )

    I'm not sure if you posed the question out of being nieve, or if its just being daft. You don't want to be moving 24TB over the USB bus. End of discussion really - at least in terms of USB.

    Whoever or however you ended up looking at USB for this was wrong/wrong way.

    You have lots of choice in terms of boxes, servers, NAS boxes, locally attached storage. 24TB is in the range of midrange NAS boxes.

    Once you have this, you can start to make choices on the many backup, replication, and duplication bits of software

    • Re: (Score:2, Informative)

      by ledow ( 319597 )

      USB 2.0 provides 480Mbps of (theoretical) bandwidth. So unless you go Gigabit all over your network (not unreasonable), you won't beat it with a NAS. Even then, it's only 1-and-a-bit times as fast as USB working flat-out (and the difference being if you have multiple USB busses, you can get multiple drives working at once). And USB 3.0 would beat it again. And 10Gb between the client and a server is an expensive network to deploy still.

      Granted, eSATA would probably be faster but there's nothing wrong wi

      • You don't need a Gigabit connection everywhere, just on your computer and the NAS directly connected to your computer.

        USB2 is not a very good option. For some reason, I've been getting poor performance from Linux with storage mounted via USB. Your best bet is eSata. If you can't install eSata, but have a Gigabit eithernet connection then go that route. USB2 is the connection of last resort when talking about backing up 24TB.

      • USB 2.0 provides 480Mbps of (theoretical) bandwidth. So unless you go Gigabit all over your network (not unreasonable), you won't beat it with a NAS. Even then, it's only 1-and-a-bit times as fast as USB working flat-out (and the difference being if you have multiple USB busses, you can get multiple drives working at once).

        The 480Mbps is nowhere near what you will see in practise, unlike network speeds which are far closer to the rated maximum. Most USB drives I've seen top out at somewhere between 25 and 30MByte/sec, and if there are no other bottlenecks it isn't unusual to see 100Mbyte/sec from a gbit switched network. My main desktop pulls things from the fileserver at around 80Mbyte/sec, which is as fast as local reads tend to be on that array. So you are right about 100mbit networks: that'll be the bottleneck not USB, bu

  • You know... (Score:5, Funny)

    by marsu_k ( 701360 ) on Friday August 10, 2012 @04:59AM (#40943683)
    Porn is a renewable resource, there's no need to store so much of it.
  • by Qbertino ( 265505 ) <moiraNO@SPAMmodparlor.com> on Friday August 10, 2012 @05:05AM (#40943719)

    What your attemting isn't easy, it's actually difficult.
    Buy a cheap and big refurbished workstation or rackmount server, install a few extra SATA controllers and maybe a new power supply, hook up 12 2TB drives, install Debian, check out LVM and your all set.

    Messing around with 12 - 24 external HDDs and their power supplys is a big hassle and asking for trouble. Don't do it. Do seriously go through the possibilty of building your own NAS. You'll be thankfull in the end and it won't take much longer, it might even go faster and be cheaper if you can get the parts fast.

    My 2 cents.

    • by DRJlaw ( 946416 ) on Friday August 10, 2012 @07:01AM (#40944241)

      What your attemting isn't easy, it's actually difficult.
      Buy a cheap and big refurbished workstation or rackmount server, install a few extra SATA controllers and maybe a new power supply, hook up 12 2TB drives, install Debian, check out LVM and your all set.

      Messing around with 12 - 24 external HDDs and their power supplys is a big hassle and asking for trouble. Don't do it. Do seriously go through the possibilty of building your own NAS. You'll be thankfull in the end and it won't take much longer, it might even go faster and be cheaper if you can get the parts fast.

      Way to redefine the problem instead of working within the specifications.

      Perhaps:
      1. The poster ALREADY has a NAS and wants to have airgapped or even offsite/offline backup.

      2. External HDDs are fast, common, reasonably cheap, and do not have a single point of failure (e.g., the tape backup drive in many suggested alternatives)

      I'm interested in this question. I use this general setup, but on a smaller scale. I cannot put a NAS in a safety deposit box. I cannot ensure that my "backup" NAS would not be drowned in a flood, burned in a fire, fried by a lightning strike...

      Let's pretend the poster is not an idiot, and answer the actual question. If he has 24TB of data, IT'S ALREADY ON DAS/NAS. Geesh.

      • Let's pretend the poster is not an idiot, and answer the actual question. If he has 24TB of data, IT'S ALREADY ON DAS/NAS. Geesh.

        Don't assume he was the one that created his current storage solution. It could be a turnkey solution that he purchased, like one of those movie storage devices we read about on slashdot earlier this year.

        If he installed his current storage configuration himself then why did he need to ask this question on Slashdot? I don't see any particular bad answers, and no one is insulting

  • Bash.... (Score:5, Informative)

    by djsmiley ( 752149 ) <djsmiley2k@gmail.com> on Friday August 10, 2012 @05:08AM (#40943733) Homepage Journal

    First bash script to grab the size of the "current" storage;

    compress the files up until that size;

    Move compressed file onto storage;

    request new storage, start again.

    ----------

    Or, if you've got all the storage already connected; bash for 0..x; do { cp $archive$x /mount/$x/ }; done :D

  • ... by employing a detector with a size of 2463 x 2527 pixels (6M) at 12 Hz (12 times / sec). When run continuously for a set of data (roughly 900 degrees) ...

    we collect 900 frames in roughly 2 minutes including hardware limitations for starting/stopping.

    In proper format for processing, this works out to about 6MB/image and roughly 3GB/min for 2 minutes.

    With an experienced crew of 3-4 people ... one handling the samples, one handling the liquid nitrogen, one running the software and one taking notes (ove

    • by ledow ( 319597 )

      I'm not the OP but:

      Because downloading 3.6Tb to restore from a backup for just one day is pretty ridiculous for someone on a home broadband?

      Backup to external servers is ridiculous for anyone without university-sized access to the net. Hell, the school I work for try to back up 10Gb to a remote server each night and it often fails because it took too long (and we're only allowed to do that because we're a school - the limits for even business use on the same connection are about 100Gb a month).

      Absent a stu

      • Two quick things:

        1. Why do a complete restore of the 3.6TB? Just take the files that want to use again/have been lost.

        2. Why work at home? It's home, not work.

  • USB is for a second working copy.
    Backups should also ensure durability of the copy, while USB HDD have a shorter lifespan than a normal HDD which in turn has shorter lifespan than tapes, the usual medium for durable backups.

  • Use DAR or KDAR (Score:3, Informative)

    by pegasustonans ( 589396 ) on Friday August 10, 2012 @05:18AM (#40943771)

    If you don't want to invest in new hardware, you could use DAR [ubuntugeek.com] or KDAR [sourceforge.net] (KDE front-end for DAR).

    With KDAR, what you want is the slicing settings [sourceforge.net].

    There's an option to pause between slices, which gives you time to mount a new disk.

  • Backup tapes were designed precisely for the problem you have. LTO-5 tapes are about 1.5TB, if I remember right. Stored correctly they shouldn't give any problems when you come to retrieve whatever is backed up. Most archiving efforts use backup tape, and they can't all be wrong :)

    • Actually handling all those tapes and recovering data from them is very expensive in manpower and time, and can be very awkward for recovering data. Those tapes, and tape drives, are also _expensive_. They're useful for sites that require secure off-site storage, or encrypted off-site storage, but for most environments today they are pointless. Easily detachable physical storage has become very inexpensive, far more economical, and is far less vulnerable to the vulnerabilities of mishandling SCSI connection

      • Actually, for a data set this large it will probably work out only very slightly more expensive - and the benefit to be gained is worth it IMHO (in speed if nothing else - USB disks are *slow* and eat a lot of CPU). I live in the UK so I'll work in GBP. I think US prices are likely to be cheaper but the relative sizes will be similar.

        I'd figure around ~£1100 for drive and SAS interface plus £500-700 for 24TB worth of media. Throw in an extra 2TB drive to spool to before you write to tape as well

        • The slow disk is why you use rsync or other such efficient mirroring technologies. The tapes have a limited lifespan, they require significant maintenance, and have been prone to far too many mechanical failures and expensive downtime in my experence. The disks can actually be simultaneously connected for casual "read" access with a reasonable USB hub and possibly an additional USB card.

          You've also left out the cost of recovery time for users. Swapping tapes to get recovery of arbitrary files is rather awkw

          • Whether tape or disk is appropriate really depends what you are intending to use the backup for and how important your data is. You might even choose to use a mixture of the two.

            If it's your only backup, I would suggest that it's not wise to leave it permanently online in the way you suggest; that leaves you open to any number of potential issues which your backup is supposed to protect you from (OS bug, misconfiguration, lightning strike, power failure, overheating, ...). Tape libraries have the same issue

  • PAR (Score:4, Informative)

    by fa2k ( 881632 ) <pmbjornstad.gmail@com> on Friday August 10, 2012 @05:59AM (#40943967)

    I have just seen "PAR" a couple of times here on slashdot, haven't used it, but it seems great for this: http://en.wikipedia.org/wiki/Parchive [wikipedia.org] . You need enough redundancy to allow one USB drive to fail. And I would rather get a SATA bay and use "internal" drives than having to deal with external USB drives. Get "green" drives, they are slow but cheap.

  • by Wolfling1 ( 1808594 ) on Friday August 10, 2012 @06:04AM (#40943995) Journal
    A 24TB NAS is not very hard to assemble. Relatively cheap, and basically transfers data at Gb speed - assuming that you populate it with fast disks. Set one up with RAID and you're away. Personally, I would do it with a low end server and a big-ass RAID array. That way, you can really control its behaviour via the OS. Linux is ferpect for this kind of thing.
  • I know (Score:2, Funny)

    by Anonymous Coward

    The iCloud! ;-)

  • Get an old computer... anything will work really. You have to know someone that has one laying in their basement. Plug your drives into that. share the drives on your network. Use any general backup software and sequentially backup what you need to backup over the network. Now it will do it overnight and you really don't care how long it takes. It can even do it every night. If you want it safe from fire and such.... build a box out of 2x4s and Drywall scraps form homedepot. Make it 5 sheets thick and it'll
    • by DarkOx ( 621550 )

      build a box out of 2x4s and Drywall scraps form homedepot. Make it 5 sheets thick and it'll withstand any housefire you could possibly have

      I find that statement suspect. I am not saying you are wrong but extraordinary claims require extraordinary evidence.

      I have seen some pretty nasty house fires, the kind were the fire department sprays water on the neighbors houses to keep them from catching rather than try to do anything about the one that is actually burning. With all the modern synthetic materials in furniture, carpeting, and other flooring a house fire can hit 600 degrees and stay that way for hours.

      If five sheets of dry wall ( .5" or

  • by zapyon ( 575974 ) on Friday August 10, 2012 @06:28AM (#40944083)

    "Only wimps use tape[*] backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)"
    Linus Torvalds (1996) http://en.wikiquote.org/wiki/Linus_Torvalds [wikiquote.org]

    (Isn't that prescience of "The Cloud"?)

    ––––––––––
    * replace this with your favorite backup media of today ;-)

    • by rvw ( 755107 )

      "Only wimps use tape[*] backup: real men just upload their important stuff on ftp, and let the rest of the world mirror it ;)"

      Linus Torvalds (1996) http://en.wikiquote.org/wiki/Linus_Torvalds [wikiquote.org]

      (Isn't that prescience of "The Cloud"?)

      ––––––––––

      * replace this with your favorite backup media of today ;-)

      "Only wimps use ftp[*] backup: real men just upload their important stuff to the iCloud, and let the rest of the world mirror it ;)"

      An Amazon support employee (2012)

    • Linus should have patented his idea! He could have become a rich man.

  • It's a little late to be asking that now.

  • by freaker_TuC ( 7632 ) on Friday August 10, 2012 @06:42AM (#40944153) Homepage Journal

    Count Bacula as your friend ;) -> http://www.bacula.org/ [bacula.org]

  • Sometimes the easiest way to duplicate (back up) data is to simply duplicate the hardware it's already on. If it's on a 16-disk (x 2TB) NAS system, build another one. If it's on tape, buy more tapes, if it's on random HDD's scattered all over the place, then you have bigger problems to deal with first (like building a NAS box)!
  • Backup advice (Score:2, Insightful)

    by Anonymous Coward

    I do things like this all the time with a data set about half of that, ~ 12TB. You didnt say anything about what the data is but from the request and the fact you mentioned USB I would gather this is your typical warez hording mp3/flac, mkv, apps and also a personal picture and video collection of fam.

    Here is a checklist i would execute similiar to mine. I find the most reliable way to keep your data over the years is by following a checklist or procedure and choosing when to move to the next storage pl

  • # rsync -avz /this /that. Split your directories corresponding to the sizes of your drives. If on Linux, run smartctl -H /dev/sdX to check your disk health and if possible, take the HDD's our of their usb enclosures and connect them directly to SATA for faster xfer speeds. These drives will 9/10 mount just like a normal drive since usually they are just a normal drive housed in an enclosure.

    Good luck :)
  • Plug all the disks into a USB hub. Ensure that each one has a unique volume name eg bak1, bak2... The old skool way is to make a little tar script and use volume spanning. Otherwise, configure all the disks as a single JBOD and run DejaDup.
  • "Hi there ! I'm looking for a simple solution to backup a big data set consisting of files between 3MB and 20GB, for a total of 24TB, onto multiple hard drives (usb, firewire, whatever)

    Private Manning, is that you?

  • by kimvette ( 919543 ) on Friday August 10, 2012 @10:22AM (#40946447) Homepage Journal

    You buy one of these:

    http://www.newegg.com/Product/Product.aspx?Item=N82E16816322007 [newegg.com]

    populate it with 4GB drives and create two RAID5 (or one RAID6) array, then you've got 24 or 28 TB of backup space, without having to change drives or break up your backup into smaller chunks.

    But really, your backup methodology is broken; you need to organize the data into manageable chunks because aside from a large dedicated backup server/SAN, there is no reliable (don't tell me tape is reliable) backup solution for a such a large quantity of data in a single chunk.

    What I do for backups: in my 24-bay server I have eight large drives in a (HARDWARE) RAID5 array (were 4TB drives available at the time I'd have gone RAID6) and rsync the virtualized server contents to that, then archive them into tarballs, and send copies of them across the LAN to another server that is running (HARDWARE) RAID5 as well. Every once in a while I back up the critical data (source, scripts, financial data, production web sites, /etc, and so forth but not the program binaries nor system binaries which are easily recreated or reinstalled, respectively) to optical media and external hard drives.

    So what I have in summary is:
    * Massive server with a backup array separate from the production array
    * Separate backup server running another array (again, using a quality HARDWARE RAID controller. Safeguard your data and don't bother with Intel, Adaptec, Promise, or Highpoint "hybrid" RAID)
    * Periodic backups of non-recreatable data to USB drives and optical media that are moved off site.

In the long run, every program becomes rococco, and then rubble. -- Alan Perlis

Working...