Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Cloud Data Storage IT

Amazon Wants To Replace Tape With Slow But Cheap Off-Site "Glacier" Storage 187

Nerval's Lobster writes with a piece at SlashCloud that says "Amazon is expanding its reach into the low-cost, high-durability archival storage market with the newly announced Glacier. While Glacier allows companies to transfer their data-archiving duties to the cloud — a potentially money-saving boon for many a budget-squeezed organization—the service comes with some caveats. Its cost structure and slow speed of data retrieval make it best suited for data that needs to be accessed infrequently, such as years-old legal records and research data. If that sounds quite a bit like Amazon Simple Storage Service, otherwise known as Amazon S3, you'd be correct. Both Amazon S3 and Glacier have been designed to store and retrieve data from anywhere with a Web connection. However, Amazon S3 — 'designed to make Web-scale computing easier for developers,' according to the company — is meant for rapid data retrieval; contrast that with a Glacier data-retrieval request (referred to as a 'job'), where it can take between 3 and 5 hours before it's ready for downloading."
This discussion has been archived. No new comments can be posted.

Amazon Wants To Replace Tape With Slow But Cheap Off-Site "Glacier" Storage

Comments Filter:
  • by retep ( 108840 ) on Tuesday August 21, 2012 @11:24AM (#41068521)

    > Whenever I need to restore data from an archive backup, I need it RIGHT FUCKING NOW.

    I don't. It'll be at least a few hours until FedEx arrives with the new server hardware in the best case, and a few weeks before we get a new building and our clothes stop smelling of smoke (and zombies) in the worst case.

    Interesting question though: if I submit a retrieval job, how soon do I have to actually download the associated data? Can I wait a few hours or days?

  • I think this opens the possibility for a middle-man company to provide long term archival tools for end users. This firm would spend its energy focused on front end tools for the end user and make use of Amazon's back end long term storage for the actual infrastructure.

    There are many amateur and even professional photographers, for example, with almost no alternatives for very long term storage. Home writable media is nearly all flawed in terms of true long term storage. I'm sure there are many use cases in this space.

    In terms of mid-size and larger companies, I think a critical feature will need to be a simple interface that encrypts at the client side prior to sending the data using a private key only available on the client side. I cannot think a responsible I.T. professional would store company critical or customer data on a third party site like that without such protections in place.

  • by hawguy ( 1600213 ) on Tuesday August 21, 2012 @12:21PM (#41069217)

    my company pays for offsite storage of our tapes and i did some quick math

    $2000 a month to store over 1000 tapes for us. I think the minimum bill is like $1500 if you only have a few tapes

    $.01/GB is $10 to $20 per LTO-4 tape per month. i know the specs are less but ive seen LTO-4 tapes hold close to 4GB of data.
    i send out one tape per month for storage and keep a bunch more locally. so even on the cheap end that's $240 per month for the first year.

    Compress your data before you send it to Amazon and you'll have a more fair comparison. An LTO-4 tape holds 800GB native, so your thousand tapes is 800TB of data, which would cost you $8000/month on Amazon Glacier.

    If you store multiple copies of your data (to protect against tape failure) and could get by with only 200TB of Glacier space, then it might be cost effective, lower labor costs in loading tapes and shipping them offsite, and dropping maintenance on your tape library (or libraries) may also sway the decision.

    The numbers change for LTO-5 (1.5TB native), but then you're looking at a large capital cost to swap out your tapes and upgrade your tape drives.

    I'm in a little different situation - I have my data replicated to a colocated storage array with less than 100TB of data. Amazon Glacier storage would cost about the same as I pay in maintenance on the array (ignoring colocation fees). Glacier is not a drop-in replacement for the array, since the storage array also runs my DR VMware cluster, but it may be more cost effective to get rid of the colocated array cabinet and VMware cluster hardware and rent some VM's with a small amount of storage for the critical servers I need for disaster recovery, using Glacier to store the rest of my data.

  • by mdfst13 ( 664665 ) on Tuesday August 21, 2012 @12:30PM (#41069341)

    This could be used either way. If you are using it as an archival medium, it is less of a hassle than finding three facilities of your own (the promise is that there are at least three copies of the data at all times). To get the equivalent from tape, you'd have to buy three tapes. Plus, you need places to store them.

    If you are using it as the offsite part of your backup procedure, then it only needs to match the latency of other offsite backups. If you are restoring from a tape that you have stored in a safe deposit box, that also takes three to five hours to restore (it takes time to get to the bank and retrieve the tape, then it takes more time to read from the tape). And truly, that time will rarely matter. If you really lost

    1. Your primary data store.
    2. Your backup data store.
    3. Your local archive copy.

    all at the same time, you likely lost your physical hardware as well. Or you are experiencing a security problem that you need to fix before restoring from backup. You could promote your archived data from Glacier to S3 while you were replacing that hardware or fixing your security.

    It also may be worth thinking about how this works if you are doing everything AWS. In that case, Multi-AZ RDS provides your primary and backup data stores. It also provides the ability to rebuild your data store from real-time backups. Next, you use snapshots to take regular backups (the equivalent of a local archive copy). Weekly makes sense as RDS can store up to eight days of real-time backups. You keep a few of the most recent snapshots, but you archive most that are older than a month to Glacier. You can still keep the one month, three month, and six month snapshots in the quicker, more expensive storage.

    Now, you face a major data problem. Amazon loses two facilities. These happen to be the two facilities with your RDS stores. However, you still have the snapshots (which are stored in more than two facilities). You restore quickly. You only need to go to Glacier if you have data corruption that you don't notice for a month (so that the archive copy that you need has dropped out of the snapshots).

    If you are not using AWS for everything, then you are responsible for creating your own primary and backup data stores as well as local archive copies. Other than that, the same issues apply.

  • by hawguy ( 1600213 ) on Tuesday August 21, 2012 @12:36PM (#41069417)

    Centon DataStick Pro 64gb is about 35$ each. I bet if you buy 50 of them, they are cheaper. Get a good fire safe, and store one on site, one off site.

    You forgot to include labor costds to pay someone to plug them into the backup server, swap them out, ship them offsite, and keep track of them.

    But even if you exclude labor costs:

    50 of those memory sticks cost $1750, if you split them between offsite and onsite, and have 2 copies of the data on each set, that's gives you 768GB of storage (50 / 2 / 2 * 64), which would cost about $8/month on Glacier, so you could store that data for more than 15 years for what it costs you to buy the memory sticks.

I've noticed several design suggestions in your code.

Working...