Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Data Storage Hardware

The Ultimate All-In-One Storage Solution 387

karnifex writes "Filled up your LaCie Bigger Disk already, and looking for a little more storage space? Good news! The Petabox is ready! 'The petabox by the Internet Archive is a machine designed to safely store and process one petabyte of information (a petabyte is a million gigabytes).' And luckily, as the Internet Archive notes, it's shipping-container friendly (20' x 8' x 8'). So save on delivery costs and order two!"
This discussion has been archived. No new comments can be posted.

The Ultimate All-In-One Storage Solution

Comments Filter:
  • In 10 years ... (Score:5, Insightful)

    by Bob Loblaw ( 545027 ) on Tuesday May 11, 2004 @07:06PM (#9121840)
    Will we find one of these things in eBay in 10 years selling for $10 and feel all nostalgic about those days when that amount of storage media was the size of a room?
  • Re:Price? (Score:3, Insightful)

    by Anonymous Coward on Tuesday May 11, 2004 @07:07PM (#9121863)
    If you have to ask, you can't afford it. Just remember that. It might come in handy again someday. :)
  • by Berylium ( 588468 ) * on Tuesday May 11, 2004 @07:11PM (#9121927)
    From the site:

    PILOT STATUS 5/2004
    * The first 100TB Rack is up and running!
    * The second 100TB Rack will be up by the end of May
    * Thermal Targets have been met
    * Systems Booted from USB Dongle
    * Reiser FS running
    * PC-based Router running


    Maybe I'm missing something but this looks to me like they don't really have a Petabyte of storage working but plans to incorporate a Petabyte of storage with only 100 TB up and running now. Not that 100 TB is anything to brush off.
  • by gremlins ( 588904 ) on Tuesday May 11, 2004 @07:13PM (#9121945)
    I know the pull is to get these things as big as you can get but i would love to see hard drives that will work for ever. Now I know everything breaks but I mean in 400 years how is anyone going to know what we were like if all the data on us slowly goes away because the hard drives or the cds don't really last very long
  • Re:two words (Score:5, Insightful)

    by pbox ( 146337 ) on Tuesday May 11, 2004 @07:32PM (#9122133) Homepage Journal
    Assuming 2 layered disks that is 10 GB per disk (feeling generous).
    100 disk -> 1 TB
    15000 disks -> 150 TB.

    Netflix has a "mere" collection of 15000 disks. Your patebyte disk is only 1/6th full.

    You upload all music CDs: 1 GB per disk (feeling generous).

    How many CDs can be in print? Maybe a 500,000?

    That is only 500 TB. Now your disk is 2/3rd full.

    Lets upload all printed material. May or may not fit in the rest.

    Then again, if you want to archive the internet: ~6G pages. 10kB each. 60 TB. each run. Store the last 16 versions -> 1TB.
  • Comment removed (Score:5, Insightful)

    by account_deleted ( 4530225 ) on Tuesday May 11, 2004 @07:43PM (#9122232)
    Comment removed based on user account deletion
  • by exp(pi*sqrt(163)) ( 613870 ) on Tuesday May 11, 2004 @07:50PM (#9122292) Journal
    ...just mount /dev/random as a petabyte drive. Admittedly it might be hard to find your data in there - but chances are it is in there somewhere [westnet.com].
  • Re:Price? (Score:3, Insightful)

    by theLOUDroom ( 556455 ) on Tuesday May 11, 2004 @09:04PM (#9122920)
    So, about $1.3M (10 racks)

    What would be interesting is to know the estimated maintenance costs as well. With than many drives, I imagine you'd be changing them like light bulbs, especially as time passes and the probability of each drive failing get's higher and higher.

    If one was really clever, they could use the failure rate of a typical hard disk and Moore's Law to estimate monthly replacement costs for the next 100 years or so. I would expect them to rise in the short term as the drives age, but fall in the long term as moore's law catches up.
  • Re:Business idea (Score:3, Insightful)

    by timeOday ( 582209 ) on Tuesday May 11, 2004 @10:35PM (#9123549)
    I suspect google is more interested in building a platform that would be a competitor to this product. For the device in this article they estimate 1 FTE (full-time-employee) for each petabyte of storage. That doesn't sound so good. Google's system will apparently replicate and migrate data betwen units as necessary so you never need to replace drives at all; the maximum capacity just degrades slowly with time. Perhaps when it gets to 80% original capacity you just roll in a newer unit (which is probably much bigger in capacity as well), hook them together for a day, then throw out the old one.

    The power requirements are also quite hefty. It shouldn't be necessary to run all those drives (and the computers behind them) unless the unit is near capacity and access is random (which I'm sure would rarely be the case). Instead, they should be dynamically powering drives and computers up and down, and migrating data to a reasonably small 'working set' of drives.

    On the hardware front, the device in this article also incorporates 800 "low-end PCs." IOW it's a big cluster that happens to be heavy on storage. If all you want is the storage, surely there is some way to get rid of all those motherboards and CPUs with their fault-prone, power-hungry fans. They need to develop a controller that can directly handle, say, 64 hard drives, analogous to a big network switch.

    Anyways, it sounds like a fun project!

  • Ozymandias (Score:4, Insightful)

    by Boglin ( 517490 ) on Tuesday May 11, 2004 @11:46PM (#9123845) Journal
    Okay, I've heard this too many times and I'm just starting to get sick of it. It's not the computers that are killing your quest for digital immortality; it's just the way that history works.

    You're complaining that these hard drives won't run forever and you're right. Neither will CD's. However, I would also like to point out that the vast majority of ancient egyptian papyrus isn't around today. Also, don't start goign off on using clay or stone tablets, because they break (even the Rosetta stone is broken).

    Honestly, computers are still far superior to what we were using before. It's not like we've got Homer's original version of the Illiad sitting in a museum somewhere; we just have many duplicated copies that have been reproduced over the years. You're right that hard drives fail and CDs break, but we can keep updating onto new media. Besides, when a monk drops an iota when transcribing the Bible, Jesus goes from being God to godlike. When a computer adds an iota, the checkbit fails and the data is resent.

    Somebody is also going to point out that, as systems change, data can become unreadable. Heck, I had a professor who couldn't update his lab instructions because the software that read the lab printouts wouldn't run on new machines and the fileformat wasn't understood by any other software. So, want to stop our data from becoming unreadable? Well, let's just do what the Etruscans did! Of course, we don't have a clue what they did because nobody can read Etruscan. For a more familiar example, think of heiroglyphics before the Rosetta stone. It's pretty common for data to become lost and unreadable. Also, this bring us back to the solution. Along with the data, include the source code for the software that can read it. If you really want to be anal, you could even include the source to an emulator for the machien it was designed to run on.

    Still, you might point out, 400 years from now, we'll still lose 99% of that do to failures of whatever nature. Once again, you would be be right. However, do you honestly believe that we have 1% of all the data that was collected in 1604? Hell, most of the people couldn't even right, so we don't know ANYTHING about their lives. I'm sorry that we can't digitally preserve our wonderous society for all of eternity, but it's completely blind to believe that this makes us in ANY way different to any other culture. Read Percy Shelley's Ozymandias before complaining about how people in the future won't know what our lives were like.

  • by gumbi west ( 610122 ) on Wednesday May 12, 2004 @12:52AM (#9124087) Journal
    If you expect a hard drive to fail after three years (I'm guessing) but these occurances are randomly distributed (an assumption that will be true after running this thing for a year or two) you can then expect that the 4000 hard drives in this array would have about 3 failures per day. This thing would never be at full speed! it would be constantly restructuring its RAID. Also, it would cost about $300 just in hard drives (not to mention controllers, power supplies, et cetera).
  • Re:In 10 years ... (Score:3, Insightful)

    by blancolioni ( 147353 ) on Wednesday May 12, 2004 @03:24AM (#9124531) Homepage
    Every hard drive I've ever bought has been larger than all my previous hard drives combined. And this is without even trying.

    The storage problems I have these days are almost entirely organisational.
  • by vidarh ( 309115 ) <vidar@hokstad.com> on Wednesday May 12, 2004 @07:00AM (#9125137) Homepage Journal
    Laugh all you want but what they are doing makes sense: Recovering a node in case of a crash or messed up filesystem is easy - you replace the dongle and hit the reset button. No need to have space wasting CD drives or floppy drives, and the rest of the OS can be pulled down over the network.

    The last thing you want with a setup like this is having to haul hardware around or disconnect stuff if you for any reason can't boot of the disks anymore. And you certainly don't want to reduce density by wasting space that could be filled with disks with other stuff.

  • by vidarh ( 309115 ) <vidar@hokstad.com> on Wednesday May 12, 2004 @07:09AM (#9125161) Homepage Journal
    Yes, but you aren't seriously suggesting it would be one RAID over all the disks are you?

    So assuming 3 failures a day, at most 3 RAID's would be running slower a day. Assuming 4 disks per RAID that's at most 12 disks at reduced performance, or 0.3% of the total data set that isn't available at full speed. If that is an issue, you duplicate any data that MUST be available on multiple nodes.

You knew the job was dangerous when you took it, Fred. -- Superchicken

Working...