Internet Archive Gets 4.5PB Data Center Upgrade 235
Lucas123 writes "The Internet Archive, the non-profit organization that scrapes the Web every two months in order to archive web page images, just cut the ribbon on a new 4.5 petabyte data center housed in a metal shipping container that sits outside. The data center supports the Wayback Machine, the Web site that offers the public a view of the 151 billion Web page images collected since 1997. The new data center houses 63 Sun Fire servers, each with 48 1TB hard drives running in parallel to support both the web crawling application and the 200,000 visitors to the site each day."
Never underestimate the bandwidth ... (Score:5, Insightful)
... of a 4.5 petabyte datacenter in a shipping container in transit.
63 x 48 = 3024Tb (Score:3, Insightful)
Re:Where do they store 4.5TB off site (Score:2, Insightful)
4.5TB isn't that bad. Heck, we have 1TB tapes right now. 5 of them can be carried in a small bag.
It's the 4.5PB that the Internet Archive could use that's hard to store offsite. 4500 1TB tapes can be pretty unruly.
Re:Story is meaningless without LOC measurement (Score:2, Insightful)
from http://www.lesk.com/mlesk/ksg97/ksg.html [lesk.com] The 20-terabyte size of the Library of Congress is widely quoted and as far as I know is derived by assuming that LC has 20 million books and each requires 1 MB. Of course, LC has much other stuff besides printed text, and this other stuff would take much more space.
1. Thirteen million photographs, even if compressed to a 1 MB JPG each, would be 13 terabytes. 2. The 4 million maps in the Geography Division might scan to 200 TB. 3. LC has over five hundred thousand movies; at 1 GB each they would be 500 terabytes (most are not full-length color features). 4. Bulkiest might be the 3.5 million sound recordings, which at one audio CD each, would be almost 2,000 TB.
This makes the total size of the Library perhaps about 3 petabytes (3,000 terabytes).
so 230 libraries by the old standard or 1.5 by the new standard
Compress each audio file to a 5 MB MP3. That's 17.5 TB. Total size would be 750 terabytes.
So the data would be 6 LOC.
Re:Where do they store 4.5TB off site (Score:4, Insightful)
i find it impressive they have all that hardware for a mere 200k users a day.
Re:Story is meaningless without LOC measurement (Score:3, Insightful)
You compressed the video, and the photographs, but not the audio? And why do you need a full CD for every sound recording? Surely many of them are far shorter than a full CD?
Re:You can ship it over OC-192... (Score:5, Insightful)
Or, you can ship the 40' containers in just under two weeks!
Re:Where do they store 4.5TB off site (Score:2, Insightful)
I can't take anyone seriously who puts "truth" and a link to Fox news in the same signature.
Re:Where do they store 4.5TB off site (Score:3, Insightful)
>>>I can't take anyone seriously who puts "truth" and a link to Fox news in the same signature.
Neither can I take seriously anyone who believes MSNBC or CNN are unbiased and/or better alternatives. Or is prejudiced (prejudges a report without ever watching it). For example I may think Rachel Maddow is a joke, but at least I listen to what she has to say before I laugh. And sometimes, she says something worthy of hearing... it's good to keep an open mind and listen to the opposition.