World's Largest Databases Ranked 356
prostoalex writes "Winter Corp. has summarized its findings of the annual TopTen competition, where the world's largest and most hard-working (in terms of load) databases are ranked. The results are in, and this year the contestants were ranked on size, data volume, number of rows and peak workload. I wrote up a brief summary of the top three winners in each category for those too lazy to browse the interactive WinterCorp chart."
Google (Score:5, Interesting)
SQL Server? (Score:5, Interesting)
I would have liked to see SQL vs non-SQL ranking too.
No IMS? (Score:5, Interesting)
Hmmm (Score:2, Interesting)
OK so this is obviously only vendors of databases and RDBMS systems.
In a broader sense aren't such things as the wayback machine [archive.org] a database? What about the truly massive amounts of data gathered at research labs, e.g. CERN [web.cern.ch]. Who's the daddy of these guys?
What surprised me... (Score:5, Interesting)
29 TB is the biggest? (Score:4, Interesting)
I recognize Oracle and DB2, but could someone give a brief synopsis of what the other database systems are? And what is an MPP archetype?
94.3TB!?!?! (Score:5, Interesting)
It takes a truely amazing staff to maintain (backup, adminisister, maintence, sit and stare at screens) the servers and maintain the integrity of the data but, good lord...
A 94.3TB database? My upmost, and highest kudo's to those DBMA's and admins there. That is one gigantic task to operate. Being it's AT&T and assuming a great deal is billing and maintence functions these have to be up I'm sure a good 3 nines if not greater.
Regardless of the result of the study, which without actually reading the entire study the end results are simply a short-read of a geek pissing contest, I find it truely amazing how much work, man-hours, and midnight pager calls go into maintaining these databases. I know I don't want our DBMA's jobs and certainly wouldn't want to be a DBMA on a 94.3TB farm but, I know those that do and love doing it. It's a speciality skill and apparently these guys do it right...
Kudos...
Archive.org not on the list? (Score:4, Interesting)
Quote:
"The Internet Archive Wayback Machine contains over 300 terabytes of data and is currently growing at a rate of 12 terabytes per month." Taken from here [archive.org]
But it doesn't say what OS? (Score:2, Interesting)
Re:29 TB is the biggest? (Score:5, Interesting)
Personally, it has it's drawbacks, but if the indexing is right, you can join hundred million row tables at amazing speed. Based on my experience in data warehousing, it's performance Oracle can't touch (no, I'm not paid by NCR...just a user).
http://www.teradata.com
Overview:
http://www.teradata.com/t/go.aspx/?i
Doesn't have to be relational (Score:4, Interesting)
Genomic databases (Score:2, Interesting)
Frightening (Score:3, Interesting)
Re:wintercorp climbing up the ratings now.. (Score:2, Interesting)
Re:29 TB is the biggest? (Score:3, Interesting)
Obviously data collected from places like Arecibo wouldn't lend themselves to this kind of survey, even though it must be vastly larger, but what about storage of particle vectors from nuclear event simulations? I'm guessing that they were either not nominated or declined to be listed on security grounds rather than don't rate high enough. Does anyone have any figures?
Re:Archive.org not on the list? (Score:4, Interesting)
Re:29 TB is the biggest? (Score:2, Interesting)
Re:My porn database (Score:2, Interesting)
Databases not ranked (Score:2, Interesting)
MasterCard (Score:4, Interesting)
I agree that there are many companies who would not want to be in that list. There's a small competitive advantage if you keep what technology you use secret.
Re:SQL Server? (Score:3, Interesting)
bah, meaningless (Score:4, Interesting)
Without system descriptions (like in tcp) it merely shows that such a top-end is feasible.
What about total cost?
annual cost?
time to build?
software versions?
hardware?
staffing composition?
I mean really, a 500 gbyte database on a modest single CPU server is far more challenging than a 2 TB database on a 64-CPU E10k.
Re:My porn database (Score:1, Interesting)
I keep md5 hashes and the galleries they relate to of all the pictures of porn I have (about 25000 images so far) so I dont get duplicates when I add new ones... I also have another db table for galleries wich keeps track of the number of images are included at that gallery and the traits of the chick/chicks in those pictures (young, hot, shaved, redhead, cartoon), althoug only about 60% of the galleries have been gategorized.
It runs on mysql and is controlled by a couple of legacy php scripts (yes I taught my self php so I could create a cool database for my porn)
The pictures them selves are held in seperate Blowfish encrypted files (so my parents wont find em) and I also keep smaller (also encrypted) thumbnails of all images.
The funny thing is that I rarely see any of these pictures after I enter them to my database because im always looking for new ones (free6.com, thehun.com, ampland.com and spidering various newsgroups)... Oh well... maby I just give em to my grandchildren some day.
Re:Walmart (Score:1, Interesting)
I read an article in CRN(could be wrong here) about Visa's north american systems. They have two sites, one for the eastern half and one western, dividing the continent at the Mississippi river. The eastern site generates about 240 TB of data a month and could take over the whole continent if the western site went down. All this with just 4-5 IBM mainframes, probably running IMS.
How often does your visa transaction not get processed? On the shopping day after Thanksgiving? All this with just 4-5 boxes? I would like to see Sun, HP, Oracle, Microsoft try that. Even if they could do it, it would be far more expensive to build and maintain.
Open Source DBs? (Score:2, Interesting)
I would guess that PostgreSQL maxes out larger than MySQL. </fuel-on-the-fire>
Re:SQL Server? (Score:2, Interesting)
Now MS has overwhelmed Sybase with a derivation of it's own technology that has MS's special additional bugs included for a nominal price, largey because they know how to market and Sybase regularly fails to market it's products effectively.
We are larger: 500TB (Score:3, Interesting)
Press release:
http://www.slac.stanford.edu/slac/media-info/20
Cheers