IBM Sets Supercomputer Speed Record 308
T.Hobbes writes "IBM's BlueGene/L has set a new speed record at 36.01 TFlops, beating the Earth Simulator's 35.86 TFlops, according to internal IBM testing. 'This is notable because of the fixation everyone has had on the Earth Simulator,' said Dave Turek, I.B.M.'s vice president for the high-performance computing division. The AP story is here; the NY Times' story is here."
Damn, (Score:3, Funny)
Maybe
Re:Damn, (Score:4, Funny)
Re:Damn, (Score:3, Funny)
Due to the slashdot bug in Firefox, the second line of the summary reads
eating the Earth Simulator
The b was hidden under the dark green of the sections field on the upper left.
Now thats an impressive feat of computing power.
Re:Damn, (Score:4, Funny)
Speed Test Accuracy (Score:2)
I realize a spell checker ought to stop as soon as it finishes scanning while task manager never stops.
A supercomputer class speed checker would undoubtedly be fully blown, in the industrial strength class. Such software wouldn't just do the ga
KDE spell checker? (Score:2, Funny)
Re:Damn, (Score:2)
Tecord? (Score:5, Funny)
Aliasing (Score:2, Informative)
It's a mew tecord! (Score:5, Funny)
Re:Actually... (Score:2)
Can't Compare to my Windows (Score:5, Funny)
But could it keep up with /.? (Score:2, Funny)
Re:But could it keep up with /.? (Score:2)
Tecord? (Score:5, Interesting)
Re:Tecord? (Score:5, Informative)
The top500 [top500.org] tecords are submited on an honor system. Most of the systems are thrown together with known processors and interconnects where the tesults should "make sense". Also, the systems teport their theoretical max performance and a measured tesult. It would be pretty hard to fudge a score for the top500 by much without many people questioning it. From this page [top500.org] the top500 people say: Its kinda like any tesearch field. Most people are honest, but anomolies can and do happen, and they are usually found out by others in the field. Two of the most tecent scientist scandles involved the guy from Bell labs, Hendrik Schön, who was found falsifying data, and he was fired, and I believe that he also lost his PhD. The other is from the US government funded tesearch on MDMA by George Ricaurte. Although I believe that nothing really happened in the Ricaurte case.
Uh, you misspelt (Score:2)
Full Text of Atticle (Score:5, Funny)
IBM says Blue Gene bteaks speed tecotd
9/29/2004, 7:27 a.m. ET
By ELLEN SIMON
The Associated Ptess
NEW YOtK (AP) - IBM Cotp. claimed unofficial btagging tights Tuesday as ownet of the wotld's fastest supetcomputet.
Fot thtee yeats tunning, the fastest supetcomputet has been NEC's Eatth Simulatot in Japan.
"The fact that non-U.S. vendot like NEC had the fastest computet was seen as a big challenge fot U.S. computet industty," said Hotst Simon, ditectot of the supetcomputing centet at Lawtence Betkeley National Lab in Califotnia.
"That an Ametican vendot and an Ametican application has won back the No. 1 spot -- that's the main significance of this."
Eatth Simulatot can sustain speeds of 35.86 tetaflops.
IBM said its still-unfinished BlueGene/L System, named fot its ability to model the folding of human ptoteins, can sustain speeds of 36 tetaflops. A tetaflop is 1 ttillion calculations pet second.
Lawtence Livetmote National Labotatoty plans to install the Blue Gene/L system next yeat with 130,000 ptocessots and 64 tacks, half a tennis coutt in size. The labs will use it fot modeling the behaviot and aging of high explosives, asttophysics, cosmology and basic science, lab spokesman Bob Hitschfeld said.
The ptototype fot which IBM claimed the speed tecotd is located in tochestet, Minn., has 16,250 ptocessots and takes up eight tacks of space.
While IBM's speed sets a new benchmatk, the official list of the wotld's fastest supetcomputets will not be teleased until Novembet. A handful of scientists who audit the computets' tepotted speeds publish them on Top500.otg.
Supetcomputing is significant because of its implications fot national secutity as well as such fields as global climate modeling, asttophysics and genetic teseatch.
Supetcomputing technology IBM inttoduced a decade ago has evolved into a $3 billion to $4 billion business fot the company, said Simon.
Unlike the mote specialized atchitectute of the Japanese supetcomputet, IBM's BlueGene/L uses a detivative of commetcially available off-the-shelf ptocessots. It also uses an unusually latge numbet of them.
The tesulting computet is smallet and coolet than othet supetcomputets, teducing its tunning costs, said Hitschfeld. He did not have a dollat figute fot how much lowet Blue Gene's costs will be than othet supetcomputets.
Howevet, othet supetcomputets can do things Blue Gene cannot, such as ptoduce 3-D simulations of nucleat explosions, Hitschfeld said.
When I first heard of Blue Gene (Score:2)
A few years the fastest supercomputers were being built to simulate atomic explosions including the first computer to break the teraflops barrier [sciencenews.org].
The Earth Simulator was built for peaceful purposes. Blue Gene is in name motiv
Place your bets (Score:5, Insightful)
What percentage of posts in the first 15 minutes will be about the spelling of the last word in the title, and what percentage about the content?
Re:Place your bets (Score:2)
-N
BlueGene? Deep Blue? (Score:3, Funny)
"I want to play chess against that one" - Kasparov
I read all three articles but couldn't find... (Score:3, Interesting)
Re:I read all three articles but couldn't find... (Score:5, Informative)
It's a sort of two layer system. The compute nodes (2 cpus per compute node) run a IBM proprietary, very small and simple, kernel. 64 compute nodes are managed by an i/o node running Linux.
Re:I read all three articles but couldn't find... (Score:3, Interesting)
I wonder if they let normal people see this thing? I'll ask.
Re:I read all three articles but couldn't find... (Score:2)
It's also been on local TV news, and in some of the newspapers.
You can understand that if there is something bigger floating around, we're not exactly allowing photographers in.
Re:I read all three articles but couldn't find... (Score:5, Informative)
Re:I read all three articles but couldn't find... (Score:5, Informative)
Here's a bit more: each node has 2 cpus and 4 fpus, custom non-preemptive kernel
application program has full control of all timing issues kernel and application share same address space
kernel is memory protected
kernel provides: program load / start / debug / termination file access all via message passing to IO nodes
I could go on and on but it's all on Blue Gene's site http://www.research.ibm.com/bluegene/index.html [ibm.com]
I can't resist adding that GCC won't use the second FPU on each die...
Re:I read all three articles but couldn't find... (Score:5, Insightful)
Each node is running an embedded linux kernel.
No.
each node has 2 cpus and 4 fpus, custom non-preemptive kernel
I see a contradiction with your previous statement here...
As I said in my comment above, the compute nodes run an IBM proprietary kernel (apparently the kernel you're describing), and every 64 compute nodes are managed by an i/o node running Linux.
I can't resist adding that GCC won't use the second FPU on each die...
So what's the problem? It's not like anybody who could afford a highly specialized and expensive machine like this one couldn't afford to shell out some $$$ for xlf.
Anyways, I'm sure that if this modified PPC core gets popular outside multi-million dollar supercomputers, the gcc team will figure out how to utilize the second FPU.
Re:I read all three articles but couldn't find... (Score:2)
Actually being that lifted every single fact from IBM's blue gene website (which I linked to) I feel comfortable saying there is no contradiction and that IBM is running a custom non-preemptive kernel, just like they said they did.
Re:I read all three articles but couldn't find... (Score:2)
I don't see anything wrong with embedded & non-preemptive it's not like the entire embedded world runs hard real time kernels, I don't on the PPC I use.
I agree, but I didn't say anything about that topic.
Actually being that lifted every single fact from IBM's blue gene website (which I linked to)
What a coincidence, I also have read a lot of the material on that site.
I feel comfortable saying there is no contradiction
The contradiction I pointed out was that first you said that each no
Re:I read all three articles but couldn't find... (Score:3)
The example Blue Gene/L implementation begins with dual PPC440 chips (each die with dual FPUs), A Compute Card contains 2 of these chips, some number of Link Nodes and 512 meg of DDR RAM. There are 16 Cards to a Compute Node. Each Compute Node contains a IO Node. There are 32 Compute Nodes to a cabinet.
Each Compute Card runs "CNK", an IBM in house kernel written in C++ and is connec
Re:I read all three articles but couldn't find... (Score:2)
GCC has never been considered an excellent optimising compiler. Its good enough for basic system tools, the kernel, etc. But when performance matters you use a compiler that comes from your CPU vendor. Typically an archetecure specific compiler can yield 100% speedup and beyond over GCC. Its too bad AMD does not ship a compiler.
Re:I read all three articles but couldn't find... (Score:2)
Re:I read all three articles but couldn't find... (Score:2, Funny)
36 TFlops ? (Score:5, Interesting)
I know that when the Mac G5 Cluster was developed they claimed tremendous speed, but when the sustained rate was calculated, it turned out to be much lower
Re:36 TFlops ? (Score:2)
I have no idea of what you mean by sustained.
Re:36 TFlops ? (Score:2, Informative)
He is refering to the fact that horsepower has a time componant. It's only in rare conditions that you're interested in the instantaneous force a horse can apply. What you want to know is how much work you can get out of it per day.
A cheetah may be able to sprint to 100 kph, but I'll out distance it in 10 minutes driving my car at only 80 kph.
Human hunters on foot can only run about 15 kph, but can chase down large prey that can run 65 kph, because the human
Re:36 TFlops ? (Score:5, Informative)
Va. Tech cluster not on current Top 500 list (Score:4, Interesting)
Except that it's not on the most recent Top 500 list [top500.org] anywhere.
Remember how Va. Tech replaced all 1100 G5 nodes with G5 XServes a few months ago? Well, when you do something like that, you have to rerun and resubmit the benchmark. Va. Tech were not able to get the machine back together soon enough to rerun the benchmark in time to make the last list; there's even a big caveat about it on the Top 500 home page [top500.org].
(It's also not clear that the original version of the Va. Tech machine ever did anything other than run that benchmark, but that's another matter.)
Re:Va. Tech cluster not on current Top 500 list (Score:2)
If anything, we have the opposite problem where I work. Our clusters are so heavily utilized (averaging around 80-90% for the last couple months) that we've had a hard time freeing up enough nodes to run benchmarks and do preventative maintenance.
Re:Va. Tech cluster not on current Top 500 list (Score:2, Insightful)
Except during the first month or so of operation there was always a queue of jobs waiting to run. In fact, applying to use that computer is almost exactly like writing a grant proposal, except that the good proposals are awarded CPU hours instead of money.
In scientific computation, researchers can always use faster computers. The accuracy and scope of the models will expand to fill available computational resources.
Re:36 TFlops ? (Score:2)
Speaking of the Big Mac (lame name), where is it now? I don't see it in the list [top500.org] and the the news page on their site doesn't list it as coming back on the 24th edition in November.
Re:36 TFlops ? (Score:2)
Re:36 TFlops ? (Score:2)
From what I know when Virginia Tech's G5 cluster's results were submitted they looked OK. You can see the results here [top500.org]. The measured result was about 58% of the theoretical peak which is on par with other similarly configured systems. Now, why Tech spent $5 mil and rushed to get this system put together for the November 2003 top500 list, and then
Re:36 TFlops ? (Score:2)
Anyway, I know they were in the process of "upgrading" the G5 system. Its not too common for people to build and completely disassemble a $5 million computer within a couple of months. Yes, I realise that the new systems will have ECC memory. This was the #1 question when the first system was built, and the Tech people said "Oh, we have validation routines in our applications, we don't need ECC memory". Now they are putting in machines with ECC memory.
Re:36 TFlops ? (Score:2)
and the Tech people said "Oh, we have validation routines in our applications, we don't need ECC memory"
Actually I think the tech people were well aware of the fact that their cluster was damn near useless without ECC, they knew that they had to run everything twice at the very least, and they *DEFINITELY* knew that their validation routines had ABSOLUTELY nothing to do with this correcting memory errors.
Unfortunately many posters on /. are not nearly as intelligent. I saw LOTS of people here yelling
Which one is it - using it or testing it? (Score:2)
"upgraded", "live", "testing it" - which one is it? What the fuck does that mean - is it ready or not? It's live but it's still under testing? How can it be? It's the tests first, then going live.
And if they're done with it, why don't they publish another (higher, if the upgrade worked) benchmark?
If they
Re:Which one is it - using it or testing it? (Score:2)
(tig)
Re:36 TFlops ? (Score:2)
With a 95% confidence level (Score:3, Funny)
0. "fist pr0st!!!!!111~"
1. "Imagine a beowulf cluster of these!"
2. "But does it run Linux?"
3. "In Soviet Russia, SPEED RECORD SETS YUO!"
4. "1. Earth Simulator: 38.56 TFlops. 2. BlueGene/L36.01 TFlops. 3.
5. "I for one, welcome our supercomputer overlords."
6. "Do either of the supercomputers run BSD? BSD is dying."
7. "I didn't have enough time to read the article, but..."
Huh? (Score:3, Interesting)
the Blue Gene/L system next year with 130,000 processors and 64 racks, half a tennis court in size.
The prototype for which IBM claimed the speed record is located in Rochester, Minn., has 16,250 processors and takes up eight racks of space.
So does this mean the finished product, with almost 10x as many procs will be much faster still? Or am I reading this wrong?
Re:Huh? (Score:5, Informative)
"About IBM's Blue Gene Supercomputing Project Blue Gene is an IBM supercomputing project dedicated to building a new family of supercomputers optimized for bandwidth, scalability and the ability to handle large amounts of data while consuming a fraction of the power and floor space required by today's fastest systems. The full Blue Gene/L machine is being built for the Lawrence Livermore National Laboratory in California, and will have a peak speed of 360 teraflops. When completed in 2005, IBM expects Blue Gene/L to lead the Top500 supercomputer list. A second Blue Gene/L machine is planned for ASTRON, a leading astronomy organization in the Netherlands. IBM and its partners are currently exploring a growing list of applications including hydrodynamics, quantum chemistry, molecular dynamics, climate modeling and financial modeling."
-from the IBM website
Re:Huh? (Score:3, Funny)
*** NIB * BlueGene prototype ***
Re:Huh? (Score:2)
So does this mean the finished product, with almost 10x as many procs will be much faster still?
Yes. Assuming the machine scales linearly (might be possible with linpack) the real thing will have a linpack score 8x than that of the earth simulator.
Impressive yes, but keep in mind that the earth simulator was 5x faster than the next fastest machine when it was introduced in 2002.
Off the shelf configuration (Score:3, Interesting)
Unlike the more specialized architecture of the Japanese supercomputer, IBM's BlueGene/L uses a derivative of commercially available off-the-shelf processors. It also uses an unusually large number of them. The resulting computer is smaller and cooler than other supercomputers, reducing its running costs, said Hirschfeld. He did not have a dollar figure for how much lower Blue Gene's costs will be than other supercomputers.
This is the most interesting part of the article to me. Makers of supercomputers are going to go back and forth for the speed record. However, holding the speed record with off the shelf components seems like a separate achievement in and of itself. The article did mention, however, that the IBM system is not as capable as other supercomputers.
Re:Off the shelf configuration (Score:3, Informative)
TZ
Re:Off the shelf configuration (Score:2)
The most interesting part: (Score:5, Interesting)
"The new system is notable because it packs its computing power much more densely than other large-scale computing systems. BlueGene/L is one-hundredth the physical size of the Earth Simulator and consumes one twenty-eighth the power per computation, the company said."
1/100th the size and 1/28th the power. Now if that isn't a beautiful thing, I don't know what is.
Re:The most interesting part: (Score:2)
1/100th the size and 1/28th the power. Now if that isn't a beautiful thing, I don't know what is.
1/100th the cost?
Look out world .. (Score:2)
More detail (Score:3, Informative)
Thank you! (Score:2)
Re:More detail (Score:2)
With 65k processors it makes sense that the MTBF would be small, but wow. Of course, IBM has accounted for this in the design: the system is able to automatically recover from a failed node etc.
What for? (Score:3, Funny)
"IBM's new system nudges past a nearly three-year-old computer speed record of 35.86 "teraflops," or trillions of calculations per second, with a working speed of 36.01 teraflops....The current record-holder, known as the Earth Simulator, is a supercomputer in Yokohama, Japan, designed to simulate earthquakes."
Won't it be great when IBM announces that they built Blue Gene to simulate Japanese earthquakes? Neener neener.
Smart machines (Score:5, Interesting)
Re:Smart machines (Score:5, Interesting)
Timeout-- my head hurts.
Which brings me to my next point. If computer ever could think, it would eventually start to think about how it thinks... And then it would overheat or explode.
Re:Smart machines (Score:2)
way to catch up guys. (Score:5, Informative)
Blue Gene is a very interesting design in so much as it uses IBM's 32-bit powerpc cores, normally used for embeded applications. They put 2 cores on a die, and integrated a memory controller, as well as the 4 different interconnect networks. The cores are only clocked at about 800mhz, and are thus pretty wimpy individually. However, that can be good. Since the processor cores are quite modest, the ratio of memory bandwidth to CPU flops is quite high. Similarly the ratio of interconnect bandwidth to CPU flops is also very high. Thus the CPUs should run very efficiently on problems that will parallelize to thousands of cpus. Some problems, on the other hand, will perform terribly. I expect a lot of this system's performance depends on the scalability of the system software, and the compilers / libraries.
That said, the earth simulator is also really good at some applications, and not so good at others. Instead of 16,000 small CPUs, it uses 5000 massive vector CPUs. Each is clocked at only 500mhz, but has 8 parallel execution pipes, and about 50GBytes/sec of memory bandwidth. Problems that don't vectorize run through the very modest 500mhz scalar unit.
Earth simulator has realized a large percent of it's theoretical peak performance on real world simulations (often up to 50%) while most large systems approach (10%). I'm looking forward to see how well utilized Blue Gene is. Earth simulator was a direct descendant from NEC's sx-series supercomputers, which have a 20 year lineage. Blue Gene is a radical departure from IBM's regular HPC product offerings, and uses a new microkernel OS rather than clustered AIX nodes. I imagine there will be some stutter-steps in the early days of this new product, which will undoubtedly work themselves out over time.
Great work IBM.
Re:way to catch up guys. (Score:3, Interesting)
I expect a lot of this system's performance depends on the scalability of the system software, and the compilers / libraries.
The blue gene is an all out MPI machine. System software scalability is not that crucial, since every compute node kernel only controls 2 cpu:s. With this modest number of cpu:s per node, I'd guess it doesn't require any extreme trickery from the scalability point of view to achieve near hardware performance.
Software-wise, all the scalability problems lie in the design of the app
Re:way to catch up guys. (Score:3, Interesting)
Re:way to catch up guys. (Score:2)
I thought your comment about scalability of the system software meant the traditional scalability woes of SMP systems, complicated lock hierarchies and the like. Clearly the problems faced by MPP systems are different, and not as limiting since we can build MPP systems with about 2 orders of magnitude more CPU:s than shared memory systems.
But then the application does something like write(file, offset, &buffer). That can'
36.01 What ? (Score:3, Informative)
What interconnect technology are they using? (Score:2)
Chip H.
Re:What interconnect technology are they using? (Score:5, Informative)
Did they use infiniband? Or a proprietary interconnect, perhaps?
Proprietary. Actually, it has 3 networks, one mesh network for point-to-point communication, one tree network for collective communication and a service network for disk i/o, control, health monitoring etc. The service network is ethernet IIRC, the other two are custom.
Virginia Tech Supercomputer (Score:2, Funny)
Also, I'll be big money its already been used for gaming. What college studeny could resist?
Seen partial towers (Score:2, Informative)
Sadly, all 64 racks will never be in Roch, just not enough space.
Actually, StarTribune [startribune.com] has one (crappy) pic of some towers.
OK, it sounds fast, but (Score:2, Funny)
Re: (Score:2)
Re:think they could install distcc? (Score:2)
Nice...logo... (Score:2)
blakespot
Re:Nice...logo... (Score:2)
You know, before they redirected its usage to designing nuclear weapons, blue gene was supposed to do life-science calculation stuff...
Earthlings (Score:2)
Re:Earthlings (Score:2)
how many watts? (Score:2)
Re:how many watts? (Score:2)
Thats nice for IBM but real computing power.. (Score:5, Informative)
Re:Thats nice for IBM but real computing power.. (Score:3, Interesting)
Re:Thats nice for IBM but real computing power.. (Score:3, Informative)
Re:Thats nice for IBM but real computing power.. (Score:2)
Mac Rumor Sites (Score:3, Funny)
-ch
SOME SOME SENCE PLEASE! (Score:2)
IBM vs. SGI (Score:2, Interesting)
yeah, but... (Score:2)
Re:Tecord == Record? (Score:5, Funny)
Re:Can't let this one slip by... (Score:2)
Re:Someone please explain this to me (Score:2)
Re:Someone please explain this to me (Score:2, Informative)
Re:The question is... (Score:2, Funny)
Re:Processor Failure. (Score:2, Informative)
You deserve some credit for using "when", not "if" (IMHO)
The system is designed to work around failure. In the original protein folding simluation, the plan was (among other things) to checkpoint the system every hour in order to be able to restart if a failure occurred. In fact, the original expectation was that a processor would fail every few days (that presentation has since been taken down by IBM... was originally named "BlueGenePublic.