Forgot your password?
typodupeerror
Silicon Graphics Software Hardware Science Linux

SGI & NASA Build World's Fastest Supercomputer 417

Posted by timothy
from the but-does-it-run-windows dept.
GarethSwan writes "SGI and NASA have just rolled-out the new world number one fastest supercomputer. Its performance test (LINPACK) result of 42.7 teraflops easily outclasses the previous mark set by Japan's Earth Simulator of 35.86 teraflops AND that set by IBM's new BlueGene/L experiment of 36.01 teraflops. What's even more awesome is that each of the 20 512-processor systems run a single Linux image, AND Columbia was installed in only 15 weeks. Imagine having your own 20-machine cluster?"
This discussion has been archived. No new comments can be posted.

SGI & NASA Build World's Fastest Supercomputer

Comments Filter:
  • hmmmm...... (Score:4, Funny)

    by commo1 (709770) on Tuesday October 26, 2004 @10:46PM (#10638139)
    Let's see them predict the weather.....
    • by Anonymous Coward on Tuesday October 26, 2004 @10:58PM (#10638243)
      Today we predict a high of +3 Funny, with localised Trolling.

      Tomorrow looks like developing a slight rise in Insightful post, but a drop in overall Informative. "First Post" will remain as a constant pattern.

    • Re:hmmmm...... (Score:5, Informative)

      by OblongPlatypus (233746) on Wednesday October 27, 2004 @12:17AM (#10638749)
      You asked for it: "...with Columbia, scientists are discovering they can potentially predict hurricane paths a full five days before the storms reach landfall."

      In other words: RTFA, that's exactly what they're using it for.
      • Re:hmmmm...... (Score:4, Insightful)

        by Shag (3737) on Wednesday October 27, 2004 @12:52AM (#10638925) Homepage
        "...with Columbia, scientists are discovering they can potentially predict hurricane paths a full five days before the storms reach landfall."

        You don't live somewhere that gets hurricanes, do you? 'Cause scientists can already "potentially predict hurricane paths a full five days before the storms reach landfall." Hell, I can do that. A freakin' Magic 8 Ball can potentially do that.

        Maybe they're trying to say something about doing it with a better degree of accuracy, or being right more of the time, or something like that, but it doesn't sound like it from that quote.

        "Hey, guys, look at this life-sized computer-generated stripper I'm rendering in real-ti... oh, what? Um, tell the reporter we think it'd be good for hurricane prediction."

    • by cheekyboy (598084) on Wednesday October 27, 2004 @01:41AM (#10639189) Homepage Journal
      1. I bet CIA has something in order of 10-100x more powerfull, I mean if you can afford to wire up 5 full office floors of computers, say 20*512 * 5 per floor * 5 , thats a hell lot more. CIA can afford to spend 200m on it, and have 10 super clusters of 1000 tf each.

      2. I bet the CIA also can change the weather, go read HARP etc... if the russians can do it in the 80s then the CIA can do anything.

  • by Anonymous Coward on Tuesday October 26, 2004 @10:47PM (#10638145)
    ...when they hit the "TURBO" button on the front of the boxes they'll really scream.
    • Re:That's nothing... (Score:5, Informative)

      by jm92956n (758515) on Tuesday October 26, 2004 @11:01PM (#10638258) Journal
      when they hit the "TURBO" button on the front of the boxes they'll really scream.

      They did! According to C-Net article [com.com] they "quietly submitted another, faster result: 51.9 trillion calculations per second" (equivalent to 51.9 teraflops).
      • by jd (1658) <imipak&yahoo,com> on Tuesday October 26, 2004 @11:21PM (#10638393) Homepage Journal
        There it talks of a third run, at 61 teraflops, slightly over the estimated 60 teraflops predicted.


        Ok, so we have Linux doing tens of teraflops in processing, FreeBSD doing tens of petabits in networking, ... What other records can Open Source smash wide open?

        • From the article -

          NASA Secures Approval in 30 Days
          To accelerate NASA's primary science missions in a timely manner, high-end computing experts from NASA centers around the country collaborated to build a business case that Brooks and his team could present to NASA headquarters, the U.S. Congress, the Office of Management and Budget, and the White House. "We completed the process end to end in only 30 days," Brooks said.


          Wow. That's incredibly fast, IMHO.

          As the article mentions, I suppose NASA owes thi
          • by luvirini (753157) on Wednesday October 27, 2004 @12:00AM (#10638644)
            "NASA Secures Approval in 30 Days" Knowing how govermental processes normally go, this part really seems incredible. Normally even the "fluffy" pre-study would take that long(or way more), before anyone actually sits down to discuss actual details and such. Specially the way most everything with NASA seems to be over budget and way late. It is indeed good to see that there is still some hope, so lets hope they get the procurement prosesses in general more working.
          • by RageEX (624517) on Wednesday October 27, 2004 @01:28AM (#10639127)
            Good job NASA? Yeah I'd agree. But what about good job SGI? Why does SGI always seem to have bad marketing and not get the press/praise they deserve?

            This is an SGI system. SGI has laid out plans for terascale computing (stupid marketing speak for huge ccNUMA systems) a while ago. I'm sure NASA and SGI worked together but this is essentialy an 'Extreme' version of an off-the-shelf SGI system.
            • by AndyChrist (161262) <andy_christ@yah[ ]com ['oo.' in gap]> on Wednesday October 27, 2004 @03:32AM (#10639540) Homepage
              The reason SGI isn't getting the kind of credit it should is probably because of how they resisted linux and clustering for so long (before apparantly caving and deciding to go the direction the wind was blowing and put their expertise to doing the fashionable thing BETTER.)

              Slashdot carries grudges.
              • Yes there's some truth to that. One thing SGI has been guilty of is bad management and wishy-washiness. But it should be pointed out that SGI has been a supporter of OSS for a very very long time and has a been an important contributor not only to the Linux kernel but has also open sourced a lot of their own software. Heck they gave the world XFS for free!
              • how they resisted ... clustering

                Their new machines stilled aren't clustered. Clusters don't generally run single system images on shared memory computers. SGI's Altix systems use a NUMA link to enable them to efficiently acces memory on remote computers, making them a kind of distributed shared memory machine. And SGI's Origin systems are your traditional SMP machine. The Altix or Origin systems are neither cheap, nor off the shelf.

                Regarding your comment about them ignoring Linux, what was fundamentally
              • Commodity linux clusters are not the only kind of cluster out there. SGI has been building clusters since the late 80s. Their first super-computer product, the power-challenge clusters, were 16 and 36 way SMP boxes clustered together with hippi. Remember terminator-2 and jurasic park? Those were rendered on clusters of crimsons and indigo workstations. They may have called the NOW(network of workstations) instead of beowulf, but it was the same thing.

                As for linux, they stepped towards linux about the same
            • What about good job Intel? I see nowhere in this entire set of postings a nod to Intel? Sure, Intel has had the suck of suck lately in PR, but the brains behind this whole SGI monster are IA64 Itanium 2 processors. I certainly concede the SGI interconnects for the cluster are absolutely awesome, but as others have pointed out, if your cluster has killer software with crappy hardware, or killer hardware and crappy software, then your cluster sucks. 2+2=4 here.
        • "What other records can Open Source smash wide open?"

          Mmmm, home consumer usage, maybe?? HA! What was I thinking!?
    • by Dink Paisy (823325) on Tuesday October 26, 2004 @11:06PM (#10638297) Homepage
      This result was from the partially completed cluster, at the beginning of October. At that time only 16 of the 20 machines were online. When the result is taken again with all 20 of the machines there will be a sizeable increase in that lead.

      There's also a dark horse in the supercomputer race; a cluster of low-end IBM servers using PPC970 chips that is in between the BlueGene/L prototype and the Earth Simulator. That pushes the last Alpha machine off the top 5 list, and gives Itanium and PowerPC each two spots in the top 5. It's amazing to see the Earth Simulator's dominance broken so thoroughly. After so long on top, in one list it goes from first to fourth, and it will drop at least two more spots in 2005.

  • by Emugamer (143719) * on Tuesday October 26, 2004 @10:47PM (#10638146) Homepage Journal
    I have one of those... in a spare room!

    Who cares about a 20 system cluster, I want a one 512 processor machine!

    or 20, I'm not that picky
  • by Dzimas (547818) on Tuesday October 26, 2004 @10:47PM (#10638147)
    Just what I need to model my next H-bom... uhh... umm.... I mean render my next feature film. I call it "Kaboom."
  • Wow---- (Score:5, Funny)

    by ZennouRyuu (808657) on Tuesday October 26, 2004 @10:48PM (#10638159)
    I bet gentoo wouldn't be such a b**ch to get running with all of that compiling power behind it :)
  • by m00j (801234) on Tuesday October 26, 2004 @10:48PM (#10638161)
    According to the article it got 42.7 teraflops using only 16 of the 20 nodes, so the performance is going to be even better.
  • by ferrellcat (691126) on Tuesday October 26, 2004 @10:49PM (#10638165)
    ...they were *almost* able to get Longhorn to boot.
  • by fender_rock (824741) on Tuesday October 26, 2004 @10:49PM (#10638168) Homepage
    If the same software is used, its not going to make weather predictions more accurate. Its just going to give them the wrong answer, faster.
    • Well, maybe what makes the weather models inaccurate is the grid size of the simulations. If you try to model a physical system with a finite-element type of approach and set the gridsize so large that it glosses over important dynamical processes, it won't be accurate.

      But if you can decrease the grid size by throwing more teraflops at the problem, maybe we'll find that our models are accurate after all?

    • I'm not an expert on this, but your statement is in my opinion not completly true. Weather forecasting is a little bit like playing chess. One does have a lot of different path to take to find the best solution. Increased computing power allows for "deeper" searches and increases accuracy. My guess is that more accuracy requires exponentially more computing power. Comparing earth simulator to colombia makes me wonder how much accuracy has increased in this particular case.
      • The question is whether the limiting factor is the amount of data we have on the system, or how much we can do with that data. Weather is a fundamentally chaotic system, with sensitive dependence on initial conditions. So eventually any inaccuracy in the data will be amplified and throw off our predictions. But then again, we have a lot of data already, with satellites and weather balloons and airplanes flying through hurricanes, and so forth. Maybe we can squeeze a little more knowledge out of this dat
      • it's the wetware (Score:5, Insightful)

        by Doc Ruby (173196) on Wednesday October 27, 2004 @12:08AM (#10638692) Homepage Journal
        Weather prediction, it turns out, is *not at all* like playing chess. Chess is a deterministic linear process operating on rigid, unchanging rules. There is always a "best move" for every board state, which a sufficiently fast and capacious database could search for. Weather is chaotic, a nonlinear process. It feeds back its state into its rules, in that some processes increase the sensitivity to change of other simultaneous processes. Chaos cannot be merely "solved", like a linear equation; it must be simulated and iterated through its successive states to identify more states.

        Of course, we're just getting started with chaos dynamics. We might find chaotic mathematical shortcuts, just like we found algebra to master counting. And studying weather simulation is a great way to do so. Lorenz first formally specified chaos math by modeling weather. While we're improving our modeling techniques to better cope with the weather on which we depend, we'll be sharpening our math tools. Weather applications are therefore some of the most productive apps for these new machines, now that they're fast enough to model real systems, giving results predicting not only weather, but also the future of mathematics.
  • SGI & NASA now have developed a computer that will be able to run Longhorn.
  • Photos of System (Score:5, Informative)

    by erick99 (743982) <homerun@gmail.com> on Tuesday October 26, 2004 @10:50PM (#10638178)
    This page [sgi.com] contains images of the NASA Altix system. After reading the article I was curious as to how much room 10K or so processors take up.
  • Interesting Facts (Score:5, Informative)

    by OverlordQ (264228) * on Tuesday October 26, 2004 @10:50PM (#10638181) Journal
    1) This was fully deployed in only 15 weeks.
    (Link [sgi.com])

    2) This number was using only 16 of the 20 systems, so a full benchmark should be larger too.
    (link [sgi.com])

    3) The storage attached holds 44 LoC's (link [sgi.com])
    • More on the Storage (Score:3, Informative)

      by Necroman (61604)
      Check out http://www.sgi.com/products/storage/ [sgi.com] for some more info about the storage they are using. For those that don't want to wander around the site, there is a link under the picture of the storage array that says "Watch a Video" and it gives an overview of the technology that SGI uses in their storage solution.

      They use tape storage from Storage Tek like this one [storagetek.com]
      And harddrive storage from Engenio (formally LSI Logic Storage Systems) like this [engenio.com].
  • by Anonymous Coward
    ...single node of these...

    oh wait, sorry, Cray deja-vu :-)
  • by daveschroeder (516195) * on Tuesday October 26, 2004 @10:53PM (#10638208)
    Prof. Jack Dongarra of UTK is the keeper of the official list in the interim between the twice-yearly Top 500 lists:

    http://www.netlib.org/benchmark/performance.pdf [netlib.org] See page 54.

    And here's the current top 20 [wisc.edu] as of 10/26/04...
  • NASA.org? (Score:5, Funny)

    by lnoble (471291) on Tuesday October 26, 2004 @10:58PM (#10638244)
    Wow, I didn't know the NewAdvancedSearchAgent had such an interest or budget for super computing. I'd think they'd be able to afford their own web server though instead of being parked at domainspa.com and having to fill their entire page with advertisments.

    Try NASA.GOV.

  • by Dancin_Santa (265275) <DancinSanta@gmail.com> on Tuesday October 26, 2004 @10:58PM (#10638245) Journal
    Why does it take so long to build a super computer and why do they seem to be redesigned each time a new one is desired?

    It's a little like how Canada's and France's nuclear power plant system are built around standardized power stations, cookie cutter if you will. The cost to reproduce a power plant is negligble compared to the initial design and implementation, so the reuse of designs makes the whole system really cheap. The drawback is that it stagnates the technology and the newest plants may not get the newest and best technology. Contrast this with the American system of designing each power plant with the latest and greatest technology. You get really great plants each time, of course, but the cost is astronomical and uneconomical.

    So to, it seems with supercomputers. We never hear about how these things are thrown into mass production, only about how the latest one gets 10 more teraflops than the last and all the slashbots wonder how well Doom 3 runs on it or whether Longhorn will run at all in such an underpowered machine.

    But each design of a supercomputer is a massive success of engineering skill. How much cheaper would it become if instead of redesigning the machines each time someone wants to feel more manly than the current speed champion, that the current design be rebuilt for a generation (in computer years)?
    • But then what are the engineers supposed to do? Bored engineers like making new supercomputers.

      Although I joke, I do see your point. Perhaps it would be wiser if we left our current supercomptuer designs alone for a while until we really need an upgrade. Maybe they could spend some of their time fixing Windows instead?
    • Here's one reason it takes so long: you have to construct a special building to put a large super computer in. That could take several months to years to complete. You can't just set up computers in any old warehouse, you need the proper power, air conditioning systems, cable conduits, etc...

      Bringing pre manufactured super computers into the building is probably the easiest step.
    • Why does it take so long to build a super computer ...
      It doesn't. [rocksclusters.org]
    • by anon mouse-cow-aard (443646) on Tuesday October 26, 2004 @11:34PM (#10638489) Journal
      Thought experiment: Order 10000 PC's. time how long it takes to get them installed, with power, network cabling, and cooling, in racks, and installed with the same OS.

      Second thought experiment. Imagine the systems are built out of modular bricks that are identical to deskside servers. so that they can sell exactly the same hardware in anywhere from 2 to 512 processors by just plugging the same standard bricks together, and they all get the same shared memory, and run one OS. Rack after rack after rack. That is SGI's architecture. It is absolutely gorgeous.

      So they install twenty of the biggest boxes they have, and network those together.

      $/buck ? I dunno. Is shared memory really a good idea? Probably not. but it is absolutely gorgeous, and no-one can touch them in that shared memory niche that they have.

    • by Geoff-with-a-G (762688) on Tuesday October 26, 2004 @11:44PM (#10638543)
      Why does it take so long to build a super computer and why do they seem to be redesigned each time a new one is desired?

      Well, are we talking about actual supercomputers, not just clusters? 'Cause if you're just trying to break these Teraflops records, you can just cram a ton of existing computers together into a cluster, and voila! lots of operations per second.

      But it's rare that someone foots the bill for all those machines just to break a record. Los Alamos, IBM, NASA, etc. want the computer to do serious work when it's done, and a real supercomputer will beat the crap out of a commodity cluster at most of that real work. Which is why they spend so much time designing new ones. Because supercomputers aren't just regular computers with more power. With an Intel/AMD/PowerPC CPU, jamming four of them together doesn't do four times as much work, because there's overhead and latency involved in dividing up the work and exchanging the data between the CPUs. That's where the supercomputers shine: in the coordination and communication between the multiple procs.

      So the reason so much time and effort goes into designing new supercomputers is that if you need something twice as powerful as today's supercomputer, you can't just take two and put them together. You have to make new architecture that is even better at handing vast numbers of procs first.
    • I don't build supercomputers, but I do build systems that look a lot like them in very similar infrastructures. I'm not sure why it took them 120 days (okay, "under" 120 days), but when we build out a datacenter with 70 to 100 machines, it usually takes a bit of time:
      a) obtain space. Usually, raised floors, rack systems, with adequate HVAC for the huge thermal load you're about to throw into a few racks. For collocation, it'll take some time for your provider to wire together a cage for your installation, e
  • Ermm, which NASA are we talking about again?

    National Aeronautics and Space Administration [nasa.gov]
    New Advanced Search Agent [nasa.org]
  • by Doppler00 (534739) on Tuesday October 26, 2004 @11:05PM (#10638283) Homepage Journal
    by a computer they currently being set up at Lawrence Livermore National Lab: 360 teraflops [zdnet.com]

    The amazing thing about it is that it's built at a fraction of the cost/space/size as the Earth simulatior. If I remember correctly, I think they already have some of the systems in place for 36 teraflops. It's the same Blue Gene/L technology from IBM, just a larger scale.

  • Cost (Score:5, Interesting)

    by MrMartini (824959) on Tuesday October 26, 2004 @11:07PM (#10638302)
    Does anyone know how much this system cost? It would be interesting to see how good of a teraflop per million dollar ratio they achieved.

    For example, I know the Virginia Tech cluster (1,100 Apple Xserve G5 dual 2.3Ghz boxes) cost just under $6 million, runs at a bit over 12 teraflops, so it gets a bit over 2 teraflops per million dollars.

    Other high-ranking clusters would be interesting to evaluate in terms of teraflops per million dollars, if anyone knows any.
    • Re:Cost (Score:4, Informative)

      by MrMartini (824959) on Wednesday October 27, 2004 @12:27AM (#10638802)
      Since no one else has answered my question, I'll post the results of searching on my own:

      http://news.com.com/Space+agency+taps+SGI,+Intel+f or+supercomputer/2100-1010_3-5286156.html [com.com]

      The cost is quoted in the article at $45 million over a three year period, which indicates that the "Columbia" super cluster gets a bit more than 1 teraflop per million dollars. That seems impressive to me, considering the overall performance.

      It would be interesting to see how well the Xserve-based architecture held its performance per dollar when scaled up to higher teraflop levels...
      • Re:Cost (Score:3, Informative)

        by Junta (36770)
        Actually, a lot of the top500 supercomputers acheive or beat $1million/TFLOPs. Even if the price points weren't that good on the component parts, marketing departments are inclined to give huge discounts for the press coverage. You can bet SGI and Intel both gave exhorbitant discounts here, SGI's market presence has been dwindling, and overall the Itanium line has been a commercial failure. Being #1 on the top 500 for 6 months (the length between list compilations, and BlueGene isn't even close to finish
  • 65 > 43

    It was here on slashdot last week [slashdot.org], IIRC. :)

  • Not fully true (Score:2, Informative)

    by ValiantSoul (801152)
    They were only using 16 of those 20 servers. With all 20 they were able to peak 61 teraflops. Check the article [com.com] at CNET.
  • Ya know... (Score:5, Funny)

    by Al Al Cool J (234559) on Tuesday October 26, 2004 @11:13PM (#10638343)
    It's getting to the point where I'm going to have call shenanigans on the whole freakin' planet. Am I really supposed to believe that an OS started by a Finnish university student a decade ago and designed to run on a 386, is now running the most powerful computer ever built? I mean, come on!

    Seriously, am I on candid camera?

  • by chicagozer (585086) on Tuesday October 26, 2004 @11:15PM (#10638353)
    Emulating a Centris 650 running Mac OS X at 2.5 Ghz.
  • Yes, but does it run linux?
  • 70.93 TeraFLOPs (Score:5, Interesting)

    by chessnotation (601394) on Tuesday October 26, 2004 @11:20PM (#10638382)
    Seti@home is currently reporting 70.93 TeraFLOPs/sec. It would be Number One if the list were a bit more inclusive.
  • And with all that power they feel the need to .ZIP their .JPG images which actually shaved off an entire 4K on this single 848K file. Wow. I should have thought of that.

    Columbia [sgi.com]
  • by Trogre (513942) on Tuesday October 26, 2004 @11:44PM (#10638545) Homepage
    ...rubbing his hands whilst sitting in a dark corner amongst an ever-dwindling pile of Microsoft-donated cash, salivating at this.

    "512 processors, 20 machines, $699 per processor. All that intellectual property, yes! No free lunch no, Linux mine, MIIIINE, BWAAAAHAHAHAHA!!!"

    *dials*

    "Hello, NASA? About that $7,157,760 you owe me...
    I'm sorry, where do you want me to jump?"

  • Linux #1 (Score:5, Interesting)

    by Doc Ruby (173196) on Tuesday October 26, 2004 @11:57PM (#10638631) Homepage Journal
    The most amazing part of this development is that the fastest computer in the world runs Linux . All these TFLOPS increases are really evolutionary, incremental. That the OS is the popular, yet largely underground open source kernel is very encouraging for NASA, SGI, Linux, Linux developers and users, OSS, and nerds in general. Congratulations, team!
    • But it would take to long to:

      Run Windows Update for each box.
      Remove Windows Messenger.
      Cancel the window telling you to take a tour of XP.
      Cancel the window telling you to get a passport.
      Run the net connection wizard.
      Reboot after installing updates.
      etc....

      (I'm not being totally serious, I know you can deploy ghost images etc..)
  • by swordgeek (112599) on Wednesday October 27, 2004 @12:00AM (#10638645) Journal
    Curiously enough, we were talking about the future of computing at lunch today.

    There was a time when different computers ran on different processors, and supported different OSes. Now what's happening? Itanic and Opteron running Linux seem to be the only growth players in the market; and the supercomputer world is completely dominated by throwing more processors together. Is there no room for substantial architectural changes? Have we hit the merging point of different designs?

    Just some questions. Although it's not easy, I'm less excited by a supercomputer with 10k processors than I would be by one containing as few as 64.
  • by Sabalon (1684) on Wednesday October 27, 2004 @12:10AM (#10638703)
    It was great. I needed to build the kernel so I typed
    # make -j 10534 bzImag
    and even before I could hit the e and enter, it was done.

    I was gonna build X but on this box the possible outcomes of "build World" scared me!
  • Units? (Score:3, Funny)

    by Guppy06 (410832) on Wednesday October 27, 2004 @12:19AM (#10638758)
    "Its performance test (LINPACK) result of 42.7 teraflops easily outclasses the previous mark set by Japan's Earth Simulator of 35.86 teraflops"

    Yes, but what is that in bogomips?

In every non-trivial program there is at least one bug.

Working...