Forgot your password?
typodupeerror
IBM Supercomputing Hardware

Installation of Blue Waters Petaflop Supercomputer Begins 86

Posted by Unknown Lamer
from the too-many-flops-to-fail dept.
An anonymous reader writes "The National Center for Supercomputing Applications at the University of Illinois is finally getting the troubled Blue Waters supercomputer installed. After IBM walked away from the project after 3 years of planning, Cray stepped in to pick up the $188 million contract. Now, in around 9 months time, Blue Waters should be fully operational and achieve performance of 1 petaflop or more. As for the hardware... who wouldn't want access to 235 Cray XE6 cabinets using AMD 16 core Opteron 2600 processors with access to 1.5 petabytes of memory (4GB per chip) and 500 petabytes of local storage."
This discussion has been archived. No new comments can be posted.

Installation of Blue Waters Petaflop Supercomputer Begins

Comments Filter:
  • That's the real question.

    • About seven years. That's the almost clock-work timing of supercomputer to laptop, desktop should be a bit sooner.

      • by gentryx (759438) * on Monday January 30, 2012 @07:50PM (#38872023) Homepage Journal

        Let's look this up. 7 years ago #1 on the Top500 was an IBM BlueGene/L at 70 TFLOPS. I can't see that performance anywhere close on the desktop or even on the notebook market.

        Assuming you're running a good SLI systems and that your GPUs actually deliver the performance the manufacturer is claiming them to have, you'd get in the best case something around 1.5 TFLOPS which corresponds roughly to a 1998 ASCI Red [top500.org].

        • SLI is absolutely useless for CUDA-based (Cray's uses NVIDIA GPUS) GPGPU.

          • by gentryx (759438) *
            Sure, I was using SLI as an abbreviation for a multi GPU system. And since I was refering to a hypothetical desktop, it might even run AMD GPUs, not just Nvidia chips). But yeah, I know: AMD GPUs generally suck at scientific computing. Sadly.
        • by afabbro (33948)

          Let's look this up. 7 years ago #1 on the Top500 was an IBM BlueGene/L at 70 TFLOPS. I can't see that performance anywhere close on the desktop or even on the notebook market.

          Whoosh...

      • by itamblyn (867415)
        I've found it's actually closer to about 15 years.
        • by rubycodez (864176)
          That would depend type of problem? If I had a lproblem suitable for a vector supercomputer, maybe I would only need one or four processors. An NEC SX-6 processor had 8 GFlop in 2001, By 2005, the SX-8 had 16 vector GFLOPS per CPU. Then in 2008 the SX-9 was up to 102 GFLOP, about where our core i7 desktops are. so maybe there is four or five year lag
    • Depends... (Score:5, Funny)

      by Junta (36770) on Monday January 30, 2012 @08:01PM (#38872119)

      How big is your desk?

  • It rings a bell for two things: Atari and the hacker magazine.

    I wonder if there's a connection somewhere ;)

    • by Zakabog (603757)

      Very likely there is a subconscious connection as it's really an Opteron 6200, the 2600 is a typo.

  • Hang on... is there an app for it?
    • Re: (Score:1, Troll)

      by Forbman (794277)

      well, it can run Crysis at about 42 fps...

  • Um, me (Score:5, Interesting)

    by jimhill (7277) on Monday January 30, 2012 @08:00PM (#38872107) Homepage

    If the Cray architecture selected for Blue Waters is akin to that of Cielo then UIUC is going to rue -- RUE! -- the day they got in bed with these Cray con-men. The uptime and filesystem stability of Cielo is an absolute dog (as in, at least 2 FS rebuilds per week with data loss accompanying 2 in 5).

    • You raise an interesting point. The usual level of Slashdot "commentary" on Supercomputers usually isn't much above the level of jokes about Crysis and pissing matches between AMD ARM and Intel fanboys. Slashdot generally misses those little trivial details like... does it actually work doing something other than a meaningless Top500 benchmark.

      • Re:Um, me (Score:4, Funny)

        by gmhowell (26755) <gmhowell@gmail.com> on Tuesday January 31, 2012 @04:02AM (#38874945) Homepage Journal

        You raise an interesting point. The usual level of Slashdot "commentary" on Supercomputers usually isn't much above the level of jokes about Crysis and pissing matches between AMD ARM and Intel fanboys.

        Back in my day, the Slashdot "commentary" on Supercomputers was about Beowulf clusters and Natalie Portman and hot grits.

        Damned kids running around on my lawn again...

    • Re: (Score:2, Informative)

      by Anonymous Coward

      Different file systems - Cielo is running Panasas (pfs) and Blue Waters will be running Lustre...

      Ironic CAPTA: painless

      • by jimhill (7277)

        We were notified last week that Those Who Run The Machine are throwing in the towel on Panasas and are securing a Lustre-based farm for Cielo.

    • by gentryx (759438) *
      Me neither. AMD's Interlagos (a.k.a. Bulldozer) chips have proven to absolutely suck at floating point performance. And in supercomputing floating point means everything. As much as I love the eternal underdog AMD, I can only hope Cray will soon start selling Intel systems, too. Sandy Bridge's AVX implementation is much better as the internal datapaths (L2->L1, L1->registers) are more elaborate.
      • According to Google, Cray and Intel are working together on future supercomputers.
      • Re:Um, me (Score:5, Insightful)

        by TheSunborn (68004) <tiller&daimi,au,dk> on Monday January 30, 2012 @09:43PM (#38873083)

        No they have not. Take a look at multicore spec test at http://electronicsnexus.com/articles/Opteron-Xeon-Benchmarks-2012-01.aspx [electronicsnexus.com] where the 4x6282SE Opteron is the fastes 4 processor system testet. Or to quote

        "For example, note that the top-end 16-core 6282SE Opteron is a match for the top-end 10-core Xeon on floating point, and is not far behind it on integer either"

        Oh and the opteron cost less then half the price of the 10-core Xeon chip. So I think that slightly better floatingpoint performance, for less then half the price, make opteron the obvious choice, assuming you can split the workload so you can really use all the cores. Something I assume they master, since they are going to run their code on more then 1000 cores at a time.

        • by Junta (36770)

          In all truthfulness, the 10-core Xeon's (Westmere architecture still) aren't Intel's shining star of FP performance. Intel's strength is in their 8-core Xeons (Sandy Bridge) that are only recently coming into the market (not lagging Interlagos much at all). HPC has rarely been about the expensive high-end Xeons (massively expensive and generally 'last-gen' compared to the middle-tier Xeons with the main historical benefit of getting you to 4 sockets in one 'system', which is largely a moot point in HPC wh

          • by afidel (530433)
            You STILL can't buy an E5 Xeon from anyone unless you were one of the few shops to order a datacenter full of them and are working with your OEM on the errata fixes. To say that the E5 Xeon isn't trailing Interlagos by much is a huge stretch since they're still essentially vaporware.

            On an unrelated note WTF are they using 4GB DIMM's? 8GB DIMM's have been the sweet spot for servers for the last ~18 months. The only thing I can think of is that they don't have the internode bandwidth to effectively use a glo
    • by magarity (164372)

      No kidding - Seymour may be rolling in his grave over having his name attached to anything massively parallel. His entire design philosophy was to have just a few uber processors cranked up as fast as possible, although I wonder if by now he'd have changed his mind. Multiple processor servers were expensive when he passed away and the multiple core race we have going on now wasn't even fantasy.

      • by rubycodez (864176)
        some problems still need vector supercomputers, funny Cray has even slapped their label on NEC SX vector supercomputers. Seymore must be doing 10,000 rpm.
      • by rbmyers (587296)

        No kidding - Seymour may be rolling in his grave over having his name attached to anything massively parallel. His entire design philosophy was to have just a few uber processors cranked up as fast as possible, although I wonder if by now he'd have changed his mind. Multiple processor servers were expensive when he passed away and the multiple core race we have going on now wasn't even fantasy.

        The number of processors isn't the issue. The degree of connectivity is the issue, and IBM, Cray, and Seymour would all get it, even if the current "Cray" and UIUC aren't going to admit it. This version of Blue Waters is just another in a long line of massively parallel jokes. The version of Blue Waters proposed and abandoned by IBM would have been worth talking about.

        Flops are nearly free. Connectivity is expensive. That's why flops, irrelevant though they may be, are advertised.

  • Typo (Score:5, Informative)

    by reking2 (813728) on Monday January 30, 2012 @08:08PM (#38872191)
    Please correct "Opteron 2600" to "Opteron 6200". There are no 2600 series chips from AMD.
  • Woah. Now thats a name I haven't heard of for a while.... I'm glad to hear that they're still in the game!
  • Yes, but does it run .... oh forget it

  • Ahem, NVIDIA? (Score:4, Informative)

    by Mike_K (138858) on Tuesday January 31, 2012 @12:24AM (#38874137)

    It is very nice that AMD Opterons are mentioned and petaflops are celebrated, but aren't those petaflops mostly delivered by NVIDIA's Kepler Tesa cards?

    From the TFA:

    Cray XK6 blades with NVIDIA(R) Tesla(TM) GPUs, based on NVIDIA
    (NASDAQ: NVDA) next-generation 'Kepler' architecture, which is
    expected to more than double the performance of the Fermi GPU on
    double-precision arithmetic.

    • by gentryx (759438) *
      Actually most of the cabinets will be XE6, not XK6. Most codes at U of I aren't GPU ready.
  • Hardware compared to, say, 1970? Mammoth progress. Room-sized state of the art then is dwarfed by a low-end laptopnow.

    Software compared to, say, 1970? We've moved a little, but really it isn't all that much different. Things are more GUI, some fads have come and gone, but as Robert Martin puts it [youtube.com], it's still just sequence, selection, and iteration.

    • by LeDopore (898286) on Tuesday January 31, 2012 @09:02AM (#38876529) Homepage Journal

      Dear afabbro,

      You are largely correct. Most software has not sped up much since the 1970s, and it could even be argued that developers write such sloppy code these days that even our improved compilers can't compensate, especially in applications where performance is no longer critical.

      On the other hand, since about 2006 there have been some tremendous advances in algorithms. One optimization problem I work on, Basis Pursuit Denoising http://en.wikipedia.org/wiki/Basis_pursuit_denoising [wikipedia.org], has had on the order of a 10-fold increase in real-world speed on constant hardware every year for the past 5 years (see http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5940245 [ieee.org] for my contribution).

      These advances are not just academic games; they are actually worth doing. They could eventually lead to computers with sensory processing routines that have a mote of common sense to them, able to perform some real-world tasks we currently need humans for.

      While I agree that by and large, most software is getting fat and lazy, there are a few problems where today's algorithms on 2002 hardware mop the floor with 2002 algorithms on today's hardware.

      Best,

      LeDopore

      • One optimization problem I work on, Basis Pursuit Denoising , has had on the order of a 10-fold increase in real-world speed on constant hardware every year for the past 5 years

        Great, so how about making OCR on noisy scans work next? The archive.org desperately needs something that works....

        • by LeDopore (898286)

          There's been some really promising work in the direction of OCR-like problems lately. Here's an algorithm that can efficiently learn a small dictionary of symbols (like letters) and decompose a signal into elements that fit within this "low-rank" dictionary plus sparse noise (bugs squashed on the text?) plus Gaussian noise: https://sites.google.com/site/godecomposition/ [google.com].

          It's not literally magical, but it's super-duper awesome (an no, I'm not an author of this one) and it should contribute to the minor revo

God may be subtle, but he isn't plain mean. -- Albert Einstein

Working...