Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
IBM Supercomputing Hardware News Science

Cray Replaces IBM To Build $188M Supercomputer 99

wiredmikey writes "Supercomputer maker Cray today said that the University of Illinois' National Center for Supercomputing Applications (NCSA) awarded the company a contract to build a supercomputer for the National Science Foundation's Blue Waters project. The supercomputer will be powered by new 16-core AMD Opteron 6200 Series processors (formerly code-named 'Interlagos') a next-generation GPU from NVIDIA, called 'Kepler,' and a new integrated storage solution from Cray. IBM was originally selected to build the supercomputer in 2007, but terminated the contract in August 2011, saying the project was more complex and required significantly increased financial and technical support beyond its original expectations. Once fully deployed, the system is expected to have a sustained performance of more than one petaflops on demanding scientific applications."
This discussion has been archived. No new comments can be posted.

Cray Replaces IBM To Build $188M Supercomputer

Comments Filter:
  • by unity100 ( 970058 ) on Monday November 14, 2011 @09:48AM (#38048128) Homepage Journal
    Along with the cray they are upgrading (#3 in the world now, will be #1 when complete) and the one lockheed martin ordered (3 days ago) this is the third supercomputer that was ordered in the last 3 weeks to use opterons (bulldozer 16 cores).

    the cpu sucks so much that, it is exclusively dominating the SUPERcomputer market.
    • by dutchwhizzman ( 817898 ) on Monday November 14, 2011 @09:55AM (#38048178)
      Designing supercomputers involves a lot of investment in inter-CPU messaging and memory sharing. Once a supercomputer-vendor has committed themselves to a platform, it's not easy to migrate to another. Given the volumes they sell, design costs will have to be spread on just a few actual installations. Maybe AMD was the best platform to use when these computers were originally designed, but they are outdated now. The fact that these new AMD CPUs will work in "ancient" sockets and use the same interconnects, will make development cost for a performance upgrade lower.

      Obligatory car metaphore: Most car manufacturers put old technology in cars they bring out today as well, just because the cost of developing new technology and building production lines is commercially prohibitive.
      • by LWATCDR ( 28044 )

        AMD has held an advantage in systems with more than 4 sockets for a while. You just don't see that many Intel x86 based systems with a dozen or more sockets. Itantium tends to be used on those systems.

        • by Kjella ( 173770 )

          Not sure if they count as systems, but Intel has about 75% of the TOP500 list with AMD about 13%. And that's coming from a period where AMD has had really strong Opterons. But then I don't think each node has a dozen or more sockets...

          • by LWATCDR ( 28044 )

            The Xeon has gotten much better but I will bet that a lot of those systems use smaller SMP nodes coupled with Infinityband or some custom interconnect as their structure where the AMD systems use larger NUMA Clusters linked with Infinityband or some custom interconnect. That is just my guess of course.

      • by trum4n ( 982031 )
        Exactly. Why are cars still gas powered? Because building a real electric car takes real design work. They rather just strap a new engine and some uglier panels on the body of an existing car.
        • Trolling? OK I'll bite. "So you don't think it has anything to do with the fact that the energy density of gasoline is 12,200 Wh/kg while lithium ion batteries have an energy density of 439 Wh/kg plus a limited charge cycle of ~1000?"
        • Cars are still gas-powered because of massive collusion. They are now finally starting to bring out production EVs because China was going to do it sooner or later -- they're producing electric vehicles of all types as fast as they possibly can, most of them suck but it's only a matter of time as there's a lot of problem-solving to do, but EVs are conceptually simple. But the big three automakers can agree on one thing, and that is that EVs are bad for their bottom line. The dealers for the Big 3 depend on

          • by trum4n ( 982031 )
            The one big problem i have is that GM already did it. The EV1 is still the mark we homebuilt EV guys compare to. All they had to do for today's market is extend it slightly, add a back seat, and drop in some Lithium batteries.
            • The one big problem i have is that GM already did it.

              Sigh. The one big problem you have with WHAT?

              All they had to do for today's market is extend it slightly, add a back seat, and drop in some Lithium batteries.

              And make it meet the federal crash test standards that they helped write in order to keep cars like that off the market, in order to step on the people of California's attempt to improve our air quality through vehicle emissions reductions.

      • Re: (Score:2, Informative)

        by flaming-opus ( 8186 )

        It is true that it costs a lot to switch processors, but lets remember that HPC systems are also very price sensitive. Blue waters will have more than 50,000 processor sockets in it. Xeon processors may be better than opterons, but they also cost a LOT more. Multiply that by 50,000. In the benchmarks I've seen, 16 core opterons almost compete with 8 core xeons in raw performance, but blow the xeons away in price/performance.

      • Re: (Score:3, Insightful)

        by drinkypoo ( 153816 )

        Because of AMD's design glue logic is cheaper. You can see this reflected in the cost of all levels of motherboard for both AMD platforms vs. their intel competition. This is especially important in a supercomputer. AMD has been easier to build into massively parallel systems for longer. The intel processors are slightly snazzier dollar for dollar, but not that much more amazing. Therefore there are only two reasons you would use intel over AMD to build a supercomputer (cluster size, maximum power limitatio

      • by JBMcB ( 73720 )

        Obligatory car metaphore: Most car manufacturers put old technology in cars they bring out today as well, just because the cost of developing new technology and building production lines is commercially prohibitive.

        Not quite - car technology lags behind the marketplace because type acceptance on electronics takes years. A new engine can take six or seven. Especially on low-margin cars, like compacts. A single warranty recall can blow the profit margin on an entire production run. They want the latest tech in their products, but they aren't going to throw profit out the window if it breaks.

    • It sucks as a desktop processor, though, which is what most of the people on slashdot care about. It's still up in the air if OS optimisations will solve that, though.

      Still, hopefully a good foothold in the supercomputing market will give AMD the oomph it needs to compete with Intel in every field.

      • it sucks for certain kinds of desktop work. for the things I do, core count and size of RAM beats mips, flops, GHz. Modest six-core AMD is better for me than highly priced double or quad core Intel

    • So...two ordered...DOMINATION! Are you actually astroturfing for AMD's marketing department and getting paid or are you really this much of a desperate fanboy?
      • 3 weeks, 3 supercomputer orders, the minimum value of each is 200 million. 2000 cpus minimum goes to just one of them. none intel, none anything other than amd.

        yes, its pretty much domination with this picture.
        • So new CPU, orders for the CPU come in, and that's domination? Come back when the SB Xeons hit and we'll see what happens.
  • by Dyinobal ( 1427207 ) on Monday November 14, 2011 @09:50AM (#38048158)
    Ever since reading Jurassic Park, I've always wanted a Cray supercomputer. No other super computer company had a hand in bringing dinosaurs back to life. Once you've resurrected dinosaurs I don't think that can be topped. I wonder if U of I is planning on doing any dinosaur resurrections with their new super computer.
  • Cray is still alive? Wasn't it gobbled up by Silicone Graphics? and then SGI too went belly up? Well, it is a blast from the past.
    • Cray has several of the Top 10 supercomputers on earth, especially in the US. They're pretty nice to work with, too.
    • Re: (Score:3, Informative)

      by wezelboy ( 521844 )
      The Cray name was bought from SGI by Tera. SGI was later bought by Rackable.
      • It's worth noting that the "new Cray", while they obviously don't make the old vector processor systems that they did originally, makes a really nifty hybrid cluster/SSI (single system image) supercomputer that is notably different than most of what's on the market. Man, seeing articles like this makes me want to get back into HPC stuff. I'm making a bit more doing this corporate crap, but I really miss getting to play with the cutting edge stuff.

      • They didn't just buy the name. They also bought all of the people who designed and built those earlier Cray machines. There are still people at Cray who had a hand in the original Cray 1. It's actually a rather nice mix of expertise, multithreading experience from the Tera side, scalable MPP and vector experience from the Cray Research side.
        • Good to know. I'm kinda curious to hear what they've got planned for their integrated storage solution. :-)
    • by Yvan256 ( 722131 )

      What year is this? Is it still time to buy AAPL shares at 8$USD?

    • by Niomosy ( 1503 )

      SGI split Cray off quite a while ago.

  • Last time I was at the air and space museum in Washington DC I saw a Cray Supercompter http://www.nasm.si.edu/collections/artifact.cfm?id=A19880565000 [si.edu]
        I was extremely excited and tried to show my kids who only saw a very weird big computer thing. A new supercomputer built by Cray sounds like a great idea :)

    • Re:Wow (Score:5, Interesting)

      by DrgnDancer ( 137700 ) on Monday November 14, 2011 @10:38AM (#38048630) Homepage

      Cray has had Supercomputers on the top ten list (and even in the number one spot) again for years now. Ever since they spun off from SGI they've had one of the more interesting architectures in HPC. I was interviewing at ORNL when they were installing Jaguar [ornl.gov], and I got a pretty in depth description of the hows and the whys. It's no longer the most powerful computer in the world, but it's still a very impressive piece of machinery. Sigh. I really need to get back into HPC.

    • Names like Cray and Silicon Graphics are associated with a time when most of us could only imagine what incredible technology existed behind closed doors, inaccessible to mere mortals. Now all the excitement is behind commodity items that sell for $500 or less. It's fantastic. Yet, where's the mystique? I miss it.
      • The mystique is in scaling. It's very hard to run codes on hundreds of thousands of cores and get decent performance. Communication is a huge problem which is why you still see custom interconnect on the high-end systems. Memory architectures on these machines are pretty exotic. It's not just about having a fast processor. It's more about making sure that you can feed that fast processor.
  • by ackthpt ( 218170 ) on Monday November 14, 2011 @10:45AM (#38048702) Homepage Journal

    I've been working with an agency who contracted a large project to IBM a few years ago. The results have been ... unimpressive. The training was largely a waste of time, I don't believe they even understood their audience.

    Better to see Cray, I think as IBM is shopping out a bit too much of their work to people who aren't up to it .. unless IBM has seen the light.

  • imagine a Beowulf cluster of these . . .

    • Okay, hang on.

      hhhnnnnnnnnnggggg...
      hhhhhhhhhNNNNNNnnnngg..
      hrrrrrrrhrhrhrrrgg..

      I think I got it. It looks alot like a small city. But programming the whole mess is exactly like programming one of them, since each is already a cluster.

  • by gentryx ( 759438 ) * on Monday November 14, 2011 @12:18PM (#38049724) Homepage Journal

    As covered earlier here [slashdot.org], IBM backed out of the contract because they thought they wouldn't be able to meet the performance requirements for existing codes. They were concerned about clock speeds (POWER7 [wikipedia.org] runs at 4 GHz). POWER7 excels at single thread performance, but also in fat SMP nodes.

    What NSCA ordered now is system that is pretty much the antipode to the original Blue Waters: the Bulldozer cores are sub-par at floating point performance, so they'll have to rely on the Kepler GPUs. Those GPUs are great, but to make them perform well, NSCA and U of I will have to rewrite ALL of their codes. Moving data from host RAM to the GPU RAM over slow PCIe links can be a major PITA, especially if your code isn't prepared for that.

    Given the fact that codes in HPC tend to live much longer than the supercomputer they run on, I think it would have been cheaper for them to give IBM another load of cash and keep the POWER7 approach.

    • Well, they won't have to completely rewrite all of their codes thanks to OpenACC [cray.com]. They will probably still have to do a bit of restructuring (and that's not a small task) but the nitty-gritty low-level stuff like memory transfers should be handled and optimized by the compiler.
      • by gentryx ( 759438 ) *

        That's similar to what PGI [pgroup.com] is doing. And you know what? It's not that simple. You seldom achieve competitive e performance with this annotation type parallelization, simply because the codes were written with different architectures in mind.

        This is also the reason why the original design did emphasize single thread performance so much. The alternative to having POWER7 cores running at 5 GHz would have been to buy a BlueGene/Q with much more, but slower cores.They didn't go into that avenue because they kn

        • That's similar to what PGI is doing.

          Yes. In fact they've been working together on it.

          It's not that simple.

          You are absolutely right. That's why I wrote, "not a small task."

          You seldom achieve competitive e performance with this annotation type parallelization, simply because the codes were written with different architectures in mind.

          The nice thing about this is the restructuring one does for GPUs generally also translates into better CPU performance on the same code. So one can enhance the code in a performance-portable way. That isn't possible to do without compiler directives. With directives, one can avoid littering the code with GPU-specific stuff.

          They didn't go into that avenue because they knew that their codes wouldn't scale to the number of cores well.

          This article [hpcwire.com] explains that five years ago when NCSA made the bid

          • by gentryx ( 759438 ) *

            This article [hpcwire.com] explains that five years ago when NCSA made the bid, accelerators were very exotic technology. The move toward GPUs was actually at the behest of scientists who now see a way forward to speed up their codes with accelerators. Technology shifts and we adapt.

            If they are so willing to adapt, why weren't they willing to accommodate IBM's change requests? It's not like IBM was totally unwilling to build a $200 million machine.

            None? I know of several. It's all still in its infancy of course, but I'm convinced it's possible to get good speedup from GPUs on real science codes. It's not applicable to everything, but then that's why they aren't CPUs.

            I was referring to annotations for GPU offloading. Codes that run on GPUs are in fact so common nowadays that in fact you'll be asked on conferences why you didn't try CUDA if you present any performance measurement sans GPU benchmarks. :-)

            • If they are so willing to adapt, why weren't they willing to accommodate IBM's change requests?

              I don't have any knowledge of what those change requests were, so I don't know the answer. Everything I have read indicates that IBM wanted too much money.

              It's not like IBM was totally unwilling to build a $200 million machine.

              From what I have read, it seems that they were. They couldn't keep their costs low enough to justify the expense.

              I was referring to annotations for GPU offloading. Codes that run on GPUs are in fact so common nowadays that in fact you'll be asked on conferences why you didn't try CUDA if you present any performance measurement sans GPU benchmarks.

              Ah, I misunderstood. I don't think directives have been around all that long (PGI's earilier directives and CAPS's directives come to mind) and they certainly weren't standardized. OpenACC, like OpenMP, should allow scientists to write more

              • by gentryx ( 759438 ) *

                I don't have any knowledge of what those change requests were, so I don't know the answer. Everything I have read indicates that IBM wanted too much money.

                From what I have read, it seems that they were. They couldn't keep their costs low enough to justify the expense.

                True, but only because of the strict requirements of NCSA. If they had been willing to change them, a BlueGene/Q would have been viable.

                Ah, I misunderstood. I don't think directives have been around all that long (PGI's earilier directives and CAPS's directives come to mind) and they certainly weren't standardized. OpenACC, like OpenMP, should allow scientists to write more portable accelerator-enabled code. In fact the OpenACC stuff came out of the OpenMP accelerator committee as explained here [cray.com]. I think it's highly likely some version of it will be incorporated into OpenMP.

                The reason why I'm so allergic to annotation based parallelization is the experiences folks had with OpenMP. The common fallacy about OpenMP is that it is sufficient to place a "#pragma omp parallel for" in front of your inner loops and *poof* your performance goes up. But in reality your performance may very well go down, unless your code is embarrassingly parallel. In

    • The bulk of the new blue waters will be pure opteron nodes, with only 35 of the 270ish cabinets using GPUs. They obviously are assuming that most users will be running with only x86 cores. They ordered a few dozen cabs of GPUs, probably with an eye to where the industry will be heading over the lifetime of the machine, not where the users are today.

      It's true that interlagos cores are a poor competitor to power7 core to core. However, they fair much better if you use the entire module. Think of interlagos as

  • IBM unleashes the evil twin of Watson, for some real jeopardy.

    • Pfft. The evil twin Watson will be easily recognizable and easy to kill. First it will be the one with the moustache. And to kill it, just shoot the monitor.
  • Intel is better than AMD.

    Sandy Bridge would crush anything AMD has.

Do you suffer painful illumination? -- Isaac Newton, "Optics"

Working...