Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Supercomputing Hardware

Jaguar Supercomputer Being Upgraded To Regain Fastest Cluster Crown 89

MrSeb writes with an article in Extreme Tech about the Titan supercomputer. From the article: "Cray, AMD, Nvidia, and the Department of Energy have announced that the Oak Ridge National Laboratory's Jaguar supercomputer will soon be upgraded to yet again become the fastest HPC installation in the world. The new, mighty-morphing computer will feature thousands of Cray XK6 blades, each one accommodating up to four 16-core AMD Opteron 6200 (Interlagos) chips and four Nvidia Tesla 20-series GCGPU coprocessors. The Jaguar name will be suitably inflated, too: the new behemoth will be called Titan. The exact specs of Titan haven't been revealed, but the Jaguar supercomputer currently sports 200 cabinets of Cray XT5 blades — and each cabinet, in theory, can be upgraded to hold 24 XK6 blades. That's a total of 4,800 servers, or 38,400 processors in total; 19,200 Opterons 6200s, and 19,200 Tesla GPUs. ... that's 307,200 CPU cores — and with 512 shaders in each Tesla chip that's 9,830,400 compute units. In other words, Titan should be capable of massive parallelism of more than one million concurrent operations. When the server is complete, towards the end of 2012, Titan will be capable of between 10 and 20 petaflops, and should recapture the crown of Fastest Supercomputer in the World from the Japanese 'K' computer."
This discussion has been archived. No new comments can be posted.

Jaguar Supercomputer Being Upgraded To Regain Fastest Cluster Crown

Comments Filter:
  • Makes me wonder what the DOE will run on that computer? Kind of makes another game of (insert your favorite FPS here) seem quaint.

    • by Surt ( 22457 )

      They'll do nuclear weapon simulations as usual.

    • by jd ( 1658 )

      No idea, but that kind of monster is getting very close to the compute power needed to do full-blown raytracing (not shading sold as raytracing) for 3D movies in real-time. As in a cinema with one of these would only need the basic renderman files plus sound track.

      If the US had shown any interest whatsoever in ITER (fat chance, after they lost the bidding war on who was going to house it), I'd say that a supercomputer on this kind of scale would be adequate for simulating the dynamics inside the fusion reac

      • "Hey guys, what do you want to do today.. analyze the deep mysteries of the universe, pushing science and technology 50 years further beyond what we were expecting to be able to do anytime soon, or watch Kung Fu Panda?"

        "Quit joking around Tim - run Skynet.sh to hack into Pixar again and download Toy Story!"

      • U.S. is funding efforts that might actually produce sustainable fusion, such as Bussard Polywell; not the money sewer that is ITER, which will not produce anything useful for decades even considering follow-on machine designs. Spend smarter, not bigger.

    • Re:DOE projects (Score:4, Informative)

      by Raul654 ( 453029 ) on Tuesday October 11, 2011 @02:14PM (#37682356) Homepage

      The DOE categorizes their supercomputers into capacity machines and capability machines.

      The capacity machines are the work horses - time-shared between lots of users doing a variety of applications (including material science, life science, nuclear simulation, etc). The spend pretty much their entire lives near maximum utilization.

      The capability machines are the really big ones (Jaguar, Road Runner, etc) that are big enough to permit applications that are too large (require too much RAM or have absurdly long running times) to run on most systems. (Capability machines are also quite difficult to administer because none of the software they run has ever been tested at those scales)

  • Would be more interesting if they named it "Joshua".
    • WOPR would be just as good. The only thing that concerns me about the name Titan is what happens when they need another name some years from now to indicate its even faster. Maybe they can go on to Gargantuan and then Colossus, but I think after that they're out of names to get bigger with!
      • Super Titan
      • The only thing that concerns me about the name Titan is what happens when they need another name some years from now to indicate its even faster.

        Well, Olympian would be the obvious choice for a successor to Titan.

      • Just like the missile, you name it "Titan II" (aka "The Quickening" and/or "Electric Bugaloo")

      • by afidel ( 530433 )
        I don't think they'll be using Colossus since the British WWII cryptographers used it first =)
  • Will it run Crysis...on Flash?

  • LLNL will receive their 20 PF machine dubbed Sequoia [wikipedia.org] later this year. IBM's Blue Genes are known for their good ratio of CPU performance/network performance. This allows the MPI codes to scale well. The same is true for vanilla Cray XT5 and XE6 machines, but if upgraded with GPUs then each node receives a significant boost in computational power without increasing the network performance. This leaves the individual nodes bandwidth starved and makes it next to impossible to achieve peak performance in produc

    • No real application gets close to peak performance on such a supercomputer.

      And if you're limited by I/O, it just means you don't have a big enough workload. I/O and computation time should be overlapped so that you only pay for latency.

      • by gentryx ( 759438 ) *
        Who's talking about disk I/O? I'm talking about network bandwidth, which is required for synchronization, e.g. to update ghost zones in stencil codes [wikipedia.org]. The required bandwidth is proportional to the computational power of the nodes. Latency can actually be hidden, too, by overlapping computation and communication -- at least for the afore mentioned stencil codes, which represent the largest fraction of simulation codes out there. That said, disk I/O is still vital [lbl.gov], at least if you want to actually see what
        • I wasn't talking about disk I/O, but network I/O.
          Stencils are computed by replicating the borders during tiling. Doing a communication every time a node needs to access that zone is doing it wrong.

          • by gentryx ( 759438 ) *

            Yes. And in every time step you'll have to sync the ghost zones (or halos). And if the time step is computed faster (because your CPU/GPU has been upgraded), and your network hasn't been equally accelerated, then, at some point, you'll be bandwidth limited. Overlapping of calculation and communication only hides communication time if t_send = (t_latency + size / bandwidth) is smaller than t_compute. Speeding up the CPU/GPU will reduce t_compute. t_send will remain constant.

            That said, of course there are al

  • "...being upgraded to regain fastest cluster crown"

    "...a total of 4,800 servers, or 38,400 processors in total; 19,200 Opterons 6200s, and 19,200 Tesla GPUs..."

    Gee, that sounds cheap. And all for a "crown"...Hell of a financial justification there...especially considering by the time they upgrade the last rack of blades, someone else will have plans for a bigger, faster one.

    • Gee, that sounds cheap. And all for a "crown"...Hell of a financial justification there...especially considering by the time they upgrade the last rack of blades, someone else will have plans for a bigger, faster one.

      Welcome to the wonderful world of technological progress, brought to you today by the Law of Moore!

    • by gentryx ( 759438 ) *
      Regaining the #1 spot on the Top500 is merely a convenient side effect. The real reason for building these machines is that they are a key enabler for numerous science projects ranging from astrophysics to climate modeling to atomic phasefield simulation of crystal growth [supercomputing.org]. This type of research can only be done on machines which offer Petabytes of RAM and Petaflops of performance. They cost hundreds of millions to build and operate. And if you can cut this cost down to a fraction by reusing Jaguar's existin
      • Regaining the #1 spot on the Top500 is merely a convenient side effect. The real reason for building these machines is that they are a key enabler for numerous science projects ranging from astrophysics to climate modeling to atomic phasefield simulation of crystal growth [supercomputing.org]. This type of research can only be done on machines which offer Petabytes of RAM and Petaflops of performance. They cost hundreds of millions to build and operate. And if you can cut this cost down to a fraction by reusing Jaguar's existing housing, cooling and networking facilities, then this is financially a very clever move.

        I would be willing to agree with you 110% here on your justification model here, save for one tiny little thing. We also happen to hold the #1 spot in the world regarding debt, which tends to question the overall benefit of building systems that cost "hundreds of millions to build and operate", regardless of the use.

        I DO understand and respect the advances we've made and will continue to make in science due to these types of systems, but the title of this article tends to scream a childish "king of the hil

        • I would be willing to agree with you 110% here on your justification model here, save for one tiny little thing. We also happen to hold the #1 spot in the world regarding debt, which tends to question the overall benefit of building systems that cost "hundreds of millions to build and operate", regardless of the use.

          Yeah! I'm hugely in debt because I over-bought on my summer home, so I'd better stop paying for the bus tokens I use to get to work! That'll really help my situation!

          Hundreds of millions is nothing compared to the debt problem at issue, super-computer purchasing is not a significant contributor to that issue, however scientific advances that can only be achieved via such computers can provide returns far above what was spent and do orders of magnitude more to resolve our debt issues than it is contributin

          • ...Being in debt doesn't mean you stop spending money. It means you have to spend money wiser -- on things that can provide dividends. This is a perfect example of the kind of spending that should not be stopped exactly because of the debt problem.

            And as I said, I agree completely on the purpose and validation of these systems. Now, how about taking a look at some statistical analysis both before and after the top500.org website was established, and validate that overall spending on these massive supercomputers, as well as average upgrade timelines, is NOT somehow tied to the "king of the hill" theory.

            Let's not be ignorant and think that greed and corruption somehow cannot infiltrate organizations building these systems. I agree that their dividend

            • And as I said, I agree completely on the purpose and validation of these systems. Now, how about taking a look at some statistical analysis both before and after the top500.org website was established, and validate that overall spending on these massive supercomputers, as well as average upgrade timelines, is NOT somehow tied to the "king of the hill" theory.

              So what if it the keeping of a Top 500 list of supercomputers spurs governments and other organizations to compete to produce the most powerful computers? The problems these supercomputers are solving are problems where you can always throw more computational power at them to get better answers. You agree that this purpose is worth while.

              I go back to what I said before. This is a tiny portion of the budget, a tiny portion of our debt problem. It's something that pays dividends. It's part of our spending

  • by flaming-opus ( 8186 ) on Tuesday October 11, 2011 @02:20PM (#37682440)

    Titan will be a hugely powerful computer. However, fastest supercomputer might be just out of reach. 2012 is also the year that Lawrence Livermore labs, also part of the Department of Energy, is planning to unveil their 20 petaflop BlueGene/Q computer name Seqoia. [http://www.hpcwire.com/hpcwire/2009-02-03/lawrence_livermore_prepares_for_20_petaflop_blue_gene_q.html]

    That said, Seqoia will be a classified system for nuclear stockpile simulations. Titan will be a comparatively open system for wide ranging scientific discovery: government, academic, and industrial.

    • by gentryx ( 759438 ) *
      Not just Sequoia, but also Mira [hpcwire.com] (10 PFLOPS BG/Q @Argone), Hermit [inside.hlrs.de] (4-5 PFLOPS Cray CE6 @HLRS, Germany) and... well Blue Waters seems to have silted up [slashdot.org].
      • by Junta ( 36770 )

        Not to mention the likelihood of the arrival of Sandy Bridge systems, Kepler and Southern Islands GPUs providing competitors that aren't big in the news at the moment.

        By the end of 2012, even the optimistic end of the scale at 20 petaflops probably won't be #1. If #1 is a BlueGene, it will probably use less electricity than Titan while still getting higher numbers, though either way programmers will have a bit of a struggle vs. big traditional CPU based systems.

  • Can I play Iron Soldier 2 on it?
  • by Yvan256 ( 722131 ) on Tuesday October 11, 2011 @02:23PM (#37682486) Homepage Journal

    Go Atari!

    • by ogdenk ( 712300 )

      You jest but Atari really was working on a deskside "supercomputer"....

      http://en.wikipedia.org/wiki/Atari_Abaq [wikipedia.org]

      AKA The Atari ATW800.... based on INMOS Transputer RISC CPU's. Very cool, really wanted one when I was a kid. Figured they were vapor until years later when a few started popping up here and there on eBay.

      Never had a Jaguar but always wanted one. I grew up with the Atari XL/XE-series and the ST though.

  • inb4 "S" Jokes next up the launch of the Japanese "K S" computer.
  • Jaguar (Score:4, Funny)

    by Ukab the Great ( 87152 ) on Tuesday October 11, 2011 @02:26PM (#37682528)

    I would have thought that the world's fastest cluster would something a little more modern than OS X 10.2.

  • So the system called Jaguar was powered by multiple XK6 engines. If they're upgrading it to take the title of world's fastest, they should have stuck with that pattern and called the new system the XJ220 [wikipedia.org].
  • What kind of gaming server WOULD this kind of processing power allow? Imagine the AIs, physics, and real-time geographical updates something like this could support. I know that EVE has a rather massive server and database for their universe but this should kick that server's behind.

    Communications with clients would likely be the main bottleneck to keep up with all this but imagine some nifty in-house gaming consoles.

  • I was expecting a supercomputer made out of a beowulf cluster of Atari Jaguars.

  • That's a total of 4,800 servers, or 38,400 processors in total; 19,200 Opterons 6200s, and 19,200 Tesla GPUs. ... that's 307,200 CPU cores — and with 512 shaders in each Tesla chip that's 9,830,400 compute units.

    Fail. No HDMI. :)

  • by drwho ( 4190 )

    Mandatory LFTR comment here. Yes, it would be nice if this fancy tool were to be used in the effort towards solving problems with Liquid Fluoride Thorium Reactors, such as cutting down on the tritium production, but ORNL doesn't seem to be interested in its child any more. But, perhaps some funding will be found in the future for creation of modeling software and purchase of CPU time on this Jaguar or the next. Is Stephen Chu the bottleneck?

  • Seems to me this supposed supercomputer is just a cluster of quad processor motherboards with coprocessors.

    And why isn't it liquid cooled in refined mineral oil on cabinets on their back? It cuts the power requirements for cooling and makes the boards as a whole more reliable. Don't have to worry about cooling fans going down and you can have redundant heat exchanging pumps.

There are two ways to write error-free programs; only the third one works.

Working...