Jaguar Supercomputer Being Upgraded To Regain Fastest Cluster Crown 89
MrSeb writes with an article in Extreme Tech about the Titan supercomputer. From the article: "Cray, AMD, Nvidia, and the Department of Energy have announced that the Oak Ridge National Laboratory's Jaguar supercomputer will soon be upgraded to yet again become the fastest HPC installation in the world. The new, mighty-morphing computer will feature thousands of Cray XK6 blades, each one accommodating up to four 16-core AMD Opteron 6200 (Interlagos) chips and four Nvidia Tesla 20-series GCGPU coprocessors. The Jaguar name will be suitably inflated, too: the new behemoth will be called Titan. The exact specs of Titan haven't been revealed, but the Jaguar supercomputer currently sports 200 cabinets of Cray XT5 blades — and each cabinet, in theory, can be upgraded to hold 24 XK6 blades. That's a total of 4,800 servers, or 38,400 processors in total; 19,200 Opterons 6200s, and 19,200 Tesla GPUs. ... that's 307,200 CPU cores — and with 512 shaders in each Tesla chip that's 9,830,400 compute units. In other words, Titan should be capable of massive parallelism of more than one million concurrent operations. When the server is complete, towards the end of 2012, Titan will be capable of between 10 and 20 petaflops, and should recapture the crown of Fastest Supercomputer in the World from the Japanese 'K' computer."
DOE projects (Score:1)
Makes me wonder what the DOE will run on that computer? Kind of makes another game of (insert your favorite FPS here) seem quaint.
Re: (Score:2)
They'll do nuclear weapon simulations as usual.
Re: (Score:2)
More like "Hello Multiverse"
Re: (Score:2)
No idea, but that kind of monster is getting very close to the compute power needed to do full-blown raytracing (not shading sold as raytracing) for 3D movies in real-time. As in a cinema with one of these would only need the basic renderman files plus sound track.
If the US had shown any interest whatsoever in ITER (fat chance, after they lost the bidding war on who was going to house it), I'd say that a supercomputer on this kind of scale would be adequate for simulating the dynamics inside the fusion reac
Re: (Score:2)
"Hey guys, what do you want to do today.. analyze the deep mysteries of the universe, pushing science and technology 50 years further beyond what we were expecting to be able to do anytime soon, or watch Kung Fu Panda?"
"Quit joking around Tim - run Skynet.sh to hack into Pixar again and download Toy Story!"
Re: (Score:2)
U.S. is funding efforts that might actually produce sustainable fusion, such as Bussard Polywell; not the money sewer that is ITER, which will not produce anything useful for decades even considering follow-on machine designs. Spend smarter, not bigger.
Re:DOE projects (Score:4, Informative)
The DOE categorizes their supercomputers into capacity machines and capability machines.
The capacity machines are the work horses - time-shared between lots of users doing a variety of applications (including material science, life science, nuclear simulation, etc). The spend pretty much their entire lives near maximum utilization.
The capability machines are the really big ones (Jaguar, Road Runner, etc) that are big enough to permit applications that are too large (require too much RAM or have absurdly long running times) to run on most systems. (Capability machines are also quite difficult to administer because none of the software they run has ever been tested at those scales)
Yeah, but how fast does it (Score:2, Funny)
Re: (Score:2)
What, no "beowolf cluster" joke yet?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
It'll be done compiling any moment now... Just making sure the funny is fully optimized.
Re: (Score:1)
Re: (Score:2)
play Quake?
A good question. I ran the numbers, and the answer is FUCKING quickly.
Re: (Score:1)
"Titan"? (Score:1)
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
The only thing that concerns me about the name Titan is what happens when they need another name some years from now to indicate its even faster.
Well, Olympian would be the obvious choice for a successor to Titan.
Re: (Score:2)
Just like the missile, you name it "Titan II" (aka "The Quickening" and/or "Electric Bugaloo")
Re: (Score:3)
Will it run Crysis? (Score:1)
Will it run Crysis...on Flash?
Re: (Score:2)
Crysis on Flash on Safari on Wine on Linux on Javascript on Internet Explorer on Vista
Now we've got a proper benchmark!
Re: (Score:2)
No, but I think it will be able to run Vista.
Re: (Score:2)
Yes, but all people are rendered as instances of Kim Kardashian to save time. She funds a few more processors if it uses her instead of stick figures.
It also replaces billboards with ads for her latest perfume, Ass ("It's What I Use(TM)") and runs at half frame rate with a two-hour delay on the US West Coast (because it costs more to send the frames to Oregon for some reason).
If it doesn't get outrun by Blue Gene/Q... (Score:2)
LLNL will receive their 20 PF machine dubbed Sequoia [wikipedia.org] later this year. IBM's Blue Genes are known for their good ratio of CPU performance/network performance. This allows the MPI codes to scale well. The same is true for vanilla Cray XT5 and XE6 machines, but if upgraded with GPUs then each node receives a significant boost in computational power without increasing the network performance. This leaves the individual nodes bandwidth starved and makes it next to impossible to achieve peak performance in produc
Re: (Score:2)
No real application gets close to peak performance on such a supercomputer.
And if you're limited by I/O, it just means you don't have a big enough workload. I/O and computation time should be overlapped so that you only pay for latency.
Re: (Score:1)
Re: (Score:2)
I wasn't talking about disk I/O, but network I/O.
Stencils are computed by replicating the borders during tiling. Doing a communication every time a node needs to access that zone is doing it wrong.
Re: (Score:1)
Yes. And in every time step you'll have to sync the ghost zones (or halos). And if the time step is computed faster (because your CPU/GPU has been upgraded), and your network hasn't been equally accelerated, then, at some point, you'll be bandwidth limited. Overlapping of calculation and communication only hides communication time if t_send = (t_latency + size / bandwidth) is smaller than t_compute. Speeding up the CPU/GPU will reduce t_compute. t_send will remain constant.
That said, of course there are al
Justification. (Score:2)
"...being upgraded to regain fastest cluster crown"
"...a total of 4,800 servers, or 38,400 processors in total; 19,200 Opterons 6200s, and 19,200 Tesla GPUs..."
Gee, that sounds cheap. And all for a "crown"...Hell of a financial justification there...especially considering by the time they upgrade the last rack of blades, someone else will have plans for a bigger, faster one.
Re: (Score:2)
Gee, that sounds cheap. And all for a "crown"...Hell of a financial justification there...especially considering by the time they upgrade the last rack of blades, someone else will have plans for a bigger, faster one.
Welcome to the wonderful world of technological progress, brought to you today by the Law of Moore!
Re: (Score:2)
Re: (Score:2)
Regaining the #1 spot on the Top500 is merely a convenient side effect. The real reason for building these machines is that they are a key enabler for numerous science projects ranging from astrophysics to climate modeling to atomic phasefield simulation of crystal growth [supercomputing.org]. This type of research can only be done on machines which offer Petabytes of RAM and Petaflops of performance. They cost hundreds of millions to build and operate. And if you can cut this cost down to a fraction by reusing Jaguar's existing housing, cooling and networking facilities, then this is financially a very clever move.
I would be willing to agree with you 110% here on your justification model here, save for one tiny little thing. We also happen to hold the #1 spot in the world regarding debt, which tends to question the overall benefit of building systems that cost "hundreds of millions to build and operate", regardless of the use.
I DO understand and respect the advances we've made and will continue to make in science due to these types of systems, but the title of this article tends to scream a childish "king of the hil
Re: (Score:2)
I would be willing to agree with you 110% here on your justification model here, save for one tiny little thing. We also happen to hold the #1 spot in the world regarding debt, which tends to question the overall benefit of building systems that cost "hundreds of millions to build and operate", regardless of the use.
Yeah! I'm hugely in debt because I over-bought on my summer home, so I'd better stop paying for the bus tokens I use to get to work! That'll really help my situation!
Hundreds of millions is nothing compared to the debt problem at issue, super-computer purchasing is not a significant contributor to that issue, however scientific advances that can only be achieved via such computers can provide returns far above what was spent and do orders of magnitude more to resolve our debt issues than it is contributin
Re: (Score:2)
...Being in debt doesn't mean you stop spending money. It means you have to spend money wiser -- on things that can provide dividends. This is a perfect example of the kind of spending that should not be stopped exactly because of the debt problem.
And as I said, I agree completely on the purpose and validation of these systems. Now, how about taking a look at some statistical analysis both before and after the top500.org website was established, and validate that overall spending on these massive supercomputers, as well as average upgrade timelines, is NOT somehow tied to the "king of the hill" theory.
Let's not be ignorant and think that greed and corruption somehow cannot infiltrate organizations building these systems. I agree that their dividend
Re: (Score:2)
And as I said, I agree completely on the purpose and validation of these systems. Now, how about taking a look at some statistical analysis both before and after the top500.org website was established, and validate that overall spending on these massive supercomputers, as well as average upgrade timelines, is NOT somehow tied to the "king of the hill" theory.
So what if it the keeping of a Top 500 list of supercomputers spurs governments and other organizations to compete to produce the most powerful computers? The problems these supercomputers are solving are problems where you can always throw more computational power at them to get better answers. You agree that this purpose is worth while.
I go back to what I said before. This is a tiny portion of the budget, a tiny portion of our debt problem. It's something that pays dividends. It's part of our spending
2012 will be a big year for supercomputers. (Score:3)
Titan will be a hugely powerful computer. However, fastest supercomputer might be just out of reach. 2012 is also the year that Lawrence Livermore labs, also part of the Department of Energy, is planning to unveil their 20 petaflop BlueGene/Q computer name Seqoia. [http://www.hpcwire.com/hpcwire/2009-02-03/lawrence_livermore_prepares_for_20_petaflop_blue_gene_q.html]
That said, Seqoia will be a classified system for nuclear stockpile simulations. Titan will be a comparatively open system for wide ranging scientific discovery: government, academic, and industrial.
Re: (Score:1)
Re: (Score:2)
Not to mention the likelihood of the arrival of Sandy Bridge systems, Kepler and Southern Islands GPUs providing competitors that aren't big in the news at the moment.
By the end of 2012, even the optimistic end of the scale at 20 petaflops probably won't be #1. If #1 is a BlueGene, it will probably use less electricity than Titan while still getting higher numbers, though either way programmers will have a bit of a struggle vs. big traditional CPU based systems.
Re: (Score:2)
Damnit, I just realized I wasn't logged in... oh well.
Re: (Score:2)
"Within 5 to 10 years", of course.
Re: (Score:1)
Re: (Score:2)
1M instances...
Great, but... (Score:1)
Yeah! (Score:3)
Go Atari!
Re: (Score:2)
You jest but Atari really was working on a deskside "supercomputer"....
http://en.wikipedia.org/wiki/Atari_Abaq [wikipedia.org]
AKA The Atari ATW800.... based on INMOS Transputer RISC CPU's. Very cool, really wanted one when I was a kid. Figured they were vapor until years later when a few started popping up here and there on eBay.
Never had a Jaguar but always wanted one. I grew up with the Atari XL/XE-series and the ST though.
Wait.... (Score:1)
Jaguar (Score:4, Funny)
I would have thought that the world's fastest cluster would something a little more modern than OS X 10.2.
Naming... (Score:2)
All kidding aside for Crysis and Quake (Score:1)
What kind of gaming server WOULD this kind of processing power allow? Imagine the AIs, physics, and real-time geographical updates something like this could support. I know that EVE has a rather massive server and database for their universe but this should kick that server's behind.
Communications with clients would likely be the main bottleneck to keep up with all this but imagine some nifty in-house gaming consoles.
I was mistaken... (Score:2)
I was expecting a supercomputer made out of a beowulf cluster of Atari Jaguars.
Slickdeal? (Score:1)
That's a total of 4,800 servers, or 38,400 processors in total; 19,200 Opterons 6200s, and 19,200 Tesla GPUs. ... that's 307,200 CPU cores — and with 512 shaders in each Tesla chip that's 9,830,400 compute units.
Fail. No HDMI. :)
Re: (Score:1)
http://www.nccs.gov/computing-resources/jaguar/
Cray linux, which as I understand is a derivative of SuSE
LFTR (Score:2)
Mandatory LFTR comment here. Yes, it would be nice if this fancy tool were to be used in the effort towards solving problems with Liquid Fluoride Thorium Reactors, such as cutting down on the tritium production, but ORNL doesn't seem to be interested in its child any more. But, perhaps some funding will be found in the future for creation of modeling software and purchase of CPU time on this Jaguar or the next. Is Stephen Chu the bottleneck?
Difference between a supercomputer and a cluster? (Score:2)
Seems to me this supposed supercomputer is just a cluster of quad processor motherboards with coprocessors.
And why isn't it liquid cooled in refined mineral oil on cabinets on their back? It cuts the power requirements for cooling and makes the boards as a whole more reliable. Don't have to worry about cooling fans going down and you can have redundant heat exchanging pumps.