Cray Supercomputers to be Based on AMD Opterons 197
PsychicX writes "AMD and Cray have announced an agreement to base Cray supercomputers on AMD's Opteron line until the end of the decade, and to collaborate on Cray's 2006 proposal for Phase 3 of the federal government's DARPA HPCS (High Productivity Computing Systems) program. Cray already offers the XT3 and XD1 supercomputers based on Opteron."
excellent (Score:5, Interesting)
Re:excellent (Score:3, Interesting)
Re:excellent (Score:5, Informative)
Re:excellent (Score:2)
Tera bought far more than a name when they bought us. They also bought a bunch of software and hardware people, many of whom (myself not included) have been with Cray Research (the original Cray) for many years. So, while it's certainly not the Cray of the mid-1980's, the tradition still goes back there, especially with the vector machines like the Cray X1/X1E and its impending follow-on.
You work for Cray. Cool.
Please tell me that this deal implies that AMD is going to add some proper vector instructions to
Re:excellent (Score:5, Informative)
I think you're much more likely to see the cray vector processor retooled with lots of hypertransport connections, so it can use an opteron as its scalar unit, and use the same seastar routers that the xt3 uses. On the X1, the scalar unit already runs ahead of the vector unit, so I bet it's not all that important for the scalar unit to be on-die.
Re:excellent (Score:2)
i dont know.
Re:excellent (Score:4, Informative)
No they won't! They have no reason to.
Yes, you're probably right that it doesn't make sense for AMD economically. But I want to run numerical codes at more than 5 % peak performance on my cheap Opterons, so I want to believe.
The vector units that a cray uses aren't like altivec, sse, or other "bolt-on" vector units. The vector unit on a cray (or NEC) is a latency hiding mechanism. It's a method for forcing the programmer/compiler to structure the code such that the data loaded from memory is used a significant period of time after the load is initiated.
Yes, I know. And that's precisely the reason why I'd like to see real vectors instead of the sse/altivec toy ones. Main memory latency is hundreds of cycles, and it's getting worse all the time.
Additionally, from a microarchitecture perspective, vectors have quite a few advantages there too.
This works pretty well on the HPC code that is used on crays, but not at all for the everyday server/workstation code that opterons run.
I'm not sure about that. I guess technical apps vectorize just as well as HPC codes (well perhaps not the UI, but the code that runs the actual simulation or whatever). Heck, even some database code vectorizes nicely (sorting and hash joins).
Furthermore, to support that sort of vector unit, you need to have about eight times as much memory bandwidth as an opteron, which means many more pins on the socket, which are very expensive.
Yes, as I said some Alpha Tarantula like design is probably overkill for the vast majority of the market. My point was that a vector ISA extension with modest execution resources wouldn't need that much die area, and could help make better use of the available bandwidth, whatever that bandwidth is. As you said yourself, the expensive thing is IO. Transistors are cheap by comparison. So not having instructions that allow one to effectively use the available IO resources is a real shame.
I think you're much more likely to see the cray vector processor retooled with lots of hypertransport connections, so it can use an opteron as its scalar unit, and use the same seastar routers that the xt3 uses. On the X1, the scalar unit already runs ahead of the vector unit, so I bet it's not all that important for the scalar unit to be on-die.
Yes, that sounds feasible. IIRC it is something like this that Cray has cooked up for the Cascade project; I.e. a node consists of 8 (or was it 4) scalar processors connected to memory (I guess these could be Opterons or further in the future some kind of Processor-in-memory (PIM) stuff), and a vector unit with its own cache and fast access to the main memory via the scalar cpu:s.
As for the seastar thing, I think you're right that that's what they'll use for inter-node communication. Currently X1(E) uses Numalink licenced from SGI, so they're certainly looking at replacing that with existing in-house tech. BTW, 2H2006 will see the XT4, with the new Opteron sockets with DDR2 memory and the Seastar2 router that provides twice the BW compared to the existing Seastar.
Re:excellent (Score:2)
so its all up to you now to make it happen, you have the
company to yourself. the success or failure of which will
be entirely of your own making.
Re:excellent (Score:2, Interesting)
I know chemists who claim there there are still algorithms than don't run as well on modern MPars as they did on mid-90s vector Crays. I know we're not a huge market, but I bet there are some other fields that would rather have a deskside T90, rather than a multi-proc Opteron box.
Re:excellent (Score:5, Interesting)
Only the name Cray remains, not the old-time reputation.
That's not quite true; they still sell Cray-specific technology. One of my colleagues has just bought a small 24-core Opteron system. Each node contains two dual-core processors, and the 6 nodes are linked together by 80 Gb/s Craylink cables. I think this interlink technology is also licensed to SGI for use in their Origin computers.
Re:excellent (Score:2, Funny)
Small? Small? You've lost your sense of scale with all the weather modeling and nuclear RSA NP-hard protein folding simulations you've been running.
Re:excellent (Score:2, Informative)
Which they call NUMALink (for Non-Uniform Memory Access --a node obviously accesses its memory faster than through the interconnection), and they're used in their Altix (Linux on Itanium2) computers too; I don't know how much they have evolved the technology, I just know that it's 3.2GiB/s each direction.
NUMALink [sgi.com]
Have you got any links to some page describing some configuration simi
Re:excellent (Score:5, Informative)
Craylink was designed at SGI, and renamed to craylink after they bought Cray. They introduced craylink in the origin2000, which they started selling half a year after buying cray, so I'm sure they couldn't have integrated any cray-designs into their product in that span.
After they sold Cray to Tera, SGI started calling the technology Numalink, and currently use it in their origin3, altix3, and altix4 product lines. They are on the 4th generation of the technology, which is 3.2GB/s per direction. The cray that was sold to Tera included the half-finished X1 system, which also uses numalink. It uses the older 1.6GBps/dir links, but uses 32 networks in parallel for a total of ~50GB/s/dir per node.
The Cray XT3 uses a newer network interconnect called seastar, which offers 3.8GBps/direction. This is probably what will be used in the X1's successor.
The Cray XD1, which your colleague bought, is a product cray acquired when they bought OctigaBay. They use an interconnect called the RappidArray switch, which provides 4GBps/direction of interconnect.
All of these interconnects are high-bandwidth and low latency. The XD1, is also very inexpensive for a cray, which is always nice.
Re:excellent (Score:5, Insightful)
The difficult problems in building computers has changed, and the financial climate around supercomputers has changed quite a lot. Among other things, CMOS finally became fast enough to put bipolar in its grave, single microprocessor workstations became powerful enough to do all but the hardest of scientific tasks, and the average price of high performance (not top 10 on the list, but still fast) computers has plummeted. To ask the new cray to be like the old cray would be foolish.
That said, New Cray is still offers impressive products. All of Cray's 3 product lines have much lower entry-prices than similar crays of the 90's. They all have more managable power/cooling/physical size characteristics. They make much greater use of industry standard Disks and networks, and also can be administered and programmed much more like any other unix computer. You program a New Cray more or less the same as other contemporary HPC systems.
When cdc introduced the 6600, the president of IBM complained to his staff aking (paraphrase) 'how has cdc managed to best IBM's fastest computer with a staff of just 14 engineers and 4 programmers?' Seymor Cray responded "It seems like Mr. Watson has answered his own question." Because new Cray is tiny does not mean that it is not capable of making impressive innovations. Old Cray's Gorilla days were very wasteful, and not necessarily full of the best moments of innovation.
Now, if only they could put four X1e CPUs into an air-cooled, rack-mount server and charge a reasonable amount for it. I'd much rather have a handful of vector processors than a few dozen opterons, anyday.
Re:excellent (Score:2)
Now, if only they could put four X1e CPUs into an air-cooled, rack-mount server and charge a reasonable amount for it. I'd much rather have a handful of vector processors than a few dozen opterons, anyday.
NEC sells an entry-level deskside SX-8, called IIRC SX-8i, with one processor. Unfortunately "entry-level" in this case means $100000+.
Re:excellent (Score:2)
And NEC has done a great job of doing this for the last several generations of their vector machines. I have not ever programmed for an SX, and don't know much about them. The really nice thing about the X1, is that under the covers it's running Irix, which is a pretty reasonable Unix variant. Anyone know anything about super/UX?
Re:excellent (Score:2, Insightful)
"The SAME CPU used in CRAY ***SUPERCOMPUTERS***, now available for your desktop!"
And some rube will buy one based on that statement.
Re:excellent (Score:2)
Ha! (Score:2)
This is really just a marketing play on AMD's part.
Sounds like sour grapes from an itanic customer (or an intel or SGI staffer) :-)
Re:Ha! (Score:2)
Re:excellent (Score:2)
A well-scrubbed, hustling rube with some taste and DOE/DOD dollars to spend.
Re:excellent (Score:2)
The answer is quite simple (Score:4, Insightful)
Re:excellent (Score:4, Interesting)
What bullshit. Provide some sources for this info, or shut up. AMD is opening a new fab, and has a contract with a 3rd party to produce cores if AMD can't keep up... They're doing fine producing Opeterons.
Incidentally, I can provide real, actual sources that show Intel is the one who is having problems producing enough chips to meet demand.
Re:excellent (Score:2)
Say, does on often open new factories when one's production is above one's sales?
The very fact that they need to open new factories and strike deels with other manufacturers to help them if they can't manage enough production DOES mean that their fabs are cramped and working full steam.
Re:excellent (Score:2)
No, it usually means they are planning ahead. Intel, in fact, is also opening new fabs, and that news, oddly enough, gets massive coverage on
Re:excellent (Score:2)
That makes no sense at all. Anyone with even a passing understanding of the rules of logic would surely disagree with you.
In your own trollish way, you did sort-of ask for some sources, so here's a handful:
http://www.eweek.com/article2/0,,1857009,00.asp [eweek.com]
http://www.computerworld.com/hardwaretopics/hardwa re/desktops/story/0,10801,104807,00.html [computerworld.com]
http:/ [theregister.co.uk]
Re:excellent (Score:2)
Correct, but it would be perfectly reasonable to project that a company that is pushing the processor to the limits of it's performance, like somebody manufacturing supercomputers, would find areas for overall improvement of the chip.
AMD makes more than chips! (Score:4, Insightful)
Re:excellent (Score:2)
Perhaps they could license the result to Intel. They don't seem to be doing much lately.
Re:excellent (Score:2)
Re:my experience (Score:2)
It only makes sense (Score:5, Insightful)
The Chipset is the key. (Score:4, Interesting)
Specialized computing hardware for supercomputers has always seemed like a fiscally bad choice. It'll be good to see what kinds of improvements we can see in research possibilities as supercomputing costs come down from using mass-marketed parts.
Cray likes to build classical vector-driven machines. In that space, you can't rely on some external kludge like Myrinet for your communications; instead, your value-add is in the chipsets that get all those CPUs talking to one another [and to the memory subsystem].
In one of Cray's previous incarnations, they once possessed a chipset/backplane tech for the Sparc processor that Sun purchased off of Silicon Graphics for a song and a dance, and immediately turned into the insanely profitable Sunfire series. The big question here is whether this new agreement requires Cray to share their chipset/backplane tech with AMD [in which case some of it might filter its way back down to the level where mere plebians like us would be able to afford it].
Re:The Chipset is the key. (Score:2)
Re:It only makes sense (Score:5, Informative)
That doesn't mean that there isn't a place for commodity hardware in supercomputing, but to say that there's no room for custom hardware either misses the point. The only thing "off the shelf" about an AMD based Cray is the AMD. The logic board, and, most importantly, the network that interconnects the processors is entirely custom. Not to mention the fact that Cray will still build some entirely custom processors...
By the way - this is hardly the first Cray based on a commodity processor. The T3E and T3D were both Alpha processors, yet nobody calls those machines "commodity".
Re:It only makes sense (Score:2)
I really don't consider Alpha to be a commodity chip. While Opteron and Athlon64 share a lot of the same designs and even might have the same masks, it doesn't necessarily make it a commodity chip. I suppose either chip were produced in much greater volume than Cray's custom processors.
Re:It only makes sense (Score:2)
Looks like it's happening. The Superdome can either run PA/RISC or Itanium2. Opterons on a Cray system?
And it's not just limited to processors either. I have a bag of sticks of 72-pin DIMMs in the closet (yes, the kind PCs used to use) that came out of a supercomputer. Technology advances in supercomputing will make their way back into the genera
Re:It only makes sense (Score:2)
The whole point of a supercomputer is that it performs well beyond what commodity systems are capable of. When supercomputers are made entirely of commodity parts, there will be no supercomputers.
Re:It only makes sense (Score:2, Insightful)
And moderating is voting.
Sign You Invested In The Wrong Supercomputer, #342 (Score:5, Funny)
From the press release... Sooooo... if I scrape together a few million bucks and buy a computer from these guys, will I still be able to contact my Cray rep once his 500 FREE TRY AOL NOW HOURS have expired?
Re:Sign You Invested In The Wrong Supercomputer, # (Score:2)
Sign You Invested In The Wrong Supercomputer, #342
#34. Your "supercomputing" vendor has an AOL email address.
Re:Sign You Invested In The Wrong Supercomputer, # (Score:2)
They probably hire an outside comunications firm to do public relations.
Re:Sign You Invested In The Wrong Supercomputer, # (Score:2)
nVidia motherboards (Score:5, Funny)
Re:nVidia motherboards (Score:2)
The irony of it all. (Score:3, Interesting)
More ironic is the fact that the compiler that will be used for those supercomputers is probably the PathScale variant of Open64 - SGI's compiler that was released as open source after it was retargetted to the Itanic architecture.
I might have some misconceptions, careful readers, please fill-in the blanks.
Re:The irony of it all. (Score:3, Insightful)
The research / development arm of the organization I work for just got a 4000+ CPU XT3. Last I checked, they planned on using the PGI compiler for most stuff.
Re:The irony of it all. (Score:4, Interesting)
its not ironic at all. its a question of resources and volume.
cray has a few very bright people (still, sort of). they are
essentially a us government lab. they do a bad job, but its insane
to think that 100 people can build and maintain several different
supercomputer architectures.
a $300 opteron is almost always more effective than a $60000 X1
processor. they have alot of bright people too, and alot more of
them.
the only reason that cray still exists is support for parallelism
and the provision of high memory bandwidth systems. but even that
niche is being eroded pretty severely. the xt3 communications chip
runs at 3.5GB/s in each direction. it costs about $250 for cray to
have each of them made. for the same $250 i can buy a mellanox nic
that runs at half the speed
its no suprise that cray is using opterons. they actually got lucky
by committing to amd early and having it turn out so well.
the real question is whether there is any more room for a cray at
all. the commodity world moves so quickly. the xd machines (which
they purchased) are really their best asset, but it still hard to
justify that kind of margin for what is essentially a well
constructed cluster.
The Irony of Marketing (Score:2, Informative)
Re:The Irony of Marketing (Score:2)
position of marketing excellence?
from a business point of view its completely whacked. cray spends
4-5 years of time to build a machine, just to sell a very small
few of them, throw almost all of the technology away and start
over again.
and are you really trying to claim that any cray machine built in
the last 10 years has particularily good mtti rates? the sv2
really was basically unsuable for the first two years after it
was shipped. its still kind of a dubi
Finally.... (Score:4, Funny)
It had to be said... (Score:4, Funny)
Re:It had to be said... (Score:2)
Re:It had to be said... (Score:2)
Assuming you mean on Earth, so we take G to be 9.8N and a copy of Windows Vista (boxed of course) would weigh 0.5KG, dropping it from a height of 1M: M*G*H would tell us that 0.5*9.8*1= 4.9ms^-1
Re:Get your formula's right!! (Score:2)
sqrt(19.6) = 19.6 then x^y then
It should have been said that ... (Score:2)
unless you did mean it goes down real fast !
Behold! (Score:5, Funny)
Re:Behold! (Score:2, Funny)
Does this mean... (Score:3, Funny)
Re:Does this mean... (Score:2)
Not to be a jerk...but this is old news.... (Score:3, Informative)
Check out the article here...
http://www.hypertransport.org/consortium/cons_pre
Sun.. (Score:2, Funny)
Re:Sun.. (Score:3, Informative)
Re:Sun.. (Score:2)
Cray already offers the Opteron Based XD1. The smallest 12 processor model is estimated to cost $100k USD.
Makes me wonder ... (Score:5, Insightful)
Re:Makes me wonder ... (Score:2)
Apple as far as I can see has a bit of a disconnect with "reality". They're not customer-oriented [e.g. shitty dead pixel policies, really costly vendor-locked in gear, etc]. It was always "oh that's apple gear only because the quality is higher" yet all the fol
Re:Makes me wonder ... (Score:2)
Maybe Apple's computer "just work" and look cool to boot [daringfireball.net] (and not
Re:Makes me wonder ... (Score:2)
The 32-bit comment was about PPC not Turion....But yes I did word that wrong and it was misleading. Sorry.
The biggest flaw I see with most incarnations of PPC [like the G4] is they shoot themselves in the foot with bandwidth. 133Mhz FSB? What's this the mid 90s? Put a good ol' dual-channel DDR400 [or DDR-II-533] on the front of it and be done with.
I'm sure with a slightly longer pipeline [iir
Re:Makes me wonder ... (Score:4, Insightful)
As for why not AMD, though, Intel has placed a much higher focus than AMD has on very low-voltage chips, and from what I've heard, that's what ultimately gave them the Apple nod. Arguments about production capacity aside, AMD doesn't have the R&D resources of Intel, and they have to pick their battlegrounds carefully. They've picked them wisely, but as of right now, they don't have anything competing with chips like Intel's low-power, dual-core lineup for 2006. If that's remedied in 2007, I'm sure Apple would have no problem revisiting it, but right now, AMD just doesn't have the chips they want.
(As for the production capacity arguments, I've seen people here point out that Intel has had problems meeting their demand recently and AMD hasn't. While this is true, it's important to keep in mind that Intel's overall demand is still over five times that of AMD's, and the gap in notebooks -- the segment where Intel's production capacity fell behind demand temporarily -- favors Intel by an even greater margin.)
When does this translate to bank? (Score:2)
Shouldn't AMD stock be doing better? If you bought 5 years ago, AMD is flat, but Intel is up like 50%.
Re:When does this translate to bank? (Score:2)
And besides you're supposed to use retarded circular and self-defeating logic like
"AMD doesn't have the big customers because they can't afford to upgrade their supply chain to have the big people as customers." -- Average retarded suit.
Tom
Re:When does this translate to bank? (Score:3, Insightful)
Re:When does this translate to bank? (Score:2)
Why? (Score:2)
I guess being a commodity chip helps for supply issues, but when you are building machines that are this expensive, is that really a deciding factor?
Finally :) (Score:2)
Re:Had to be done (Score:4, Insightful)
No, it didn't.
Can we have a "-1, Catch phrase" option, please? The old jokes are not even remotely funny anymore..srsly.
Re:Had to be done (Score:4, Funny)
While we're on the subject of burnt karma.... (Score:2)
Re:But (Score:3, Insightful)
Solaris? (Score:2)
Solaris 10 works really well on large NUMA boxes (better than Linux), and it has an Opteron port...
Re:Put on your tinfoil hats... (Score:5, Funny)
You are using the Internet.
You are part of the conspiracy.
Re:Put on your tinfoil hats... (Score:2)
-nB
Re:Put on your tinfoil hats... (Score:2)
Re:Put on your tinfoil hats... (Score:2)
Re:Put on your tinfoil hats... (Score:2, Funny)
Re:...and now some real numbers (Score:5, Interesting)
Let's extrapolate for a moment, shall we? I'll even do Intel a favor and clamp down on the AMD increases each time. Basically, AMD more than doubled their share of this elite group in six months' time.
Six months from now, they've almost doubled to 100 systems.
Twelve months from now, slowing down and growing only 75%, they've got 175 systems.
Two years in the future, with even more slowing down of their growth, 300 systems on the list are AMD. I wonder whether the preponderance of that growth comes from the current 400-odd Intel machines or from the 73 IBM setups...
Likely? Maybe not. Possible? Yeah, it just might be.
Re:...and now some real numbers (Score:5, Interesting)
That's quite a collapse. Intel is propping up their high-end systems with volcano-simulator Xeons?
A near doubling in a year. And that's with AMD's first real server standard processor. HORUS comes out today, that'll put AMD into the 32 and 64 core marketplace. Not bad for a company with 0 server marketshare, nevermind Top500 systems two years ago.
As for the rest of your troll, I think most of the people here are clever enough to see it for what it is.
Opteron is not NexGen's tech (Score:3, Informative)
Re:Opteron is not NexGen's tech (Score:4, Interesting)
Their bus arch and chipset tech is the most interesting. (if someone has proof that AMD didn't design this it better be solid). This attention from Cray, and the super computer people in general, is due more to this success. AMD has the best design and it shows. It is one thing to buy schetches of a something and another to make it fly this good.
More to the point regarding Cray is their XD1. THAT is a cool machine! I was looking around at different FPGA stuff and almost shorted my keyboard with drool. Damn, I wish I was rich. -sniffle-
Re:...and now some real numbers (Score:5, Insightful)
Wow, seems AMD *doubled* it's share of spots in the Top 500 list in *six months*. I bet Intel is ticked, and worried...this is very good PR for AMD.
Go AMD! Milk that NexGen core for all its worth, too bad you didn't invent it, you just bought it.
LOL! Intel fanboys don't have anything real to say these days, they have to resort to cheap ad-hominems. Don't worry, I'm sure someday Intel will come out with competitive chips again. Pretty sure, anyhow.
And as to AMD "just buying it", how would that relate to Intel getting so much Alpha technology and talent from it's deals with HP/Compaq/DEC?
It would be nice if you would start innovating one of these days.
Yeah, if AMD can produce better processors than Intel without innovating, just imagine what'll happen when it does innovate...! =)
Re:Hmmm (Score:4, Funny)
I wonder what the governmnet will do with these cheaper, powerful supercomputers?
Why, decrypt your 66,000ft high stack of SSH traffic in case you're a would-be terrorist [guardian.co.uk], of course! :-)
Re:Hmmm (Score:2)
The last thing we want is seti@home team 'USGovt' overtaking us!
Re:Hmmm (Score:2)
steve
Re:Compilers !! ??? (Score:3, Informative)
Re:Compilers !! ??? (Score:2)
I haven't used their compiler, but this looks fscking sick: http://www.pathscale.com/infinipath.php [pathscale.com]
AMD has come a very long way the past few years. If they had Intel's fabrication skills they would be almost unstoppable. This HTX stuff simply looks like the bomb.
Re:why not use the ultrasparc T1? (Score:3, Interesting)
So? How much memory bandwidth do they have? Not I/O bandwidth, but memory bandwidth. I highly doubt that they have as much bandwidth PER CORE as the Opterons do, and in big applications, memory bandwidth can be a very important factor.
You cant build an enterprise machine without Ultrasparc (or Power4 or PA_RISC) CPUs.
I guess that Cray thinks differently.
Re:why not use the ultrasparc T1? (Score:2)
The Niagara has four 144-bit interfaces.
The Opterons (Both single- and dualcores) has two 72 bit interfaces.
The Crays are not just about memory bandwidth, but also alot about an efficent interconnect.
Re:why not use the ultrasparc T1? (Score:3, Informative)
Because they don't do floating-point in hardware, or at least not to any useful level of performance.
The 8-core Niagara (T1) has 1 floating-point execution unit on only 1 of the 8 cores. Buy a 6- or 4-core Niagara, and do you get a floating-point execution unit at all?
On Niagara (aka UltraSPARC T1) floating-point will mostly be accomplished with software emulation of the SPARC V9 FP instructions.
That's why you wouldn't use Niagara for supercomputing. Web serving, yes, computational fluid dynamics or nume
Re:The new juncture will be called....CRAPTERONS (Score:2)