Cray XD1 Now Available 200
cyngus writes "Cray announced the availability of their XD1 systems. Each XD1 chassis has up to 12 AMD Operton processors. Up to 12 chassis can be clustered together in a rack. The XD1 uses Cray RapidArray Interconnect technology, based on HyperTransport, for high bandwidth and low latency communications between processors and chassises. The XD1 also has a handful of other technologies aimed at the HPC market, including Xilinx FPGAs, communications accelerators, etc."
Not the top end (Score:5, Funny)
Still, if they need someone to, uh, test one...
Re:Not the top end (Score:3, Interesting)
Still, if they need someone to, uh, test one...
Interesting numbers. Also to note, NEC's Earth Simulator is now nearly three years old - the Cray XD1 is made with modern AMDs.
I guess there's no getting around it. For the time being our really fast computers will just be fucking huge. Oh, and call NEC if you want a big computer. (:
Re:Not the top end (Score:5, Insightful)
(commence snide comments about the next windows release... now :) )
Re:Not the top end (Score:2)
I for one welcome our new fucking-huge-windows overlords.
Re:Not the top end (Score:2)
Hmmm
Re:Not the top end (Score:2)
Blue Gene [ibm.com]
Let's see what happens come Supercomputing '04... good things can come in small sizes.
From the linked site
with a sustained speed of 11.68 teraflops and a peak speed of 16 teraflops, uses more than 8,000 PowerPC processors packed into just four refrigerator-sized racks.
So lets say their "refridgerator sized racks" are about the size of 1.5 standard racks, and you'd need about 3x the amount of flops they're getting currently, so that's 36 racks? Not too shabby. Almost half t
Re:Not the top end (Score:2)
This seems as good a place as any to ask that question (see my sig for details.)
the difference (Score:5, Informative)
this Cray is more like building MPPs off of scalar units (opterons) and doing some real innovation around the MPP interconnect. It's sort of off the shelf, yet not at the same time.
The big thing here that kicks ass is the 6 FPGAs per chassis. If you can write a highly tuned software algorithm, there's a chance you can write a highly tuned peice of hardware, deploy that to the FPGA, and you've got an application specific hardware accelerator. 6 per chassis, infact. That's pretty cool, and its in some ways a HUGE innovation over having a dedicated vector unit (as was the cray1 design).
the really interesting thing here is that these are essentially opterons running linux, with custom interconnect goo. The interconnect bypasses the PCI bus - its closer to the PE's than that.. their claim is that it attaches to the AMD hypertransport bus (the Proc -> Proc -> Mem bus for SMP AMD machines)
Hell, yeah! (Score:5, Interesting)
Thus, the part that impacts runtimes the most is either the on disc lookup, which is still faster than direct calculation, which we've also had to do.
I looked into FPGA's a while back. Some back of envelope calculations show that a single FPGA should be able to calculated the data table on demand, and it'll be faster than reading from disc.
(Turns out, that to actually get a usable solution for a basic PC would need to hack up the whole tool chain. FPGA cards for a PC are all designed for DSP, rather than numerics).
So, with an FPGA and a CPU, I could elminated the slowest part of the job, and scale up to, what, a 1GB working matrix, which is about 8 time larger than the biggest job I've ever run, which hogged a T3E1200 for 6 hours.
So, in short, gimme an FPGA and some reasonable tool chain, and I will be able to about half runtimes, and, more importantly, scale up to 10 times larger calculations. 5 time larger calculations is the most I've ever been asked about.
Time to brush up on my VHDL, I think.
Re:the difference (Score:2)
It would be nice if it were available on low-cost motherboards so anyone could build a similar system.
Re:Not the top end (Score:3, Interesting)
Earth Simulator uses vector processors. If you want a comparable Cray system, you should be looking at the X1 which is also a vector processor. Incidentally, the X1's silicon runs so hot they use evaporative florinert cooling instead of a straight liquid - the florinert is heated to just under the evaporation point and sprayed onto the processor so that the phase change will remove more heat
Re:Not the top end (Score:3, Interesting)
I imagine that this is extremely expensive stuff to do. Since a cray can charge $40,000 per processor for the X1, they can get away with a $700 cooler. Not so easy on a PC.
Re:Not the top end (Score:3, Interesting)
Just circulate the stuff and run it through a radiator.
BTW, HP was recently researching cooling chips with inkjet nozzles spraying a coolant which evaporates easily onto the CPU.
long time no news... (Score:4, Insightful)
Re:long time no news... (Score:5, Interesting)
right.. but what happened to Tera?! (Score:4, Informative)
If you look at cray.com today its pretty sad. 3 product lines - the TD1 opteron+magic, the X1, which is traditional cray vector (smp vector nodes, and MPP's of those nodes), and their 3rd product line is the NEC SX-6... they're reselling it in the states for NEC.
If you hit tera.com, you get a 404
Whither MTA? (Score:3, Informative)
The MTA idea is neat, but nobody's ever been able to find a problem that runs all that well on them. The original MTA didn't have enough memory bandwidth to make it competitve with a vector machine, and the small number of them in the field (less than 10 IIRC) are notoriously cantankerous. When Tera bought Cray, the one of the main things they were buying, aside from name recognition, was Cray's CMOS design experience; they were
Re:right.. but what happened to Tera?! (Score:2, Interesting)
Re:right.. but what happened to Tera?! (Score:3, Informative)
Re:long time no news... (Score:2, Insightful)
I too was worried that Cray would completely disappear if they continued to pursue the expensive and anachronistic
Re:long time no news... (Score:2)
Re:long time no news... (Score:2, Informative)
I forget where I snipped this from, but here goes:
Re:long time no news... (Score:2)
But does it run Linux? (Score:2)
Re:But does it run Linux? (Score:2, Informative)
Re:But does it run Linux? (Score:3, Informative)
Having Linux or any other OS (or even CPU type functions) on the FPGA would be a waste of gates. The gates would be better spent for specialized vector operations, such as an FFT or crypto engine.
Re:But does it run Linux? (Score:3, Insightful)
Sheesh...what happened to Cray? (Score:5, Funny)
Re:Sheesh...what happened to Cray? (Score:4, Funny)
Re:Sheesh...what happened to Cray? (Score:3, Informative)
Re:Sheesh...what happened to Cray? (Score:2, Informative)
Operton? (Score:4, Funny)
Also, mandatory: imagine a Beowulf cluster of these.
Re:Operton? (Score:3, Funny)
Also, mandatory: imagine a Beowulf cluster of these.
Don't you mean Bewoulf?
Re:Operton? (Score:2, Funny)
I don't know, but.. (Score:2)
Does the XD1 give the illusion of shared memory? (Score:3, Interesting)
"Tightly coupled to the AMD Opterons and switching fabric, [the RapidArray Communications Processors] handle memory to memory copies, global memory management, and system wide process synchronization, freeing..."
(Emphasis mine)
Does this mean the HT links give the OS the view of a single-system for each chassis? (Or rack, even?) Ie, can I utilize a single processor out of those 12 in a chassis, and access 96GB of RAM with that one process WITHOUT using MPI or rDMA?
Re:Does the XD1 give the illusion of shared memory (Score:3, Informative)
Re:Does the XD1 give the illusion of shared memory (Score:2)
Re:Does the XD1 give the illusion of shared memory (Score:2)
For that you need to buy crays mpp system "strider", which is a productized version of red-storm.
The xd1 hardware is probably capable of shared memory, the software is not. The nodes (each 2-cpu blade) run off-the-shelf linux, and use MPI to share data.
Cray doesn't do Clusters? (Score:3, Interesting)
And it's running Linux, if that matters to you
Re:Cray doesn't do Clusters? (Score:4, Interesting)
Given the financial status of Cray, embracing clusters is just a common sense move, not necessarily an ideological one.
Re:Cray doesn't do Clusters? (Score:2)
In the world of buisness, ideology often needs to take a back seat to good common sense. Personally, I feel it's to their credit. This way they can focus less on their last innovation, and more on the next one.
~D
Cray is giving customers what they want (Score:2)
If you read Cray's 10-K, you will see that they believe that some computer problems can be resolved optimally on clusters, but others are better suited for vector-based systems.
Re:Cray doesn't do Clusters? (Score:2)
Spend a few million on something else.
Re:Cray doesn't do Clusters? (Score:2)
Re:Cray doesn't do Clusters? (Score:2)
Seriously though, I some problems are well solved with clusters and some with vector processors. So I suppose Cray is trying to find out just how many fall into each camp... either that or how much their name is worth.
Just thinking about that, I guess I would buy a mini Cray deskside thing if I had to replace my Onyx 2 but not a cluster as my application doesn't do well with them. Shame really, I like the Blue Gene/L.
Re:Cray doesn't do Clusters? (Score:2, Insightful)
Must.. get.. ridiculously.. powerful.. device.. (Score:5, Funny)
Dilbert: I can compute many values of pi. Some people discuss areas of circles, but I'm doing something about it!
Interesting specs and density (Score:5, Informative)
From the linked page:
Highly modular, the Cray XD1 base unit is a chassis. Up to 12 chassis can be installed in a rack. Multirack configurations integrate hundreds of processors into a single system.
Farther down the same page:
The Cray XD1 compute subsystem is composed of 12 AMD Opteron(TM) 64-bit processors that run Linux and are organized as six 2-way SMPs to deliver 58 GFLOPs* per chassis. Finely tuned memory and I/O performance removes bottlenecks and maximizes processor performance.
Wow - do the math: 696 GFLOPs per chassis. That's rather impressive.
However, part of me is a bit saddened by seeing the Cray name attached to X86s. Yes, I felt the same thing with SGI, DEC, and Sun. Yes, I need to get over it and move on.
Re:Interesting specs and density (Score:2)
However, part of me is a bit saddened by seeing the Cray name attached to X86s. Yes, I felt the same thing with SGI, DEC, and Sun. Yes, I need to get over it and move on. :-)
No need to worry -- if you want a Cray vector processor-based supercomputer, you can still buy one [cray.com]. You can also source the parts for your own Earth Simulator [cray.com] from Cray, as well.
Re:Interesting specs and density (Score:3, Interesting)
Actually, in the year between crash of Cray Computer (in March 1995) and his death in an auto accident, Seymour Cray started a new company, SRC Computers [srccomp.com], which still exists, and makes a parallel Pentium-based computer (which also incorporates custom hardware processing elements). I believe that this product is the same thing he was working on from the start of that company in 1996.
Re:Interesting specs and density (Score:2)
Re:Interesting specs and density (Score:2)
Re:Interesting specs and density (Score:2)
The Cray XD1 compute subsystem is composed of 12 AMD Opteron(TM) 64-bit processors that run Linux and are organized as six 2-way SMPs to deliver 58 GFLOPs* per chassis.
Your comment:
Wow - do the math: 696 GFLOPs per chassis. That's rather impressive.
To get the 696 GFLOPS you need to have 12 chassis (a fully loaded system), so it isn't 696 GFLOPS/chassis, but
Ooops... :-) (Score:2)
I meant "696 GFLOPs per rack", not "per chassis", where a fully-loaded rack contains 12 chassis (58 * 12 = 696). D'oh! I appreciate the correction.
Re:obsesive compulsive correction! (Score:2)
The EPIC on the other hand does have a completely different instruction set with x86 emulation being a fairly separ
XD1 announced sales (Score:3, Informative)
Where's the source code (Score:3, Insightful)
I wonder what that means - Red Hat EL 3.0 with enhancements, or their own thing..
Interconnect - I wonder how their proprietary interconnect compares to IB..
File system - ext3? No cluster file system?
Re:Where's the source code (Score:3, Interesting)
Oh God... (Score:2)
Why do I have the sneaking suspicion that you're not pronouncing it correctly, either?
Still waiting for... (Score:5, Funny)
Crayola!
Re:Still waiting for... (Score:3, Funny)
Re:Still waiting for... (Score:3, Interesting)
Coincidence? (Score:5, Funny)
Dare we say, we've finally actually found the hardware that can run this game?
Unfortunatly, no. (Score:2)
I wonder what John has been thinking... a few years ago he was on
In other news... (Score:4, Funny)
"We welcome the Cray XD1 as the first platform on which Gentoo installs in less than 12 hours. Looking forward to renaming Gentoo to 'One-Click-Linux'. Stay tuned !"
A good use for a bunch of these systems (Score:2)
Hmmm.... (Score:3, Funny)
~D
Emoticons (Score:2)
Re:Emoticons (Score:2)
Pah! (Score:3, Funny)
It runs open source codes! (Score:2)
That's the power of Cray's parallel processing: each machine runs its own "open source code" therefore a cluster is more powerful because the entire cluster runs "open source codes". At least that's their sales reps understanding of it.
If you can't afford this Cray... (Score:3, Informative)
http://www.monarchcomputer.com/ [monarchcomputer.com]
A friend of mine and I were talking the other night about local Atlanta, GA computer stores, and he mentioned that Monarch Computer is one of the only vendors from whom you can purchase the 4-way Opteron 800 series processors ($1200 a piece -- damn!).
He's been in grad school out of state for a few years and was suprised to learn that Monarch Computer is, in fact, in his hometown backyard. Kind of kewl to walk in a store in your own town and walk out with a $1200 4-way processor.
Until the wife finds out and sends you back to said store with the receipt in hand for a refund.
IronChefMorimoto
P.S. - I don't work for these guys or advocate their store. I just thought it was cool to have such a vendor nearby. Too bad they don't sell Shuttle XPCs.
Re:If you can't afford this Cray... (Score:2)
Re:If you can't afford this Cray... (Score:2)
Re:If you can't afford this Cray... (Score:2)
What is needed is the Infiniband bridge.
It's nice to see Cray out there (Score:3, Insightful)
It's nice to see our old friend Cray continue to keep a foot in the market -- if nothing else, it makes everyone else stay on their toes.
Let's See... is my G5 math right? (Score:2, Interesting)
Re:Let's See... is my G5 math right? (Score:2)
The Opteron's theoretical peak double precision floating point numbers are pretty mediocre, it gets beaten soundly by Xenons and Itaniums too. It's just not what the processor was designed for, it only has one FPU iirc, where the G5 and most of the other competition has 2.
It is worth noting that these are theoretical numbers, not what you can actually achieve in reality on any given algorithm. The Opteron's Rmax is slightly more competitive(2.9GF/proc at 2Ghz in LosAlamos's Lightni
What happened to RedStorm? (Score:3, Interesting)
Yea check this out:
Cray Unicos/mp" [cray.com]
Actually that references the X1, which is not based on PeeCee stuff, but actually a 8 core MPM.
Sad thing is, even with Red Storm I think IBM will remain on top as their contract calls for 130,000 of their powerPCs on one system?
It would be nice to see Cray on top, with something other than a commoditiy processors. I realize the T3D and T3E were both Alpha based systems.
PS, I still have a J932se 32 proc Vector Cray ( for sale [ebay.com] ) if anyone wants a Cray for home. $4500, real deal 3 cabinet Cray from 97', most likely used for gov't nuclear energy something-or-other. Located in Southeastern Virginia.
Q404 (Score:2)
This is a quintessential kernel architecture without support for threads, VM, IPC, etc. The interconnect is also asynchronous (5ns end-end speculated), point-to-point, and uses Portals.
Re:Q404 (Score:2)
Re:What happened to RedStorm? (Score:3, Informative)
2 complimentary product lines. You could run the same application on both, though red storm provides real shared memory, which might allow better optimizations.
I love Cray (Score:4, Interesting)
*queue calls to Cray*
Re:I love Cray (Score:2)
Re:I love Cray (Score:3, Interesting)
That was a unique experience... I had a security pass to the machine room, full of Cray C-90, Y-MP, X-MP, Cray-1 &
Dual-Core opterons (Score:2)
There are quad-opteron 1U boxes... So currently 6u of space can hold twice as many Opterons as these Cray units (24 Opterons with normal servers, 12 Opterons with the 6u Cray). The rapidly approaching introduction of dual-core Opterons would allow 48 opteron cores in the space this 12 opteron Cray.
Yes, the Cray has many extras, (The FPGAs for example?), but for pure power, you might be better off with normal servers.
Re:Dual-Core opterons (Score:2)
Since when are racks 72U high? It's more like 72 inches, or 42U, for the big ones.
The end of custom CPUs (Score:3, Insightful)
Re:Banning Fun (Score:3, Funny)
Re:Cray (Score:3, Informative)
I only know this from a Sci-Am article on using supercomputers to predict crash situation.
Re:Cray (Score:3, Informative)
http://en.wikipedia.org/wiki/Seymour_Cray
Seymour Roger Cray (September 28, 1925 - October 5, 1996) was a supercomputer architect who founded the company Cray Research. For about 30 years, the short answer to the question "What company makes the fastest computer?" was "Wherever Seymour Cray is working now."
Re:Just imagine... (Score:2, Flamebait)
Find the non ad? (Score:2)
Re:Find the Fake Ad? (Score:3, Funny)
Re:If it were a girl robot. (Score:2)
Re:Hmmm... (Score:5, Informative)
This is some of the stupidest piles of drivel I have read on slashdot. SGI and Cray both do ALL of the glue logic chips themselves, that's the whole point of buying from them. They don't use the off the shelf chipset, they design their own with the design goal of large scalable systems. Besides Intel uses a shared bus where AMD uses the point to point bus they bought from Compaq which was origionally designed for the Alpha. So if anyone has a scalability lead it's AMD.
Re:Hmmm... (Score:2)
One should note, however, that the altix does not use the shared bus features of the itanium. Or at least that that bus is only shared by one cpu, the memory, and the bridge chip. The interconnect architecture of the altix is identical to the interconnect used on the old SGI origin systems, which were based on MIPS processors. From an architecture point of view, the Altix and the XD1 are very very similar. One uses itanium, one uses opteron.
Altix tries to run a single
Re:Hmmm... (Score:5, Informative)
They have the X1, which is a massively parallel vector system for the very high-end. (For those who need 30+Gbytes/second of memory bandwidth for EACH cpu) These things are huge, expensive, and used by a limited number of users, mostly governments.
They are getting ready to productize red storm, which is also a bunch of opterons, but strung together in a shared-memory system like the T3E. also a high-end solution.
This system, the Xd1, is a low end system designed to be a half-step better than a cluster of off-the-shelf opterons. It's a multi-kernel cluster using MPI for all the data sharing. However the interconnect basically sits where the south-bridge sits on most opteron boxes.
So Cray still has the absolute cutting edge systems, but have now expanded down-market. (Rather, they acquired octiga-bay who did the early design work).
This is also not the first time this has happened. In the early 90s, Cray purchased a small start-up that was developing a NUMA-style mini-super based on sparc processors. They turned it into a product and sold a few, though not as many as they would have liked. During the SGI acquisition they sold the product to SUN, who branded it the E10000, and made about a billion dollars off of it. It's now the foundation for all of Sun's high-end Unix servers.
Cray also bought a small company (I forget the name) that made a cmos implementation of the YMP. This became the ymp-el, the J90, which pioneered technology for the SV1.
Cray has often built mid-range systems. Nothing new.
Re:XD1 == OctigaBay (Score:2)