Virginia Tech to Build Top 5 Supercomputer? 460
hype7 writes "ThinkSecret is running a story which might explain exactly why the Dual 2GHz G5 machines have been delayed to the customers that ordered them minutes after the keynote was delivered. Apparently, Virginia Tech has plans to build a G5 cluster of 1100 units. If it manages to complete the cluster before the cut-off date, it will score a Top 5 rank in the Linpack Top 500 Supercomputer List. Both Apple and the University are playing mum on the issue, but there's talk of it all over the campus."
Just imagine,... (Score:3, Funny)
Oh wait, it is a cluster. DAMN!!!!!
Re:Just imagine,... (Score:2)
Re:I, for one... (Score:5, Funny)
Damn, ethernet controllers must really piss you off then, huh?
is that so? (Score:2, Funny)
Re:is that so? (Score:5, Funny)
Wow! Imagine a Beowulf cluster of...
Er, ah, forget I said anything.
Re:is that so? (Score:4, Interesting)
Stephen
Re:is that so? (Score:2)
Re:is that so? (Score:2, Insightful)
What about latency? (Score:2, Interesting)
Re:What about latency? (Score:3, Interesting)
You win the moron of the article award. Congrats.
Re:What about latency? (Score:4, Insightful)
Now you are one optimistic AC. The day is still young. I am giving 30:1 odds that there are going to be way better morons than Thinkit3 before this thread is archived.
Re:What about latency? (Score:4, Interesting)
Re:What about latency? (Score:5, Informative)
I suggest you look at the list of the top supercomputers [top500.org] in the world. Most are clusters, ie. separate, distinct machines (just a quick glance shows the top 25 all are). It's just too darn hard to make a shared memory computer with 1000's of processors. So the common architecture is to make a cluster of smaller shared memory machines.
Besides, most clusters built utilize special interconnects like Myrinet that offer low latency connections. They're more expensive than ethernet, but it's a supercomputer so you spend it.
>> All this "the internet is one giant distributed computer" doesn't acknowledge this.
On the contrary... people know this very well. That's why we see rendering and SETI processing as distributed. They don't really need to communicate with others often.
SGI Origin 3000, 1024 processors... (Score:5, Interesting)
It's hard, but not too hard or impossible. The Silicon Graphics Origin 3000 supports 512 processors in a single image system with the stock IRIX kernel and 1024 processors with the "XXL" kernel.
Rumor has it Origin 4000 will support 2048 processors, as will Altix once SGI has done some major work with their kernel patches. (Altix is currently limited to 64 processors per system image).
Re:What about latency? (Score:5, Informative)
For example, you could use Myrinet to get 2 Gigabit, super low latency connectivity, or Quadrix, or Infiniband, or just a well laid out Gigabit Ethernet with high end switches.
In multiple processors in a box, the processors have to fight for the resources that box has to offer. NUMA alleviates demand on the memory, but IO operations (when writing to disk or to network) in a multiprocessor box block a good deal as the processor count in a node rises.
The idea with clusters is that inter-node communication in most cases can be kept low. Each system can work on a HUGE chunk of a problem on its own, with its own dedicated hard drive, memory subsystem, and without having too much competition for the network card. A lot of problems are really hard to solve computation wise, but are *very* well suited to distributed computing. A prime example of this is rendering 3D movies. Perhaps oversimplifying things, but for the most part, a central node divides up discrete parts (a segment of video), and each node works without talking to others until done, so the negative impact is minimal. Certain problems (i.e. nuclear explosion simulations where time and spacial chunks interact more with one another) are much more sensitive to latency/throughut. Seti@Home and distributed.net are *extremely* apathetic to throughput/latency issues (not much traffic and very infrequent communication).
Re:What about latency? (Score:5, Interesting)
Then if it got popular, and they were really clever, they could sell off a part of that computational power they amassed to solve other peoples problems providing for funding for new versions and new supercomputing clusters.
Re:What about latency? (Score:3, Interesting)
Or how about this: your bandwidth is dependent upon the amount you contribute to the distributed processing.
Hopefully there would be some sort of minimum service level, maybe 64kbps; presumably people dropping tens of thousands expect at least a modicum of return on their investment. People who didn't want to install the client could trudge along at those speeds.
Eventually there would be a market system, whereby people would trade their completed blocks for other commodities, like food vouchers, prints,
Re:What about latency? (Score:5, Insightful)
Universities (and big business) often work together and exchange resources. Virginia Tech gets a large amount of bargaining power by having control over a large amount of processing power. They can easily trade CPU time on their cluster for CPU time on a low-latency supercomputer.
Re:What about latency? (Score:4, Interesting)
Now, since today's supercomputers are *all* massively parallel constructions, the difference between a commercial design and an off-the-shelf cluster is in the quality and speed of the interconnects. NEC's Earth Simulator, the prime example of 'custom' supercomputer architecture, puts many processor units on *ridiculously* fast 'local' buses, and its racks are all interconnected with still_pretty_insanely_fast (and rather expensive) custom links.
Meanwhile, more 'commercial' designs use various interconnects. IIRC, NEC's 'regular' supercomputers, which formed the design basis for the Earth Simulator architecture, use Fibre Channel 'mesh' networks between racks. The Opteron - sure to be an up-and-coming player in this market - offers HyperTransport, which it looks like Cray will be stretching to its limits on Red Storm; I'm not sure *how* long an HT bus can be, but one gets the impression they'll be stretching it as far as possible, and it's certainly high throughput/low-latency versus the technologies you'd usually find in use for 'networking.'
Anyhow, point is, those designs pack a lot of CPUs together with *very* fast interconnects (equivalent to 16, 32, 64+-way SMP), and have lots and lots of racks of those. (The Opteron/Red Storm approach sounds sexy to me, because I think Hypertransport should let them pack 'lots and lots' of CPUs together versus existing designs. I've yet to read anything about what they're actually doing with it, though.)
Now.. In contrast, an 'off the shelf' cluster is usually going to stick with Ethernet, and will only have 1 to perhaps 4 processors per [node-unit-where-the-CPUs-are-connected-on-a-fast- local-bus], depending how affordable 'cheap' multiprocessor systems are at the time. But *everyone* building supercomputers bumps up against the latency/routing problem; it's just a question of whether it's a problem for, say, 50 Earth Simulator racks (aren't there quite a few more?) vs. 1100 PowerMacs. Experimenting with 'lots of little nodes' has led us to better understand the problem, and learn how to produce tuned topologies that can compete favorably with 'purpose-built' hardware. See: http://aggregate.org/KASY0/ [aggregate.org]
Now, the question *is* one of cost-benefit. Large supercomputers tend to be built with maintenance features and power efficiency in mind. In turn, a totally 'off the shelf' cluster like KASY0 has some advantages because each machine is a cheap, practically disposable 'module' unto itself, and can doubtless be downed off the cluster, pulled out and replaced with another while being easily bench-repaired (since, after all, it's a self-contained PC, rather than a CPU blade or some other random card that would require an expensive test rack to troubleshoot). Meanwhile, if you absolutely demand low-latency, you want one sort of design (Red Storm seems to be acheiving it 'on the cheap,' by combining off-the-shelf - and thus cheap - chips and buses with smart 'custom-design' engineering) while if you can sacrifice some for throughput (jobs with few conditionals), you want another... (like 1100 G5 Macs on a shelf, wired with 'boring' gigabit ethernet, especially if Apple is giving you a bulk discount on the hardware).
So what I'm trying to say is... this is a *combination* of PR stunt and intelligent planning, and there's certainly a lot of 'good science' they could do with the beast - both in number-crunching and 'computer science' a-la cluster topologies. Whether they'll actually *use* it for such, or if it'll be solely a topology toy is anyone's guess.
I think there's some hope that it'll be the "Real Thing," though, since this would explain some of the weird rumors about FC-on-the-mainboard Macs. So they get a Real Monster, made of what will be revealed as "the new G5 Xserves" at the unveiling. The best of COTS *and* fresh d
Hmmm... (Score:5, Funny)
Must be a pretty boring campus...
Re:Hmmm... (Score:2)
Run Photoshop ! (Score:4, Funny)
Re:Run Photoshop ! (Score:3)
1100 G5s still can't... (Score:3, Funny)
As a VT student... (Score:5, Informative)
Funny, I haven't heard anything about it prior to today. Guess I'm just out of the loop then...
Re:As a VT student... (Score:2)
Well, since it's not official yet, it could just be that someone is imagining a Beowulf Cluster. (Wouldn't be the first time.) Hopefully it turns out to be true. The more supercomputers, the better the world.
Re:As a VT student... (Score:5, Informative)
Re:As a VT student... (Score:5, Interesting)
That's just Hokie (Score:5, Funny)
And it'll get skunked by 40 teraflops by Duke's supercomputer every year!
Re:That's just Hokie (Score:3, Informative)
I'm at Duke...let's just say I'm "in" on a lot of computing stuff...and I don't know of any supercomputer on campus of any significant magnitude. There's a couple clusters....
Maybe you were just making a joke....I had no idea. :)
I Think He's Talking About "The Blue Devil" (Score:2)
Bonus Quiz: How many Coach K-coached players have gone on to win NBA titles?
Problems with my supercomputer. (Score:5, Funny)
My Sun Enterprise 5000 is faster than this machine at times. Super computer addicts, flame me if you want, but I'd rahter hear some inteligent reasons why I should use the G5 supercomputer over cheaper, faster clusters.
Re:Problems with my supercomputer. (Score:2, Funny)
Doom III isn't out yet.
flame me if you want
'kay. Wanker.
but I'd rahter hear some inteligent reasons why I should use the G5 supercomputer over cheaper, faster clusters.
Well, you see, 5 is a bigger number than 4, and intel is stuck at 4, so ours is better. And AMD doesn't even *do* numbers!
Re:Problems with my supercomputer. (Score:3, Funny)
Re:Problems with my supercomputer. (Score:3, Funny)
Re:Problems with my supercomputer. (Score:4, Funny)
May I suggest you stop emacs to free resources for your FFT?
Re:Problems with my supercomputer. (Score:5, Funny)
And what's with having only 1100 mouse buttons? At the price they're charging, why didn't Apple provide 2200 mouse buttons and 1100 scroll wheels. And why did they use a dead operating system like BSD anyway?
Macs ? (Score:5, Interesting)
Wow, that'll make Apple's quarter for sure
Seriously though, why PowerMacs ? I've always been under the impression that intelloid machines are the cheapest commodity hardware around for equivalent processing power, if not the most exciting. Would anybody know why Powermac G5s are a better choice here?
(Note to computer zealots: it's not a flamebait, it's a genuine question, from someone who is rigorously ignorant of the Mac world. And just in case, the first sentence is a joke, too
Re:Macs ? (Score:5, Informative)
I can see it making a lot of sense. NASA and lots of bio companies use the G4s this way.
AltiVec (Score:3, Interesting)
What I am wondering is, what OS is this cluster going to run? I mean, have the BSD folks figured out how to scale? No chance it will be OS X...maybe AIX?
Re:AltiVec (Score:5, Informative)
Real world numbers don't bear this out. Check out the Photoshop and other application performance numbers for this. The gcc version used by the SPEC benchmarks used by Apple didn't even take advantage of AltiVec. When accounted for, and any institution making such a purchase would definitely have considered this, the AltiVec-enabled PowerPC chips totally spank x86 and others in number crunching tasks.
What I am wondering is, what OS is this cluster going to run? I mean, have the BSD folks figured out how to scale? No chance it will be OS X...maybe AIX?
An OS doesn't need to 'scale' to be a member of a cluster. It just needs to run the code locally and send the result back to the cluster master node.
Re:AltiVec (Score:5, Insightful)
Uh, no. 2 years ago, my roommate and I were both running the distributed.net client. I have a 500 Mhz Powerbook G4 (100Mhz bus). He had a 1.4GHz P4 with rambus RAM. I got 4Million keys/sec. He got 2MKeys/sec.
So clock for clock, my machine was nearly 4 times faster.
Re:AltiVec (Score:5, Informative)
Re:MOD PARENT DOWN (Score:4, Insightful)
http://n0cgi.distributed.net/speed/query.php?cput
Power PC 7450/7455 G4 1000 MacOS X 10.2 2.9005 RC5-72 10,594,666.00
Re:MOD PARENT DOWN (Score:2, Funny)
Re:Macs ? (Score:4, Informative)
The dual floating point units on the G5 will help, but it's nothing extraordinary. P4s and Athlons both have multiple floating point units. P4's are relatively orthogonal, Athlons less so. However, SSE2 allows for vectorized double precision operations. It is likely that for the linpack benchmark, best-in-class P4 or Athlon architecture-based machines would outperform best-in-class G5 machines.
Altivec is extremely powerful. However it is only useful for applications that don't require their floating point to be double precision. SSE2 is less powerful, but allows for double precision SIMD processing.
Comment removed (Score:4, Informative)
Re: (Score:3, Informative)
Re:Macs ? (Score:3, Insightful)
Without more details its hard to tell
Re:Macs ? (Score:5, Informative)
The G5's floating point hardware is the most advanced to be found right now, either in standard double-precision or vector double precision.
(FYI: yes, this cluster exists, or will exist. Unfortunately I believe they will be using MPICH which might put a dent into their numbers.)
Re:Macs ? (Score:2)
Re:Macs ? (Score:5, Interesting)
A couple of things make them suitable for clustering:
* There's heaps of processor-processor bandwidth and memory bandwidth.
* On board gigabit ethernet.
* Monster fast execution of properly written vector code.
* Well designed cooling.
Of course, the bang/buck ratio could be an issue for some debate but there's little doubt that in comparison to other commercial unices it's an absolute bargain.
Dave
Re:Macs ? (Score:5, Insightful)
Re:Macs ? (Score:2)
Re:Macs ? (Score:2)
Re:Macs ? (Score:2)
Re:Macs ? (Score:5, Interesting)
2) You would be hard pressed to configure a dual-opteron or dual-Xeon which could trounce the G5 in terms of speed and cost significantly less. MacOS X server also costs less than any version of windows (pure capital cost here for an 1100 seat license), which may also have factored in.
3) My guess is that they have struck a fairly significant deal with Apple (even so low as Apple provides them at cost, though I doubt its quite that low) in exchange for some degree of publicity when this thing is built.
Re:Macs ? (Score:3, Informative)
so no G5 Xserves soon? (Score:5, Interesting)
unless there is some reason the desktops are better for this project that i did not pick up on?
as for the above question about Macs.... depending on what they want to really do with this, Altivec is really efficient for some computations. all flame wars aside there have always been people clustering Macs for certain uses. i do not know how much of it was user preference or the software they wanted to run or the simplicity of getting the cluster running.
it is supposedly VERY simple to cluster Macs. there was a story on
Re:so no G5 Xserves soon? (Score:4, Interesting)
The heatsink is a large oblong about 5"x4"x6" with a thin grille like construction. It's just too big to go in the 1U Xserve. Give them some time to work on designing it to fit though. The G5 is an ideal CPU for the Xserve as you say.
Why G5's? I helped set up the racks (Score:3, Informative)
They are having someone write infinaban drivers for OS X just for this cluster.
I look forward to helping install 4GB of ram + the infinaban cards in each of these bad boys.
It is great having connections!
Re:Macs ? (Score:3, Interesting)
Actually they're not that mum (Score:5, Informative)
Virginia Tech is in the process of building a Terascale Computing Cluster which will be housed in the Andrews Information Systems Building (AISB). For those who are interested in learning more about this project, we will host an information session on Thursday, September 4th from 11 a.m. to noon in the Donaldson Brown Hotel and Conference Center auditorium.
We look forward to seeing you there
Terry Herdman Director of Research Computing.
I'll try to remember to take notes on this and let you all know if there's anything interesting...
Not fast enough (Score:3, Informative)
Re:Not fast enough (Score:5, Informative)
Talk of it all over campus? (Score:5, Funny)
Yeah, chicks dig massive...computers.
No wait, no they don't!
Re:Talk of it all over campus? (Score:2)
Memory (Score:3, Informative)
Rus
Sounds familiar... (Score:2, Interesting)
Right after the Sony Playstation 2 launch, there was a big shortage. Several media stories blamed it on some "unnamed" Middle East country buying them all up to power their missles and supercomputers (because, the rumor claimed, the PS2 was just so powerful).
Wonder if Apple is trying to "pull a Sony" here...
What operating system will they be using? (Score:4, Interesting)
The article makes no mention of the operating system that will be running on this supercomputer. I for one would like to see them get this done w/ OS X rather than use GNU/Linux.
I get to help build it =D (Score:2, Informative)
Getting a bit ahead of ourselves (Score:3, Insightful)
Err... I think somebody's getting a bit ahead of themselves here. =) Building parallel computing systems is complicated, and it may end up being quite a bit harder to realize the predicted performance than thought (not an uncommon occurrence). I'll believe it when they have the actual Linpack numbers.
Top 5? I dont think so (Score:3, Interesting)
There is only one machine in the top 5 that this cluster could beat. The rest of the world has had 6 months to build machines too.
This should be a top 10 machine for sure. Good to see more fast machines being built every day.
The Altivec stuff is the key, I'll bet. (Score:5, Interesting)
Fucking A (Score:3, Funny)
-Waldo Jaquith
Hoax (Score:2, Insightful)
XServe? (Score:2, Funny)
http://www.apple.com/server/clustering.html
Another brilliant idea, inspired by Apple (Score:5, Interesting)
Who cares?
APPLE G5'S DO NOT SUPPORT ECC.
The random bit error rate for 2200 DIMMs with 0.13u cells is roughly one '1' bit dropped to '0' every 9 hours. In other words: good luck getting any reliable, large-scale computation done with this cluster. (And I do mean "good luck" - they might get a run of two or three days without any problems once in a while.)
Now if only Apple would support PC3200 ECC DIMMS, which certainly do exist:
http://www.intel.com/technology/memory/ddr/vali
this cluster might be a bit more useful for real work.
Re:Another brilliant idea, inspired by Apple (Score:5, Insightful)
But... a cluster should be redundant enough to withstand that sort of minor inconvenience and go on functioning without the errant node while it gets fixed, reboots or whatever.
I'll admit that building something smart enough to say "Node 206, you have a memory error. Bad G5, no donut!" is beyond the scope of my understanding.
Won't someone please think of the social sciences? (Score:3, Funny)
DecafJedi
Bragging Rights (Score:5, Funny)
I've been tempted to order a dual G5. I've resisted the temptation by realizing that my only real reason for wanting it would be to awe friends and co-workers. Pretty shallow. I was ashamed.
What a surprise to find that the folks who buy multi-million dollar supercomputers seek some of the same shallow satisfaction that moves me--bragging rights.
Still, if a single order for 1100 units causes significant delays filling orders for other customers, Apple must not have been expecting to sell many of these things. Maybe I should place an order just to help out.
It's true. (Score:2, Funny)
Here's an official word [vt.edu] (search for Teraflop).
Also, here's the original e-mail that went out (a month ago) They never mentioned Apple though:
> Date: Mon, 28 Jul 2003 17:36:46 -0400
> From: Jason Lockhart <multimedia@vt.edu>
> Subject: Terascale Assembly Assistance
>
> Hello all,
>
> As you may know the College of Engineering in conjunction with the
> university Information Systems and Computing organization are
>
Talk about a ton of desktops in a server room (Score:5, Informative)
The stated objective was to be on the next 500 list. Dell and HP were considered, but they couldn't fill the order in time (possibly as they have made announcements of other large clusters recently) and Apple promised delivery after someone leaked the story of the cluster meetign with Dell and HP to Apple and Apple jumped at the chance.
Basically, the story is not a rumor from the point of view of the geeks on campus who have been effected by the preperations. I'll probably post the
I'm disapointed about this being only on the Apple section of
Re:Talk about a ton of desktops in a server room (Score:4, Informative)
Do the math - the new generator is rated at 600kva and is already carrying several hundred machines (including a very power-hungry Sun E10K and a number of E6K-class machines). There's not enough capacity on that generator for 1,100 more systems.
(And I just wanted to say "This is All Kevin's Fault" - except for a few unrelated parts we blame on Randy
Re:Talk about a ton of desktops in a server room (Score:3, Insightful)
I said there wasn't capacity on that generator. I didn't say anything about the existence of other generators - and there's a distinction between pulling copper to get power grid capacity to the cluster and having emergency power backup for same. That diesel is for emergency bac
1% of G5 orders (Score:5, Insightful)
And 8 hours@12.4GFlops...damn you Virginia Tech, you owe me a third of a quadrilion floating point multiplies!
G5 Vs. Itanium2 and Opteron: Some perspectives (Score:5, Informative)
Re:What? (Score:5, Informative)
I take it you don't look at Think Secret on a regular basis. It is, easily, the most accurate Mac rumors site out there. In fact, they have posted info on numerous occasions that has caught the attention of Apple's lawyers, and have been forced to pull down and issue their standard disclaimer. Say what you will about other rumors sites (most of them simply feed off each other) but there are some startlingly reliable sources informing Think Secret. Frankly, I don't recall the last time they were wrong about anything they've posted.
Think Secret's Record (Score:5, Informative)
Bottom line? Like any other news organization, Think Secret has occasional misses. But those misses don't appear to include any of the items mentioned here. I think our record speaks for itself.
Nick dePlume
Publisher and Editor in Chief
Think Secret
Not! (Score:2, Interesting)
Re:Do they have a need for it? (Score:3, Informative)
Re:Do they have a need for it? (Score:5, Interesting)
But I would bet this will be not too dissimilar in use from the HP Itanium2 referenced earlier on slashdot. I would bet one of the paramount concerns this cluster would look at is the effect of farm runoff, and probably climatology too among other things.
Re:Do they have a need for it? (Score:2)
Re:Do they have a need for it? (Score:5, Informative)
Re:Do they have a need for it? (Score:3, Interesting)
Is there something particularly about building an
Re:Do they have a need for it? (Score:4, Informative)
Re:Poor choice on Apple's part (Score:5, Funny)
You mean you are about to order another in a day or two
Sincerely,
Steve.
sjobs@apple.com
PS - Seven Mac's in ten years isn't hardcore. 1100 units ordered on one purchase order before they even ship (August, 2003)
Re:Poor choice on Apple's part (Score:5, Insightful)
(seriously - I can take out a loan
Re:Poor choice on Apple's part (Score:3, Insightful)
Making... oh... ONE good thing about the Empo. (Score:2)
Kinda curious why the Empo needs them, especially in the budget crunch. But I shouldn't complain (especially since I'm now an alumnus), I'll just watch University Surplus [vt.edu] and score a used G4 from there for cheap once they retire them in favor of the G5s. (WOOHOO!)