Forgot your password?
typodupeerror
Apple Businesses Hardware

Virginia Tech to Build Top 5 Supercomputer? 460

Posted by michael
from the computer-hogs dept.
hype7 writes "ThinkSecret is running a story which might explain exactly why the Dual 2GHz G5 machines have been delayed to the customers that ordered them minutes after the keynote was delivered. Apparently, Virginia Tech has plans to build a G5 cluster of 1100 units. If it manages to complete the cluster before the cut-off date, it will score a Top 5 rank in the Linpack Top 500 Supercomputer List. Both Apple and the University are playing mum on the issue, but there's talk of it all over the campus."
This discussion has been archived. No new comments can be posted.

Virginia Tech to Build Top 5 Supercomputer?

Comments Filter:
  • As a VT student... (Score:5, Informative)

    by Julius X (14690) on Saturday August 30, 2003 @06:25PM (#6835450) Homepage
    but there's talk of it all over campus.

    Funny, I haven't heard anything about it prior to today. Guess I'm just out of the loop then...
  • Re:What? (Score:5, Informative)

    by inkswamp (233692) on Saturday August 30, 2003 @06:26PM (#6835457)
    Slashdot posting vaporware news from an unreliable (Thinksecret et al) source?

    I take it you don't look at Think Secret on a regular basis. It is, easily, the most accurate Mac rumors site out there. In fact, they have posted info on numerous occasions that has caught the attention of Apple's lawyers, and have been forced to pull down and issue their standard disclaimer. Say what you will about other rumors sites (most of them simply feed off each other) but there are some startlingly reliable sources informing Think Secret. Frankly, I don't recall the last time they were wrong about anything they've posted.

  • by Nefrayu (601593) on Saturday August 30, 2003 @06:31PM (#6835481) Homepage
    I got the following email the other day:
    Virginia Tech is in the process of building a Terascale Computing Cluster which will be housed in the Andrews Information Systems Building (AISB). For those who are interested in learning more about this project, we will host an information session on Thursday, September 4th from 11 a.m. to noon in the Donaldson Brown Hotel and Conference Center auditorium.
    We look forward to seeing you there
    Terry Herdman Director of Research Computing.


    I'll try to remember to take notes on this and let you all know if there's anything interesting...
  • Not fast enough (Score:3, Informative)

    by cperciva (102828) on Saturday August 30, 2003 @06:33PM (#6835491) Homepage
    By my count, they'll have an R_peak of 8800 GFLOPS; unless they've got more efficient linpack code than anyone else, that will put them around 7th or 8th place.
  • Re:Macs ? (Score:5, Informative)

    by jbolden (176878) on Saturday August 30, 2003 @06:34PM (#6835498) Homepage
    Altivec. Certain types of vector code when compiled to only run on a G4 outperform a pentium even at 3+x the ghz range (i.e. a 800 mhz G4 beating a 3ghz PIV). Assuming similar numbers for the G5 and the increase across the board on all the non vector operations + the fact that the 970 work together so much better....

    I can see it making a lot of sense. NASA and lots of bio companies use the G4s this way.
  • Memory (Score:3, Informative)

    by rf0 (159958) <rghf@fsck.me.uk> on Saturday August 30, 2003 @06:37PM (#6835513) Homepage
    One thing against clusters rather than machines designed for the ground up is memory access. If on a n Single System Image (SSI) system is that any node can access memory of another over fast internconnects. With a cluster the memory has to be transfered over ethernet which even if using 10GB Ethernet is still a number of magnitudes lower than memory

    Rus
  • by Anonymous Coward on Saturday August 30, 2003 @06:46PM (#6835555)
    In my email the other day, I received this letter:
    Hello all,


    This email is to serve as invitation and notice of impending Terascale Facility assembly assistance. For those receiving this info for the first time know that Virginia Tech is building a top 10 supercomputer from scratch and we need your assistance. We do have one stipulation to volunteerism and that is you must not be a wage employee of the university. Grad students on GTA/GRA are fine as well as others outside the university that may wish to volunteer.

    We are expecting to receive machines next week!!! Yikes! In preparation for the assembly process, we need to get volunteers together at the AISB (Andrews Information Systems Building), 1700 Pratt Dr., this weekend. We are planning to have a process orientation session start at 10:00 AM on Saturday, August 30, and last no longer than an hour. We can give you a few more details about the project if you show up and have not been before.

    There are many things that need to be covered and many new volunteers needed. We have posted an electronic sign-up sheet for proposed shifts at (link deleted) We will need folks to sign-up as either a primary volunteer or on-call/backup person that we can call and bring in if we are short people. We know this is a very busy time for everyone and we want to get this done and over with quickly so it will not affect other work that needs to be done across campus. Once we have a definite date for the deliveries we will send out notification to those folks that said they were available on that day. We will have 48 hours notice of shipment, 72 hours notice of delivery. The machines will arrive on a staggered, every other day, schedule. Three shipments are expected for the total number of machines.

    Orientation today was postponed, however, so I won't have more details until Wednesday =/ I'm looking forward to helping out, though.
  • by harks (534599) on Saturday August 30, 2003 @06:49PM (#6835563)
    Schools have many different accounts set up to fund many different things. This is due to how donors donate money and specify that they want it to go toward a certian project or department. One department, say the CS department might have recieved donations from CS alumni. Also, having large projects like this can generate lots of revenue through grants.
  • by absoluthokie (694231) on Saturday August 30, 2003 @06:51PM (#6835568)
    Exactly, Virginia tech has a goal to become a top 30 research university. Having known about the plan for some time, this makes perfect sense. The departments who are building the cluster, have gotten very large grants and donations from our great alumni to build this, and become a better university for it. This construction can be compared to the stadium expansions. The stadium expansion is paid out of a different set of funds, as is research. Academic fund is hurting because alumni rarely give money for academic reasons, but more for football or research.
  • by purdue_thor (260386) on Saturday August 30, 2003 @07:01PM (#6835606)
    >> A box designed to be separate just will not have the latency advantage of a supercomputer designed from the ground up.
    I suggest you look at the list of the top supercomputers [top500.org] in the world. Most are clusters, ie. separate, distinct machines (just a quick glance shows the top 25 all are). It's just too darn hard to make a shared memory computer with 1000's of processors. So the common architecture is to make a cluster of smaller shared memory machines.

    Besides, most clusters built utilize special interconnects like Myrinet that offer low latency connections. They're more expensive than ethernet, but it's a supercomputer so you spend it.

    >> All this "the internet is one giant distributed computer" doesn't acknowledge this.
    On the contrary... people know this very well. That's why we see rendering and SETI processing as distributed. They don't really need to communicate with others often.
  • by Junta (36770) on Saturday August 30, 2003 @07:06PM (#6835622)
    There are tradeoffs actually. This isn't like distributed.net or seti@home, this is a controlled network. They have complete control over the network switches, technology, and topology used and can design the network to accomodate tho problems the cluster will be designed to solve.
    For example, you could use Myrinet to get 2 Gigabit, super low latency connectivity, or Quadrix, or Infiniband, or just a well laid out Gigabit Ethernet with high end switches.

    In multiple processors in a box, the processors have to fight for the resources that box has to offer. NUMA alleviates demand on the memory, but IO operations (when writing to disk or to network) in a multiprocessor box block a good deal as the processor count in a node rises.

    The idea with clusters is that inter-node communication in most cases can be kept low. Each system can work on a HUGE chunk of a problem on its own, with its own dedicated hard drive, memory subsystem, and without having too much competition for the network card. A lot of problems are really hard to solve computation wise, but are *very* well suited to distributed computing. A prime example of this is rendering 3D movies. Perhaps oversimplifying things, but for the most part, a central node divides up discrete parts (a segment of video), and each node works without talking to others until done, so the negative impact is minimal. Certain problems (i.e. nuclear explosion simulations where time and spacial chunks interact more with one another) are much more sensitive to latency/throughut. Seti@Home and distributed.net are *extremely* apathetic to throughput/latency issues (not much traffic and very infrequent communication).
  • Re:AltiVec (Score:5, Informative)

    by Space Coyote (413320) on Saturday August 30, 2003 @07:07PM (#6835630) Homepage
    While the AltiVec unit is very impressive, The SSE2 unit on the P4 or the Opteron would have nearly the same performance and cost a whole heck of a lot less (I am betting if this rumor is true at all, then Apple has given the units to the school).

    Real world numbers don't bear this out. Check out the Photoshop and other application performance numbers for this. The gcc version used by the SPEC benchmarks used by Apple didn't even take advantage of AltiVec. When accounted for, and any institution making such a purchase would definitely have considered this, the AltiVec-enabled PowerPC chips totally spank x86 and others in number crunching tasks.

    What I am wondering is, what OS is this cluster going to run? I mean, have the BSD folks figured out how to scale? No chance it will be OS X...maybe AIX?

    An OS doesn't need to 'scale' to be a member of a cluster. It just needs to run the code locally and send the result back to the cluster master node.
  • Re:Not fast enough (Score:1, Informative)

    by Anonymous Coward on Saturday August 30, 2003 @07:11PM (#6835645)
    You left out a factor of 2. 2 processors * 1100 * 4 FLOP * 2 GHz = 17600 GFLOPS. Remember, the 970 has 2 FPUs each capable of 2 FLOPs/ cycle.
  • by reporter (666905) on Saturday August 30, 2003 @07:28PM (#6835699) Homepage
    After the introduction of the supercomputer called "Earth Simulator [com.com]" by NEC, many Americans went into paranoid mode. They feared that the Japanese "once again" had taken the lead in a crucial technology.

    American fears are unfounded. Numerous universities like Virginia Tech have trained a generation of American (not foreign) students in building the finest supercomputers. MIT, Carnegie Mellon University (CMU), and Virginia Tech (to name just a few) have launched large-scale research projects staffed by top American graduate students. Their work became the foundation of several generations of multiprocessors.

    By contrast, very few (if any) Japanese universities conduct large-scale research projects to build high-performance supercomputers. The Japanese government has tended to avoid funding this kind of research. Worse, there is little collaboration between industry and academia in Japan. Yet, precisely this kind of collaboration is needed for such large-scale projects: e.g. Virginia Tech is enlisting the help of Apple computer.

    American companies lead by scientists trained at MIT and CMU could easily design a computer that outperforms the Earth Simulator. These companies simply have chosen to not do so because there is far more profits to be garnered by building commercial supercomputers geared for database transactions. In fact, the highest-performance commercial supercomputers nearly all come from the United States of America (IBM).

    The 21st century remains Pax Americana, not Pax Asia. The hordes of immigrants trying to get the hell out of Asia and into the USA underscores this fact.

    ... from the desk of the reporter [geocities.com]

  • Re:That's just Hokie (Score:3, Informative)

    by TiMac (621390) on Saturday August 30, 2003 @07:41PM (#6835748)
    And which supercomputer might that be?

    I'm at Duke...let's just say I'm "in" on a lot of computing stuff...and I don't know of any supercomputer on campus of any significant magnitude. There's a couple clusters....

    Maybe you were just making a joke....I had no idea. :)

  • Re:Not fast enough (Score:5, Informative)

    by bspath1 (703088) on Saturday August 30, 2003 @07:57PM (#6835797)
    Reposting my AC post:
    You left out a factor of 2. 2 processors * 1100 * 4 FLOP * 2 GHz = 17600 GFLOPS. Remember, the 970 has 2 FPUs each capable of 2 FLOPs/ cycle.
    Leaving out a factor of 2 in this case significantly alters the R_peak value. -Bruce.
  • by Nick dePlume (164783) on Saturday August 30, 2003 @08:04PM (#6835825)
    As Zack pointed out, iWalk was not a Think Secret report; in fact, we debunked [thinksecret.com] it. For WWDC, we reported that Apple would announce 64-bit Power Macs as well as a videoconferencing camera that we said would be called "iSight," -- I think we're in the clear there. iWorks? I maintain that it is still a future Apple release. As for 12-inch and 17-inch PowerBooks, while we raised the possibility of a release that week, we specifically said we couldn't confirm the delivery date: "It's unclear when Apple plans to announce the upgrades..."

    Bottom line? Like any other news organization, Think Secret has occasional misses. But those misses don't appear to include any of the items mentioned here. I think our record speaks for itself.

    Nick dePlume
    Publisher and Editor in Chief
    Think Secret
  • by Kalak (260968) on Saturday August 30, 2003 @08:07PM (#6835839) Homepage Journal
    For the ones who are questioning this existence, the order is shipping, the racks (a ton of them) are there in the main Computing Center server room. First they required all servers to be moved innto racks. Then they started moving servers around, including removing the Petaplex [vt.edu]. The power has been upgraded in the server room (the UPS backup generator actually). This caused a morning of basically all the important servers on campus having to go down for one day in the summer - I hated waking up to go switch off machines for that one. The AC has been upgraded to accomidate the huge amount of heat to be put out. It was't until I heard about the cluster that all the chages in the Machine Room made sense. Now they're recruiting help to do the grunt work of putting all the machines in the racks.

    The stated objective was to be on the next 500 list. Dell and HP were considered, but they couldn't fill the order in time (possibly as they have made announcements of other large clusters recently) and Apple promised delivery after someone leaked the story of the cluster meetign with Dell and HP to Apple and Apple jumped at the chance.

    Basically, the story is not a rumor from the point of view of the geeks on campus who have been effected by the preperations. I'll probably post the /. link to the campus geek list (If someone hasn't beaten me to it).

    I'm disapointed about this being only on the Apple section of /. since a cluster this size is noteworthy of the frontpage. (Rumor - and this is rumor sice I haven't goe to direct sources on this - is that it will not be running OS X, and probably BlackLab or YellowDog [terrasoftsolutions.com] or SuSE.)
  • by Kirby-meister (574952) on Saturday August 30, 2003 @08:21PM (#6835879)
    As a CS student at VT, I received word of it days ago - Hello all, This email is to serve as invitation and notice of impending Terascale Facility assembly assistance. For those receiving this info for the first time know that Virginia Tech is building a top 10 supercomputer from scratch and we need your assistance. We do have one stipulation to volunteerism and that is you must not be a wage employee of the university. Grad students on GTA/GRA are fine as well as others outside the university that may wish to volunteer. We are expecting to receive machines next week!!! Yikes! In preparation for the assembly process, we need to get volunteers together at the AISB (Andrews Information Systems Building), 1700 Pratt Dr., this weekend. We are planning to have a process orientation session start at 10:00 AM on Saturday, August 30, and last no longer than an hour. We can give you a few more details about the project if you show up and have not been before. :-)
  • by KiahZero (610862) on Saturday August 30, 2003 @08:40PM (#6835940)
    Obviously you haven't looked at VT recently. Tuition and fees is only $7,500 [vt.edu], out of state. I can only wish that my tuition were that low. Hell, for in-state students, the room and board is the same price as tuition (around $2,000). But of course, you're modded up insightful, because you pulled a random idea out of your ass and presented it as fact.
  • Re:Macs ? (Score:5, Informative)

    by 11223 (201561) on Saturday August 30, 2003 @08:51PM (#6835961)
    Oh, and I suppose you were asleep during the discussion of the G5's dual, independent double precision floating-point units? And the out-of-order-execution engine that makes them usable? And the memory architecture that will make it scream on large data sets?

    The G5's floating point hardware is the most advanced to be found right now, either in standard double-precision or vector double precision.

    (FYI: yes, this cluster exists, or will exist. Unfortunately I believe they will be using MPICH which might put a dent into their numbers.)
  • Re:Macs ? (Score:4, Informative)

    by Erich (151) on Saturday August 30, 2003 @09:09PM (#6836004) Homepage Journal
    Altivec
    Doubtful. Altivec can only do single-precision floating point. It's pretty good at it, it can do operations 4 wide, but only single precision. Linpack needs double precision (at least, for the benchmark).

    The dual floating point units on the G5 will help, but it's nothing extraordinary. P4s and Athlons both have multiple floating point units. P4's are relatively orthogonal, Athlons less so. However, SSE2 allows for vectorized double precision operations. It is likely that for the linpack benchmark, best-in-class P4 or Athlon architecture-based machines would outperform best-in-class G5 machines.

    Altivec is extremely powerful. However it is only useful for applications that don't require their floating point to be double precision. SSE2 is less powerful, but allows for double precision SIMD processing.

  • Re:AltiVec (Score:5, Informative)

    by JDWTopGuy (209256) on Saturday August 30, 2003 @09:26PM (#6836040) Homepage Journal
    No kidding. My 667Mhz Powerbook G4 gets 6.8 MKps, my friend tried it on his 2.53Ghz P4 and got 3.4 MKps, and my 1.4Ghz Athlon XP 1600+ gets aroung 4.5 MKps. I calculate a Dual 2Ghz G5 around 40 MKps, probably more. Imagine them running distributed.net on this bad*** cluster!!!! YOW baby!
  • by valdis (160799) on Saturday August 30, 2003 @09:34PM (#6836056)
    Actually, this summer's outage for the new diesel backup generator was something else entirely - that was merely replacing two older natural-gas fired generators that were no longer sufficient to fully back up all the existing hardware. That install has been in the planning stages for a long time, and was needed for current operations.

    Do the math - the new generator is rated at 600kva and is already carrying several hundred machines (including a very power-hungry Sun E10K and a number of E6K-class machines). There's not enough capacity on that generator for 1,100 more systems.

    (And I just wanted to say "This is All Kevin's Fault" - except for a few unrelated parts we blame on Randy ;)
  • by Anonymous Coward on Saturday August 30, 2003 @10:33PM (#6836259)
    1. The PPC970 draws from the Power4 lineage, which I have used for a long time. The PPC970 has 2 double precision FPUs, each capable of fused multiply add instructions leading to 4 flops/cycle/processor (2 units*2flops/cycle). This is identical to the Itanium2 FPU microarchitecture. The Opteron on the other hand can only do 2 double precision flops/cycle, which makes it only half as powerful on matrix heavy scientific computations, when compared to the PPC970 or the Itanium 2. The PPC970 should really be compared in FP terms to the Itanium2 at 1/10th of the cost, and at 2GHz it is clocked higher than the top-end 1.5GHz Itanium2 Madison. Moral of the story, read thy arstechnica. 2. The standard benchmarking process (LINPACK) only uses double precision FP. If this rumor is true, then this machine is capable of an Rpeak (LINPACK) of 17.6 Teraflop, which those of you who follow top500 will realize is quite substantial. 3. If they are really using Infiniband, this should be a nice machine. Infiniband provides 10 Gbps (20 Gbps full duplex) of bandwidth, which is much faster than either Myrinet or Quadrics. Also Infiniband latency is 10us and the benchmarking process is bandwidth not latency sensitive. On the other, this stuff is really expensive. If all of this is true, this would be a major engineering endeavor. Also, it is probably cheap. However, all in all, this could well just be a rumor (come on it is thinksecret - remember iWorks). If not, this should be a fairly substantial machine.
  • by Anonymous Coward on Saturday August 30, 2003 @10:43PM (#6836292)
    In multiple processors in a box, the processors have to fight for the resources that box has to offer. NUMA alleviates demand on the memory, but IO operations (when writing to disk or to network) in a multiprocessor box block a good deal as the processor count in a node rises.

    Seymour Cray once said that a supercomputer is a device for turning compute-bound problems into I/O-bound problems. In other words, if your job is I/O bound, great! That means you're making the best possible use of your compute resources.

    In the real world, most jobs require far more compute resources than they do I/O resources. So scaling to a thousand processors or more makes sense, because we can already scale I/O up to gigabytes per second, either disk or network, very easily.

    The idea with clusters is that inter-node communication in most cases can be kept low.

    Alas, the idea with supercomputers is that inter-node communication cannot be kept low. Consider terabyte-scale data set visualization, for example. There's simply no way to do that job without distributing a copy of the entire data set to every node. That makes is a really bad job for a cluster, but a perfect job for a supercomputer.

    Do not be fooled into thinking that clusters are superior to supercomputers, or vice versa. There are tasks that clusters can do cost effectively that supercomputers cannot do as cost effectively. However, there are tasks that supercomputers can do that clusters simply cannot do, no matter the cost. So a one-to-one comparison between the two is inevitably going to be incomplete and misleading.
  • by Anonymous Coward on Saturday August 30, 2003 @10:58PM (#6836343)
    They went with the G5's because they were the cheapest 64 bit solution and because they would use less power and generate less heat than alternative systems. That is the whole of it.

    They are having someone write infinaban drivers for OS X just for this cluster.

    I look forward to helping install 4GB of ram + the infinaban cards in each of these bad boys.

    It is great having connections!
  • by stingerman101 (702479) on Sunday August 31, 2003 @12:20AM (#6836590)
    If the money granted was earmarked for a specific project their is nothing the university can do to transfer it to another need. It is either they use it or lose it. Fortunately these grants usually include monies for maintenance of the systems and the staff required, which means that personnel can be put under this new grant area and alleviate costs from another area. So in the end, grants like this help other areas of the school and will attract and create other sources of revenue.
  • Re:Macs ? (Score:1, Informative)

    by Anonymous Coward on Monday September 01, 2003 @12:27AM (#6842781)
    Your answer seems a bit confused. Face it, without SSE2 the P4 would stink if it weren't for the fast clock. Heck, you wouldn't even have good FP performance, or why else would Intel tell everyone to use SSE for it instead of the old FP unit?

    Speaking as somebody who has written a lot of both Altivec and SSE code, it is a bit more complex than that. Even the intel compiler doesn't generate SIMD code in most circumstances. However, the SSE/SSE2 instruction sets include instructions that are essentially normal floating-point operations operating only on the first element of the XMMn registers.

    The reason why Intel tells you to use SSE instead of x87 is that they have focused on optimizing the SSE unit and only provide the x87 registers for compatibility. This way all code will be fast (even if it is "normal" FP code and only uses the first element), and in the very few cases where the compiler or programmer can optimize and use all elements it will be very fast, without having to move stuff between the normal FP registers and the XMM ones.

    Go ahead and compile with -S to see the assembly output - it's quite interesting.

    The fact that AltiVec actually scales pretty well for many tasks doesn't prove that the G4 is "unbalanced", it shows that SSE2 isn't particularly good.

    As a theoretical pissing contest about the cleanest vector implementation I agree that Altivec is much better. The problem is that most problems aren't perfectly vectorizable, and as soon as you need to access unevenly spaced (not only unaligned) data Altivec sucks.

    SSE, on the other hand, is an ugly hack just as a lot of other things from Intel. But just as most of the other ugly things from Intel the performance often wins over a theoretically nicer architecture in practice. For general algorithms SSE can often be better than Altivec. Intel isn't stupid - they've got some of the best engineers in the world.

    It isn't a religion: OS X and Apple aren't bad just because SSE can be pretty good in practice. However, I do agree with the statement that the G4 is a bit "unbalanced" in that it is very fast for a small set of programs that have been manually tuned with Altivec, but it is very slow (especially on double precision FP) on the huge majority of compiled code.

    In 20 years I think we're all going to be using the type of explicitly parallel instructions as present in ia64. It is horribly difficult to program, but it allows you to get much closer to the theoretical peak, and you can increase performance with more integer/fp units instead of only higher frequency. (Normal compilers can't really schedule more than 2 fp units).

    Again - don't underestimate Intel. They've got WAY too much money, patents, & prestige invested in ia64 to let it fail. With Madison they have just shown us they can produce the fastest CPU in the world - all that remains now is getting the price down, and that's economy of scales (read: a matter of time).

  • Re:Macs ? (Score:4, Informative)

    by jcr (53032) <jcr.mac@com> on Monday September 01, 2003 @04:54AM (#6843458) Journal
    Altivec can only do single-precision floating point.

    Not quite correct. Apple has extended-precision libraries available for Altivec.

    -jcr
  • Re:AMD x86/64 (Score:2, Informative)

    by stingerman101 (702479) on Monday September 01, 2003 @02:33PM (#6845639)
    Well the only AMD alternative is the 2GHz Opteron whose processor alone costs about 900/processor. You won't be able to get a dual 2GHz Opteron for less than Apple's retail dual 2GHz G5. Too, you have to make sure PCI-X slots are there for the interconnect cards they are using. But as was stated elsewhere, the G5's can process twice the scalar FP instructions of the Opteron or the Itanium per cycle. The University as also relies on vector math and the G5's have a significantly better SIMD unit.
  • Re:Macs ? (Score:3, Informative)

    by jcr (53032) <jcr.mac@com> on Tuesday September 02, 2003 @05:53PM (#6854477) Journal
    Go look up the altivec instruction set and tell me which instructions work on packed double-precision values.

    As I said, Apple's published extended-precision libraries that use Altivec. You can indeed use altivec for double precision operations, you just have to use multiple passes.

    You can do double precision on PPC, but you don't get anything from the vector unit. Only the FPUs.

    You are mistaken.

    -jcr

The key elements in human thinking are not numbers but labels of fuzzy sets. -- L. Zadeh

Working...