Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
IBM Software Hardware Linux

More Details Of IBM's Blue Gene/L 119

Bob Plankers writes "By now we've all heard about IBM's Blue Gene/L, LLNL's remarkable new supercomputer which is intended to be the fastest supercomputer on Earth when done (360 TeraFLOPS). IBM has released some new photos of the prototype, and renditions of the final cluster. Note that the racks are angled in order to permit hot air to escape vertically and reduce the need for powered cooling. The machine uses custom CPUs with dual PowerPC 440 processing cores, four FPUs (two per core), five network controllers, 4 MB of DRAM, and a memory controller onboard. The prototype has 512 CPUs running at 700 MHz, and when finished the entire machine will have 65536 dual-core CPUs running at 1 GHz or more. Stephen Shankland's ZDnet article also mentions that the system runs Linux, but not on everything: 'Linux actually resides on only a comparatively small number of processors; the bulk of the chips run a stripped-down operating system that lets it carry out the instructions of the Linux nodes.'"
This discussion has been archived. No new comments can be posted.

More Details Of IBM's Blue Gene/L

Comments Filter:
  • Doom3? (Score:5, Funny)

    by arcanumas ( 646807 ) on Sunday November 30, 2003 @07:19AM (#7591966) Homepage
    Well, it may be able to play Doom3 when it is released.
  • by incal ( 728144 ) on Sunday November 30, 2003 @07:27AM (#7591977)
    no matter how many cpu's it will get. Maybe its better to invest time and resources in the David Deutsch research of quantum machines? http://www.qubit.org/people/david/David.html
  • Infinite (Score:5, Funny)

    by Raynach ( 713366 ) on Sunday November 30, 2003 @07:28AM (#7591978) Homepage
    I'm really impressed with this computer. I think it's going to be the first computer that can finish an infinite loop in under an hour.
  • by Manywele ( 679470 ) on Sunday November 30, 2003 @07:33AM (#7591986)
    This will be sure to boost the effeciency of travelling salesmen everywhere.
    • by Vegard ( 11855 ) on Sunday November 30, 2003 @07:56AM (#7592032)
      Actually, we better let the salesmen travel. It's a little known secret that the reason computers are so bad at solving the travelling salesmen problem is that those who design computers are technicians, and everyone knows that tech people hates salesmen, so the longer they spend travelling, the better for the techs.
    • The traveling salesman problem should really be renamed to the airline scheduling problem. Airlines spend a LOT of money on large computers to ring more efficiency out of their huge capital investments.
  • by vogon jeltz ( 257131 ) on Sunday November 30, 2003 @07:34AM (#7591990)
    ... those were the times. Ahhh, memories!
  • Only a PPC 440? (Score:1, Interesting)

    by Anonymous Coward
    Why not more powerful CPUs? a 440 is hardly any kind of workhorse. A G4 at that speed would be too hot, but since PIII machines can run with just a small passive heatsink now wouldn't that have been a much better choice?
    • Re:Only a PPC 440? (Score:5, Informative)

      by shaitand ( 626655 ) on Sunday November 30, 2003 @08:01AM (#7592039) Journal
      No, It's a supercomputer. Those are RISC processors, a PPC 440 in reality gives better performance than a CISC processor like the PIII
      • Re:Only a PPC 440? (Score:1, Informative)

        by Anonymous Coward
        RISC vs. CISC is a moot point nowdays.

        A PIII is as much a RISC processor under the hood as the PPC, but neither are pure RISC.

        Pure RISC sucks, pure CISC sucks.
      • Re:Only a PPC 440? (Score:2, Insightful)

        by vicotnik ( 556724 )
        Why did this get such a high score? Why not compare an old Sparc 4 against an Athlon 64 or a Pentium 4, the Sparc has a RISC processor so it should be faster, right? The PPC 440 might be faster for a number of reasons, and being RISC instead of CISC is hardly even among the most significant. That x86 has a crappy ISA doesn't mean CISC that has to be slower than RISC in general.
      • Re:Only a PPC 440? (Score:5, Insightful)

        by be-fan ( 61476 ) on Sunday November 30, 2003 @11:56AM (#7592775)
        -1: Disinformative

        RISC vs CISC means very little these days. Most current CPUs have a core even more minimal than RISC chips, but present a CISC (in the case of x86) or RISC (in the case of the G5) interface to the outside. They used the PPC 440 for different reasons:

        1) IBM had to do significant custom engineering for it, and they own the PPC 440 core. That allowed them to use it to design an SoC.

        2) They needed to add FPU hardware, which is easier to do on a design they own. The PIII only has one FPU, while this chip as 2 FPUs. IBM had to add this to the design, because the regular PPC-440 has no FPUs.

        3) The PPC-440 was designed from the beginning to be an embedded CPU. At 1GHz, a stock PPC-440 consumes about 2.5W. Even a low voltage PIII consumes more than that.
        • No, it's not THAT disinformative. Although RISC/CISC doesn't say everything about processors, there are a few things to consider:

          - RISC needs smaller die sizes. The free space can be used for more pipelining etc. => more speed!
          - Compiler's perform way better on RISC. If you do not code in assembler => more speed!
          • Cache efficiency goes WAY up with a CISC design and since cache size and memory latency are the number one bottlenecks in modern CPU designs CISC wins. Sorry but even the G5 is a lot more CISC like than a traditional RISC processor as far as ISA goes. At the core allmost all CPU's are a minimalistic RISC design as the grandparent poster stated.
            • IMHO, cache efficiency only goes way up if your code size shrinks by the usage of a more expressive instruction set. But it is not said that
              CISC instruction sets are more expressive than RISC ones, despite their name. Look for example (no highperf. computing, I know, I know, but anyway) at the ARMThumb. That's 16bit/instruction. Clearly RISC.
  • 4 MB DRAM (Score:3, Funny)

    by Sensei_knight ( 177557 ) <.sensei_knight. .at. .yahoo.com.> on Sunday November 30, 2003 @07:41AM (#7591999) Homepage
    Holly shit where do I buy on of thoes!
    • Re:4 MB DRAM (Score:5, Interesting)

      by otis wildflower ( 4889 ) on Sunday November 30, 2003 @07:50AM (#7592019) Homepage
      4MB per CPU, each with 2 processing cores, and an onboard memory controller.

      Final version to have 65536 CPUs.

      Smells like 256GB to me, which is pretty decent in _any_ book, especially if it lives on the same silicon as the CPU...
      • Yes by why dram? Has supercomputing stopped using faster static memory in favor of cheap but slow dynamic and cpu hog (those refreshs are a big part of why a p1 will run reasonably well side by side with a p4 when you stick a fast drive controller card in it for 90% of computing operations) as well?
        • Re:4 MB DRAM (Score:5, Informative)

          by javiercero ( 518708 ) on Sunday November 30, 2003 @08:24AM (#7592085)
          Because they are used embedded DRAM, which although not as fast as SRAM it gives more storage in fewer transistors. This leads to a smaller die, and lower power/heat dissipation.

          If your p1 runs at the same speed than your P4 for 90% of operations, then there is something wrong with your computer! The HDD is not the bottleneck for most modern computers, as they have enough memory to minimize page faults for most common home computing tasks.... Startup times however may be equal since both machines have to get the data/program from HDD... once stuffs are in memory, buubbie P1....
          • Let me correct myself. When running WINDOWS both machines are around equal speed for 90% of operations. On windows the drive plays a much bigger part because regardless of how much memory you have the drive swaps. Also a P1 is using EDO Ram so it does not sap the cpu in refresh.

            Get two systems together. Both with the same amount of memory and the same windows OS. Then install an IDE controller in the P1 and put fast ide drives on both systems (a faster scsi controller will do as well). Put something
          • Well, "editors" and such ...
            On a more serious note, I probably was fast enough to strike an obvious joke (4MB of RAM, very funny).
            What I'd actually like to know:
            Wouldn't these babies, considering *4* FPUs per double core be *screaminggggggggg* on typical tasks like Fluent, Ansys, Abaqus, and in general any fpu-intensive task?
            Wouldn't these computers be a revolution (in the sense of the word) for companies looking for "the bang for the buck"?
            Just wondering...
        • Why DRAM? eDRAM has a 4x density advantage over SRAM, so they're getting 4MB on-die instead of 1MB. You can read the paper by IBM's eDRAM developers [ibm.com] for 15 pages of detail, if you wish. Mind your megabits versus megabytes... array designers always talk megabits. DRAM's not as slow as you think.
          • With DRAM the higher the clock the more wasted cpu clockcycles, it scales that way.

            Definately higher density at a lower price but I thought the idea of supercomputers was to build the most powerful machine possible at ANY expense?

            Not to say that just thinking of having even the prototype in my basement doesn't make me want to cream my jeans. But this is not news, if you want lots of memory real cheap you go with dynamic ram, if you want the fastest ram you go with static. Dynamic ram bears a double pena
  • by Decameron81 ( 628548 ) on Sunday November 30, 2003 @07:46AM (#7592013)
    "The prototype has 512 CPUs running at 700 MHz, and when finished the entire machine will have 65536 dual-core CPUs running at 1 GHz or more."


    Woah, this is the first time I think a box with 512 CPUs at 700 Mhz each one is crap.

    Diego Rey
  • It's gonna be 512 MB for BlueGene/L(ite) and 1Gb for proper BlueGene
    • It's gonna be 512 MB for BlueGene/L(ite) and 1Gb for proper BlueGene

      I mean, per node :-)

      AFAIK, 512 Mb is just too little for proper protein-folding calculations, while 1Gb provides enough capacity... And, of course, no swap is possible in this types of systems

  • by mOoZik ( 698544 ) on Sunday November 30, 2003 @07:49AM (#7592018) Homepage
    I think I wet my pants.

  • What's new? (Score:4, Interesting)

    by LeoDV ( 653216 ) on Sunday November 30, 2003 @08:02AM (#7592043) Journal
    So, you mean they're going to build a computer that's going to be bigger, faster and with higher number stats than the current #1? Shocking!

    Sorry about the sarcasm, I'm only asking to be proven wrong, but isn't Blue Gene just more of the same, only bigger? Big Mac was interesting because of how cheap it was and because it was the first of its kind to use Macs, the Earth Simulator was interesting because it brought back custom chips for supercomputing as opposed to off the shelf components, we've been reading about IBM's dishwasher-sized supercomputer, articles about efficient supercomputing, so what's new about Blue Gene, besides being newer and bigger?

    Once again I'm not bashing, I haven't read much of anything but the /. blurbs, so I'm asking, is it just a bigger supercomputer, or does it have any "real" innovations?
    • Re:What's new? (Score:5, Interesting)

      by javiercero ( 518708 ) on Sunday November 30, 2003 @08:30AM (#7592097)
      Well... Earth Simulator was bigger and faster... as was big mac. Every advance in computer design and fabrication has been about "bigger and faster." It may sound trivial to you, but that is because you got no idea about what is involved about making things "bigger and faster."

      What is significant about blue gene is that is some sort of compromise between off the shelf parts (PPC based Processing elements vs. the Earth Simulator SX based custom vector PEs), and efficient interconnection (plain crappy cluster like the Big mac with a better interconnect at multiple layers starting with dual cores per die).

      In the end it all leads to the same goal: tackling bigger problems faster. So it may sound trivial but there is a lot of research going into this baby.
    • Re:What's new? (Score:5, Interesting)

      by tigertiger ( 580064 ) on Sunday November 30, 2003 @08:54AM (#7592141) Homepage
      Beowulf-style cluster are a big waste in terms of additional circuitry and hardware that you do not need in a supercomputer, from I/O busses to the power supply. Of course, since off-the-shelf components are so cheap, it is still cheaper to buy the stuff than to design your own tailor-made circuitry - up to a certain scale.

      That is where IBM tries to go: BlueGene's design is based on a system-on-a-chip - everything (except memory) is integrated on a single chip. In the long run, this allows them to build systems much larger than you could with a Beowulf. They are basically aiming for a system where you can easily add computing power by simply putting in a few more chips, and the thing will scale. They are doing the same thing for storage with this brick [slashdot.org]

      BlueGene is a also the first supercomputer marketed to the life sciences. It's interesting to see that it developed from a project at Columbia University called QCDOC [columbia.edu] for "Quantum Chromodynamics on-a-chip" which did research in computational high-energy physics, and QCDSP before, which used DSP processors to build a supercomputer about ten years ago. Both an instructive example how academic research in the long run becomes industrially relevant, and how science changes.

      • That is where IBM tries to go: BlueGene's design is based on a system-on-a-chip - everything (except memory) is integrated on a single chip.
        Actually, even the 4MB of eDRAM is on the chip.
    • by RalphBNumbers ( 655475 ) on Sunday November 30, 2003 @09:23AM (#7592222)
      What's big about BlueGene/L is that it's small. That 512 processor prototype they mention in this article is the Dishwasher-sized computer you heard about.

      BlueGene/L is about driving down the cost of supercomputing, not only in terms of money spent on hardware, but in terms of space, cooling, and maintanance, while at the same time improving scalability.

      BlueGene/L is going to put 65,000+ processors in less space, using less power, and costing less, than many of todays >10,000 processor systems.
      They do this with a minimalist approach, each processor is a SoaC (System on a Chip), with everything from the memory controller to internode networking to two cores and 4FPUs on the die, and the only other thing in a node besides the processors is a bit of RAM. This allows them to use much less power per node and gives them less heat per node to dissipate, which lets them pack the nodes much closer, which cuts down on internode latency, which increases scalability.
      • Dammit, my less than symbol should be pointing the other way.
      • I don't even know if its so important that the computer is small. What is more important is what it allows us to simulate and what it is a step toward simulating. It allows us to simulate protein folding - which is a major step in developing drugs. It is also a major step toward simulating an entire cell or organelles of a cell. We need this computing power so we can see what effect a drug has without making it, running tests on humans/animals, etc.
    • I thought the point of Blue Gene was to be faster and smaller, and that it follows the efficient computing idea where you get significantly more TF per watt, much more TF per cubic volume, much less cooling needed per TF, etc.
    • From where I come from, we do something called "lattice gauge theory", which basically simulates QCD, the force that e.g. binds quarks into protons and neutrons. This is an amazingly hard problem, which is not solvable or even tracktable with pencil and paper (theorists' favorite tool).

      Now, Ken Wilson (=cool Nobelprize guy) who basically started this field, gave a famous estimate in a talk at a conference some time back, stating that to do a serious project in lattice QCD, one would need some (listen up n

      • I'm starting to admire their strategy
        I concur - I used to be in lattice gauge theory and moved into computational biology :-). You have to follow the money, and IBM is very good at this. Also, biology is more fun, they are just starting to solve problems seriously on a computer.
  • by Epsillon ( 608775 ) on Sunday November 30, 2003 @08:49AM (#7592129) Journal
    The standard of trolling has certainly fallen recently. Where's the SCO licence fee estimate for the finished 65536 processor SMP unit? You got a better class of idiot in those days... ;o)
  • by The Other White Meat ( 59114 ) on Sunday November 30, 2003 @08:52AM (#7592138)
    If you actually look at the picture, closely, you will see that the racks themselves are NOT angled to reduce active cooling.

    At the left side of the row of racks, there is an angled cover, which is either decorative, or being used to force cold air down the row of racks. Likely, its just decorative, and the cold air is being forced up from the raised flooring below.

    Just like it is in every other enterprise-grade computer room...

  • There's a box in the background of this [ibm.com] picture which has written on it "IBM Confidential trash".

    I guess corporate espionage is quite real for these guys.
  • by Morgaine ( 4316 ) on Sunday November 30, 2003 @09:39AM (#7592259)
    The part of the article that I found most interesting was:

    Linux actually resides on only a comparatively small number of processors; the bulk of the chips run a stripped-down operating system that lets it carry out the instructions of the Linux nodes.

    The "stripped down operating system" must be the distribution nucleus on the compute-only subnodes, presumably something that allow the Linux nodes to distribute the code and I/O of computations to them and to query or control their state during debugging, and to reaccquire lost processor control.

    It's only a matter of time before those of us who already have sizeable LANs at home will have embedded compute-only clusters within them too. Those would differ substantially from the typical Linux clustering for high availability. Instead of a non-Linux nucleus on those subnodes though, I'd prefer to see a pretty ordinary Linux kernel running slaved to remote masters.

    Is anyone already playing with something like this in their Linux clusters?
  • Looking at the photographs, the entiere beast resides in 64 rack cases. With 42 units per case and 65536 CPU total, there are 24 CPUs per unit. Not bad :) I can't imagine the overall heat of the thing.
  • huh. the guy who initiated and led the Blue Gene project in the beginning was Marc Snir [uiuc.edu]. But then I believe some major fallout happened and it underwent major technical and managerial changes. oh well.
  • Besides the fact that their Nikon D100 has a stuck (hot) pixel, the pictures of people (first "set" on the page) are really bad quality-wise and there is not much creativity - i.e. two shots of the same geek (Hall) taking heatsink temperatures from slightly different angles aren't exciting even to fellow geeks.

    Other than that, keep up the great work IBM!
  • I hear:

    Oh wow, another technical marvel

    Oh Gee, another super computer...

    Morons...

    The whole point here, is that it makes the simulation
    of folding a complete gene in about a years time.

    If THAT doesn't bowl you over, don't post.

    p.s. I can hear the rest of you "umm... so?" people and I can't help you. Sorry. :)
    • The whole point here, is that it makes the simulation
      of folding a complete gene in about a years time.


      Isn't that folding a protein?

      But how big a protein?

      A chromosone would take a lot longer to simulate - but it's essentially a double helix when you get down to the molecular level. That would be a good test though - if you take DNA and can't simulate it going into a double helix how can you trust the computer?
  • ...If they ported over VMWare to run on this bad boy? Imagine the number of guest OS's you could run. This thing could be the data center of all data centers.
    But otherwise, for all intents and purposes, its extremely proprietary and will ultimately run just a few specialized applications.
    Never-the-less, with virtualized computing and beheamoth systems like these, the future of data centers is sure to change.

  • Maybe it's in part because I am currently studying these molecules but the movie is cool. It's got a nice cello background music and the resolution is well above average. I never thought watching a molecular simulation would make me feel that way.
  • er... With this power i want to crack some MD5 's !!!!! :)
  • ...decided to finally answer the great question of Life, The Universe and Everything?

    If they need help for that, they can read an Douglas Adam's "Hitchhikers Guide to the Galaxy".

Think of it! With VLSI we can pack 100 ENIACs in 1 sq. cm.!

Working...