Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Hardware

Linux Cluster attains 125.2 GFLOPS 66

akey writes "CPlant98, a Linux cluster composed of 400 Digital Personal Workstation 500as, achieved 125.2 GFLOPS, which would place it at #53 on the Top 500 list. And this was only a 350-node run... " I'm hearing rumors of 1000+ Linux Clusters. I'm itchin' for it to come out of the closet so we can see some real benchmarks.
This discussion has been archived. No new comments can be posted.

Linux Cluster attains 125.2 GFLOPS

Comments Filter:
  • Such a machine exists. See the subject for the name.

    It's linked to from the Beowulf site somewhere.
  • Maybe we've worked with different style cases, but the ones I've used don't have power supply issues.

    Probably, there are so many different cases out there. Many of the 'slim' cases I have seen only have 120-175 watt power supplies, which may be adequate for most uses, but may not be the best for sustained heavy duty use. Typical mid tower cases usually have 200-250 watt supplies.

    As for floppy drives - simple - don't put one in the box.

    The problem there is that it may increase the time necessary to service a machine if it dies.

    For bulk serving you don't need it.

    Very true under normal circumstances, but my concern is when the proverbial excrement hits the rotary oscillator. I want to be able to fix any problem ASAP. Having to find and install a floppy drive in a machine to bring it back up is time I may not want to take.

    Nor a CD ROM drive.

    That is normally something that you can usually do without, provided one is available on a network reachable machine.

    If the box is at a colocation, you're going to get to it via ssh, not standing in front of it.

    Provided you aren't dealing with a crashed hard drive or some other issue that can't be solved remotely.

    Another complaint I have about a lot of 'slim' cases I've seen is many of them have limited quantities of front-panel accessable drive bays. While that isn't a big deal for most things, one useful thing when you are dealing with a large number of servers is the inexpensive IDE 'lock and load' trays, which make swapping in and out hard drives much faster and easier. It can make large scale upgrades or dealing with crashed drives a lot faster since you can do the work on another machine and then only take down the server box to do the actual swap.

  • I honestly don't know for certain, but aren't the big cray machines and other microprocessor supercomputers effectively clusters of SMP nodes? Could the disparity here be fairly weak SMP performance of Intel's SMP scheme?
  • Or get a bunch of SBCs - you can put at least four 2way SBC cards in a standard backplane in a 19" rackmount and stack them 6 or 10 high giving you something like 80 cpu's in a standard network rack. 5 racks across is about 8 linear feet for 400 cpu's
  • Provided you aren't dealing with a crashed hard drive or some other issue that can't be solved remotely.

    Of course, on this machine the compute nodes aren't equipped with hard drives... :) Seriously, in a huge cluster like this, if a node fails, they will take it out, and may not even bother trying to fix it, I imagine.

  • No, the big cray machines are not, in fact, clusters. I have a supeconmputer expert friend who laments that there are no new supercomputers since Cray is dead. All the new "supercomputers" are just big clusters. The Cray T3E (my favorite supercomputer) runs a single operating system which controls all the nodes.

    A cluster runs a separate operating system on each node. This generally (again, this is hearsay) makes it much harder to maintain a cluster than a supercomputer (meaing one with one operating system). We purchased a small 8 node IBM sp2 computer six months ago, and still haven't figured out how to make it act like a single computer. :( Oh well.

  • ok, so a cluster has a separate copy of the os for each node, whereas the conventional supercomputer has a single os controlling all it's nodes. that being the case, is it possible to take a supercomputer (perhaps the above mentioned t3e) and run it as a cluster, with a separate instance of the os for each node. i'm guessing that you wouldn't really want to do this, but is it possible. an anti-beowulf setup, if you will.
  • Actually, quite a few cluster environments run SMP for one very good reason: communication overhead. Communication is much faster in a shared memory environment.

    NERSC [nersc.gov], for example, has recently purchased an IBM SP system which has two processors (or was is 4?) per node, with plans to upgrade to 16 processors per node.

    The problem with SMP and clusters is that the message passing software has to be smart emnough to take advantage of the shared memory situation, and needs to and this can also complicate things when you try to optimize your code.

  • by Suydam ( 881 )
    This has to be good news. Linux is attacking the desktop (lowest end) AND the supercomputer (highest end) at the same time.

    How many Windows NT machines rank in the top 53 of the worlds fastest machines?

  • Every time I see a photo of one of these clusters (assuming the sandia photo os of the cluster they are using), it seems everyone has opted for full-sized boxes. Would seem they could cut down on rack space by 50% or more by going with a slim chassis. Go by any co-location and you can tell the newbies from the vets by who maximizes their shelf space.
  • we've [jhuapl.edu] got a 8 node 1 master test cluster that is in a 43" [ i think ] high x 19" rack. compared to a irix challenge [ think fridge ] or any of the other racks we have, our beowulf is just a tiny little beast.
  • re: Big-Ass Clusters

    Fermilab has plans to build a 2000-node cluster in the near future but is putting off purchasing all the nodes until the last second to maximize their value.

    re: Rackmounts

    They're more expensive, and typically the machine rooms at large Beowulf installations have enough space for whatever they choose to use. It's not like Los Alamos has to pay for space at the colo when they add a new pile of Alphas.

  • Well, doesn't that all depend on exactly _who_ you get to do the benchmarking?
  • In a talk with someone from VA linux they said that they *possibly* have a client who would is looking at setting up a 2600 Node Cluster..

    Umm.. really fast Quake .. umm... :)
  • IIRC, they're uniprocessor nodes
    Christopher A. Bohn
  • Sometimes it's less expensive to buy a nodes pre-assembled. Sometimes it's necessary because of the particular fund the money's coming from (a system I built was in this boat -- about a fourth of our money came from the "desktop computer" fund). Sometimes it's a question of effort -- they'd rather not spend time physically assembling the nodes.
    Christopher A. Bohn

  • how much can this type of computer scale up.

    At least 2000 it seems (if somebody try to do it then it must thoerically scale to that extent), but do we have a theorical limit or something like that???

    And are these computers mono-processors or SMP?
    If Linux was going to have great enhancement in SMP for 4+ CPU's then would it be worth to create a cluster of SMP boxes given the price difference between SMP and non=SMP boxes actually (I suppose if you do a 2000 SMP cluster then you must have special price).
  • I thought that benchmarks were worthless tests and proved nothing. I belive the words were -"bench-marketing," as Penguin's Mark Willey derisively referred to it -- is fundamentally flawed. Benchmarks don't reflect real-world situations, and are too easy to manipulate.


    By that logic this cluster is no faster than my 386 with 2mb of RAM.

  • Nonononono. Short words confuse managers. Do a Dilbert.

    So it wouldn't be "distributed.net", but "joint research into highly parallelised, highly distributed encryption validation", and "SETI@Home" is actually "joint research into vastly parallelised radio inteferrometry, using test data from Aricebo".

  • Because space is cheap in the USA. If this cluster were in Japan you can guarantee that space would be maximized. I work for a global company and in the far east especially, we will pay a premium for smaller machines because space is such an issue.

    Once vendors start selling 1U machines with one disk, one processor, and one slot, this kind of thing will be more accessible to those outside the USA. I know VA has some 2 or 3U machines, that's the right direction to be going.

    -Rich
  • There's a place on the webpage to get a user account if you have a suitable project to run. Hmmm...wonder if they'd consider distributed.net or SETI@home suitable...
  • Excuse my ignorance, but is there a conceivable limit to the number of nodes you can have on a cluster?

    Just how big of a room does it take to house 2600 nodes?

  • Actually, it depends on your requirements. For a lot of nontrivial applications (matrix inversion and 3D Fourier transforms, for example) the entire dataset needs to be transmitted at each iteration.

    I know that for what I do (pseudopotential plane wave calculations) one 100bT switch would be too slow to connect an 8 node cluster. As in, you'd be better off with all the memory in one computer, and forget parallelization.

    Also, I think that typically you don't want more than two hops between nodes. Of course, it all depends what you're doing. If you're doing monte carlo stuff, you could probably get by with 9600 baud modems if it were cheaper than ethernet.

  • The first cray2 cost about $30 Million. Running a well-optimized code, it churned about 1 Gigaflop. Supercomputers seem to hang at about the same price ( $20-30 M ) but increase performance an order of magnetude or more per generation. This would put is in the range of teraflop machines now - building toward the petaflop for the same $30M. A single 450 MHZ pentium II can do about 70-100 Mflops depending on the exact operation , so ten to fifteen could do the same gigaflop as the cray 2. ( assuming embarrasingly parallel code. ) These computers are easily available for $1K in quantity. 150 should yeld 10GF at under 200k including network, and 1500 should yield 100GF at under $2M. I think this should still look like 10X more cost effective than the massive parallel and vector/mpp
    machines. If you used dual processor systems, the cost/performance would be even better.
  • There are some cases where benchmarks are more valid than others.

    The Linpack benchmark is one that's been around a long time, and is pretty much an agreed-upon benchmark in the computing industry.

    In addition, the numbers are submitted by the USERS of the machines, not the vendors, and this makes a LOT of difference in the trustworthiness of the benchmark.
  • my understanding is that the big sun servers (E10000, etc) do essentially that. i think there was talk about it with the articles on the ebay fiasco. you might want to look back at some of that. i think i read that unisys has a system like that, too, but i don't recall off the top of my head. i'm sure there are probably a lot of instances of this sort of system.
  • There will be 1U chassis for a variation of the Compaq DS10 computer. Pricing hasn't been determined, but it is a huge bang for buck. The main drawback of such size is lack of expandability - only 1GB RAM, one drive and one open PCI slot. The time is this summer, I think.
  • Hey, Samsung makes Alphas in S. Korea, AMD will make K7's in Dresden, Germany, Fujitsu makes Sparcs in Japan, Intel has loads of plants in places like Isreal, Maylasia, Singaphore, etc.

    I wonder if this will make weenies go for more treaties. Ugh.
  • Telenet Systems is currently offering a 1U system starting at $1000US

    http://www.tesys.com/servers/rackmount.shtml

    neal
  • Actually, the Linpack benchmark pretty much sucks. If you read the webpages of the Top 500 people, they mention that they only use it because it's the only benchmark that will run on such a variety of hardware and software.
  • Well, if you're running ethernet, the more nodes you have, the more collisions you have. Eventually the network slows to a crawl. If you're using token ring...well nobody does anymore =)
  • At Usenix we were showing a 1U EV6 based system
    (based on the DS10, I believe).

    40 of these will fit in one rack.

    Is that Compaq(t) enough for you? (sorry for the
    very poor pun).
    - Jim Gettys
  • Y'know...I was thinking...

    A brilliant epitaph for a Slashdotter would be:



    HERE LIES JOE BLOW


    LAST COMMENT!

    --
    Get your fresh, hot kernels right here [kernel.org]!

  • Would seem they could cut down on rack space by 50% or more by going with a slim chassis

    Personally I hate most 'slim' style cases for a few reasons:

    • There isn't enough space to work in them, so SIMM/DIMM slots are often obscured by power supplies, etc. I want something that is easy to work in when I need to get things back up and running quickly.
    • Most of them use riser cards for the slots, which are a pain in the butt.
    • Most of them use non-standard motherboard layouts, and part of the idea of building clusters is to have cheap commodity priced, easily upgradeable hardware. Some of them even require goofy special floppy drives (due to custom faceplate bezels).
    • Slim cases often include wimpy power supplies which may not be as durable and reliable as higher output supplies used in larger cases.

    Personally, I like the 'mid tower' type of case. In today's world with huge capacity hard drives, the big full tower cases are often overkill, but the mini-towers are too cramped to work in.

  • by Anonymous Coward
    Personally, I think that the biggest problem that people are going to face when creating super huge clusters (200+ machines) is not one of floor space, or heat dissipation. The problem is going to be with the networking of them. Sure, you can go to ATM, or gigabit ethernet, or ... but when it is absolutely critical that data gets to the next machine as fast as possible... and that the packet doesn't get lost somewhere along the way... And the whole reason (ok.. one of the main reasons anyway) for doing clustered supercomputers is because it is cheaper. When you start rack mounting them, and putting gigabit ethernet in them, and ... you are really starting to jack up the price of each node in your cluster.

    As far as the question about how large you can go with them, if you use an int to determine which machine you are addressing, that puts a theoretical limit of more than 60,000 nodes.
  • How large can a cluster be?

    Short answer: it depends.

    Long answer: it depends on the applications and the usage patterns.

    (I'm assuming we're talking about practical limits here, not theoretical ones -- the theoretical limit is probably the address space of a cluster's message-passing interface (i.e. 4 billion nodes).)

    Some applications -- the so-called "embarassingly parellel" ones -- will scale with nearly no deviation from linear to any number of nodes, because they do loosely-coupled problems. (Which means the result of one part of the parellel computation does not depend on a result from some other parellel computation. The mandlebrot set is a good example of this.)

    In general, the more tightly-coupled the problem is, the harder it is to scale, as the amount of data that has to be exchanged pushes the limit of the interconnects. A 32-node cluster constructed on a hub will be faster for loosely-coupled programs than a 24-node cluster on a switch, which could beat the 32-node cluster on a tightly-coupled problem because of communications overhead in the 32-node cluster.

    Usage patterns also determine the maximum useful size. If you're at a large lab like Sandia, you can reasonably expect a large number of jobs to be running concurrently, which essentially parellelizes the cluster -- running 6 tightly-coupled programs, each on their own hypercube interconnect, will complete faster than running the six in series, each with the whole cluster.

    -_Quinn
  • Well, for our little "proof of concept" cluster (as if it needs to be proved?) we just went with the "shelves in a rack" concept. All components are mounted on shelves using standoffs.

    When we build our approximately 100 node cluster (hmm, where can I get the money for the other 500 nodes????) We're considering using rack-mount boxes just because they would be easier to handle and simpler to install.

    I'ld tell you how many nodes we can stuff into a standard size 19" rack, but we're still building it!

  • This sounds really promising, but how does the price/performance of a Linux cluster compare to other "real" supercomputers? For example, what is the price/performance of a VERY high end SGI, and what would it take (price wise) for a Linux cluster to match that. I've heard that Linux clusters cost considerably less, but I've never seen any hard statistics.
  • Easy. much as a GB is a gigabyte (1,073,741,824 or 1,000,000,000 bytes depending on who you ask), a GFLOP is a gigaFLOP, where a FLOP is a FLoating point OPeration (or so I remember).
  • In clusters enviroments it's better to avoid the SMP machines.

    The reason is simple: The access to the resources are better with mono-processor systems, there is no need of competition among the procesors in the same machine.

  • Forget about slim chassis; how about no chassis? Take a look at Beowulf on StrongARM boards for $2000 [dnaco.net]. These folks are looking at building 6 StrongARM processors with RAM and the necessary "glue" onto a single PCI card. Since easily obtainable PCs have 3 PCI slots in them, you should be able to set up an 18 node beowulf cluster inside one box (the PC itself acts as the controller). Can you usefully cluster a bunch of these (a cluster of clusters)? I don't know, but it's interesting to think about.

    Doug Loss

  • Ethernet collisions are not a problem. No one uses hubs for Beowulf interconnects. Switched ethernet allows the network performance to scale with the number of nodes. Unfortunately, that is probably the real limiting factor...designing a large enough network. A typical two-tier approach would use 36- or 48-port Fast Ethernet switches with gigabit uplink ports connected to gigabit switches. I think you could assemble several hundred systems this way to a thousand systems. This is the approach LANL used for Avalon. Also, some of the big iron Cisco backplanes can take 24-port ethernet cards. I think some of them can support several hundred switched ports in a single tier.

    Unfortunately, I don't know of an ethernet solution that could scale beyond several hundred systems and still provide uniform bandwidth and latency among all of the nodes. Beyond that point, you will need to come up with a scheme to partition to the network or just accept some performance penalty for crossing subdivisions boundaries of the cluster.
  • And they are not adequate to mesure performance.
  • kewl, but no FPU.
  • where did you get 60k nodes?

    if you use ip on a closed network, surely you get the basic math of 254*254*254*254 give or take a few addr's ?? [ thats approx 4162314256 nodes ].

    am I wrong?

    I run a 25 node mpi/pvm cluster of 486-66 DX2's over a NuSwitch fast switching ethernet hub, at just 10mb/s full duplex, and I have a load of fun with pvm-povray rendering my nut off!

    I'm now working on a multi node crawler to feed my search engine ( www.websearch.com.au ).

    ++dez;

  • Actualy GFLOPs are a better measure of performance than an any other that I am aware of, simply because floating point operations are what we deal with in the real world. So it doesn't matter how many millions of instructions per second a processor can execute. If it can't handle floating point operations quickly it's no good for real world applications.
  • And generally, its GFLOPS, for Giga FLoating point OPerations per Second.

  • Too bad. this would have been cool if it was possible to add the performances of Clusters with the performances of SMP. But that's pretty rare to be able to have the better of two worlds ;)

    thanks
  • And also, I know the State Department keeps the Cray 2's under control (i.e., you need some clearence to get one), so would Pakistan or Iraq just opt for 1000 Linux machines ("the mother of all beowulf's), probably coming in around a cool million.
    Would such a cluster be close to such super computers? Would the State Department start stepping in?
    Food for thought.
  • Maybe we've worked with different stule cases, but the ones I've used don't have power supply issues.

    As for floppy drives - simple - don't put one in the box. For bulk serving you don't need it. Nor a CD ROM drive. If the box is at a colocation, you're going to get to it via ssh, not standing in front of it.
  • How can there really be a low end and a high end? Wouldn't that vary by what the machine was designed to do? ASCI Red is #1; but can it run DOOM? My PC can run DOOM, but can it outperform ASCI Red? Embedded machines are only to deal with the device that they're placed in; not really any point to compare them to supercomputers...

Kleeneness is next to Godelness.

Working...