Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Hardware

Ask Slashdot: Building a Cheap Computing Cluster? 160

New submitter jackdotwa writes "Machines in our computer lab are periodically retired, and we have decided to recycle them and put them to work on combinatorial problems. I've spent some time trawling the web (this Beowulf cluster link proved very instructive) but have a few reservations regarding the basic design and air-flow. Our goal is to do this cheaply but also to do it in a space-conserving fashion. We have 14 E8000 Core2 Duo machines that we wish to remove from their cases and place side-by-side, along with their power supply units, on rackmount trays within a 42U (19", 1000mm deep) cabinet." Read on for more details on the project, including some helpful pictures and specific questions.
jackdotwa continues: "Removing them means we can fit two machines into 4U (as opposed to 5U). The cabinet has extractor fans at the top and the PSUs and motherboard fans (which pull air off the CPU and remove it laterally — (see images) face in the same direction. Would it be best to orient the shelves (and thus the fans) in the same direction throughout the cabinet, or to alternate the fan orientations on a shelf-by-shelf basis? Would there be electrical interference with the motherboards and CPUs exposed in this manner? We have a 2 ton (24000 BTU) air-conditioner which will be able to maintain a cool room temperature (the lab is quite small), judging by the guide in the first link. However, I've been asked to place UPSs in the bottom of the cabinet (they will likely be non-rackmount UPSs as they are considerably cheaper). Would this be, in anyone's experience, a realistic request (I'm concerned about the additional heating in the cabinet itself)? The nodes in the cabinet will be diskless and connected via a rack-mountable gigabit ethernet switch to a master server. We are looking to purchase rack-mountable power distribution units to clean up the wiring a little. If anyone has any experience in this regard, suggestions would be most appreciated."
This discussion has been archived. No new comments can be posted.

Ask Slashdot: Building a Cheap Computing Cluster?

Comments Filter:
  • by MetricT ( 128876 ) on Tuesday March 12, 2013 @02:09PM (#43151129)

    I've been working in academic HPC for over a decade. Unless you are building a simple 2-3 node cluster to learn how a cluster works (scheduler, resource broker and such things), it's not worth your time. What you save in hardware, you'll lose in lost time, electricity, cooling, etc.

    If you're interested in actual research, take one computer, install an AMD 7950 for $300, and you will almost certainly blow the doors off a cluster cobbled from old Core 2 Duo's, and you'll save more than $300 in electricity.

  • Re:don't rule out (Score:5, Interesting)

    by eyegor ( 148503 ) on Tuesday March 12, 2013 @02:18PM (#43151227)

    Totally agree. We had a bunch of dual dual-core server blades that were freed up and after looking at the power requirements per core for the old systems we decided it would be cheaper in the long run to retire the old servers and buy a fewer number of higher density servers.

    The old blades drew 80 watts/core (320 watts) and the new ones which had dual sixteen-core Opterons drew 10 watts/core for the same amount of overall power. That's a no brainer when you consider that these systems run 24/7 with all CPUs pegged. More cores in production means your jobs finish up faster, you'll be able to have more users and more jobs running and use much less power in the long run.

  • Re:don't rule out (Score:5, Interesting)

    by ILongForDarkness ( 1134931 ) on Tuesday March 12, 2013 @02:29PM (#43151373)

    Great point. Back in the day I worked on a SGI Origin mini/supercomputer (not sure if it qualifies 32 way symmetric multiprocessor still kind of impressive now a days I guess (even a 16 way Opteron isn't symmetric I don't think). Anyways at the time (~2000) there were much faster cores out there. Sure we could use this machine for free for serial load (yeah that is a waste) but we had to wait 3-4X as long as a modern core. You ended up having to ssh in to start new jobs in the middle of the night so you didn't waste an evening of runs versus getting 2-3 in during the day and firing off the fourth before you go to bed. Add to that the IT guys had to keep a relatively obscure system around, provide space and cooling for this monster etc they would have been better just buying us 10 ~1Ghz at the time I guess dual socket workstations.

  • Re:Imagine (Score:5, Interesting)

    by Ogi_UnixNut ( 916982 ) on Tuesday March 12, 2013 @02:31PM (#43151407) Homepage

    Yeah, except back in the 2000's people would be thinking it is a cool idea, and would be at least 4 other people who have recently done it and can give tips.

    Now it is just people saying "Meh, throw it away and buy newer more powerful boxes". True, and the rational choice, but still rather bland...

    I remember when nerds here were willing to do all kinds of crazy things, even if they were not a good long term solution. Maybe we all just grew old and crotchety or something :P

    (Spoken as someone who had a lot of fun building an openmosix cluster from old AMD 1.2GHz machines my uni threw out.)

  • by serviscope_minor ( 664417 ) on Tuesday March 12, 2013 @03:38PM (#43151991) Journal

    I've been working in academic HPC for over a decade. Unless you are building a simple 2-3 node cluster to learn how a cluster works (scheduler, resource broker and such things), it's not worth your time. What you save in hardware, you'll lose in lost time, electricity, cooling, etc.

    I strongly disagree. I actually had a very similar Ask Slashdot a while back.

    The clustre got built, and has been running happily since.

    If you're interested in actual research, take one computer, install an AMD 7950 for $300, and you will almost certainly blow the doors off a cluster cobbled from old Core 2 Duo's, and you'll save more than $300 in electricity

    Oh yuck!

    But what you save in electricity, you'll lose in postdoc/developer time.

    Sometimes you need results. Developing for a GPU is slow and difficult compared to (e.g.) writing prototype code in MATLAB/Octave. You save heaps of vastly more expensive person and development time by being able to run those codes on a cluster. And also, not every task out there is even easy to put on a GPU.

  • by Anonymous Coward on Tuesday March 12, 2013 @04:17PM (#43152379)

    We have a cluster at my lab that's pretty similar to what the submitter describes. Over the years, we've upgraded it (by replacing old scavenged hardware with slightly less old scavenged hardware) and it is now a very useful, reasonably reliable, but rather power-hungry tool.

    Thoughts:

    - 1GbE is just fine for our kind of inherently parallel problems (Monte Carlo simulations of radiation interactions). It will NOT cut it for things like CFD that require fast node-to-node communication.

    - We are running a Windows environment, using Altair PBS to distribute jobs. If you have Unix/Linux skills, use that instead. (In our case, training grad students on a new OS would just be an unnecessary hurdle, so we stick with what they already know.)

    - Think through the airflow. Really. For a while, ours was in a hot room with only an exhaust fan. We added a portable chiller to stop things from crashing due to overheating; a summer student had to empty its drip bucket twice a day. Moving it to a properly ventilated rack with plenty of power circuits made a HUGE improvement in reliability.

    - If you pay for the electricity yourself, just pony up the cash for modern hardware, it'll pay for itself in power savings. If power doesn't show up on your own department's budget (but capital expenses do), then by all means keep the old stuff running. We've taken both approaches and while we love our Opteron 6xxx (24 cores in a single box!) we're not about to throw out the old Poweredges, or turn down less-old ones that show up on our doorstep.

    - You can't use GPUs for everything. We'd love to, but a lot of our most critical code has only been validated on CPUs and is proving very difficult to port to GPU architectures.

    (Posting AC because I'm here so rarely that I've never bothered to register.)

  • Actual experience (Score:3, Interesting)

    by Anonymous Coward on Tuesday March 12, 2013 @08:58PM (#43155063)

    I've done this. Starting with a couple of racksful of PS/2 55sx machines in the late '90s and continuing on through various iterations, some with and some without budgets. I currently run an 8-member heterogenous cluster at home (plus file server, atomic clock, and a few other things), in the only closet in the house that has its own AC unit. It's possible I know something about what you're doing.

    Some of what I'll mention may involve more (wood) shop or electrical engineering than you want to undertake.

    My read of your text is that there is a computer lab that will be occupied by people that will also contain this rack with dismounted Optiplex boards and P/Ss. This lab has an A/C unit that you believe can dissipate the heat generated by new lab computers, occupants, these old machines in the rack, and the UPSs. I'll take your word, but be sure to include all the sources of heat in your calculation, including solar thermal loading if, like me, you live in "the hot part of the country". Unfortunately, this eliminates the cheapest/easiest way of moving heat away from your boards -- 20" box fans (e.g. http://www.walmart.com/ip/Galaxy-20-Box-Fan-B20100/19861411 ) mounted to an assembly of four "inward pointing" boards. These can move somewhat more air than 80 mm case fans, especially as a function of noise. One of the smartest thermal solutions I've ever seen tilted the boards so that the "upward slope" was along the airflow direction -- the little bit of thermal buoyancy helped air arriving at the underside of components to flow uphill and out with the rest of the heated air. I.e., this avoided a common problem of unmodeled airflow systems of having horizontal surfaces that trapped heated air and allowed it to just get hotter and hotter.

    Nevertheless, the best idea is to move the air from "this side" to "that side" on every shelf. Don't alternate directions on successive shelves. If you're actually worried about EMI, then you must have an open sided rack (or you shouldn't be worried). One option is to put metal walls around it, which will control your airflow. Another option that costs $10 is to make your own Faraday cage panels however you see fit. (I've done chicken wire and I've done cardboard/Al foil cardboard sandwiches. Both worked.)

    You should probably consider dual-mounting boards to the upper *and* lower sides of your shelves. Another layout I've been very happy with is vertical board mounts (like blades) with a column of P/Ss on the left or right.

    A *really* good idea for power distribution is to throw out the multiple discrete P/Ss and replace them with a DC distribution system. There's very little reason to have all those switching power supplies running to provide the same voltages over 6 feet. The UPSs are the heaviest thing in your setup; putting them at the bottom of the rack is probably a good idea. They generate some heat on standby (not much) and a lot more when running. Of course, when they're running, the AC is (worst case) also off and at least one machine should have gotten the "out of power" message and be arranging for all the machines to "shutdown -h now".

    You only plan on having two cables per machine (since your setup seems KVM-less and headless), so wire organization may not be that important. (Yes, there are wiring nazis. I'm not one.) Pick Ethernet cables that are the right length (or get a crimper, a spool, and a bag of plugs and make them to the exact length). You'll probably get everything you need from 2-sided Velcro strips to make retaining bands on the left and right columns of the rack. Label both ends of all cables. Really. Not kidding. While you're at it, label the front and back of every motherboard with its MAC(s) and whatever identifiers you're using for machines.

The optimum committee has no members. -- Norman Augustine

Working...