Where to Spend $1M on a Cluster? 104
Natchswing asks: "My university has been given a $757,825 NSF grant to build, 'A 256 node (128 pair) Beowulf parallel computing cluster ... to improve the realism of gravity-wave modeling by permitting treatment of the three dimensional problem and multiple wave interactions.' They want to pay a company to just show up and drop off a functional cluster rather than build it themselves. Since word has leaked out regarding the purchase intent, every computer manufacturer under the sun (including Apollo himself) has called up trying to sell their cluster. Since I'm no cluster expert, I'm writing Slashdot. If you had $0.7 mil to buy a pre-built cluster who would you go with and why?"
Competitive Bidding (Score:5, Interesting)
This isn't rocket science.
Re:Competitive Bidding (Score:3, Insightful)
Re:Competitive Bidding (Score:3, Insightful)
Yeah, so if you know how to write a contract, the lowest bidder is always the best choice. Think of terms like this: contract price is $700,000 if the following conditions are meet: (a,b,c) by date x and $600,000 if meet by date y. System must be free of manufacturer defects until date z. Manufacturer defects are defined with high specificity here...
But in any case, if you tell the companys what to bid to, they will all bid to that. Then you can pick the company tha
Re:Competitive Bidding (Score:2)
The poster's question implies he doesn't know enough about clusters to write a good RFP.
Re:Competitive Bidding (Score:2)
Re:Competitive Bidding (Score:2)
Here's how it works. You get 5 or 6 technical staff and managers, at least 3 of whom are not involved with the proposals.
Then you Request Proposals via a sealed bid.
You then come up with a scoring worksheet; you weigh cost, implementation track record, hardware or whatever other factors are important to you.
Then each person scores the proposals and you meet to go over them and come up with an overall ranking.
It may seem drawn out, but its a system that works well AND controls costs.
Re:Competitive Bidding (Score:2)
Re:Competitive Bidding (Score:2)
it "worked" for the space shuttle :(
Re:Competitive Bidding (Score:1)
Re:Competitive Bidding (Score:1)
The players were the same ones from Gemini and Apollo, they submitted proposals and NASA decided on one, then it went through years of changes and redesign, but in the end, almost everyone who had a piece of Apollo and Gemini were in on Shuttle in some form or another.
The failues of Shuttle wasn't from the bidding process, it was from engineering tradeoffs.
Actually, it *is* rocket science (Score:2)
But there's more here than figuring out who can plunk down the best system for the specified price. There's the maintenance/support costs. And picking a particular hardware platform kind of defines your choices for software -- so whose compiler do you like best? And any serious school needs to ask: can we maybe do a better job, more cheaply, cobbling together a cluster from cheap (abandoned,
Re:Actually, it *is* rocket science (Score:2)
High-dollar Federal grants generally require that you adhere to some sort of standardized purchasing practice.
Competitive bidding isn't simply "Ok, this guy said he can do it for $50, he wins."
When you issue an RFP for others to come in and do work, you have to weigh various factors in your scoring.
Price is one factor. Experience and hardware features are another. You might assign bonus points to companies that allowed for a few students to take par
Re:Actually, it *is* rocket science (Score:2)
Capitalism (Score:2)
Capitalism would be invest in the opportunities offering the best return.
Re:Do it with an apple (Score:5, Informative)
They even have a page on clusters [apple.com].
Re:Do it with an apple (Score:2)
What, are you kidding? I suppose you think these guys [apple.com] sit on their hands all day long.
Re:Do it with an apple (Score:2)
"The Advanced Computation Group (ACG) researches algorithms and high-performance issues relevant to Apple technology. Apple's state-of-the-art processors coupled with ACG programs such as Apple/Genentech BLAST propel the Power Mac and Xserve to performance several times faster than other platforms running standard BLAST. This supercomputer performance dire
Me! (Score:4, Funny)
Re:Me! (Score:2)
Re:Me! (Score:1)
Penguin Computing (Score:4, Informative)
Re:Penguin Computing (Score:4, Informative)
As a Systems Engineer who has worked with a number of vendors, I would say that Penguin is the bottom of the barrel in service and quality control.
We have five clusters at our facility, the slowest of which is on the top500 in the 150 range. We've tried big and small vendors.
Penguin is the absolute worst. No two scsi hard disks had the same firmware version, the raid controller was DOA, etc. We buy/borrow a node from each vendor and evaluate them before buying clusters, and out of all the vendors the Penguin is the one that would crash or hang all the time. After months of trying, they were never able to get this going properly. Regardless of the fact that we shipped it back twice and were told each time that we'd get back a whole new machine (it wasn't).
I would personally recommend Appro, IBM or Western Scientific in that order. Service and quality hardware are their game.
In this case... (Score:2)
Once you've done that, look through the proposals and pick which one sounds the best.
Submission Title (Score:2)
Re:Submission Title (Score:2)
Re:Submission Title (Score:2)
Only on ask slashdot... (Score:4, Funny)
I my name were Natchswing I would go with... (Score:2)
You got a grant with NO PLAN? (Score:3, Funny)
Re:You got a grant with NO PLAN? (Score:3, Informative)
Re:You got a grant with NO PLAN? (Score:2)
Re:You got a grant with NO PLAN? (Score:2)
how is Embry-Riddle relevant to (Score:2)
I thought that E-R was an aviation school.
Are they developing an anti-gravity levitation vehicle?
Re:how is Embry-Riddle relevant to (Score:2)
Re:how is Embry-Riddle relevant to (Score:2)
RFP is the answer (Score:5, Interesting)
- How much disk space are you going to need in total?
- How much disk space are you going to need per node?
- How much RAM is each node going to need?
- Is your application going to benefit from a low-latency or a high-bandwith connection between nodes?
- What about cpu? which cpu family will provide the best bang/$ for your calculations? PPC or X86? x86-64 maybe?
Once you know what you need, put it together in an RFP and send it out to every company that shows up under a google search for "beowulf cluster"
Review the responses and pick the best.
Since you are asking this question here, I'm going to refrain from suggesting the better option which is to build your own.
Hector
Re:RFP is the answer (Score:2, Informative)
Firstly, the more you put into the process the more you'll get out of it so be prepared to come up with a good RFP. If you're not an expert in clusters then you might well not know the answers to some of these questions so be prepared to take advice from suppliers. Sure, some of them may try and rip you off but most will be honest and helpful which will make the dodgy ones pretty easy to spot. Alternatively, look for some external, independent hel
Re:RFP is the answer (Score:3, Informative)
Corporate background questions are fine, but please stick to general stuff that can be answered with boilerplate. No one at the vendor knows or cares where our executive team went to college, and it's going to be a huge PITA to track that sort of BS down.
Ask what you want to know, but please re-read the RFP when you're done writing it. If you've asked the same questio
Microway (Score:5, Informative)
These are important questions you must ask your researchers and yourself before you purchase this cluster. But, to answer your question, I believe Microway is the best choice and I plan on having them build our next cluster in the next fiscal year.
-brian
Myrinet? (Score:2)
Re:Myrinet? (Score:2)
Re:Myrinet? (Score:2)
Re:Myrinet? (Score:3, Funny)
Re:Myrinet? (Score:2)
There is a lot of unnecessary ego on
Re:Myrinet? (Score:2)
Re:Microway (Score:2)
Re:Microway NOOOOOOO! (Score:2)
When we purchased ours, we had two nodes that had bad power supplies within the first two months... Replacements arrived within 2 days. The cluster withstood a sever AC outage, where the Ambient Temperature rocketed to 105 degrees and failsafes had not yet been implemented. We've had no further problems with the system since the initial hicup, with a consistent load (added up) of 105.0 and an uptime of 104 days for every node.
How you could allow your cluster to run at
Negotiate a success story (Score:5, Interesting)
All these vendors want to be able to talk about their work. Letting them use you for marketing may help you get more for your money.
Not Angstrom (Score:4, Informative)
Sadly, I was not around when the proposal was made, otherwise I would have rejected this cluster outright. There is no way to hook external storage up to this beast. There is no USB, Firewire, SCSI, external SATA, or fibre channel options. You can't even run an ATA cable out of the thing without drilling holes into the blade walls.
Personally? I'm looking at an XServe or an IBM Bladecenter.. but maybe it's just because I'd like some real support.
Re:Not Angstrom (Score:2)
cluster experience (Score:3, Interesting)
We've got a 128 node (1 cpu per node) cluster from Atipa http://www.atipa.com/ that cost CDN$ 0.25M.
128 P4 Xeon, 1GB RAM, 120Gb IDE, Gigabit Ethernet.
I'd expect you to get a lot more for your USD$
The only thing I don't like about it is Atipa's configuration of Redhat8 (they didn't offer anything newer at the time). Look for something newer there.
Atipa is one of the suppliers for SGI-branded clusters as well.
I'd really like a cluster from http://adelielinux.com/en/, but I wasn't aware of them at the time we did our RFP and cluster purchase.
Re:cluster experience (Score:2, Informative)
Re:cluster experience (Score:1)
t
Apollo? (Score:2)
Peter.
Re:Apollo? (Score:1)
as apparent from this his words:
Anyone else think this is.... (Score:2)
You got then grant *then* went shopping? Does all US academia work like this? Aren't you supposed to work out what you want to do, how to do it, how much and only then apply for the grant?
Dave
Re:Anyone else think this is.... (Score:2)
Since it can take over a year to get a grant in some cases, picking out the vendor before the grant arrives is usually stupid. By the time it arrives, t
cluster problem set (Score:2, Informative)
Most people can do you up a 256 node cluster for under half a million, but doing up one with high speed and low latency netw
go blade (Score:2)
2-way Xeon blades each from ibm off the shelf.
They have blades with dual gigabit nics. A Pair of
3-Com 16-ways nics give you 2 parallel networks,
which makes it flexible. Run OpenMOSIX.
I'm pegging the whole shooting match at roughly
$420k. Spend the rest on NAS, pack out the RAM,
get a nice visualization wall, etc.
You're gonna need help.... *shakes head* (Score:2)
Second: you're almost certainly going to have to put it out to bid. For example, at UIUC [uiuc.edu], the bid limit is $28,100. Anything over that *must* go to bid unless you can provide a really good reason why you have to "sole source" it.
Now, you need to start thinking about stuff. First off, forget the number of nodes. You need to start by thinking about how they'll b
Re:You're gonna need help.... *shakes head* (Score:2)
You did give some useful information, but there was no need to start it off by calling him clueless.
I'd lend you mine... (Score:1)
contact other universities (Score:3, Insightful)
interconnect is the thing (Score:1)
Infiniband or gigabit ethernet are your main options. IB is low
Wanna Buy a Cluster? (Score:3, Funny)
You'll be back, believe me. You'll be back in no time.
ring.. (Score:3, Insightful)
..Dell
...IBM
and *talk* to a sales rep. I know how hard this is (not!) but asking Slashdot is kinda silly. Sure you might want some impartial advice but
ciao
Warning ! (Score:3, Interesting)
When my research group decided to build one [gdargaud.net], I was incharge, opted for OpenMosix [sourceforge.net] and after a tweaking period worked really well. Now with the various bootable CDs with OpenMosix (PlumpOS, BCCD, Quantian, ClusterKNOPPIX...), tests and upgrades are done by just pressing reset !
Of course with clusters your mileage may vary.
Don't ask slashdot (Score:2)
Building or buying a cluster is serious business. Talk to supercomputing experts. Issues involved are numerous. Just a short list:
on eBay (Score:1)
Re:on eBay (Score:2)
Low Cost Cluster Computing (Score:2)
You can also see one of
do you have a good relationship with one vendor? (Score:2)
The are less likely to take advantage, since they want to continue doing business with you. Your existing relationship will give you a little leverage.
Pay someone to do it for you? (Score:1)
Perhaps consider using a team member from a free software clustering project as your consultant (check credential though)? That way you hopefully get someone who is an expert an
http://aggregate.org/ (Score:1)
I would have the research group that I work with at the University of Kentucky build it. Maybe you should contact my professor, Dr. Hank Dietz.
KAYS0 [aggregate.org]
University Of Kentucky Supercomputer Breaks The $100 Per GFLOPS Barrier [aggregate.org]
They built the supercomputer for under $40,000 with 128 nodes + 4 spare nodes, just think how many nodes and how powerful it could be with $700,000!
Re:http://aggregate.org/ (Score:2)
PeTS [aggregate.org] may be applicable here, especially his research into Flat Neighborhood Networks (FNNs). However, I think that AMD/Intel sytems use too much power (70 watts or so each). A computationally-equivalent cluster of VIA EPIA motherboards (maybe 10 watts each) would be both physically smaller and much easier on the electric bill. At $100 each for a VI [axiontech.com]
IBM of course (Score:2)
There's a reason that they say you never get fired for going with IBM. IBM has more super-computing experience than anyone. We've got an amazing turn-around capability when it comes to building clusters. But perhaps the best thing with going with IBM is the fact that it builds the relationship.
IBM is very involved with universities especially in the areas of high performance computing. W
Performance bid (Score:1)
"People asked me why we chose Compaq," says Marshall Peterson, Celera's vice president of infrastructure technology. "The answer is simple. We took a benchmark and gave it to all the vendors. Only two could run it. One ran it in 87 hours.
Compaq ran it in seven." Peterson didn't disclose the name of
Wow (Score:1)
Re:I'll do it (Score:3, Funny)
3. Pocket the leftover $499.5K