Mini-ITX Clustering 348
NormalVisual writes "Add this cluster to the list of fun stuff you can do with those tiny little Mini-ITX motherboards. I especially like the bit about the peak 200W power dissipation. Look Ma, no fans!! You may now begin with the obligatory Beowulf comments...."
Imagine.. (Score:3, Funny)
Re:Imagine.. (Score:5, Funny)
Evidently they didn't cluster enough...
Re:Imagine.. (Score:5, Informative)
The original links went to NASA/GSFC [nasa.gov], but the Beowulf project central site has moved.
Beowulf. The REAL deal. (Score:4, Informative)
of spear-armed Danes, in days long sped,
we have heard, and what honor the athelings won!
Oft Scyld the Scefing from squadroned foes,
from many a tribe, the mead-bench tore,
awing the earls. Since erst he lay
friendless, a foundling, fate repaid him:
for he waxed under welkin, in wealth he throve,
till before him the folk, both far and near,
who house by the whale-path, heard his mandate,
gave him gifts: a good king he!
To him an heir was afterward born,
a son in his halls, whom heaven sent
to favor the folk, feeling their woe
that erst they had lacked an earl for leader
so long a while; the Lord endowed him,
the Wielder of Wonder, with world's renown.
Famed was this Beowulf
Sample from the Project Gutenberg Text of Beowulf.
Why not do yourself a favour and download it. Classic stuff.
Re:Imagine.. (Score:3, Insightful)
Imagine... (Score:5, Funny)
Floating point performance (Score:5, Interesting)
I decided against a mini-ITX cluster because the floating point performance (why else would you build a cluster?) of VIA CPUs is just abyssmal.
Is there any reason why there are no P4 or AMD mini-ITX mobos around?
Re:Floating point performance (Score:4, Interesting)
Re:Floating point performance (Score:5, Insightful)
And VIA markets their own line of CPUs for use in that scenario.
However, I wouldn't mind seeing Pentium-M or mobile Athlons placed on mini-ITX boards.
Re:Floating point performance (Score:4, Informative)
I'm not aware of any Athlon-based boards, but mostly because I'm satisfied with my Via-based M10000 board.
Re:Floating point performance (Score:5, Informative)
Re:Floating point performance (Score:2)
This [logisysus.com] was found after 2 seconds with Google.
Re:Floating point performance (Score:5, Informative)
Re:Floating point performance (Score:2, Interesting)
Btw, you're wrong - there ARE P4-based mini-ITX mobos.
Re:Floating point performance (Score:2)
Re:Floating point performance (Score:2)
Re:Floating point performance (Score:5, Informative)
But, most mini-itx systems are very small in size, and strive for quiet or silent operation. So, there are obvious problems with the P4's heat/power requirements. Perhaps a better solution is the Pentium-M in a mini-itx form factor. It has pretty good performance, at a low power/heat level: Pentium M [commell.com.tw]. But, most of the Pentium-M boards are intended for industrial or OEM use, so they are hard to find in retail, and are pretty expensive.
Re:Floating point performance (Score:5, Informative)
Coudn;t find a link though, sorry.
Re:Floating point performance (Score:2, Interesting)
In fact, a Pentium M platform would be a perfect choice as long as the mobile Athlon mobos are impossible to find.
Does anyone have a link?
Re:Floating point performance (Score:4, Insightful)
Here's [mini-itx.com] your link, by the way.
Re:Floating point performance (Score:2)
Re:Floating point performance (Score:2, Informative)
Re:Floating point performance (Score:5, Informative)
The floating point is just a convenience. Almost any algorithm can be modified to work with fixed point precision -- and without loss of performance.
Of course, many people will insist, they need FP to be able count dollars and cents -- they don't even think of counting cents (or any other fractions of the dollar) with integers, for example.
These are, usually, the same people, who have troubles defining bit...
Don't use FP for money (Score:3, Insightful)
Perhaps many people would insist on using FP dollars and cents, but those people are fools, and it is very easy to part them with their money. Just make sure all the rounding errors work out in your favor, which isn't hard if you have access to their accounts.
Yeah I know that for small numbers FP has no rounding errors, but that doesn't last long.
Re:Don't use FP for money (Score:3, Funny)
Peter Gibbons: Um, the 7-Eleven, right? You take a penny from the tray.
Joanna: From the crippled children?
Peter Gibbons: No, that's the jar. I'm talking about the tray, the pennies for everybody.
Re:Floating point performance (Score:3, Insightful)
* To crack encryption?
* To compile big projects?
* To compress huge files?
How about scientific computing? That's really the big thing that keeps cluster computing alive. Cracking encryption is the only thing on that list that makes sense. The other stuff shows your lack of knowledge of other disciplines by the fact that you think these are computationally expensive tasks.
Re:Floating point performance (Score:3, Interesting)
Mars is not made any closer to Earth by the revelation, that Alpha Centauri is really far...
This is why you might need the FP performance. I was answering a totally different question -- what would you do without the good floating point performance.
Thank you, thank you.
Would you, please, demonstrate, how I can rebuild a project of 300
Re:Floating point performance (Score:5, Interesting)
But at a significantly higher development and debugging cost. Why go for integer adaptation, if a P4 can do four FP operations in one clock, using SSE2? I have tested my 2.4GHGz P4 at 6 gigaflops, in a practical application doing matrix inversion. The theoretical maximum for my machine would be 9.6 Gflops. If you RTFA, you'll see they mention 3.6 Gflops performance for their cluster, about 60% of my single-processor system. I see no point at all in building that cluster.
Re:Floating point performance (Score:4, Insightful)
Power. (Score:3, Interesting)
Re:Power. (Score:3, Informative)
Well, I just applied my, admittedly imprecise, clamp ammeter to the power cable, and got ~2 amps @ 120 V = 240 W. Which means, 240W/6Gflops = 40W/Gflop. That cluster has 200W/3.6 Gflops = 55.555... W/Gflop. Slightly worse...
I admit that hardware interfacing is getting to be a problem for us hobbyists, since the demise of the ISA bus, but I have been able to get along with the parallel interface. I just hope the USB interface doesn't get to
Re:Floating point performance (Score:3, Interesting)
No, I tested it with a random matrix, as in
Re:Floating point performance (Score:5, Insightful)
Apparently you've never done any numerical computing, especially of the scientific variety. In an astrophysics simulation, for instance, the density of a field may span over 20 orders of magnitude, hardly reasonable to do with fixed point arithmetic.
Not to mention that many iterative algorithms can oscillate wildly in the presence of numerical error.
It is true that there are many other uses for a cluster besides numerical computing, however the idea that any floating point algorithm can be converted to fixed point could not be more wrong.
Disclaimer: My research at Cornell University is high performance clustered numerical computing.
Cheers,
Justin
Re:Floating point performance (Score:3, Interesting)
Point by point:
You did not get it. You are looking at the bit (and byte, and word) as a number. I suggest you look at it as a unit of information. With 64 bits you can only have 2^64 distinct possibilities. If you choose to treat them as numbers -- fine, you only have 2^64 distinct numbers.
Okay FYI I do remember first year discrete math, I have experience with inner details of compu
Re:Floating point performance (Score:3, Interesting)
This makes sense. With integers the density is uniform, which is impediment in some cases, but of help in others. [Any attempt to quantify the number of cases in each group is silly and will reveal nothing, but the attempter's personal bias. With my bias, I'll insist you are underestimating the n
Re:Floating point performance (Score:3, Informative)
Think of FP as a lossy compression algorithm. It allows the use of orders of magnitude less number of bits because it alters the density distribution of the representable numbers to meet the above specificiations.
This makes sense. With integers the density is uniform, which is impediment in some cases, but of help in others. [Any attempt to quantify the number of cases in each group is sill
Re:Floating point performance (Score:5, Interesting)
Older C3 cores run the FPU at half the clock rate. If you get the fanless 600 MHz EPIA motherboard, the FPU will be running at 300 MHz.
The newer, Nehemiah core C3 chips run the FPU at full clock speed. Any C3 newer than Nehemiah should run the FPU at full speed.
He used the VIA EPIA V8000A motherboard with an Eden core CPU. From what I found on google (here [hardwareirc.com]), the Eden core does run the FPU at full clock speed.
In any event, he said the cluster has more processing power than a four-P4 SMP system, while taking less electricity to run. And it will be quieter and more reliable. I'd like to see actual benchmarks, but it seems like it makes enough sense.
I read about a cluster of PocketPCs, and that didn't make practical sense. It was just a fun project.
steveha
Re:Floating point performance (Score:3, Informative)
Whoops, I made a mistake. He actually said his 12-node VIA cluster has more power than "four 2.4 GHz pentium 4 mcahines used in parallel". Not SMP!
Sorry about the mistake.
steveha
Re:Floating point performance (Score:4, Informative)
I have the VIA EPIA 8000 (not sure what the V and A modifiers mean), with an Ezra core. FYI, Eden isn't a core, it's an initiative. The VIA Eden is aka VIA EPIA 5000, and was the first fanless Mini-ITX. Eden was the development product moniker, and came to refer to the motherboard that was first produced from that initiative. It can also refer to any C3 CPU made to run fanless.
Back onto the original topic; my EPIA 8000 with an Ezra core runs the FPU at half clock. This document [via.com.tw] on the differences between the Ezra/Ezra-T and Nehemiah cores indicates that one of the fundamental differences between the two is the full speed FPU. So I doubt that the article you quoted is accurate...
Just some more info... Nehemiah was manufactured at 933 MHz, 1 GHz, and speeds up to 2 GHz are planned. The Ezra was manufactured at 533 MHz and 800 MHz in its first run; the 533 is also known as the Eden. The Ezra-T (the second run of the Ezra) was made at 600 MHz (aka Eden), 800 MHz, 933 MHz, and 1 GHz.
Re:Floating point performance (Score:4, Interesting)
Check out Theo de Raadt's little benchmark:
http://marc.theaimsgroup.com/?l=openbsd-misc&m=10
FPUs of the future? Re:Floating point performance (Score:3, Insightful)
Informative:
If you're looking for a small form factor for high-end processors, you will likely find future products using the picoBTX form factor. The motherboard layout provides better cooling for hot processors that mini-ITX can't address. Here's a summary of the BTX form factors from Anandtech [anandtech.com].
Interesting:
Has anyone figured out how to use the floating point power in their graphics cards for non-video applicaitons? Those things are becoming powerful that they use their own heat sinks. Ju
Re:Floating point performance (Score:2, Interesting)
Re:Floating point performance (Score:3, Informative)
Here's an old review [tech-report.com]. The VIA processors aren't built for speed; they're built for low power consumption. In that department, they're great. They're also relatively cool, temperature-wise.
I've got a machine based on
Pointy-Haired Boss (Score:3, Funny)
Kind of like that strip where he (the boss) wanted to have a SQL database in lime.
Re:Pointy-Haired Boss (Score:2, Funny)
Inexpensive for testing purposes, (Score:4, Insightful)
Still, you can get some stats on how the clustering works, what's the best algorithm for dispersing problems, and these boards are cheap, but that's about the only advantage I can see...
Simon
Re:Inexpensive for testing purposes, (Score:5, Interesting)
Re:Inexpensive for testing purposes, (Score:3, Informative)
Re:Inexpensive for testing purposes, (Score:5, Informative)
A beowolf of mini-itx boards is probably the cheapest way to get bragging rights. As a practical way of fast and cheap parallel computation they are not.
However, I have purchased three (V10000 boards) thus far and intend to add more to my network as low power (as in Watts) servers.
I worked out that given the power of 10.78W (source: mini-itx.com's power comparison tool) for the V series (probably the one with the slowest CPU in the series, board only), I could save a fortune on electricity compared to a more regular computer.
The electricity company sells electricity at the rate of 0.63 ($1.18) per watt per year. Compared with a standard PC of 100W, I can regain the purchase costs (in savings) of the board and memory within two to three years.
Also, I found rack mount chassis [icp-epia.co.uk] available cheaper than one for a regular sized case. This influenced my decision a little - who doesn't want a network of rack mounted computers?
Overall, because of the low price and low power the mini-itx boards are a no brainer if and only if the CPU power of each computer isn't important.
Jonathan
Re:Inexpensive for testing purposes, (Score:2, Informative)
One use I thought of right away... (Score:3, Insightful)
It would be quite useful for a university with an undergraduate course in high performance computing to have their own little NoRMA cluster to play with without the space, heat, and power consumption of a supercomputer.
Let the researchers use the real supercomputer, but the undergraduates can still play with message passing parallel algorythms to their hearts content.
Re:Inexpensive for testing purposes, (Score:3, Informative)
RTFA... he compares performance to 4-6 P4s. He does clustering for a living so I'm assuming he knows how to measure and compare performance at this scale...
Re:Inexpensive for testing purposes, (Score:3, Interesting)
Samba throws open a hell of a lot of threads. (At least on my network of 200 people.) A cluster with each node posessing an external network port would be able to split the threads across dedicated processors. Not too useful for me, but if someone was trying to serve a few thousand clients at a time, that would be useful.
TMYK
Re:Inexpensive for testing purposes, (Score:3, Interesting)
That said, the only time a cluster of servers will do better than a fast single node is when the task divides well over the cluster. Great for clustered webservers, even distributed data
Seriously, though... (Score:5, Interesting)
Has anyone tried stuffing several into a single 1U chassis? For a sort of cluster of clusters?
Re:Seriously, though... (Score:5, Interesting)
Re:Seriously, though... (Score:4, Interesting)
Re:Seriously, though... (Score:4, Funny)
Re:Seriously, though... (Score:4, Informative)
The answer to this is...
Yes! (2) [linitx.com] and Yes! (4) [linitx.com]
shuttle (Score:3, Interesting)
The only problem I've found so far is they ony come with nvidia onboard graphics, but that's what the agp slot is for.
Imagine... (Score:5, Funny)
In fact, maybe you just aren't that funny. Except in Soviet Russia.
Shit, now I'm doing it.
Re:Imagine... (Score:2, Funny)
Re:Imagine... (Score:2, Funny)
Sorry. I'll go kill myself now.
This with Chess (Score:3, Interesting)
Some preliminary performance results (Score:5, Informative)
Oh my goodness! (Score:5, Funny)
Around here, that must make you a god!
Re:Some preliminary performance results (Score:3, Funny)
Why 12 nodes? (Score:2)
Cool stuff ... (Score:5, Interesting)
Here's a picture [amd.co.at] of our first 4 boxes. The USB stick seen sticking out from one of the boxes is bootable and an excellent replacement for floppy disks...
more information ... (Score:4, Informative)
Re:Cool stuff ... (Score:2)
Re:Cool stuff ... (Score:5, Informative)
Yes, I guess that most current BIOSes of the newer boards do, especially the consumer-ish stuff. We just used the stock Shuttle XPC with its FlexATX-board.
Re:Cool stuff ... (Score:2)
Re:Cool stuff ... (Score:3, Informative)
Hmmm (Score:5, Funny)
So what was it? No cutting, or cutting?
FLASH... (Score:2, Interesting)
Maybe he should consider PXE instead.
Re:FLASH... (Score:5, Interesting)
Actually, he's not. IBM Micro Drives are not CF, they just have a CF form factor/interface to be compatible with hand held devices. They are hard drives.
Re:FLASH... (Score:3, Informative)
"CompactFlash(R) is a small, removable mass storage device."
So you are correct in noting that he is actually using HDDs, not flash, but in the same time, he is using CompactFlash (BTW the CF pinout is IDE compatible, so to hook up your CF to your IDE bus all you have to do is to manage to connect the wires of the IDE cable and the power cable to the card)
Whilst not clustering... (Score:5, Interesting)
Enough on the Web Server !!! (Score:2)
Silly question, I know, but... (Score:5, Funny)
I wonder too (Score:3, Insightful)
Video encoding? (Now, where'd I put that parallel-processing version of AVISynth?)
Rent it out to a university?
Program it to solve chess and leave it going till it does?
Get a decent frame rate in any FPS, once and for all? (Note to self: develop parallel-processing graphics card.)
Re:I wonder too (Score:4, Informative)
dyne:bolic [dynebolic.org] is a Live CD distribution, very small, that can be PXE boot, with full audio/video capture/editing/processing/streaming capabilities plus the usual suite of tools, a few games and whatnot... and is auto-clustering on a private network.
Test Text. (Score:4, Informative)
The construction is simple and inexpensive. The motherboards were stacked using threaded aluminum standoffs and then mounted on aluminum plates. Two stacks of three motherboards were assembled into each rack. Diagonal stiffeners were fabricated from aluminum angle stock to reduce flexing of the rack assembly.
The controlling node has a 160 GB ATA-133 HDD, and the computational nodes use 340 MB IBM microdrives in compact flash to IDE adapters. For file I/O, the computational nodes mount a partition on the controlling node's hard drive by means of a network file system mount point.
Each motherboard is powered by a Morex DC-DC converter, and the entire cluster is powered by a rather large 12V DC switching power supply.
With the exception of the metalwork, power wiring, and power/reset switching, everything is off the shelf.
At present, the idle power consumption is about 140 Watts (for 12 nodes) with peaks estimated at around 200 Watts. The machine runs cool and quiet. The controlling node has 256 MB RAM , and an 160 GB ATA 133 IDE hard disk drive. The computational nodes have 256 MB RAM, each and boot from 340 MB IBM microdrives by means of compact flash to IDE adapters. The computational nodes mount
Power and Cooling
Mini-ITX boards have very low power dissipation as compared to most motherboard/cpu combination in popular use today. This means that a Mini-ITX cluster with as many as 16 nodes won't need special air conditioning. Low power dissipation also means low power use, so you can use a single inexpensive UPS to provide clean AC power for the nodes.
In contrast, a 12-16 node cluster built with Intel or AMD processors will generate enough heat that you will likely need heavy duty air conditioning. Additionally, you will need adequate electrical power to deliver the 2-3 kilowatts peak load that your 12 node PC cluster will require. Plan on having higher than average utility bills if you use PC's...
Hardware Construction
The cluster is built in two nearly identical racks. Each rack has two stacks of three motherboards and dc-dc converters mounted on aluminum standoffs.
The compact flash adapters used to mount the microdrives are also in stacks of three. Each stack of boards is mounted on a 7 inch by 10 inch 0.0625 thick 6061-T6 aluminum plate as are the microdrive stacks. There are seven metal plates in all, in each rack.
The top cover plate has the mounting bracket for the 6 on/off/reset switches.
The plate below it is home to the power distribution terminal block. The power delivery cable for each rack is heavy duty 14 gauge stranded wire with pvc insulation. The power cabling from the terminal strip to each of the dc-dc converters is 18 gauge stranded pvc insulated hookup wire. The wiring for the power/reset switches is 24 gauge stranded, pvc insulated wire.
The top rack houses nodes one through six (node one is the controlling node). The bottom plate of the top rack also houses the 160 GB ATA-133 hard disk drive used by the controlling node. All other nodes make use of the IBM microdrives. Node number three has a spare compact flash adapter which can be used to duplicate microdrives for easy node maintenance.
The disk drive and power cabling to the motherboards was dressed as was sanely possible on the back panel. The liberal use of nylon cable ties helps reduce the ten
Just because you can... (Score:3, Insightful)
Re:Just because you can... (Score:2)
Re:Just because you can... (Score:4, Insightful)
Flops/$$$ = free (Score:4, Insightful)
slashdotted already? (Score:5, Informative)
I managed to get it mirrored here:
page 1:
http://www.phule.net/mirrors/mini-itx-cluster.htm
page 2:
http://www.phule.net/mirrors/mini-itx-cluster2.ht
page 3:
http://www.phule.net/mirrors/mini-itx-cluster3.ht
heh I got obligatory for ya (Score:3, Funny)
*ba dum ch*
Mini-ITX? Bah! Nano-ITX!!! (Score:4, Informative)
MB, slim DVD and laptop HD in a case the size of a large paperback book!
It will make my "K-Mart Toolbox Mini-ITX PVR" look like a full tower in comparison!
Sounds Fun (Score:5, Interesting)
I think all of these could be solved at once. What if someone built low-power, low-noise, and low-cost computer, good enough for running light office applications? I don't mean OpenOffice, but rather lightweight programs that implement the functionality people use _without_ the bloat. My 486 handles email just fine and the WYSIWYG word processors were once satisfied with a first-generation Pentium (and even these were already bloated).
Current PDAs have more than enough processing power to handle those tasks, and I've noticed that company's like gumstix [gumstix.org] build and sell devices almost like what I have in mind (the gumstix don't seem to have display connectors, though). Hey, these machines could actually be portable and have a really decent battery life (more than a full working day); that would be a killer!
Am I just daydreaming here or are others with me? Maybe you know of devices that do this job? Someone recommended Sharp's Zaurus, which is excellent, but still rather more expensive than what I have in mind.
Massively Parallel (Score:5, Insightful)
I'd just like to point out that 12 nodes is not "massively parallel."
Re:Massively Parallel (Score:3, Insightful)
12Nodes is as small as clusters can be, so "a small cluster" would be a better description than "a massivly parallel cluster".
(But it really looks cool, and 12v dc via lab psu is cool,too.)
I want one! (Score:2)
I'd really like to build and use my own cluster, as I do have some MPI experience from college. The only question is: What are they good for at home? I just can't justify the expense to myself without figuring out what I could really do with a cluster if I built one.
Ideas saught!
~D
Why Microdrives? (Score:2)
Re:Why Microdrives? (Score:3, Informative)
Why this particular set of software / booting? (Score:5, Interesting)
I've always wondered; why not PXE boot something like this? Set your node controller to also do DHCP and you're set.
While you're at it, use the CL version for the controller which has two network cards and build a NATTING firewall into the node controller too. Then you have a plug-in appliance that doesn't interfere with your network topology at all. PXE boot it and the motherboards will only need RAM.
The board he used is available for $99 with proc. A stick of 256 is probably around $20.
The best price froogle would give me on the drives he's using is $60, and they're prone to wear and tear.
Add in the $10 CF-IDE adapter and the drive is %60 of the cost of the motherboard itself...
Hell if you don't want the network bogged down with a bunch of PXE booting nodes all the time, just get cheap CD drives and put dyne:bolic [dynebolic.org] on it, which does automagic clustering...
Personally, if I were to do it, I'd set dynebolic to PXE boot, get a huge stack of motherboards and RAM, and do it that way. Then adding/changing nodes is relatively simple... IIRC, they're even factory set to try PXE booting if no IDE devices are found...
The only other change I would make would be to ditch the 16-port switch... move to 4-ports, connect those to a 4-port with gigabit uplink, and connect that to a gigabit switch. Of course at this point I'm talking about really scaling the cluster up, to a few hundred nodes or so. At that point I'd stop using a mini-ITX board for my node controller and go with a motherboard with a bit more juice behind it, dual procs, RAID 0/1, the whole shebang...
Now if only I had a couple grand burning a hole in my pocket... speaking of which:
motherboard: $100
RAM: $20
DC-DC converter: $30
CF adapter: $10
Microdrive: $60
Total: $220
Total PXE booter: $150
Savings: 30%
So, not counting the costs of cabinets, power rectifier/UPS, wiring, network gear, and labor, you can increase the size of your cluster by %30 for the same cost, just for setting up PXE boot...
Re:I built a fanless ITX system... (Score:4, Informative)
Re:I built a fanless ITX system... (Score:4, Informative)
My requirements were essentially (1) no moving parts, (2) affordable if not cheap, and (3) small. I settles on one of these [transcend.com.tw]. Debian is fine on 128MB, with 512MB of ram and no swap. Performance, it should be said, sucks. The next step up, for slightly more performance, much more capacity, and a whole lot more cost, is here [m-systems.com]; but I wanted to avoid using a case that needed drive bays, plus I haven't pockets that deep.
Neither of those is likely to be what you want for a database system, though. You're probably more in the market for a bunch of ram and a battery, unless your primary concern is reliability. If speed is the goal, you want this [cenatek.com], or, for more capacity and more money, this [soliddata.com]. Note that I haven't used either extensively, and in playing around with the rocket a little, I was surprised just how much of a bottleneck PCI became. Also, the rocket doesn't have a battery... so really, unless you have a board with 8GB of memory, and you just need another 8GB of low latency space, it's not such a great deal today.
If you fit into any of the niches above, solid state is wonderful. It's always more expensive than you think, though. And for any database systems I've dealt with, a disk is without question the way to go, perhaps with more memory on board. But if you want any further tips, I'm glad to help.