Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
Hardware

Multiprocessor G3/G4 Boards 196

giminy writes: "These boards from TotalImpact look pretty nifty. Each one can take 4 g3's or 4 g4's and go in a regular PCI slot -- and get this, they can run in Intel machines. They work by having a program dumped to them like a second computer. Still kinda pricey for the cards, but you can put as many of these cards in your server as you want for something super-scalable. Linux support is there, and datasheets are available." We mentioned these back in '98 but a lot has changed since then. I'm sure there are clever uses for a couple of spare CPUs in a box ;)
This discussion has been archived. No new comments can be posted.

Multiprocessor g3/g4 boards

Comments Filter:
  • They're more like "Computing peripherals" - They're PCI cards that fit in any PCI slot. (Well, one conforming to the right specs - One of their 66 MHz/64-bit PCI cards won't work in your average box.)

    They cards will not run standalone or as a primary processor, they're slave processors. You still need a host processor, which can be whatever you want. (Intel, SPARC, Alpha, PPC, even StrongARM probably.)
  • If you need to constantly go over the PCI bus for everything (memory, disk, etc) then yes, you'll run out of bandwidth real quick.

    However, the board appears to have a lot onboard, meaning that the bandwidth requirements are lower, leaving you with things like a "black box" scenario. You have an image you need manipulated, so you send it to the G4 board with the manipulation instructions. The board gnaws on it for a while without working on the PCI bus, then returns the modified image.
  • but not your favorate closed-source variety

    why the hell would you say something liked that? it's just plain stupid. In case you've been absent the past 15 years, hardware support is almost always more complete with 'closed source' OS's than OSSOS's. man, please don't be such an idiot. Windows supports more pieces of hardware than there are competent linux operators!

    :)Fudboy
  • Yeah, I wonder what kind of RC5-64 rate you could crank out with a couple of those monseters in your machine? You'd probably put a sizable dent in the time needed to finish the project.

    Looks like you beat me to the obligatory distributed.net comment.

  • If you look at RC5 Benchmarks [totalimpact.com] it lists bench marks for their 604e boards. So it can do it. It is just a matter of how fast the G3/G4's are going to be on it. Then again who is going to spend this much money for just RC5. If you do have money like this to burn send it my way.
  • the low FPU based needs of a server,

    How true is this ? If you're a disk-only server, or even a SQL box, then you need disk bandwidth, memory, disk space and network bandwidth in roughly that order. OTOH, if you're bothering to put extra processor boards into your server at all, you're presumably needing to crunch serious numbers. How much of a margin is there between needing an essential FPU and not needing extra processors at all ?

    I'm getting into packaging media for streaming servers. A bucketload of G4s in a box sounds like a fine idea to me.

  • whoah, that came off kinda harsh! Sorry, I just get so excited by motherboards...

    I should mention that, yeah, I would like more irq as well. Another thing - multiplexing at that scale is probably feasable, a function of the MB chipset... the chipset mfctr.s could provide for this if there were a profitable reason.



    :)Fudboy
  • I think it's only available for the *500 series PowerMacs at this point. I'll sell you my 8500 if you're really interested :)

  • I may be mistaken, but the marketing info makes this seem like Asymmetric Multi-Processing. I thought that had pretty much fallen into disfavor. It just can't match SMP in most cases.
  • I was even more surprised to see that the 8600, 9500 and 9600 were listed.

    from the 9600/350 spec sheet [apple.com]:
    Data Path: 64-bit, 100 MHz
    Slots: 6 PCI
    Notes: One PCI slot occupied by video card. System supports 100 MHz cache bus, and 50 MHz system bus speeds.

    from the 9600/200MP spec sheet [apple.com]:
    64-bit, 50 MHz
    6 PCI

    from somewhere else on the Apple, re the 9600, I forgot to copy the link:
    Six PCI expansion slots compatible with PCI 2.0-compliant cards

    does that provide any info? I don't know what PCI 2.0 implies, exactly...

  • naw, I bet if I carefully align the teeth, I could just tape a video card onto the processor card and presto! I could even hook up the power feeds with aligator clips for that extra geek-ass ooomph. I bet the thing would just boot right up, too, spontaneously writing its own os. that's a whole lot of transistors you know.

    :)Fudboy
  • Supposedly quad g3/400's are $4500, quad g4/400's are ~$6500. Yeah, it's kind of pricey, but if you stick 8 of these things in a box, you have some serious computational power.....
  • well, with the G4's on there, it would be a personal quadruple supercomputer.

    drool...

  • I'm pretty sure the design of the AGP bus allows only one AGP interface per system bus (eg CPU/Memory/AGP shared bus)
  • Ok, this is kinda cool, you can put lots of processor power in one box. Of course you probably will have a bottleneck at the bus so it won't actually be that fast a lot of stuff. The real question is what the hell am I going to run on it?

    I mean its mac chips which will most likely go into PCs. No software that's straight off the shelf will run on this thing because its too freaking wierd. Definitely not windows (but so what) and most likely not MacOS either (ditto). However I'm betting you can't just throw Mandrake on this either and get it to work. This company is going to put out a custom linux distro just so they can get some practical use out of the concept.

    I mean if you're not going to be using Open Source software with this thing you may very well be up the proverbial creek. Thats not a problem for many slashdotters, but if I want to run a commercial analysis package that available in binary only, this architecture is probably right out.

  • I'm not sure everyone would have sue for this, but a pci card that used the bus for little (if anything) more than power and a fast lan connection. Plug a p3+ram+video+?sound card into a pci slot, plug in a keyboard, mouse, monitor, maybe speakers, and you've got a dual (or more) mobo case. You could either mount a filesystem from the "host" mobo, or toss an IDE connection on the baord as well. Sound too small to be true? The EspressoPC did it, and while it would obviously have some nasty power requirements, and while it wouldn't be for everyone, it would have a wide number of potential uses:

    A few years back, this company called Ross Technologies (recently defunct) sold a product called the SparcPlug, which fit a Sparc/Solaris workstation in a PC's spare 5.25" drive bay, allowing you to run NT and Solaris simultaneously. It had it's own ethernet, but used the PC's keyboard, mouse, monitor, drives, etc., and was sold bundled with a Dell Pentium for around $10k. I don't know if anybody made an actual x86-compatible pc-on-a-card for a PC, though...
  • I even heard about developments of PCI boards consisting of 32 StrongARM CPU.
    Given a 1MIPS/Hz performance, just imagine yourself with a virtual 5-6GHz per board!
    ARM are also known for having the best puissance/consummation ratio.
    They indeed hardly burn even a single Watt each.
    --
  • Well, it is supposed to run in Mac 8600, 9500, and 9600 boxes. (Those are the ones immediately before the first G3's) So you could pick up one of those and use LinuxPPC.

    If you also got a G3/G4 upgrade card, you ought to even be able to custom compile the programs to take full advantage of the newer processors...

  • As was mentioned before, DayStar had dual and quad 604 machines out a while ago. Apple produced the 8500/180 and 200 with dual 604s. Also, IIRC, the original BeBoxen were dual 603 machines...

    However, I'm not sure that any of these were truly "on the board" and not on processor daughtercards.

  • What about the larger x86 servers from Unisys and Sequent which have DOZENS of PCI slots?

    G4 500's are rated at around a GLFOP. So about 4 GFLOP per PCI slot? Some servers have like 32, 96, 100+ 64-bit 66MHz PCI slots... there's a thought. Heh.

    --
  • PCI beats Ethernet any day, but most scientific clusters use Mirinet or SCI, which are about 1 Gbps full duplex, while PCI is about 4Gbps half-duplex with a 64bit/66MHz bus. This means that as soon as you have two of those babies in your computer, your PCI bus is actually slower than is you had a switched Mirinet or SCI interconnect.

    The other problem is the bus on the card is too slow to handle four CPUs. Our experience is that anything over two CPU in a single machines will cause bottlenecks. Except on SGIs with ccNUMA, of course, which can handle eight CPU per machine easily.

    Memory is also a bit tight - we usually need use about 512Mb per CPU, this thing as 512 for all 4 CPUs.

    Well, that's my NSHO and experience.
  • by peter ( 3389 ) on Tuesday June 27, 2000 @05:05AM (#973510) Homepage
    Here's [mot.com] Motorola's G4 fact sheet. The real lowdown on the G4 is here [mot.com]. Especially check out the hardware spec. (The link seems to be broken or something, though. I looked at it a few weeks ago :(

    The TotalImpact page doesn't say what speed they run the L2 cache at. (The PDF spec sheet link is broken :( G4s support a range of clock divisors for the external L2 cache SRAMs, from 1:1 to 4:1. Apple uses 2:1 in their towers. (BTW, the cache RAM is external, but the control logic and stuff is all on chip.)
    #define X(x,y) x##y
  • A fair number of i840 boards have them too, shame you need RIMMs though, but if you can afford one of these boards you can probably afford the RIMMS.
  • If a G4 dissipates 10W, then a 4 CPU board will dissipate 40W. 8 of them will dissipate 320W. Okay only a problem forpeople with more CPU's than sense.
  • The more I think about it, the more I like it. It sounds exactly as powerful as a SMP machine BUT you can add more CPUs more easily. Up to 8 cards each with 4 processors plus my main processor is 33 CPUs! In one machine! And you can more than 8 cards if you don't "map all the memory" (according to the page).

    How much??
    --
  • I think the software would have to be very specialised to use the board as by the sounds of things the program needs to be written specifically to use this board and not just be threaded.
  • The Tyan board i mentioned is indeed i840,
    but looking at the spec, uses pc100 dimms.
    Glad to see we're not entirely forced into
    Rambus. You're right though, for a $600
    board, rimms would be much less of an issue.

  • by jedi@radio ( 100551 ) on Tuesday June 27, 2000 @05:11AM (#973516)
    Looks like they are working on a Linux-specific product [totalimpact.com] too...
  • From the companies' web page it looks like they are powered with MPI, not PVM, not a thread model.

    I'd be very intersted in how much of MPI and MPI-2 they support...

    What would be intersting is having a number of these cards connected togeteher in the same machine, using MPI for on-card communication, and then some sort of IMPI or some other protcol for communcation to other cards (or other machines)
  • So, as Linux/Apache webservers aren't all that uncommon, how about running SQL-server on one card and application server on another? All the machine is left to handle on main CPU(s) is Apache that queues processes for the cards and tosses results back to the HTTP pipe.
  • Heh, my Lisa MB has a CPU slot. Talk about running ahead of the pack.
  • One of their 66MHz/64-bit PCI cards won't work in your average box.

    Actually, if it follows the spec it will work fine in any PCI slot. If you plug a 64-bit card into a 32-bit slot the card runs at half the speed. A 32-bit card in a 64-bit slot just works like normal. A 66MHz card in a 33MHz slot runs at 33MHz and a 33MHz card in a 66MHz slot downgrades the bus to 33MHz (ouch!).
  • How's this for a configuration? If you look at the specs on these boards, they allow daughter boards too. The daughter boards include SCSI, 10/100BaseT Ethernet, and IEEE1394. Mainframe Architectures off load I/O from the CPU to a large extent. Suppose we use these in a system like that... Use the PPC board with Ethernet, one on either end, you have the beginnings of a highly secure network card (can you say 1024+ bit keys?). Offload the SCSI, letting the PPC board pick up the I/O control, manipulating the data. Using the Ethernet daughtercard, you can monitor the network traffic, basically building the firewall directly into the NIC at that point. The possibilities for these critters go on.

    Then there's the Beowulf - Talk about HUGE... 4 way processor host machine, with 8 of these PPC cards fully loaded, then put these into the Beowulf cluster. Multi-level parallel systems - now you're talking super-computing at a level never seen before!

    Then there's the ultimate use... Take a system like this, 4 way processor with these cards, using the cards as multimedia, network, and other I/O sub-processors, and you're talking an incredible gaming box...
  • Uhhh. the AGP bus is meant for VIDEO cards, whereas PCI is more general purpose..


    Uhh, no. AGP is another bus on the system. It's basically a high speed memory bus, and while it is optimised for one way, processor -> agp transfers, that by no means limits its use to video cards. I could see where it'd be useful to say, dump a chunk of encrypted data straight into a piece of shared memory for the processor card to sit there and chew on, or maybe dump a raw stream of audio through for the processor card to encode to a format such as real audio, leaving the PCI bus free for the task of actually shipping that stream out to clients.

    Just because it's mostly used for video cards now, doesn't mean that video cards are the only thing AGP is good for.
  • I was just about to reply about the addition error when I actually read the contents of the last percent. lmost as funny as an average scene on Frasier.

    Speaking of TV shows, do you Ameri-Co users get to see "The Games", a hilarious insight into the fiasco that is the Sydney Olympics management. The show's premise is a cameraman who follows the head management team around, looking into meetings, etc etc.

    http://www.abc.net.au/thegames/

    One scene, the head guy is pep-talking in a staff meeting, and couldn't read his own handwriting.

    "... and our love of... what's that word?"

    "Sport."

    "Aaah thanks. Heh, I can't read my own... what's that word?"

    "Writing."

    "Can't read my own writing. That's it. Anyway..."

    --

  • At the recent SMP meeting about FreeBSD held at Yahoo, there were 3 people there from Apple Computer. Feel free to speculate.
  • I used to use a 386 with an array of 16 T800 transputers to render. Each board of 4 transputers had 4 megs of RAM, as well as one meg for each transputer as cache. They communicated along a dedicated back bus.

    This was used for RenderMan rendering with the old Digital Arts DGS system. The main processor would split the job into 16 x 16 pixel "buckets" and send the pre-clipped scene data (geometry, lighting, surface information) as well as the a portion of the textures used in the scene. As each transputer finished the contents of it's bucket, it would dump it back along the ISA bus to the Targa framebuffer.

    Thats the sort of process these are useful for. Not SMP, but assisted special-purpose processing.

  • I can see some potential problems with using these in normal systems. Even if you have a 66Mhz 64bit PCI slot, I think you'll have problems with power consumption especially if you have 2 or more processors. I believe that a PCI card can only draw about 20W according to spec. However, one of these cards with two processors will need to draw at least 14-20W for the processor alone. I didn't see any information about a separate power supply like the Voodoo 6000 so I assume that you'll need a special PCI slot that can supply the power or need to attach the card to a power cable.

  • A Beowulf Cluster of these?
  • Ask the people at Be. The original BeBox [emedia.com.au] used dual 603s and SMPd like nobody's business.

  • Mostly used for video cards?

    AFAIK, AGP is only used for video cards at this point. I'm not sure why, but if they don't use it for anything else, there must be a reason that you and I haven't concidered. Yes, I agree AGP sounds like it would be a good idea to use here, but who knows. Further discussion is next to useless becasue we do not know the relevant information.

    PS: AGP was meant for video. Advanced Graphics Port.
  • ...but if you're going to design something utterly centralised, then to make it replace a reliable server farm you're going to end up reinventing the mainframe

    ...which might not be that bad of an idea for server-only uses. IMHO, of course.

  • Yes, this is new. This is not the main CPU of a system that just happens to be connected in a package resembling a "card" as you put it. This is a PCI expansion device ("card") for use in an x86 system (or PPC), not as a main CPU or even as a second CPU. It is a "PCI based multiprocessing solution for Intel compatible PC's [or Macs]".
  • I'm well aware of your faq in which you try to justify your waste of bandwidth.

    Temper, temper. Calm down. Witness the "Slot A" slot on my athlon. It looks physically identical to a connector for a PII mobo. Do 'ya think they'll work in each, however? No, which is why you read the tech specs..

    Temper temper? To you I say Foolish Foolish. I did read the specs (you did not, because obviously you did not read the site before you posted, I did.) It could be pc66, 5v dimms, it doesn't matter, it is a dimm slot, and as you can see (that is if you have even read it at this point, something I am begining to doubt you ever will do) these dimm slots are CLEARLY occupied by cards with memory chips on them. Do you plan to argue that they might not be ram chips in order to justify your original post?

    Furthermore, I hardly consider a visit to ONE vendor site a reasonable view of the market. Just recently I have read several reviews of 400W and 450W consumer power supplies. This Board is clearly not designed for consumer use, it is a 66mhz, 64bit pci device, a slot not found on consumer level motherboards, but you would have known this had you visited their page by now, wouldn't you?


    This may be considerd a flame, but the root post of this thread is obviously a troll. If you are going to make statements about a product, please inform yourself about the product in question. Don't be the hardware equivelent of one of those foolish people who protests against movies they havn't seen.

    NightHawk

    Tyranny =Gov. choosing how much power to give the People.

  • Great use for the new bus. Too bad most traditional PCs don't support either yet. The only machines I know of that have 64/66 is the UltraSPARC-based machines.

    Or are there x86 boxes out there that have it?
  • Et pour ceux qui ne parlent pas francais (ou n'ont pas etudie Latin):

    puissance/consummation = power (as in mhz)/(energy) consumption
  • So if you can put a bunch of these in a rackmount with a gig or two of RAM, wouldn't it be a cheaper alternative than a Beowulf cluster?
  • The PPC chips can have their endianess set either way, so no problems in that area.
  • dcypher.net or SETI@home is not a _clever_ use. It is an obvious use. So let's not waste space yacking about it.

    OTOH, clusters are better when they have faster interconnections, so what if you got a mobo with a lot of PCI slots, and put a bunch of cards in it? PCI beats ethernet any day :)
    #define X(x,y) x##y
  • by michael.creasy ( 101034 ) on Tuesday June 27, 2000 @05:12AM (#973539) Homepage
    Sound is handled by a sound card,
    Graphics is handled by a graphics card and now...
    processing is handled by a processing card.
    Cool.
  • The bus between the memory and the CPU is usually the fastest one in the system for a reason... how does this overcome the inherent limitation in getting data from main memory? It seems to me like just sticking 4 CPUs on a PCI bus is a recipe for disaster - your harddrive's controller, be it SCSI or IDE, is most always on the PCI bus.. wouldn't this create a huge contention for bandwidth and slow I/O down by about a factor of, oh, a million? :(

    Maybe they have a huge cache on the board? Also.. as another poster mentioned.. what are the power requirements? I have a 300W power supply, and 250 is already sucked by *just* the CPU + mobo. I know the G4 has low power requirements.. but can the mobo supply much more than it is now??

  • These cards are not new, but the idea of embeding Linux on a PowerPC MP system in a PC or Mac would be cool. All this needs is a dedicated 10/100 Enet port and away it goes as a high powered server for web or network gaming. The only problem is that it will be tied to the base system for power, configuration and control. Hence when the main system hangs and needs rebooting, this card gets rebooted/reloaded. I would hate to see Windoze on the base system. This card would be rebooted every hour or so. But, it we were to take a passive backplane and stick a ton of these in there, with a few SCSI RAID controllers and more than a few disks you could have yourself a nice compact cluster of high performance nodes to run Linux on.
  • The new Compaq ML-series servers have 64-bit slots. You can get one of these servers for around $3k, which isn't too bad if you're considering implementing this technology.
  • I wouldn't expect Mac OS 9 to use it, but I'm sure somebody will hack Darwin to get it to work (and Mac OS X should run on top of the hacked-up Darwin core).

    Gotta love open-source. :-)

    --

  • I wonder if these are similar to an idea I've been toying with for WIGTTPWSH (When I Get The Time To Play With Some Hardware), that is to use a MPC106 chip to put four MPC603e (okay, not as kewl as a G4, but respectable, I think), on a PCI bus. The '106 includes a DRAM controller, CPU-PCI bridge, L2 cache controller, and interfacing for up to four 60x microprocessors.

    Now, take several of these cards, each with four processors and their own memory, and bus them together on a PCI backplane. Lotsa horsepower, no?

    These would be all main processors, not peripherals like the cards mentioned here. No, they wouldn't run Windows or Linux, but I've been hankering for a chance to play with OS design anyway. :)

  • On the topic of OS support, I'm also wondering how one of these babies would work with Mac OS X. Presumably, someone could be able to write up a virtual machine which could run Mac OS X in one of these boards while still having a x86 as your main architecture. Hopefully, you wouldn't have to wait while the developers port Mac OS X over to x86 and iron out the bugs.

    Hypothetically, this new setup could enable one to run multiple OSes simultaneously in the one box and do away with slow emulation altogether. Yet, IMHO, this would probably raise concerns as how these boards with their respective OSes share resources, most notably RAM, and other peripherals.

    But then again, with all that extra processing power on an architecture (x86) that wasn't meant to support several different types of processors at once, why would you want to run several different OSes at the same time instead of being dedicated to one? Well, at least this idea would allow one to have a "universal" interface card. For example, I have 400Mhz G3 card and I configure it run as a graphics card. First, I hope that the manufacturers will think of making the interfaces to these cards scalable/versatile. A year down the track, I'll get a 800Mhz G4 card to replace the graphics processor, and instead of throwing out the 400Mhz, wouldn't it be great if I could reconfigure it to become a high-speed FireWire controller or even a dedicated "software" (because you must load a program into the board's RAM) RAID controller?

    One more quick question: does the board work with PPC variants of Linux, x86 variants or both?

    mPOWER certainly has implemented an interesting concept in giving the typical "nerd" of what seems to be an affordable scalar platform that could easily compete with SMP systems with the addition of sheer scalability.

  • Well, if you examine the product sheet, there's on-board (on-card?) RAM.. this would lead me to believe the typical application would involve writing a program and then uploading it to the card's memory space - and then having the card only send back the results from the process that was sent.

    Their drivers seem to tie the card's memory - apparently up to 512MB with the linux memory space, so that you don't need to jump through lots of hoops to make it work. Again, I assume there is logic on-card to make sure that the system memory bus isn't being used, just the local memory bus on the card (which would be no big deal).

    This is much like the much-proclaimed clustering technology, excecpt that the bus is PCI, and it's mackin' in comparison to ethernet :). It would be nice to see an rc5 client written for this bad boy.. it would also be a real bonus for those of us that play around with neural network simulation and 3D graphics work. I'd love to have a card like this for a render slave.. hehe

    Kudos!

  • Priority is determined by which slot # the board is in.

    Slot 1, bus 1 has a higher priority than slot 1 bus 2, etc. Most larger servers have more than 1 PCI bus. Most PC's have 1 PCI bus, but have priority on which slot # a card is in. Eg: Slot 1 is usually the slot closest to the AGP slot on most ATX systems.

  • Methinks you are incorrect on this one. Please see this article [maccentral.com], which talks about a quad processor g3 that was shown at LinuxWorld.

  • Umm... what about the PM 9500MP or the Umax s900 (or all those Daystar machines)? They ship with two 604e processors, and from my experiences with my s900, are really great machines. In fact, my s900 has a 603e and a 604e running together without any problems at whatsoever.
  • Signal 11, please.

    You have crossed the line from foolishness to fraud. You have realized that you are wrong, and will now spew out any lie to save face.

    You want links?
    Here is one 450W consumer power supply [direct411.com] and This place also sells one [computer123.com]. 450Watts too much? How about 400 [monarchcomputer.com], which you can also buy HERE [etoyou.com] or HERE [logical-source.com]. That last one has a really great description/specifications page, but you have proven you aren't interested in reading things like that. You didn't really beleive your 300Watt power supply was the best on the market, did you?

    Sure, I guess you'd know the difference between a DIMM slot made for cpu cache vs. main memory. I don't know anything about macintosh hardware, for all I know those dimms were there for caching accesses to main memory.

    Your whole point is compeletly irrelevent because had you actualy READ the page [totalimpact.com] (which is the issue of this thread) you would CLEARLY see in the specifications that the board has "Two 168 DIMM sites, support for up to 512Mb of SDRAM, 3.3V, unbuffered PC-100 DIMMS.". It is right there in plain english. Those two 'slots' that you cannot seem to identify ARE PC100 168pin dimm slots. You may as well argue that they might be new slots for a secret mac chip not yet revealed, you would be just as wrong either way.

    Practice what you preach.
    I DO practice what I preach, I READ articles before I post about them.

    I suggest you stop now before you totaly disgrace and completely discredit yourself.

    NightHawk

    Tyranny =Gov. choosing how much power to give the People.

  • Are these things like those Evergreen Celeron PCI (system on a card) upgradecards (www.evertech.com/accelerapci/) that are designed to upgrade old PCI sytems; Or like those Cyrix 686 200L PCI cards for Macs (to give 'em Windows compatibility through hardware), that use to be for sale in Mac shops?
  • Apparently you missed the source of that quote. Right below his sig it says "George Moore on his law".
    --
  • What a waste! Why spend on all that money on one or more G4s and then stick it in a PCI slot? That's like putting a ferrari engine in a VW beetle. Sure the bug goes fast, but don't you think you could have found a better use for that engine? PCI slots are fast, but not nearly fast enough for this to be a worthwhile venture. You wanna build a cheap supercomputer? Wulf it instead! At least that way you'll get the full potential of the processor and not choke off a great chip.
  • i dont know what MoBo is in it, but the IBM Netfinnity 5600 series do have 64bit pci. The board in these machines is a dual processor one, though they normaly only come with one chip.
  • by ArsSineArtificio ( 150115 ) on Tuesday June 27, 2000 @07:59AM (#973558) Homepage
    PowerPC processors are not well known for their sobriety.

    No kidding. My G3 gets tanked at least twice a week, and cleaning up after it is becoming a freakin' nuisance. Jose Cuervo and coolant paste makes a horrible reek, and don't even get me started on the effects of black coffee on a PowerBook's keyboard...



    ------------------------------------------------ -------------------

  • 1) Heavy processing, obviously. Crypto, graphics...
    2) Bunch of servers in 1 box.
    3) Gaming ^_^
  • The boards run linux. I can't figure out what type of parallel processing they use. On the one hand, they refer to mapping all of the memory on up to 8 boards into a single address space (like SMP), but on the other, the also make a product to use MPI (like Beowulfs) on MacOS.

    If they are less than about $2500 for a quad G4 board, this may be even cheaper than the KLAT-2 cluster's $650 / GFLOPS discussed here a while back.
  • General-purpose asymmetrical shared-memory multiprocessors are reinvented every few years. It's one of those things you can do, but probably don't want to. The problem is that you need a special OS and special applications to run on those wierd machines, and nobody has yet found a killer app that justifies the hassle. They're no cheaper than symmetrical shared-memory multiprocessors. SMP machines are more useful and accelerate existing multithreaded apps.

    There are a large number of ways to hook multiple CPUs together, many of which have been tried, but only two have been successful: symmetrical shared-memory multiprocessors (SMP), and networked clusters. Many millions of dollars of government money have gone into R&D on nifty ways to hook lots of CPUs together to build a supercomputer, starting with the Illiac IV (1970s), the Connection Machine (1980s), and the BBN Monarch (1990s). None of these led to anything people wanted to buy, even people with big problems and budgets. Vanilla architecture wins again.

  • http://www.totalimpact.com/G3_MP.html

    Notice that
    1) the PCI card is taller than standard height, this limits the number of desktops which can use the card. Hence: PC/*AT* & Older PowerMacs

    2) "possible" interface cards... interprets as PMC site available there but software drivers need work.

    3) now ask about parallel abstraction layers & tools...

    Large parallel systems are quite useful here, but my Total mPOWER boards have (so far) been less useful than the original packing material that the Total mPOWER boards were. :-(

  • Oh, no doubt--mainframes still make good sense in
    a variety of situations. However, reinventing such
    a niche item from the ground up seems a pretty
    poor idea. (just ask SGI :-)

  • Sure, I guess you'd know the difference between a DIMM slot made for cpu cache vs. main memory. I don't know anything about macintosh hardware, for all I know those dimms were there for caching accesses to main memory.

    From the article:

    Memory:
    Two 168 DIMM sites, support for up to 512Mb of SDRAM, 3.3V, unbuffered PC-100 DIMMs.


    As distinct from:

    Level 2 Cache:
    1Mb of L2 "Backside Cache" per processor.


    The board design would be far simpler with the DIMMs as dedicated memory instead of cache, the DIMMs are described as "memory", and the article makes no mention of direct access to system memory; the only reasonable conclusion is that the DIMMs are standalone memory, as the previous poster pointed out to you.
  • >Too bad most traditional PCs don't support either yet.

    No traditional PCs don't, but servers do. Like the Dell Poweredge 4400 and almost all of Dell's enterprise servers.

    Yeah they are pricy, but doesn't this much computing power usually require money anyway?

    ---
  • 10% of comments - lameass dumb trolls.

    20% of comments - can you imagine...a beowolf
    cluster of these?

    30% of comments - actually, I'm a really really
    smart bloke and I know
    everything about everything so
    moderate this comment up!

    35% of comments - karma whorin - come on siggy,
    you _know_ you're gonna post
    simply to collect yet more
    karma. What was it at last time
    I checked? 750? I thought so...

    5% of comments - I love Microsoft, please flame
    me. LOOK! Here's my private
    "business correspondance only"
    email address, why don't you
    hit my corporate email server
    with a nasty DDOS just because
    I'm obviously a secret MSFT
    lover and must be stopped at all
    costs.

    1% of comments - Really really fscking irritating
    statisticians who just _have_ to
    tell me that I can't add up...

    --
    Jon.
  • by RedFang ( 178492 ) on Tuesday June 27, 2000 @06:05AM (#973587)
    Grumble, stupid fat-finger sending blank message.

    For starters, as others have pointed out, these are slave processors, so by definition, putting this in does not make an SMP box. The S in SMP stands for symmetric, and while the CPU's on the card are symmetric, the card is not symmetric with the main CPU(s).

    The way this works is much closer to a mainframe running VM with partitioned systems underneath it. You submit a job by tossing it over the wall to the VM partition (in this case one of these cards) and wait for it to toss the results back. You can probably watch the job some way with a properly written VM subsystem. You probably can't run interactive programs on these cards and if you can, you really wouldn't want to since you would clobber the PCI bus sending keystrokes and screens back and forth. And don't even think about trying to run a GUI on one of these cards.

    What these cards are perfect for is batch processing. You write up a queuing mechanism to accept jobs and farm them out to the cards as they become available. The main CPU would manage the UI and the queue. The Cards have their own memory (max 512mb which is not a lot for this type of work) so you can get reasonable performance as long as the data sets are small enough to be loaded into memory on the card.

    What this means is that the type of processing you can do with these is limited by the PCI bandwidth and the memory on the card. I don't think this is as great and wonderful as it looks. It's really cool, and if you need to run lots of compute intensive programs with smallish data sets it then this is ideal, but it will choke on high transaction rates and large data environments. Databases are an absolute no-no unless you really hate your PCI bus and want to try and burn it out.

  • You could tear the ass off some RC5 cracking [distributed.net] with this beast and the PPC client [distributed.net].

    Also, SETI@Home [berkeley.edu] offers a PPC client [cdrom.com] that would benefit from this.

    I may have to get me a couple of these.

    --
  • http://www.compute-aid.com/atx350w.html
    350 watt for $55
    http://www.overclockers.com.au/techstuff/r_lm400ps u/
    400 watt supply
    http://www.axiontech.com/cgi-local/manufacture.asp ?category=power%5Fmanagement&mfg=enlight
    another 400w supply
    normaly i think you raise some valid points but
    do your homework
  • They're asking ~$4500 for the 4 G3's and ~$6500 for the 4 G4's. Each board comes with 128MB of RAM. This courtesy of http://www.xlr8yourmac.com [xlr8yourmac.com]
  • Brad Nizdil of Total Impact [totalimpact.com] has lurked on the OpenPPC Project [openppc.org]'s mailing list for months, and just posted a message about the company's plans regarding the PowerPC Open Platform [phys.sfu.ca]. Interesting stuff.

    POP is IBM's PPC-based reference platform [ibm.com], which will (we hope!) allow OEMs to build inexpensive and clever PPC-based applications. Design files for the first version of POP never came out due to a bad part (the Northbridge, from Winbond); according to Brad, a "POP2" is on its way.

    As always, further info is at http://www.openppc.org [openppc.org].

    --Tom Geller
    Co-founder, The OpenPPC Project

  • by nellardo ( 68657 ) on Tuesday June 27, 2000 @09:07AM (#973608) Homepage Journal
    So if you can put a bunch of these in a rackmount with a gig or two of RAM, wouldn't it be a cheaper alternative than a Beowulf cluster?

    As usual, the answer is most likely "It depends." (ObDisc - I don't have one of these cards to play with)

    No matter what API you're using (SMP/threads or Beowulf/PVM) these are most likely best used for SIMD (single-instruction, multiple-data) kinds of problems (of which SETI is one). Communication between boards will be a major performance bottleneck, since they all share the same bus.

    Since they do have local RAM (and not just cache), you load the card's RAM with one set of code and four sets of data. Do that for all the cards you have. Now wait, and get your answers back off the local RAM. Did you use threads or processes? Threads and its closer to SMP, processes and it is closer to PVM or Beowulf.

    But will it outperform a comparable Beowulf cluster? If it is compute-constrained, then the PCI cards will do better, especially as the problem scales, because the PCI cards share hardware costs for disks, network cards, fast bus, large RAM, etc. If it is disk or network limited, though, the Beowulf will eventually win out. The PCI cards will do well on a price/performance basis while the problem is small, because it will still be sharing hardware. But once the PCI bus fills up, those processors will start waiting on the bus. The bigger the problem gets, the more the processors wait. The Beowulf cluster, on the other hand, can distribute all that hardware - instead of one 100Mbps network card, it may have dozens (you start worrying more about what your ethernet switch's backplane looks like).

    So these cards are best for compute-intensive simulation-style stuff (image filters would also scream - mostly - FFTs require lots of communication). Simulated wind tunnels or weather phenomena, finite-element analysis, etc.

    Note though, that these cards have their own slower PCI bus, including support for an add-on card (!), so conceivably you could get a lot of server oomph by giving every four processors their own network card. But you better make sure you data (i.e., your web site) can fit in the local RAM, or you'll bog down in bus contention again.

  • And with all those servers in a single box, you'd
    be reduced to a single power cord, a single UPS,
    a single point of failure...

    Yeah, I know that you can (should, would) have
    multiple redundant power supplies, but if you're
    going to design something utterly centralised,
    then to make it replace a reliable server farm
    you're going to end up reinventing the mainframe.

  • "If Al Gore 'invented' the Internet, I 'invented' the exponential."


    Of course Al Gore didn't invent the internet, but he did introduce the legislation that forced Universities to allow the public to tap into the network. Now expecting Al Gore to understand the difference, that is another matter.


    But I seriously doupt you helped the exponential gain popularity in any way.

  • My first reaction was "Wow, more CPU power!". And then, I actually though what could we do with that beast.
    I'm sure those would be useful in niche markets, like imaging/multimedia where special custom software could offload some huge operation to the card while the main CPU deals with the user interface. But that's not my field so I have no idea of the feasability.

    Many people mentioned 'Beowulf'! Now, Beowulf is a scientific cluster, and I happen to know a fair bit on the subject, since I work for a research center.
    Most scientific applications need lots of CPU power, but also lots of memory bandwidth: for example, simulating the flow of air around an airplane wing what a dataset of 5 GB...

    So from the start, the data cache of the CPUs are nearly useless since we cicle through huge amounts of data, the CPU constantly reads and write to memory. The net result is that a standard PC isn't able to keep more than two CPU fed with data before the system bus becomes a bottleneck. Since the mPOWER card has a standard PC bus, only two of the four CPUs would actually be used.

    Next, the memory. 512MB isn't actually a lot for scientific clusters. That what you usually have for each CPU. It's a bit tight, but let's live with it.

    Finally, the benefit of this kind of card would be to cram a PC box with a number of those, to actually save money by not needing additional hard drives, cases, keyboard, cheap graphic adapter, etc.

    The typical PCI bus (64 bit, 66MHz) has a bandwidth of just under 4 Gbps. It is a bus, so only one device can use it at the same time (half-duplex). The usually clustering interconnect (Mirinet or SCI) offers 1 Gbps full-duplex, so let's say 2Gps to compare with the PCI bus.
    Let's also say that the host CPU in a multi-mPOWER card situation isn't doing any actual work to let the bus free for the mPOWER.
    The means you can put two mPOWER cards in a single system before each card will get lower interconnect than if you had a standard dual-CPU machine with a SCI or mirinet adapter. And that's even before the need to access any disk or network device, which would cause additional traffic on the PCI bus, reducing the overall available bandwidth. That's not much of a win.

    Of course, not all application need to have gobles of memory. distributed.net-like application, where the dataset is tiny, could make use of all the 8 cards in one system. I just think that those applications are the minority is scientific computing.
  • by hawk ( 1151 ) <hawk@eyry.org> on Tuesday June 27, 2000 @06:12AM (#973623) Journal
    Yikes, it just occurred to me that much of the readership wasn't born at this point, but . . .

    the bus and processor used to run at the same rate. There were many systems in which the processor plugged into the backplane jsut like any other card. S-100, PDP-11 (and others) behaved this way, as well as other lesser known formats. Others took an approach that was similar: the Apple II exposed everything to the bus, and a processor card could flat-out take over. There were a few hybrid systems that used S-100 for expansion, but had a motherboard with a processor and possibly memory.

    Then processors started running fasterthan 4mhz . . .m :)
  • If these were modified to use AGP (or an AGP-like bus) that would give a tremendous advantage. PCI DMA was fast a few years ago, but AGP allows almost direct access to memory windows, which would allow these processors far more bandwidth and system interaction, as well as reducing contention for a narrow PCI bus.
    That said, I think this would also be good for distributed.net and SETI, or whatever other data-cruncher you happen to favor.
  • by chainsaw1 ( 89967 ) on Tuesday June 27, 2000 @06:16AM (#973627)
    If you read the post, it says that these cards will work inside a x86 on binaries that contain PPC binary from a cross compiler...

    What would seem sweet...and maybe not to hard to do would be to have some thing capable of running 99.9% of the binaries in existence. While we can run many progs under x86 (WINE, vmware), the PPC will allow us to run LinuxPPC-native and even (if you so desire, but maybe not) MacOS binaries. Now we won't have the ROM (maybe the new-world ROM files will solve this), be we WILL have Darwin to work from for something in a more of a WINE like compatability. If the New-world ROM can be used, it may be possible to get something as complex as mol up on your x86 workstation. Imagine having one workstation where you, the HellDesk employee, could run *NIX ( Lin/BSD, natively), vmware (WinXX), and mol (MacOS 9+) from the same workstation... simoultaneously (ignoring the 512M RAM you'd probably need). In environments that have great OS diversity, this would be great (Universities come to mind).

    It would be more beneficial to Mac owners to have the reverse for compatability (putting a PIII or K7 on a PCI in your Mac). There are several companies that do this (and probably have patents) such as OrangeMicro which are anally retaining the hardware specs last I heard. And they only develop drivers for MacOS. Plus I think they require special versions of the OS's that run under the hardware anyway.

    You also have the possibility to now section off hardware to a virtual environment (similar to IBM's 390's) because you can easily quantize the resources allocated to each environment by PCI card...

  • Tyan has at least one motherboard with 2 64-bit slots on it, but not being familiar with the spec, I don't know what windows can do with it. Nice motherboard though, their newest scsi-on-board mobo.

  • Yes, but will they release it to the public so Jeff Goldblum will have something to do?
  • I'm not entirely sure I understand what this would mean. Would this increase the speed of the machine when running everyday apps, or (practically speaking) would this be limited to very specialized programs that like to hog processor (buying a processor for your program rather than the other way around for a change..) ?
  • by mirko ( 198274 ) on Tuesday June 27, 2000 @04:48AM (#973637) Journal
    PowerPC processors are not well known for their sobriety. Most people willing to add these boards to their servers should seriously think about upgrading their power supplies too, especially if they also use RAID disks or whatever.
    BTW, multi-processor, (Strong)ARM-based boards are also being worked upon by companies such as Simtec [simtec.co.uk] ; given the average power needs of an ARM processor and the low FPU based needs of a server, this is an interesting alternative (though I am not sure these are out yet).
    --
  • My goodness. Will you just go read the link to the story, and then come back and admit that you're wrong. And you wonder why people think you're an idiot. Idiot.
  • Hey, even back in 1988, the good'ole Amiga 2000 had a processor slot. I remember there were 286,386 and 486 cards available and PPC cards as well for the A4000. And it was *very* cool.

    Man, running DOS or Windows in a window *without* emulation was über cool.

  • DEC Alphas have it. :P
  • AFAIK, the Power Macintosh G3 (Blue and White) and the Power Macintosh G4 series both had 64 bit 33MHz PCI slots, with the Blue G3 having one 32 bit 66MHz slot. Funny that those models aren't listed as compatible with these cards. I wonder why?
  • Yeah, but imagine a Beowulf cluster of these!
  • by NetCurl ( 54699 ) on Tuesday June 27, 2000 @04:58AM (#973681)
    If you are concerned about power consumption, the G3 and G4 chips dissipate under 10 W. The Pentium III Xeon dissipates an ungodly 30 W or more. Your AMD is higher still. Worrying about adding Power Supplies is less than troubling.
  • That free PCI card slot in my G3 suddenly became much more valuable. I wonder how many G4 chips I can afford?

    I wonder if this can be integrated will with existing chips in a Macintosh computer via the Multiprocessing extension in Mac OS 9. Perhaps Mac OS X will make use of this great resource. Mmm... superserver!

  • The latter. This is for very CPU-intensive processor-hog applications. The same basic stuff that Beowulf clusters are built for. (If anything, it's Beowulf but with higher bandwidth and lower latency - Direct PCI is probably even better than SCI for smaller systems.)

"Look! There! Evil!.. pure and simple, total evil from the Eighth Dimension!" -- Buckaroo Banzai

Working...