Forgot your password?
typodupeerror
Silicon Graphics Hardware

Japan's Newest Linux Supercluster: 13TB RAM 163

Posted by timothy
from the hey-is-that-DDR-you've-got-there dept.
green pizza writes "Following its sale of a 10240 processor cluster to NASA, Silicon Graphics Inc has announced that it's supplying a 2048 processor Altix 3700 Bx2 to the Japan Atomic Energy Research Institute. Aside from running Linux on Itanium2 processors, the beast also features 13 TB of RAM!"
This discussion has been archived. No new comments can be posted.

Japan's Newest Linux Supercluster: 13TB RAM

Comments Filter:
  • by tgv (254536) on Wednesday November 03, 2004 @12:19PM (#10710886) Journal
    I guess that'll be enough to run Longhorn then.
  • oh my... (Score:5, Interesting)

    by Quasar1999 (520073) on Wednesday November 03, 2004 @12:21PM (#10710910) Journal
    I remember back in my electronics course when we had to design the flip-flop grid for memory... the teacher said he'd give 100% to anyone that could draw out 64K of memory... 13TB just makes me cringe...
    • pah... when I were a lad building my first machine... I had to gang nine 1Kbit chips together to make 1 Kbyte + parity... aye, they were the days... and you could cram a full chess playing program into that 1 Kbyte as well. A 4K ram expansion cost an arm and a leg... well it felt like that having to give up beer and ciggies for ages to scrape up the wonga...
    • that was sram though (and probably not very efficient sram)

      Dram is a whole lot simpler and scales better
    • "we had to design the flip-flop grid" I hear Bush is really good at that. :-) okay... back to work
    • Nowdays I still have to deal with 256Kbit EPROMS (32Kbyte) when programming embedded systems. The funniest of these board is that the ROM, including RS-232, RTC process scheduling, plus the application code usually fits in just 8KBytes (a bit of V25 assembly and the big chunk in C), leaving 24KBytes unused.

      By the way, in other areas, I dealed with 512KByte applications able to load and deal up to 2GB data in RAM. It is not very realistic to solve nowdays real world problems, still programming in a weird p
    • Would that have been a 6 cell fully static
      CMOS memory bit?

      Your teacher must have been a sadist!
  • bottleneck (Score:5, Insightful)

    by igny (716218) on Wednesday November 03, 2004 @12:23PM (#10710922) Homepage Journal
    Do all processors share 13TB? Because if they don't the bottleneck is that subprocesses have only 13TB/1024 available ( a mere 13GB each), and still have to communicate a lot.
    • Yes, that's what SGI does. Large single image systems with ccNUMA memory.
    • Re:bottleneck (Score:5, Informative)

      by amorsen (7485) <benny+slashdot@amorsen.dk> on Wednesday November 03, 2004 @12:39PM (#10711078)
      The whole point of Altix is that it's a single system image, not a cluster. Every processor can access all 13TB. That doesn't mean communication is free, of course, but it's vastly faster than your favourite Beowulf cluster.
    • Luckily (Score:5, Informative)

      by bmajik (96670) <matt@mattevans.org> on Wednesday November 03, 2004 @12:45PM (#10711306) Homepage Journal
      SGI has been working through this in hardware for over 10 years.

      The distributed shared memory concept of the Altix (first seen on Origin 200 / Origin 2000 in the commercial space, and previously based on the Standford DASH/FLASH projects) uses a hardware based memory router.

      Each PE has local ram and local CPUs and a "MAGIC" chip that routes cache invalidations, memory block "ownership", etc messages to other PE's as necessary. Unlike SMP designs, cache coherencvy doesn't destroy the whole shebang because its not a shared bus, it's a heirarchial directory system. I.e. PE0 knows it only needs to contact PE3, PE6, and PE13 to invalidate a cache block. Turns out that thats much more efficient than broadcasting a message to PE0-PE63 saying "invalidate this block!"

      Now, as far as _all_ processor sharing the full 13TB - i am not sure.

      The memory density / system image equation is sort of a tradeoff, as more PE's require more router hops in the topology. More router hops increase latency. SGI has sold 256 and 512p single-image systems, and may have gone up to 1024 or 2048p / system.

      To be perfectly honest, the system-system latency is different than the intra-system latency, but nothing like it would be on an x86-with-ethernet shared nothing cluster.

      SGI's big installations are cool as they have advantages of both SMP and MPP designs.. each autonomous machine gives you signle-image benefits but with really high proc counts.. . and then you link a bunch of those together to get this outrageously sized machine.
      • Re:Luckily (Score:3, Informative)

        by jon3k (691256)
        http://www.sgi.com/products/servers/altix/

        "Scaling to 256 Itanium 2 processors in a single node, Altix 3700 leverages the powerful SGI® NUMAflex(TM) global shared-memory architecture to derive maximum application performance from new high-density CPU bricks." So I'm guessing its still 256 CPU's per node.
        • SGI has a layered approach to the max number of CPUs in a supercomputer.
          I guess 256 is what they call "ultrastable" - kinda like the Linux kernel 2.2.
          But the NASA monster already has 512 CPU machines, and who knows what the japanese system has.

          Apparently, SGI sells bigger systems to customers who "know what they are doing" and who work closer with SGI. If you want something that 100% no-frills, then probably the 256 CPU is the current absolutely stable limit.
      • This is dicrimination. If you keep talking over my head, I'll feel short. That violates my human rights.
      • Re:Luckily (Score:4, Informative)

        by flaming-opus (8186) on Wednesday November 03, 2004 @03:02PM (#10713973)
        At NASA sgi has been experimenting with 2048 proc single system image. Since the japan system has yet to be deployed, it will likely be a single system.

        The SGI magic memory controller incorperates the numalink (origionally called cray-link) router they leveraged from the T3e work. This router uses worm-hole routing, which starts forewarding a packet as soon as the address bytes are read. This means that the added latency of going through several routers is often much less than packaging up the packet in the first place. On the hardware side of things it's not the number of router-hops that limits the scalability of the system. Rather the greater the size of the memory, the coarser the size of the directory blocks. With 13TB of memory you are probably invalidating dozens or hundreds of pages at a time. SUCK.

        The cache coherency of SGI's cc-numa machines makes them increadibly easy to program. However, there is a big overhead. Since most supercomputing software is written with MPI, rather than with posix-threads, you don't really behefit from it anyway. I think you can disable the hardware coherency on a per-process basis, which would greatly speed up MPI software.
        • Re:Luckily (Score:3, Informative)

          by joib (70841)

          At NASA sgi has been experimenting with 2048 proc single system image.


          The Columbia system still consists of 20 512-cpu systems, so I would assume this consists of 4 such 512-cpu systems.


          The cache coherency of SGI's cc-numa machines makes them increadibly easy to program. However, there is a big overhead.


          Well yes, the basic problem is that OpenMP/pthreads assumes a flat memory, whereas a NUMA box is all but flat. So the kernel better be real smart about how to map the memory onto the hardware to mi
    • Actually, that'd be 13TB/20480, not 13TB/1024.

      so, 0.000634765625 TB's per machine... too lazy to do it properly right now :)
      • 0.000634765625 TB's per machine

        Then you're assuming each machine has just one CPU. That is not correct, and it's the biggest difference between SGI supercomputers and commodity clusters.
        An SGI system has hundreds, if not thousands of CPUs per machine.
        • How can one actually have many Central Processing Units?
          I mean, I know there are multiprocessor computers nowdays, but what is then central there?!
      • The japanese machine is 'only' 2,048 processors. This would give an average of about 6.3GB per processor -- but SGI uses NUMA, so it's not quite that straight-forward.
  • 640kb (Score:1, Funny)

    I hear this is the reccomended base configuration for Windows Longhorn...
  • Nuclear research (Score:5, Informative)

    by Big Nothing (229456) <big.nothing@bigger.com> on Wednesday November 03, 2004 @12:26PM (#10710954)
    The puter will be used for nuclear research (bushspeak: nucjular reesatch) by the Japan Atomic Energy Research Institute. More info about the organisation, their projects, etc. can be found at: http://www.jaeri.go.jp/english/index.cgi [jaeri.go.jp].
    • Yep, Colin tells me that's nucjular reesatch just off the coast of North Korea, a bad omen for the Free World
      I call for a US export ban on Memory to protect the Homeland's national security.

      Ow! Dr. Condoleezza just informed me they make Memory all by themself, lets pre-emptively nuke 'em!

    • bushspeak: nucjular reesatch

      "Scandanavian doing a bad Bush impression"speak, you mean?

  • got us beat (Score:5, Funny)

    by theMerovingian (722983) on Wednesday November 03, 2004 @12:26PM (#10710956) Journal


    2048 processors, 13 terabytes of ram, AND it comes with a smaller, more ergonomic controller.

  • Nuke Simulator? (Score:2, Interesting)

    by k0de (619918)
    So an American company is selling a computer to a Japanese organization that is ideal for simulating nuclear explosions. Interesting.
    • surprised to not see a bunch of
      "Global Thermonuclear warfare" jokes.....
      Seems ripe for the picking.
    • About JAERI

      Devoted to comprehensive research on nuclear energy since1956, JAERI challenges research and development in the realm of frontier science and engineering with focus on the realm of nuclear research and developments. Projects include the establishment of light-water reactor power generation technology in Japan through its endeavors including the success in Japan's first nuclear power generation and achievement of the leading and systematic research on nuclear safety. JAERI has also attained the w
    • So an American company is selling a computer to a Japanese organization that is ideal for simulating nuclear explosions. Interesting.

      Interesting indeed, considering that the Japanese have the necessary materials, infrastructure, and means of delivery, I suspect they will be revealing a deployed nuclear capability soon. This announcement serves as a veiled warning to North Korea.

    • ...have no influence in Japanese big business or government...or so Jon Lovitz told me.

      Yes, I find it interesting also.

  • running Linux on Itanium2 processors

    But isn't Itanium kinda evil (as opposed to slashdot darlings PPC/Power and Opteron)?
    While Linux is super cool? So should I like it?
    • Re:undecided (Score:3, Informative)

      by hype7 (239530)
      But isn't Itanium kinda evil (as opposed to slashdot darlings PPC/Power and Opteron)?

      While Linux is super cool? So should I like it?


      I know you're trying to be humourous, but it raises an interesting question: is this thing faster than the Big Mac [apple.com]?

      -- james

      • is this thing faster than the Big Mac?

        Interesting question. Especially considering they have roughly the same number of processors...
        But from the article I get the idea the SGI is kind of... less clustered. It seems to share its memory while on the Big Mac, each G5 computer has its "private" 4 GB of memory.
      • Re:undecided (Score:4, Informative)

        by RalphBNumbers (655475) on Wednesday November 03, 2004 @01:41PM (#10712718)
        is this thing faster than the Big Mac?

        And the awnser is: it depends on what you're doing with it.
        This thing is significantly more tightly coupled than VT's cluster, and uses shared memory as opposed to clustering, so for alot of tightly coupled problems it will be *far* more efficient.

        As for raw processing power, the Itanium2 has the same theoretical peak floating point performance as a PPC970 at the same clock. In reality the Itanium is likely to come closer to achieving it's peak than the PPC970 due to it's massive cache (9MB compared to the 970's 512KB). However the Itaniums in an Altix3000 are only running at 1.6Ghz according to SGI's page, while the 970s in VT's cluster are now at 2.3Ghz. So the BigMac would have some advantage on loosely coupled problems that it can fit in it's smaller cache and memory.

        So while the BigMac might beat this system at Linpack, the benchmark used to determine the top500, in the domain this system is to be used for (3d modeling of nuclear blasts) it's tighter coupling and greater RAM will make it much faster.
      • by halfelven (207781)
        is this thing faster than the Big Mac?

        Yes it is.

        The Big Mac is just like any other commodity cluster. It's just a bunch of machines tied together in a closed network.
        The SGI supercomputers keep all CPUs in a single machine, sharing all the memory over extremely fast, proprietary interconnects. In such a system, the CPUs talk to each other as fast (if not faster) as the CPUs in a dual-CPU server.

        Assuming the total CPU power is the same, the SGI supercomputer is faster than any cluster (Big Mac included)
    • Re:undecided (Score:3, Informative)

      by networkBoy (774728)
      serious response to funny comment:
      The deal is that the Itanium2's are better(relative) processors when everything is compiled for them. The hitch is that in terms of price for performance Itaniums are near the bottom of the pile (highest performance != best value).
      Finally, in this situation (price be damned), there is not any reason to worry about value, just performance. Thus Itanium wins.
      -nB
    • I do wonder why they went with the Itaniums. Perhaps Intel is having an "All 64-bit chips must GO!!" clearance or something.
      • What's wrong with you people? "Itanium is bad" is an urban myth. Yes it's expensive. Yes, Itanium 1 was indeed bad. But this is Itanium2. It might still be expensive, but it's currently the best CPU for large supercomputers - machines which run hundreds of CPU in parallel, which is exactly what SGI does.

        SGI is using the best tool for the job. When (or rather IF) AMD comes up with a better CPU for this kind of workload, they'll probably migrate to that.

        Don't get me wrong, i'm using AMD on all my PCs, but a
        • Obviously you didn't read one of my other posts. I agree Itanium2 is a fine enough chip. I just think it is expensive. I don't think it is a good value & I am used to seeing government-sponsored research pick a lower cost product over the top-of-the-line every time. In the US, many of the research institutes I know of are consciously choosing to make AMD or Apple clusters because of how far they can stretch their research dollar.

        • It might still be expensive, but it's currently the best CPU for large supercomputers


          Except for, say, POWER5, and vector processors (NEC SX-6, SX-8, Cray X1). If you by "best" mean raw performance and bandwidth, cost and power consumption be damned.


          SGI is using the best tool for the job.


          Perhaps they are, perhaps not. That's not the issue. The thing is that a number of years ago (when AMD64 was barely a blip on the radar) they made a strategic commitment to IA-64. Spending vast amounts of money to
    • But isn't Itanium kinda evil (as opposed to slashdot darlings PPC/Power and Opteron)?
      While Linux is super cool? So should I like it?


      Nonono, on Slashdot Itanium is the best thing since sliced bread. And if it isn't... I'LL MAKE IT!!
    • But isn't Itanium kinda evil (as opposed to slashdot darlings PPC/Power and Opteron)?

      But isn't saying that offensive to turkeys [wikipedia.org]?

    • Re:undecided (Score:2, Informative)

      by drw (4614)
      The Itanium2 is a fast processor, especially when it comes to optimized floating-point calculations. Yes, it is expensive and so the price/performance ratio is not as good as common desktop processors mostly for two reasons:

      1. Large die area (mostly due to huge amounts of on-die cache) - chip price is directly related to how many cores that fit on a silicon wafer.
      2. The Itanium2 is a low volume product, so R&D and verification costs are a higher percentage of chip costs.

      The biggest problem with the
  • Linux is a good choice for a supercomputing cluster? No shit sherlocks. This isn't front page news, it's barely news at all. No wonder readership figures are declining [alexa.com].
    • mhhh, I looked up some other (German) domains.

      Not sure about the results. There are some huge fluctuations(~20x) in the data without obvious explanations. Maybe some can explain how they generate those numbers:

      1. referers posted
      2. ad-downloads
      3. click-through monitoring
      4. ...?
  • Honest curiosity (Score:3, Interesting)

    by erroneus (253617) on Wednesday November 03, 2004 @12:41PM (#10711134) Homepage
    I hear gobs about huge clusters and Linux as the OS that makes it all happen but I don't think I've ever heard other OSes uses like this.

    Could someone make an "off the top of their head" list of SuperComputing cluster and OSes that are used in them?

    I *am* a Linux user and I'm actually kinda curious if Microsoft has an answer to this area of computing?
    • Re:Honest curiosity (Score:5, Informative)

      by Wesley Felter (138342) <wesley@felter.org> on Wednesday November 03, 2004 @12:57PM (#10711758) Homepage
      Most clusters run the vendor Unix. IBMs runs AIX or Linux, SGIs run IRIX or Linux, Alphas run Tru64, x86 clusters run Linux. The ultra-high-end custom machines run obscure custom Unix ports. Microsoft is trying to break into the HPC market, but so far only Cornell and Rice are buying.
    • Linux has been EXTREEMLY successful at breaking into the supercomputer market. Particularly in the 40-200 dual SMP node size. Some really large systems run linux, but not as large a majority as is the case with the smaller systems.

      Probably the two most successful supercomputing product lines are the IBM p-series clusters and HP's alpha-server SC line. Sun sells clusters of sunfire 6800's and up.

      The IBM uses 1-32 way SMPs running on POWER[3-5] processors and running AIX. They use a proprietary interconnect
  • Aye, but... (Score:3, Funny)

    by Lardmonster (302990) on Wednesday November 03, 2004 @12:48PM (#10711397)
    13Tb of RAM, but how much swap?
    • 13 TB * 2 (Score:3, Funny)

      by BlurredWeasel (723480)
      isn't it recommended you have 2x ram as your swap? so that'd be *does difficult calculations in head* 26TB of swap. You really don't want the kernel killing off processes because you run out of ram....that'd be bad.
      • Swap on an HPC is less nessecary because if your nodes EVER have to swap then you might as well stop what you're doing and go and buy more memory. If you can't fit your problem in memory you might as well not even start
    • Typically it is highly reccomended that on Irix (dunno if they're running Irix or not, but it would make sense) you have a swap space at least as large as main memory (for crash dumps). That said, back in my SGI days we had a lot of machines with 4GB of RAM (this was several years ago) and 4GB disks (the vendor was stingy with the system disks). Since Irix required about a GB of disk back then (early 6.5 days) we only had room for 3GB of swap. This wasn't a huge problem because the OS was rarely taking u
  • by Anonymous Coward on Wednesday November 03, 2004 @12:51PM (#10711489)
    Sorry to spoil the excitement for everybody but actually, Columbia far exceeds the Japanses system's memory capacity at 20 TByte. See this description [nasa.gov] for details of Columbia's config.
  • You don't suppose they ever do any weapons research, do you? Hmmm, what to do...

    1. Make sure GWB is really, really truly reelected
    2. Hint to Japanese you'll tell White House that Japan has WMD program - "Okinawa has such pretty beaches - it'd be a shame if they got all shot up now, wouldn't it?"
    3. ...
    4. Profit!
  • a Japanese Atomic Energy Research foundation would need that kind of computing power...Godzilla!
  • The Japan Nuclear research team who just acquired a 13TB RAM supercluster also gained a nuclear power plant to power this bad boy. Projections speak of a 2.5 hour battary life, although Limrick Power Plant has offered their Nuclear facility which will generate a whopping 5.5 hour battary life span.
  • by masouds (451077)
    And I thought 640k memory was enough for everyone. Wait a minute, was it me or...?
  • Man this shit makes me feel old.

    I worked on a machine that had 24k (that 24,576) bytes of wire-wrapped, core memory. At the time though I new where RAM was trending. I had an Apple][ with 32 K of semi-conductor memory.

    I wrote a Pascal-like HLL compiler and a payroll system for the damn thing. In 24k bytes of memory.

    What the [expletive deleted] do you DO with all those terabytes or high-speed RAM? Lets pretend something goes KABOOM!

    I don't know wether to be wow-ed or depressed.
    • What the [expletive deleted] do you DO with all those terabytes or high-speed RAM?

      Simulations. Everything from nuclear processes, to space shuttles re-entering atmosphere, to cars crashing into walls, to oil drilling, to... whatever. That's what the SGI systems are used for.
  • I can't believe any of you didn't do a single doom 3 joke yet!
    • I saw two of them already (okay, one of them was actually about doom4 :-)...

      I myself was more looking for "Imagine a Beowulf cluster of these" posts, BTW.
  • ..they should just run everything from a RAM disk.
  • by obsid1an (665888) <.obsidian. .at. .mchsi.com.> on Wednesday November 03, 2004 @02:53PM (#10713834)
    Hope they didn't forget the $6.4 million startup cost and $2.2 million annual fee for linux licenses [caldera.com] (assuming 8 CPU systems).
  • ...but not quite enough to hold my...friends'...entire pr0n collection in memory.
  • if this will help safety out at the power plants. Japan has a (comparatively) horrible safety record when it comes to nuclear power compared to Western Europe and the US....
  • ... enough to run Longhorn ... ... compile Gentoo ... ... Beowulf cluster ...
  • I hear they will be using it to test the new release of After Dark. It will be running it 24 hours a day.
  • it sounds good, but can it run the Duke Nukem 3D Atomic Edition in 800x600 VESA mode?
  • http://christiancarling.com/snoopy.html [christiancarling.com]This is what the british government needs

FORTUNE'S FUN FACTS TO KNOW AND TELL: A black panther is really a leopard that has a solid black coat rather then a spotted one.

Working...