Top 500 Fastest Computers 97
epaulson writes "The Top500 list has been released for the first half of 1999. The number one machine remains ASCI Red. The biggest Linux machine is cplant at 129, and Avalon is number 160. The list is a ranking of results from the LINPACK benchmark, which is a Linear Algebra code, so things like distributed.net and SETI@home don't count. "
Re:This list isn't even close to accurate. (Score:1)
Thing is, people have to go and register themselves on the list & probably no-one knows it exists.
Re:Woo-Hoo Buffalo made the list!!! (Score:1)
(sorry, this won't make sense unless you go to SUNY Buffalo)
Dumping (Score:1)
for less than it costs to produce.
Of course, this is exactly what practically every Internet software company's strategy is--give away the product to gain market share.
China? (Score:1)
Re:Ooohh, let me at it. (Score:1)
run Win NT on the Teraflops machine. My manual
describes the details on how to do so. In the good
news, we did get Linux 2.0 running, although it never
could access the MRC mesh interconnect.
Re:actual numbers for machines for mortals (Score:1)
I assume the benchmark would run on Kaffe and the new IBM Java, if someone wanted to see what difference a JIT compiler makes under Linux.
Re:FPGAs: Where? (Score:1)
You'll find the kits at:
http://www.associatedpro.com/apsprice.html
Re:actual numbers for machines for mortals (Score:1)
mflops in java with my p200MMX..
(it could be the java interpreter though, I'm using netscapes on win98)
---------------
Chad Okere
Re:China? (Score:1)
Re:actual numbers for machines for mortals (Score:1)
ask: java Linpack
Mflops/s: 8.374 Time: 0.08 secs Norm Res: 1.43 Precision: 2.220446049250313E-16
Re:Hitachi supercomputers (Score:1)
Intel, SGI, and other comments (Score:1)
Back in November, one vendor (don't remember if it was SGI or Sun) earned itself some animosity when its marketing department counted the number of entries it had in the Top500 and declared itself the leader in the supercomputing industry.
BTW, try using a table to get your columns to line up.
Christopher A. Bohn
Re:other way around (Score:1)
And you're absolutely right about the scaling of applications. But the use of the LINPACK benchmark was a necessary decision to make the list accessible to most anybody who had a supercomputer, despite the benchmark being obsolete.
Christopher A. Bohn
Re:Woo-Hoo Buffalo made the list!!! (Score:1)
Re:actual numbers for machines for mortals (Score:1)
Netlib and the Top500 (Score:1)
Re:Because: (Score:1)
"The market for such CPUs would be very small, maybe a few hudred per year"
Actually there would be a heck of a lot more demand for such a CPU if the production cost could be economical enough for let's say CG or games.
"Needless to say, there would also be many technical difficulties. Feeding thousands of registers would require a very wide memory architecture, a few thousand bits might be a good start. "
Sony's next generation "Play Station" has a 2000+ bit bus. So the technical difficulties aren't really a stumbling block.
"SIMD can be used effectively only when there is one operation that is done to a big array of data. eg you have an array of 1024 bytes, and you want to increase the value of each byte by one. However, not all code is like this. "
True but you could use more execution units that can control several registers the ratio could be something like 2:1, 3:1, 4:1,...etc. The optimization of wafer real-estate verses execution cycles becomes the issue. Wafer size can be increased to accomodate more processing power which BTW is more power effcient than these clusters of PCs(Sarcasim) being used today.
The technology that comes to mind to at least experiment with such architectures would be FPGAs. I think I read somewhere were FPGAs were being used by a company to produce a massive parallel processing system.
Something to think about also is the needed R&D for such systems could come from talented engineering students working on MS or PhDs. They could use electronic CAD systems on campus to produce the liths for CPUs and have Intel or Motorola grow the chips. This would reduce the cost of R&D by billions. Giving access to Intel's or Motorola's facilities that are manufacturing chips anyway wouldn't cost much. Remember these companies only grow the chip. The Companies that support such a program get first rights of refusal for any new CPU architecture produced by a student. Almost like open source for hardware. Just a thought.
Re:Those Hitachi processors.. (Score:1)
use for them, but they sure do run optimized linear alrebra codes fast. Most other machines
on the list use commodity CPUs. Cray used to
have a bunch of vector CPUs in MPP configuration,
but now they just use Alphas. Doing a separate
CPU line for supercomputers is just too expensive.
Let's hope the vector stuff is gonna make a
comeback in next generation "multimedia" instructions or something like that.
Re:USA vs. Luxembourg? (Score:1)
I believe Scandinavia and the Benelux region is in fact ahead of Germany when it comes to dealing with the Y2k problem, on par with the USA and the UK.
When it comes to IT usage, Sweden and Finland is in fact #1 and #2 in the world when you count per capita. In Sweden over 50% of the population used the Internet last month. Mobile telephone usage is WAY ahead. There is in fact a lot of interesting research done in northern Europe when it comes to wireless communication, for instance Bluetooth, and other high-tech areas. But American media is a bit bad on reporting on non-american news....
Do I have an inferiority complex? Hell yeah.
Two comments: (Score:2)
Secondly; is Blue Mountain completely up to speed yet? I seem to remember reading that it was going to be the fastest (albiet not by much) when all the processors were finally added. I dunno, maybe I was just smoking something or reading SGI press releases....
----
I work on #13 :) (Score:2)
neutrino
Wow !!! (Score:1)
The results count, not only the potential.
Then the top 500 is useless, since it doesn't describe the task.
We should search for the most useful computer in the world. That would lead to a great debate.
Re:Two comments: (Score:1)
You're likely one step away from eunuch-hood. I doubt if the NSA computers you're thinking of are on this list--or even run linear algebra software, for that matter. Those "classified" machines are in all likelihood simulating nuclear reactions and other defense-related tasks.
Numbers for machines for mortals (Score:1)
(i.e. my cyrix 166)
Re:Numbers for machines for mortals (Score:1)
--
Woo-Hoo Buffalo made the list!!! (Score:1)
Re:Numbers for machines for mortals (Score:1)
Re:This list isn't even close to accurate. (Score:2)
BTW, I work on two of the machines on this list.
neutrino
This is a really easy one (Score:2)
The most useful computer in the world is the one you use.
----
Surprized (Score:1)
"There is no spoon" - Neo, The Matrix
"SPOOOOOOOOON!" - The Tick, The Tick
actual numbers for machines for mortals (Score:4)
On my 400MHz K6-2, I get 16 Mflops without optimization, 20 with -O3. Not quite what was listed in the performance document [netlib.org], but that might have been with a hand-tuned library.
For comparison, my home machine (a 300 HHz K6-2) gets 13 Mflops unoptimized, 20 with. It's running Debian 2.2pre/potato which uses egcs, so the optimization is probably better. Both machines have 100 MHz fsb and 1 MB L2 cache.
There's a fun java version [netlib.org] on the LINPACK benchmark as well. I get 1.4 Mflops.
Re:Wow !!! (Score:1)
Because the fastest is not the most useful.
The computer is not a single isolated piece of hardware, it is connected with devices and used by PEOPLE, and you can't move them around that easily.
Where do I get one?? (Score:1)
-----
Ooohh, let me at it. (Score:1)
how many teraflops would we get, then?
Why (Score:1)
Why use thousands of antiquated CPUs? Wouldn't it be better to build a multi register CPU similar to MMX but on a scale in the thousands? So what if the chip density isn't
Integrated number-crunchers not server nests ? (Score:2)
But when you look up Fujitsu on the Top-500 database, it turns out that only the vector supercomputer (VPP) series make the list, and none of their AP-xxxx systems.
For the VPP series, the entry-level for the top 500 is a twelve processor system rather than IC's single vector processor.
On the other hand, the AP-3000 would have enough total throughput at 45.6 Gflops to get on the list at number 172. But my guess is that it can only achieve that for problems that split into relatively big independent chunks.
That might be OK say for servers and big CFD models, but I suspect that the LINPACK test suite needs a much more fine-grained parallelisation, and would be much harder hit by communication latencies between nodes.
That's just a guess: perhaps any real supercomputer experts out there could say whether this sounds right ?
Re:I work on #13 :) (Score:1)
The compilers are great. Unicos/MK isn't bad, but can be killed by user processes doing too much IPC. OS IPC should have been given a better priority. That was 2 years ago, it may be fixed by now.
Good question! Microsoft...? (Score:3)
Even though I believe they modified the souce of Linux to run on all them processors, it is one of the advantages of Linux.
I am awaiting a press release from Redmond.
--
ATARI beats MAC, no sign of Amiga so far ... (Score:1)
I knew it
Re:Numbers for machines for mortals (Score:1)
In truth, ten years from now we're not going to be doing MFLOPS to assess machine performance, or opcounts to assess algorithm performance. We're going to be looking at the number and pattern of memory accesses. Some of the IBM Power3 processors can churn out something like 4 or 6 double precision flops per clock. The question is getting a main memory bus that will *read* 96 bytes and *write* 48 bytes per CPU clock, in scatter gather, with no latency. Such a beast doesn't really exist, so Things Get Interesting.
How fast you can do something these days has a lot to do with how well you can parallelize it.
Re:actual numbers for machines for mortals (Score:1)
Re:Two comments: (Score:1)
Re:Stats by Country (Score:1)
1) By number of computers on the list, the top 6 countries are G7 members;
2) The G7 all have a computer in the top 47, while no non-G7 state has a computer in the top 52.
Re:Two comments: (Score:1)
well, i can't specifically comment on the machine that holds third place, but i work for a sometimes defense contractor, and we have some very fast machines here that the government owns (ie the government paid for them, so they belong to the government) that are used for simulations, like another poster said. the nature of the simulations is classified, so i won't go into that (though it's hardly as nefarious as you might think). i can tell you, however, that these computers do not spend their time trying to crack encryption keys or other things to "violate our privacy".
Re:Deep Blue (Score:1)
FPGAs (Score:1)
Just saw a FPGA kit that slips into a PCI slot in a PC for some US $300. I hear some FPGA chips have up to 1,000,000 gates and operate at 200 MHz! Given that it would take two gates to produce one register for one bit, I figure the possibility of a 32 bit processor with 200 or maybe even 500 active registers with a humble instruction set of 17. XOR, OR, AND, conditional jump, jump, Shift Right, Shift Left, Integer Math: Add, Sub, Multiply, Divide, Floating Point Math: Add, Sub, Multiply, Divide, Load from memory and Block Transfer. Pretty basic but should get the job done. Each register will have it's own execution unit. Now the number of cycles to complete an operation should range from one to four with the mean being about 2. Lets see 200 MHz average 2 cycles per op that's 100 million times five hundred, that's 50 billion instructions per second! Give me a few months and My supercomputer will be on this list and will only take a thousandth of the space of those other monsters!
Do they really brew that much beer? :) (Score:1)
Re:FPGAs: Where? (Score:1)
Just curious . . .
If you don't want to clutter
nifty info on FPGA stuff, mail me at
johng@NOSPAM.eng.auburn.edu
And, of course, s/NOSPAM//
(Isn't it sad we have to mangle?)
Re:Ooohh, let me at it. (Score:1)
i think it uses some kind of unix based system. Besides win98 would make it take an hour just to start up.
besides what about that chessplaying computer by ibm? Deep blue the one who beat the russian by 1 game, how fast was that?
Deep Blue (Score:1)
Christopher A. Bohn
Re:This list isn't even close to accurate. (Score:3)
Christopher A. Bohn
Oh, well... (Score:1)
Christopher A. Bohn
Re:Numbers for machines for mortals (Score:1)
http://www.netlib.org/linpack/
I once did some optimization of the BLAS on some old SPARCs. We were able to more than double the performance by unrolling the loops into blocks that would fit the cache (hand tweeking the assembler). Which makes me wonder, most optimizing compilers do a fair amount of this sort of thing themselves...could this list have as much to do with compiler tricks as it does the raw speed of the machine?
Might I just add... (Score:2)
Re:This list isn't even close to accurate. (Score:2)
A bit of statistical analysis (Score:5)
The numbers won't add up correctly because several of the machines were credited to two co-builders. Or I could have made a mistake.
Company: total, # out of the top 10, highest rank
(I tried to make this line up but apparently
SGI: 182/500, 7/10, #2
IBM: 118/500, 1/10, #8
Sun: 95/100, 0/10, #54
H/P: 39/100, 0/10, #150
Fujitsu: 23/500, 0/10, #26
NEC: 18/500, 0/10, #29
Hitachi: 12/500, 1/10, #4
Compaq: 5/500, 0/10, #49
Intel: 4/500, 1/10, #1
Self-made: 3/500, 0/10, #129
SNI: 2/500, 0/10, #66
Tsukuba: 1/500, 0/10, #18
Siemens: 1/500, 0/10, #355
This ranking above looks very different than the ranking of the top five computers. For example, Intel, who is #1, is basically a non-factor in the supercomputer market, with a mere three other computers on the list. H/P and Sun, which don't even make the top 50, seem to have the mid-level supercomputer market locked up, with 134 computers between them. SGI, however, is still the undisputed leader, from the high end (7/10) to the mid and low ends of ths list.
Re:Stats by Country (Score:1)
Because: (Score:2)
Wouldn't it be better to build a multi register CPU similar to MMX but on a scale in the thousands?
Intresting idea, but it does have it's flaws. For one, designing a new CPU is _really_ expensive. And as you add more parallelism, it gets even more complicated and expensive (look at Merced). The market for such CPUs would be very small, maybe a few hudred per year. As you may have noticed, even supercomputers are made as cheap as possible these days (eg. beowolf).
Needless to say, there would also be many technical difficulties. Feeding thousands of registers would require a very wide memory arcitecture, a few thousand bits might be a good start. I sure wouldn't want be the engineer responsible for designing a mobo for those CPUs..
Few architectural problems also. SIMD can be used effectively only when there is one operation that is done to a big array of data. eg you have an array of 1024 bytes, and you want to increase the value of each byte by one. However, not all code is like this. You might want to inrease the value of the first elemnt by one, the second element by two and so on. MMX just became useless, there is no paralelism here. Now we have a CPU that is working at a fraction of its full potential: of the 2000 or so registers, only two are used. There is other stuff too, but I lazy so..
Linpack on Java (Score:1)
It's not the language that's the problem, it's the fact that you're running the program via an interpreter (aka your Java VM). Compile the Java to native code and you should get better results.
Re:Two comments: (Score:1)
Or exist on any lists, anywhere, ever. Their big iron's existence is classified, let alone specs or performance benchmarks. And you can bet your ass that NSA's top five machines would take spots 1 through 5 on this list, if they existed. ;-)
Well, ok, maybe not on this list, due to the software differences, as edhall noted. But you know what I mean.
----------------------
Fastest BLAS/LAPACK under i686 linux? (Score:1)
the BLAS under pgcc/pg77 breaks the test cases? Are there binaries in rpm format?
Re:Deep Blue (Score:1)
Stats by Country (Score:5)
USA: 292/500, 7/10, #1
Japan: 56/500, 1/10, #4
Germany: 47/500, 0/10, #15
UK: 29/500, 2/10, #7
France: 18/500, 0/10, #47
Canada: 8/500, 0/10, #29
Sweden: 7/500, 0/10, #71
Netherlands: 6/500, 0/10, #146
Switzerland: 6/500, 0/10, #339
Italy: 5/500, 0/10, #36
Australia: 5/500, 0/10, #102
Korea: 3/500, 0/10, #78
Denmark: 3/500, 0/10, #275
Belgium: 3/500, 0/10, #286
Spain: 3/500, 0/10, #314
Finland: 2/500, 0/10, #53
Norway: 2/500, 0/10, #193
Austria: 2/500, 0/10, #392
New Zealand: 1/500, 0/10, #64
Luxembourg: 1/500, 0/10, #247
Mexico: 1/500, 0/10, #436
Summary: United States 292 vs. Everybody Else 208.
In the top ten, it's United States 7 vs. Everybody Else 3.
If you compile the stats by the country in which the corporation that made the computer is based, American companies are responsible for over 400 of the top 500 supercomputers (just about everything except the Japanese stuff).
Hitachi supercomputers (Score:1)
Anybody know what kind of processors the Hitachi supercomputers at #4 (128 processors) and #12 (64 processors) get? They seem to be the fastest per-processor for LINPACK...
Re:Stats by Country (Score:1)
The most surprising part of the list IMO is the Hitachi showing. They made a supercomputer with only 1/10th the number of processors used in the computers near its 4th place and 12th place positions. And their supercomputers are the only ones in the top 500 with a single-digit number of processors (4). 11 of the 12 are in Japan, though! Shouldn't a lower number of processors reduce the price tag in a major way? Why aren't they in the US?
other way around (Score:1)