ARM In Supercomputers — 'Get Ready For the Change' 238
An anonymous reader writes "Commodity ARM CPUs are poised to to replace x86 CPUs in modern supercomputers just as commodity x86 CPUs replaced vector CPUs in early supercomputers. An analysis by the EU Mountblanc Project (PDF) (using Nvidia Tegra 2/3, Samsung Exynos 5 & Intel Core i7 CPUs) highlights the suitability and energy efficiency of ARM-based solutions. They finish off by saying, 'Current limitations [are] due to target market condition — not real technological challenges. ... A whole set of ARM server chips is coming — solving most of the limitations identified.'"
IMHO - No thanks. (Score:2, Insightful)
PC user, hardcore gamer and programmer here; for me, energy efficiency is a lesser priority than speed in a CPU. Make an ARM CPU compete with an Intel Core i7 2600K, and show me it's overclockable with few issues, and you got my attention.
Re:IMHO - No thanks. (Score:5, Insightful)
No doubt your CPU would win. But when looking at power/price as well, you'd have to pit your CPU against 50 or so ARM chips in parallel. For some solutions, it may be a far better choice. One size doesn't fit all.
One Size Doesn't Fit All -- Same in Supercomputing (Score:5, Informative)
There is already one line of supercomputers built from embedded hardware: the IBM Blue Gene. Their CPUs are embedded PowerPC [wikipedia.org] cores. That's the reason why those systems typically have an order of magnitude more cores than their x86-based competition.
Now, the problem with BG is, that not all codes scale well with the number of cores. Especially when you're doing strong scaling (i.e. you fix the problem size, but throw more and more cores on the problem), then the law of Amdahl [wikipedia.org] tells you that it's beneficial to have fewer/faster cores.
Finally I consider the study to be fundamentally flawed as it compares the OEM prices of consumer-grade embedded chips with retail prices of high-end server chips. This is wrong for so many reasons... you might then throw in the 947 GFLOPS, $500 AMD Radeon 7970 [wikipedia.org], which beats even the ARM SoCs by a margin of 2x (ARM: ~1 GFLOPS/$, AMD Radeon: ~2 GFLOPS/$).
Power Efficiency - MIPS vs ARM (Score:3)
I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture
If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp
Re:Power Efficiency - MIPS vs ARM (Score:5, Insightful)
I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture
If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp
I don't think it is. Best figures (albeit somewhat out-of-date) I can find for a MIPS-based system is 2GFLOPS/W for a complete 6-core node including memory. ARM Cortex A15 power consumption is a little hard to track down, although it's suggested that a 4-core 1.8GHz configuration (eg Samsung Exynos 5) could run at full speed on 8W (if the power manager let it; the Exynos 5 throttles down when it consumes more than 4W). Performance per GHz/core is about 4GFLOPS, so this system should be able to pull in about 28.8GFLOPS (or twice that if using ARM's "NEON" SIMD system to full advantage). Add in ~2W for 1GB DDR3 SDRAM, and that's 2.9GFLOPS/W. Assuming that the MIPS system I found is not the best available (as the data was from 2009 it certainly seems likely better is available now), the two appear to be roughly comparable.
Re: (Score:2)
I may be wrong here, but I get the impression that the MIPS architecture is much more power efficient than that of the ARM architecture
If they are going to talk about building up a big iron using CPUs which are of high power efficiency, I reckon the MIPS cpu might be more suitable for this task than one from the ARM camp
MIPS is an under invested older but great technology.
Another historic winner was the DEC Alpha.
As the folk at Transmeta (and others) demonstrated logic to decode any random ISA and drive a RISC core faster than the old VAX microcode days is very possible. This seems to be the way of modern processors. So ARM/x86/x86_64 ISA almost does not matter except to the compiler and API/ABI folk. If you want to go fast feed your compiler folk well.
Re: (Score:3)
As the folk at Transmeta (and others) demonstrated logic to decode any random ISA and drive a RISC core faster than the old VAX microcode days is very possible. This seems to be the way of modern processors. So ARM/x86/x86_64 ISA almost does not matter except to the compiler and API/ABI folk. If you want to go fast feed your compiler folk well.
One of the best ways you can help the compiler folk is with an orthogonal and sensible architecture. Furthermore, consider that generating good code is a problem that must be solved for every language, so starting with a good ISA makes for a lot less work.
Re: (Score:3)
Another advantage of MIPS is that 64bit MIPS is already mature, having been around since the early 90s... 64bit ARM on the other hand is new and not widely supported yet.
Re: (Score:3)
THe core i7 might very well still win. Remember that intel is more efficient in computing work per watt, and an Ivy Bridge core i7 3770k uses 77w. If your average arm chip uses 2 watts, that means that ~30 arm chips will still get beaten by the core i7....
Re:IMHO - No thanks. (Score:4, Interesting)
Why would an ARM chip use 2 Watts?
â-- ARM Cortex-A9
â-- 1 ops / cycle @ 800 MHz - 2 GHz
â-- 0.25 - 1 Watt
â-- ARM Cortex-A15
â-- 4 ops / cycle @ 1 - 2.5 GHz*
â-- 0.35 Watt
Re: (Score:3)
No doubt your CPU would win. But when looking at power/price as well, you'd have to pit your CPU against 50 or so ARM chips in parallel. For some solutions, it may be a far better choice. One size doesn't fit all.
50 costs more in silicon than a single x86.
basically you need a "new generation" of arm chips. but they'll have to compete against a new generation of x86 chips - and remember, x86 chips are priced as they are only because they're fastest you can buy!.
the thing is, we have been listening to this for years, that in few years arm will take over everything. yet it hasn't.
instead of supercomputing, I would foresee the lowest tier of rent-a-webservers to move to arm.. what's a better business than renting a mach
Re: (Score:2)
Alpha used to be the fastest you can buy, and it used to be priced high too...
ARM is doing what x86 did to the highend risc cpus of the 90s.
Re: (Score:2)
I for one am happy to see WinTel crumbling at both ends. Windows and X86, each as ugly as the other.
Re: (Score:2)
The x86 addressing modes are so powerful that they even created an instruction to leverage the addressing generation logic without accessing memory...
The fact is that neither RISC nor CISC is best, that a hybrid of the two is best. The problem with the RISC camp is that they cant make it hybrid while still being RISC, while the CISC camp hybridized long ago and even remained entirely compatible while do
Re: (Score:3)
Re: (Score:2)
No. Alpha anything was priced insanely.
There have always been cheap x86. It's only the extreme high end that's been rediculous. There has always been a sweet spot with x86 in terms of price and performance.
Although Alpha does provide a nice example of how performance per core trumps anything else. There were some problems you simply could not solve by throwing lesser CPUs at it no matter how much you might have wanted.
Re:IMHO - No thanks. (Score:4, Interesting)
Re:IMHO - No thanks. (Score:5, Insightful)
For the record, I think addressing a bit more memory, and larger/faster storage channels are what's holding back some of these systems.. which aren't a problem at super-computer scale.. but for someone wanting to put together a small cluster, it gets irritating.
Re: (Score:3)
A single Sandy Bridge system will outperform many dozen Raspberry PIs.
Re: (Score:2)
With sufficient abstraction.
Re: (Score:3)
50 arm cpu's eh, problem comes to fact of something that can scale to that many cpu's.
Well the article is about arms being used in supercomputers, so scalability is probably not going to be a problem.
Re: (Score:2)
Those are generally the problems people run on massively parallel supercomputers.
Re: (Score:3)
"For supercomputers? Battery life isn't a term."
You say that until the power grid fails and your generator fails to kick on, leaving you with only battery backup in place.
Re: (Score:2, Interesting)
architecture is complicated. but in terms of ops per mm^2, or ops per watt, ops per $,
cycles per useful op, the x86 architecture is a henious pox on the face of the
earth.
worse yet, your beloved x86 doesn't even have any source implications, its just
a useless thing.
Re:IMHO - No thanks. (Score:5, Informative)
architecture is complicated. but in terms of ops per mm^2, or ops per watt, ops per $,
cycles per useful op, the x86 architecture is a henious pox on the face of the
earth.
worse yet, your beloved x86 doesn't even have any source implications, its just
a useless thing.
In TFA's slides 10 and 11, Intel i7 chips are shown to be more efficient in terms of performance per watt than ARM chips. However, they're close to each other and Intel's prices are significantly higher.
Re: (Score:2)
Is it worth the wait for the next gen of low power chips to arrive?
Re: (Score:3)
Re: (Score:2)
Useless for what you do. The second performance...not performance per watt...PERFORMANCE becomes an issue..ARM is a steaming pile of shit and you know it. If you're doing anything more than what the above AC said (keep playing soduku, and portal) it can't handle it. How about everyday consumers who need a tablet that can actually do work? A gimp version of windows is not going to get the job done. Some of the Samsung Slate tablets however come with an x86...and are actually fully functional! Can you point t
GNU/Linux on ARM (Score:3)
A gimp version of windows is not going to get the job done.
On the other hand, a Windows version of GIMP does get a lot of jobs done that don't quite need Adobe Photoshop.
But seriously, the reason Windows RT is "gimped" is because Microsoft has refused to endorse recompiling desktop applications. That's not a failing of ARM, as ARM ran RISC OS on Acorn computers, as much as a power grab by Microsoft.
Some of the Samsung Slate tablets however come with an x86...and are actually fully functional! Can you point to an ARM tablet that can do everything it can?
Some ARM tablets run Ubuntu [ubuntu.com]. Other Android tablets run Debian in a chroot, with video out through an X11 server app for Android. These can't run Windows application
Re: (Score:3)
Re:IMHO - No thanks. (Score:5, Informative)
Re: (Score:2)
Not only Performance per $ (Score:3)
Re: (Score:2)
Why did you even say this? "PC users" aren't even mentioned in this article. This article is about supercomputers where the workloads are by virtual definition extremely parallel and the restrictions are around price and power consumption, not "FPS on a single game".
Re:IMHO - No thanks. (Score:4, Informative)
what do you think goes on at the other end of the copper/fibre cable?
No supercomputing whatsoever. I'm not a physicist, a mathematician, a code breaker nor anyone else with supercomputing needs. My HTTP request for web page is quite likely served by a single core. Maybe 2.
Re:IMHO - No thanks. (Score:5, Interesting)
Re:IMHO - No thanks. (Score:4, Informative)
Given, my experiences are pretty dated, and things have gotten better... for me, linux is on the server(s) or in a virtual machine... every time I've tried to make it my primary OS has been met with heartache and pain. I replaced my main desktop a couple months ago, and tried a few Linux variants.. The first time, I installed on my SSD, then when I plugged in my other hard drives, it still booted, but an update to Grub screwed things up and it wouldn't boot any longer. This was after 3 hours of time to get my displays working properly.... I wasn't willing to spend another day on the issue, so back to Windows I went. I really like Linux.. and I want to make it my primary desktop, but I don't have extra hours and days to tinker with problems an over-the-wire update causes... let alone the initial setup time which I really felt was unreasonable.
I've considered putting it as my primary on my macbook, but similar to windows, the environment pretty much works out of the box, and brew takes things a long way towards how I want it to work. Linux is close to 20 years old.. and still seems to be more crusty for desktop users than windows was a decade and a half ago in a lot of ways. In the end, I think Android may be a better desktop interface than what's currently on offer from most of the desktop bases in the Linux community, which is just plain sad... I really hope something good comes out of it all, I don't like being tethered to Windows or OSX... I don't like the constraints... but they work, with far fewer issues... the biggest ones being security related... I think that Windows is getting secure faster than Linux is getting friendlier, or at least easier to get up and running with.
Re: (Score:3)
I had a desktop with two graphics cards in sli, and two monitors
Given SLI barely works in Windows, expecting it to work in Linux was optimistic. I recently booted up a Linux Mint DVD on my laptop to try it out and... everything just works. Even using the 'recovery partition' to reinstall Windows on there takes over three hours, reboots about thirty times and breaks with barely decipherable and completely misleading error messages if you installed a hard drive larger than the one that came with it.
Linux is close to 20 years old..
And the BSD core in MacOS is close to 40 years old.
Android would m
Re: (Score:2)
Yeah yeah you had no problems therefore they don't exist. I wish Linux advocates would be more honest about its flaws. I think it's great but it's nowhere near perfect. I swapped a Mint hard drive from another machine into this one and it works flawlessly which Windows most certainly wouldn't, however when I put Ubuntu on that other machine it was a nightmare.
Re: (Score:2)
Re: (Score:2)
Far more games are played on ARM cpus than X86 CPUs these days. Of course the takeover started at the bottom end with Snake, and moved on through Angry Birds etc., it's only a matter of time before ARM takes over the hard core gamers too. It's more a matter of having a platform with big screen and interesting controllers. ARM CPUs are already up to the task of running such systems.
Re: (Score:2)
Re:IMHO - No thanks. (Score:5, Funny)
The article is aimed at supercomputers, not commodity PC. You are not the target.
While not the target, you'll be collateral damage anyway.
Re:IMHO - No thanks. (Score:5, Interesting)
Damage or a winner? I feel so bad about having a cheap, efficient, and above all, quiet box.
I bought this [hardkernel.com] 4*2GHz baby, and the only reason it's not my main desktop yet is a weird and asinine requirement for monitor resolution to be exactly 720 or 1080 (WTF?!?). I think I'll replace my old but perfectly working pair of 1280x1024 monitors (I hate 16x9!), and put the big loud clunker to the cellar. I just hate the noise so much. x86 machines with no moving parts are extremely hard to get, and have terrible performance/price. Anything that requires lots of processing power: compilation, running Windows VMs, etc, can be done remotely from the cellar just as well, while a 2GHz arm is fast enough to do client stuff, running a browser being the most demanding part.
And what else do you need to reside directly on the machine you plop your butt at?
Re: (Score:3)
I feel so bad about having a cheap, efficient, and above all, quiet box.
So do I. I can't even hear my i7 machine when playing games on it, whereas the old Pentium-4 sounded like a vacuum cleaner.
Re: (Score:2)
If it's the OP AC, whinging about how his games don't work well on ARM - then it's a damage (not that I regret it).
If it's you (thanks for the link: nice to see others on top of RasPi) or me - then its winning.
Speaking about quiet: I recently bough a Proliant Microserver for the "home FS"/NAS - at 15W for the Turion and the 4 NAS grade WD HDDes... mums, I can't hear it (under 60W at peak use). I would have gone with a ARM-board, but could't find enough support for NAS-ing (not when RAID-ing anyway).
btw: I
Re: (Score:3, Insightful)
A single ARM 4 core A-15 running 1.5 GHz per core blows away any competing chip at the same specs, on power AND price. It's not limited to the calculations x86 are and can process graphics and physics better as a result.
Translation: It gets raped sideways on single-threaded performance and you have to double up on sockets right out of the gate.
It's a bit of a misconception about ARM and x86. ARM wins of watts/socket and mhz/watts, but Intel's i7s cream ARM on performance/watt, once you account for those two factors, ARM isn't as competitive as you might think. Now, I'm not saying it isn't competitive, just that it's nowhere near as one-sided as you might be led to believe by cherry-picking.
Re: (Score:2)
..if it runs x86 native, isn't it a x86 cpu?
you look like an idiot who read some hype up article a few years back and is still waiting for it to be true. keep waiting! like for the magic parallel!(plenty of games utilize parallel code nowadays)
Does it really matter? (Score:5, Interesting)
Most of the actual processing power in current supercomputers comes from GPUs, not CPUs. There are exceptions (that all-SPARC Japanese one, or a few Cell-based ones), but they're just that, exceptions.
So sure, replace the Xeons and Opterons with Cortex-A15s. Doesn't really change much.
What might be interesting is a GPU-heavy SoC - some light CPU cores on the die of a supercomputer-class GPU. I have heard Nvidia is working on such (using Tegra CPUs and Tesla GPUs), and I would not be surprised if AMD is as well, although they'd be using one of their x86 cores for it (probably Bulldozer - damn thing was practically built for heavily-virtualized servers, not much different from supercomputers).
Re:Does it really matter? (Score:5, Informative)
Re:Does it really matter? (Score:5, Informative)
Also, a lot of algorithms, perhaps even most, rely on branching, which is something GPUs suck at. And only some can be reasonably rewritten in a branchless way.
Re:Does it really matter? (Score:5, Funny)
Also, a lot of algorithms, perhaps even most, rely on branching, which is something GPUs suck at. And only some can be reasonably rewritten in a branchless way.
nonsence, I play Farcry3 on my GPU, and it renders branches just fine thank you very much.
Re: (Score:2)
Isn't the ironic thing here, that ARM is also not very good at branching? No branch prediction - that at least used to be the case.
Re: (Score:2)
Sure, but (Score:2)
An ISA is only as good as... (Score:2)
Re: (Score:2)
It really doesn't seem like portability should be a huge goal for writing code for top-100 supercomputers. The cost of the computer would dwarf (or at least be a significant portion of) the cost of developing the software for it. It seems like writing purpose-built software for this type of machine would be desirable.
If you can cut the cost of the computer in half by doubling the speed of the software, it seems a valid fiscal tradeoff, and the way to do that would be to write it for purpose-built hardware
Re: (Score:2)
Re: (Score:3)
System and numerical libraries and compilers are of course written specifically for the machine. But user-level apps (and a lot of scientific computing uses finished apps) are ported across multiple systems.
Portability is not as big an issue as it was a generation ago, as most supercomputers basically are Linux machines today, and made to more or less look like a typical Linux installation from a user-application level, with a POSIX API; pthreads, OpenMP and OpenMPI; a standard set of numerical libraries; a
Re: (Score:2)
It depends on what are you doing. If you have relatively short term project (say less than couple of years) you are right.
I've got to take issue with this statement. Anything that takes over a couple years probably should not be started on new silicon as it doesnt make sense to start them yet due to Moores law. The guy that starts the same project a year from now using the same amount of money that you used will beat you to the final calculation and get the hookers and blow that you thought that you deserved.
The only time it makes sense is when the hardware is otherwise at end of life, that there is no longer an initial inv
Re: (Score:2)
Its the same for ARM. Java doesn't run properly yet because of the floating point limitations of ARM.
Re: (Score:2, Insightful)
False. According to the Top 500 computer survey from November, 2012 (Category: Accelerator/Co-Processor), 87% of systems are not using any type of GPU co-processor, and 77% of the processing power is coming from the CPU.
This is, however, a decrease from the June 2012 survey, so GPU is certainly making inroads, but it is not yet the main source of computation.
http://www.top500.org/statistics/list/
I still remember when the IBM Blue architecture came out, using embedded PowerPC processors and it was a huge po
Re:Does it really matter? (Score:5, Informative)
Of the last published top500 list, 7 out of the top 10 had no GPUs. This is a clear indication that while GPU is defintely there, claiming 'Most of the actual processing power' is overstating it a touch. It's particularly telling that there are so few as overwhelming the specific hpl benchmark is one of the key benefits of GPUs. Other benchmarks in more well rounded test suites don't treat GPUs so kindly.
Re:Does it really matter? (Score:5, Interesting)
Re: (Score:2)
These ARM cores are halfway between the extremely limited GPU cores and the extremely flexible X86 cores. They may be the "happy medium".
Not at all. They are much more like slow x86 processors. They can branch just as well, but are much slower and don't have a narrow very high performance sweet spot like GPUs.
I somewhat expect AMDs new unreleased APUs to be the happy medium. Not as much grunt or memry bandwidth as a discreet GPU, but still some stream processors and much easier to program.
Re: (Score:2)
Re: (Score:2)
Not really. The main difference between ARM and x86 cores in this application is that ARM has an equally flexible but lower performance ALU. For scientific applications that is a good trade off because performance tends to be mostly dependent on the FPU and on things like network and memory latency.
In other words it is hard to max out an x86 core constantly in a supercomputer so much of its performance is unused. ARM does away with the bits that are less critical which results in lower power consumption and
Questions... (Score:5, Interesting)
As I understand it, Intel still has the advantage in the performance per watt category for general processing and GPUs have better performance per watt IF you can optimize for that specific environment--both things which have been commented to death endlessly by people far more knowledgeable than I.
However, to me there are at least 3 questions unanswered:
1. ASICs (and possibly FPGAs): Bitcoin miners and DES breakers are the best known examples. Where is the dividing line between where your operations are specific enough to emply an ASIC vs not specific enough and needing a GPU (or even CPU)? Could further optimization move this line more toward the ASIC?
2. Huge dies: This has been talked about before, but it seems that, for applications that are embarrassingly parallel, this is clearly where the next revolution will be, with hundreds of cores (at least, and of whatever kind of "core" you want). So when will this stop being vaporware?
3. But what do we do about all the NON-parallel jobs? If you can't apply an ASIC and you can't break it down, you're still stuck at the basic wall we've been at for around a decade now: where's Moore's (performance) law here? It would seem the only hope is new algorithms: TRUE computer science!
Re: (Score:2)
In ASICs ARM is an ideal choice because you can built it right into the chip from a reference design. A lot of ASICs feature an 8502 core for management and I/O tasks, but if you needed to execute a more complex application than a simple ARM core running THUMB or even a full 32 bit ARM core would be ideal.
Re: (Score:2)
A lot of ASICs feature an 8502 core for management and I/O tasks
I thought only a Commodore 128-on-a-chip would have an 8502 core [wikipedia.org]. What am I missing?
Re: (Score:3)
The reason for the question is that nothing in Moore's law says anything about single-threaded performance doubling every 1.5 years as many thing.
Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years.
Re: (Score:2)
Did you see where he put "(performance)" in there?
So, when can I buy an ARM ATX board? (Score:2)
Hopefully this means we should start seeing ARM-using motherboards in an ATX form-factor. The Pi and Beaglebone are nice, but I want something that's eassentially just like a commodity x86 motherboard except it uses ARM.
Re: (Score:2)
Hopefully this means we should start seeing ARM-using motherboards in an ATX form-factor. The Pi and Beaglebone are nice, but I want something that's eassentially just like a commodity x86 motherboard except it uses ARM.
Why? Mini-ATX's not good for a commodity MB? 'cause you don't need a high google-fu to find heaps of them.
Re: (Score:2)
Mini-ATX or Mini-ITX will do fine. I just haven't seen any that have the kinds of things you take for granted on x86 boards. I want an ARM board with SATA ports, PCIe slots, and DIMM (or SODIMM) slots. Is that too hard to produce? I don't see anything like this anywhere.
Re: (Score:2)
Ditto. I went looking for an ARM board last time I built a home server, but found nothing that could compete in the slightest against a $90 Atom board.
Re: (Score:2)
No, they won't. (Score:5, Informative)
Current ARM processors may indeed have a role to play in supercomputing, but the advantages this article implies don't exist.
Go look at performance figures for the Cortex-A15. It's *much* faster than the Cortex-A9. It also draws far more power. There's a reason why ARM's own product literature identifies the Cortex-A15 as a smartphone chip at the high end, but suggests strategies like big.LITTLE for lowering total power consumption. Next year, ARM's Cortex-A57 will start to appear. That'll be a 64-bit chip, it'll be faster than the Cortex-A15, it'll incorporate some further power efficiency improvements, and it'll use more power at peak load.
That doesn't mean ARM chips are bad -- it means that when it comes to semiconductors and the laws of physics, there are no magic bullets and no such thing as a free lunch.
http://www.extremetech.com/computing/155941-supercomputing-director-bets-2000-that-we-wont-have-exascale-computing-by-2020 [extremetech.com]
I'm the author of that story, but I'm discussing a presentation given by one of the US's top supercomputing people. Pay particular attention to this graph:
http://www.extremetech.com/wp-content/uploads/2013/05/CostPerFlop.png [extremetech.com]
What it shows is the cost, in energy, of moving data. Keeping data local is essential to keeping power consumption down in a supercomputing environment. That means that smaller, less-efficient cores are a bad fit for environments in which data has to be synchronized across tens of thousands of cores and hundreds of nodes. Now, can you build ARM cores that have higher single-threaded efficiency? Absolutely, yes. But they use more power.
ARM is going to go into datacenters and supercomputers, but it has no magic powers that guarantee it better outcomes.
Re: (Score:2)
Didn't Intel say that bringing down the cost and improving the performance of the interconnect was the goal of silicon photonics and they are now very close to mass production.
However I don't know how power efficient it is.
Could silicon photonics help close that gap ?
That's what is so funny to me (Score:5, Insightful)
Slashdot seems to have lots of ARM fanboys that look at ARM's low power processors and assume that ARM could make processors on par with Intel chips but much more efficient. They seem to think Intel does things poorly, as though they don't spend billions on R&D.
Of course that would beg the question as to why ARM doesn't and the answer is they can't. The more features you blot on to a chip, the higher the clock speed, and so on, the more power it needs. So you want 64-bit? More power. Bigger memory controller? More power. Heavy hitting vector unit? More power. And so on.
There's no magic ju ju in ARM designs. They are low power designs, in both sense of the word. Now that's wonderful, we need that for cellphones. You can't be slogging around with a 100 watt chip in a phone or the like. However don't mistake that for meaning that they can keep that low consumption and offer performance equal to the 100 watt chip.
Re: (Score:2)
The point is that an ARM processor can provide, say, 75% of the performance for 25% of the power compared to x86. You can see it in tablet computers, particularly those running Windows RT or Ubuntu where a direct comparison is possible. Since most of the bottlenecks are not due to processing power but rather disk, RAM, graphics rendering, network etc. you very quickly reach the point of diminishing returns with increasing CPU performance.
In the case of supercomputers the same things applies. You might want
Re: (Score:2)
yeah well, we'll see when it does 75% performance for 25% of power. it doesn't. you can't see it in tablets right now. that's what next gen is supposed to fix. but the next gen arm design is going to use more power to get there.
(incidentally memory access, network etc are all slower on arm and for most supercomputing they do matter)
it is a bit boring to read these articles now for a decade though. "intel is dead due to arm in two years!! yeehaw!!". they were even more boring back in the day when intel was m
Re:That's what is so funny to me (Score:4, Interesting)
There's no magic ju ju in ARM designs.
The magic ju ju is the ARM business model. There is one trump card ARM holds that precludes Intel from many portable devices; chip makers can build custom SOCs in-house with whatever special circuits they want on the same die. Intel doesn't do that and they don't want to do it; it would mean licencing masks to other manufactures like ARM does. For example, the Apple A5, manufactured by Samsung, includes third party circuits like the Audience EarSmart noise-cancellation processor, among others. It is presently not feasible to imagine Intel handing over masks such that Apple could then contract with some foundry to manufacture custom x86 SOCs. This obviates Intel from many portable use cases.
That feature of the ARM business model might be very useful to large scale computing. One can imagine integrating a custom high-performance crossbar with an ARM core. Cores on separate dies could then communicate with the lowest possible latency. Using a general purpose ARM core to marshal data to and from high-performance SIMD circuits on the same die is another obvious possibility. A custom cryptography circuit might be hosted the same way.
Contemporary supercomputers are great aggregations of near-commodity components. However, supercomputing has a long history of custom circuit design and if the need arises for a highly specialized circuit then a designer may decide that integrating with ARM to do the less exotic leg work computing that is always necessary is a good choice.
I want (Score:3)
Also by breaking up the system into physically separate CPUs I suspect that an interesting memory accessing architecture could be conjured up preventing another potential choke point.
Re: (Score:2)
Also by breaking up the system into physically separate CPUs I suspect that an interesting memory accessing architecture could be conjured up preventing another potential choke point.
I suspect you mean it would have to be conjured up, or you'd spend all the time waiting to access RAM on other cores rather than doing anything useful.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
Re: (Score:2)
your description sounds like you would benefit more from them having separate memories as well. otherwise a "big powerful intel" would fit the bill, getting higher throughput of requests.
Xilinx Zync anybody? (Score:5, Informative)
Has anybody else seen/considered the Xilinx Zync [xilinx.com]? It's a mix of ARM kernels and FPGA, which could be interesting in supercomputing solutions.
For anyone willing to tweak around with it there are development boards around like the ZedBoard [zedboard.org] that is priced at US$395. Not the cheapest device around, but for anyone willing to learn more about this interesting chip it is at least not an impossible sum. Xilinx also have the Zynq®-7000 AP SoC ZC702 Evaluation Kit [xilinx.com] which is priced at US$895, which is quite a bit more expensive and not as interesting for hobbyists.
Done right you may be able to do a lot of interesting stuff with a FPGA a lot faster than an ordinary processor can and then let the processor take care of stuff where performance isn't a critical part.
Those chips are right now starting to find their way into vehicle ECUs [xilinx.com], but it's still in an early phase so there aren't many mass produced cars yet with it.
As I see it - supercomputers will have to look at every avenue to get maximum performance for the lowest possible power consumption - and avoid solutions with high power consumption in standby situations.
Re: (Score:2)
Not this week... (Score:2)
I am a fan boy for the small ARM boards... I have built an MPI cluster out of Raspberry-Pi boards and it is not even close except as a teaching exercise where it excels.
However many site services can be dedicated to these little boards where corp IT seems to dedicate virtual machines.
Department Web Servers... with mostly static content... via NFS or a revision control system like hg.
Department and internal caching name servers... NTP servers and managed central storage for each bu
Exactly. (Score:2, Interesting)
This isn't to say that ARM *can't* be there, but thus far all of the implementations have focused around 'good enough' performance within a tightly constrained power envelope. Intel's designs have traditionally been highly inefficient in that power band, but at peak conditions, it is still compelling.
I recall one 'study' which claimed to demonstrate ARM as inarguably better. It got way more attention than they should have. The reason being is that they measured the performance on the ARM test, but just *
Re: (Score:2)
Re: (Score:2)
I'm thinking you don't understand. The whole "shared memory" thing is not exclusive to x86 cores. At some level it's a software abstraction relating to latency of storage. GPUs can have terabytes of RAM too as a sixth level cache.
Intel really needs some help here because the ground has shifted too much for them.
I understand all right - try reading full posts! (Score:2)
Re: (Score:2)
Re: (Score:2)
Re:Not buying it. (Score:4, Informative)
I don't buy your response: http://top500.org/statistics/list/ [top500.org] ... click accelerator and hit submit.
87.6% of the top 500 super computers have no NVIDIA etc. coprocessing
Re: (Score:2)