Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
HP Intel Hardware

HP Announces ARM-Based Server Line 125

sammcj writes with news that HP is developing servers based on 32-bit ARM processors from Calxeda. Their current model is only a test setup, but they plan to roll out a finalized design by the middle of next year. "HP's server design packs 288 Calxeda chips into a 4U rack-mount server, or 2,800 in a full rack, with a shared power, cooling, and management infrastructure. By eliminating much of the cabling and switching devices used in traditional servers and using the low-power ARM processors, HP says it can reduce both power and space requirements dramatically. The Redstone platform uses a 4U (7-inch) rack-mount server chassis. Inside, HP has put 72 small server boards, each with four Calxeda processors, 4GB of RAM and 4MB of L2 cache. Each processor, based on the ARM Cortex-A9 design, runs at 1.4GHz and has its own 80 gigabit cross-bar switch built into the chip"
This discussion has been archived. No new comments can be posted.

HP Announces ARM-Based Server Line

Comments Filter:
  • SATA?! (Score:1, Insightful)

    by Anonymous Coward

    Come on, guys, it's 2011. We're talking servers here. Forget SATA; throw in native iSCSI support (or fibre channel, but iSCSI would probably be significantly easier - if only because it uses standard Ethernet ports, rather than needing extra protocol support), and you'll have something that's a serious contendor in that space.

    Think about it: with SATA, you have a bunch of hard disks, probably mostly disused, almost all of them performing atrociously (SATA is notorious for only being good with large sequenti

    • by Junta ( 36770 ) on Wednesday November 02, 2011 @06:59AM (#37917962)

      FC/FCoE/iSCSI all deliver much much lower aggregate I/O performance than coordinated use of direct attached storage. Google, Hadoop, GPFS, Lustre all facilitate that sort of usage. You will in any of those remote disk architecture have an I/O bottleneck along the line.

      That said, I would presume netboot at least would be there, and from there you can do iSCSI in software certainly. FCoE tends to be a bit pickier, so they may not be able to do that in the network fabric provided.

      On the whole, I'm skeptical still yet. So far ARM has proved itself when low power is critical and performance. I'm not sure if performance per watt is going to be impressive (e.g. if it hypothetically takes 10% of the power of a competitor and gave 9% of the performance, that can work well for places like cell phones but perhaps not so much for a datacenter). ARMv8 may make things very interesting though...

      • by postbigbang ( 761081 ) on Wednesday November 02, 2011 @07:59AM (#37918272)

        You can argue, successfully, that via virtualization and multi-core relationships that the ARM power argument is goofy, as number of threads per process and virtualization favors the CISC architectures. The ARM infrastructure, however, the foundation for a couple of decent server product lines. The architecture cited is very much like getting a bunch of ARM CPUs together to do what more power hungry quad/multi-core Intel and AMD chips are doing to day. Remember: the ARM is 32-bit, and the number of threads are limited both by inherent architecture as well as the memory ceiling.

        What's scary to me is that someone wrote that it has a crossbar switch on it without understanding what that implies in terms of inter-CPU communications, cache, cache sync/coherence, etc. A well-designed system will perform almost as well with iSCSI (on a non-blocking, switched backplane) as it will with SAS so IO isn't quite the issue; the power claim vs thread density per watt expended claim has yet to be proven.

      • by Lumpy ( 12016 )

        BAH, why? build a metric buttload of ram on it and have it simply make snapshots of the ramdisks to rotating media when changes are made using a coprocessor letting the main process scream along. you get insane speeds and ram is dirt. if each processor had 64 gig of ram, each can run 4 website VM's with plenty of memory and storage and still outperform the quad bonded OC48 connections into the Server Farm.

        This is how Comcasts Video on demand system runs. Main spinning storage servers spool out to ra

        • by Junta ( 36770 )

          Keep in mind that many systems have many *terabytes* of data per compute node on local spindles. Boot volumes/partitions and many little apps may barely be a blip on the big drives of today, but a whole lot of stuff has a lot more data than you realize, *particularly* if they have a meaningful application of a distributed filesystem..

      • Not where I work, they don't. I/O on VM's (ESX, etc) is generally woeful, and it's significantly faster to pass through a FC card and access LUN's on a DMX or VMax than to use local storage. Hadoop uses local storage for a completely different reason.
        • by Junta ( 36770 )

          Ignoring virtualization overhead (which is a factor), if the storage is underutilized, yes a massive amount of cache/number of spindles a FC hop away in certain scenarios can blow away one or two local spindles. The problem is when you up utilization, the equation slips the other way. If you have low utilization or insane number of disks behind an FC compared to number of hosts in the SAN, the SAN can do better. Most places I see are heavily utilized on a relatively small amount of storage relative to nu

    • Has anybody seen the Googleplex "server" spec? from what little I've read, I'd assume they're on SATA.

      • It looks like a SATA cable [cnet.com] is used but Google doesn't mention it specifically in the article.
        • Isn't it well known that they use cheap disks?
        • by swalve ( 1980968 )
          I forget the exact specifics, but I think SAS can use the same cable. And those look like SCSI drives.
          • the functional difference between SATA and SAS is intelligence on the Drive. SATA is dumb compared to SAS. The pinouts and cables and all that were designed to be interoperable. I think SAS drives can be dumbed down to SATA if you need in a pinch, and SAS controllers can handle SATA drives natively (at least some can).

    • The fact that they've special-magic-backplane-fabric-ed away all the other busses, while leaving each card bristling with SATA connectors, seems rather weird, just because that's a lot of headers to bring out if nobody is going to use them and it'll be a hell of a rat's nest if you actually try(could they really not have stretched their backplane fabric a little bit more, to include allocating direct attached storage to nodes across it?).

      The use of SATA, though, seems reasonable enough, given the low-per
  • Let's count - they have Xeon/Opteron, Itanium, and among their dead platforms, they have PA-RISC, Alpha (DEC/Compaq) and MIPS (Tandem/Compaq). What made them pick this for servers?

    Would one be right in guessing that their Itanium based Integrity servers have been a disaster?

    • Sounds like a good choice for file servers.
    • by jimicus ( 737525 )

      HP's balance sheet is up and down like a whore's drawers - one quarter they make a stonking loss, the next they're making solid profits. They haven't been consistent in years.

      Their core businesses are being eaten away by ever-tougher competition; the days when you could confidently recommend an HP inkjet are long gone (have you seen their software suite lately? Multi-function devices are even worse because with them you often can't install just the bare driver and have it work); I wouldn't be surprised if s

    • Let's count - they have Xeon/Opteron, Itanium, and among their dead platforms, they have PA-RISC, Alpha (DEC/Compaq) and MIPS (Tandem/Compaq). What made them pick this for servers?

      You can already add ARM to the mix. Their current crop of low power thin clients are ARM based:
      http://h10010.www1.hp.com/wwpc/us/en/sm/WF05a/12454-12454-321959-338927-3640405-4063703.html [hp.com] (Wow, nice memorable URL!)

      • Why would you need a URL, just call you rep for a quote! We understand the internet!
      • This is an example of how badly corporate sites fuck it up (my current employer is a perfectly good example).

        The browser tells you which language is preferred - there's no need to hardcode it in a URL. And if they want to switch/override, put it in a fucking cookie.

        www.hp.com/products/PRODUCTNUM. WTF is so hard about that?
        • by hjf ( 703092 )

          my old (old!) hp cd-writer (1998) said: www.hp.com/go/storage
          why they fuck up wit the wwwXX server stuff like it's 1995 and there's no load balancing, I don't know.

    • Let's count - they have Xeon/Opteron, Itanium, and among their dead platforms, they have PA-RISC, Alpha (DEC/Compaq) and MIPS (Tandem/Compaq). What made them pick this for servers?

      Would one be right in guessing that their Itanium based Integrity servers have been a disaster?

      It is entirely possible that their Itanium units haven't been doing so hot(though, from what I've read, it's more of a 'small number of cost-insensitive customers' which is why neither HP nor intel can just shoot the program in the head; but why they can't seem to get it to expand and gain any economies of scale).

      However, the fate of Itanium and the fate of this curious box should be almost 100% unconnected with one another: The two are about as different in design and intended workload as two servers co

  • by Viol8 ( 599362 ) on Wednesday November 02, 2011 @05:49AM (#37917600) Homepage

    With the world moving to 64 bits to accomodate huge databases in memory and on disk they must be aiming for low hanging fruit here. Still, I'd like to get hold of one IF they ever convert it into a desktop version - would be nice to have a linux installation at home that doesn't pay homage to wintel in any way.

    • Not just that, what does ARM have that the other processors of HP don't? Even if one doesn't count PA RISC and Alpha, which are dead, HP could still use MIPS processors in their platforms. And how would Xeons be any worse?
      • By guess is popularity (compared to MIPS), and power consumption (compared to Xeons)

      • It has yet to be demonstrated that MIPS will scale as well as ARM.

        PA-RISC and Alpha should not even be mentioned, since they are dead, though everything relevant about the Alpha lives on at AMD.

        Xeons are horribly power-inefficient and always have been.

        • Do you remember SGI? Their lineup was all MIPS prior to the ORIGIN 4000 and Altix lines. Those were capable of scaling up to thousands of processors.
          • I'm not talking about fetishistically throwing more cores at the problem, I'm talking about minimizing the number of cores in a multi core system. I'm talking not about the number of processors, but about the speed of individual processors. What's the fastest MIPS core you've ever seen? Was it a particularly good core? Yeah, OK, that's what I thought.

            • Not even SGI, remember Tandem's Himalaya NonStop servers - the S series used up to the R14000 processor? Those were MIPS based as well, and scaled pretty well, and that is a platform HP still owns. The fastest MIPS was the R10000 and above, and they were pretty competitive - when the R10000 surfaced, it was about the same as the PA 8000, but slower than Alphas. HP already has several server models that it could use, even w/o considering PA RISC and Alpha.
            • Of course, calling multi-core computing a fetish is ridiculous and ignores the fact that, barring some amazing new physics, processors can only get so fast. Scalability is not about having one 10 GHz CPU that costs $100,000,000 but having 12 3 GHz CPUs for a thousand bucks.

              And outside the datacenter, the idea of screaming CPUs just seems retarded in this day and age when even a 10 year old processor can handle a typical workload from today's typical user without even straining.

      • by necro81 ( 917438 )
        Multi-source supply - ARM processors are produced by lots of companies. And although Calxeda is the only source of these new server-intended ARM processors, they are only the first.
      • by Surt ( 22457 )

        Cheapness in bulk.

    • by janoc ( 699997 )

      Easy - ARM doesn't yet have 64bit cores available, they were only recently announced. It will take a while until the manufacturers license them, integrate them into their products and only then can HP buy them and build a server around them.

      From the looks of it, this prototype machine is unlikely to be built for databases (4GB of RAM per chip is not a lot for something like Oracle), so the 32bit limit is not really an issue. On the other hand, this screams HPC cluster/supercomputing or some other well paral

      • All very good - but what about the software? What software are they going to offer on ARM that's not already on Xeon (which itself is both 32-bit & 64-bit flavors)? And what performance advantage will ARM bring? If it's power consumption, how compelling is the argument to switch to a completely new platform w/ little supported software (no, Android apps don't count) and no performance advantages just to lower the electric bills? HP might as well have worked w/ either Intel or AMD to get lower powere
        • by Anonymous Coward

          TheRegister had the best analysis of what the Sales pitch for one of these is:

          "The sales pitch for the Redstone systems [the HP hyperscale offering with the EnergyCore ARM boards], says Santeler, is that a half rack of Redstone machines and their external switches implementing 1,600 server nodes has 41 cables, burns 9.9 kilowatts, and costs $1.2m.

          A more traditional x86-based cluster doing the same amount of work would only require 400 two-socket Xeon servers, but it would take up 10 racks of space, have 1,6

          • If....if...if...you have access to the source code, have software vendors working (or willing to work) on a recompile, or an in-house development team who is familiar with ARM architecture, to include best practices to get the highest performance. This is the Achilles' heel, really. You toss a stone and you will hit a halfway-competent developer who understands X86...not so easy with any of the RISC architectures, and to find efficient coders working with ARM processors, you are going to have to go shopping
            • by janoc ( 699997 )
              That's a red herring. For majority of Linux applications you *do have* source code, thanks to the OSS licensing. And you won't even have to recompile, there are distros targeting ARM already. The only exception are proprietary applications like Oracle, SAP or Exchange, but this machine isn't designed for such workloads (Oracle needs more memory, SAP and Exchange are Windows-only).

              Regarding development - development for Linux on ARM is exactly the same as development for Linux on x86 and very similar to an

            • Remember the early days of Linux? Silly people trying to run a wanna-be Unix on their little piddly home computers using the ridiculous Intel architecture. What a bunch of tards. Those little pissant boxes didn't even have SCSI*, and certainly didn't sport the massive RAM expansion that a real computer like a VAX could boast.

              Of course, the naysayers from back then are all retired now, and those piddly X86 machines run practically all the servers on the planet, and that OS has turned into a multi-billion

        • by Pieroxy ( 222434 )

          Linux provides good software for servers. Ubuntu even has released Ubuntu-server for ARM.

          As far as performance per Watt, that's the key point and it is missing from the article. A pity.

          That said, what makes an architecture successful? I think it's the amount of R&D that everyone puts in it. x86 has seen obscene amounts of R&D (as compared with other platforms). ARM is getting a fair share with all the smartphones and tablets nowadays. So in my view, it is much much much better to bet on ARM for the

        • by janoc ( 699997 )
          FYI - ARM is well supported by Linux since ages ago, not only by Android. These CPUs have been around for a very long time, probably longer than Intel's Xeon. So while you probably won't run your Exchange or IIS on such machine in the near future, it will do just fine for everything else. There are plenty of uses for non-Windows servers ...
    • Re: (Score:3, Insightful)

      by Imbrondir ( 2367812 )

      In 2010 ARM announced 40 bit virtual memory extension for 32bit ARMv7. That's 1 Terabyte of RAM. Which should be enough for everybody :)

      On the other hand ARM a couple of days ago announced 64 bit ARMv8. But you can probably can't buy one of those for 6-12 months or so. Perhaps HP is simply using ARM chips available now more as a pilot for when the knight in full shining 64 bit address space comes along

      • by Bengie ( 1121981 )

        40bit addressing on a 32bit CPU takes a hefty performance penalty when switching 4GB "views" as the CPU can still only see 4GB at a time. Since the CPU can't see the memory as one flat memory range, it has to waste time copying certain things between these "views". It also increases the complexity of the code since any single app trying to use more than 4GB will have to manually manage these "views". So a pointer to a memory location may be valid in one view, but not another. Fun times.

        • Thanks. I actually hoped some somebody would elaborate on the downside of the 'hacky' 40 bit solution. I knew there'd be some.

          Though outside of big databases, the hacky solution still sounds useful. For example on application servers, or cheap LAMP hosting, a massive amount of cool low power cores might perform better

          • by Bengie ( 1121981 )

            That was based on my general understanding. I would not "quote" what I said.. :p

            And yes, it definitively "works", especially with apps that don't need more than your 32bit range. The OS can transparently handle a sum of more than 4GB of app allocated memory. So any single app may not use more than your standard 4GB, but all of the apps together could. It wouldn't be as fast as a native 64bit CPU, but it would be close enough.

            If a single app needed more than 4GB, it would have to make use of special calls to

        • by mzs ( 595629 )

          Though you can then have more than 4GB of RAM in a system and each process has a 32bit VA space. This works for process that do not need to mmap gigantic things, only the OS moves the 'views' around for the processes and you can run more io bound processes without paging.

      • That's 1 Terabyte of RAM. Which should be enough for everybody

        I think I've heard that before. It wasn't true then, it isn't true now. I ALREADY have systems with 265 GB Ram in them, and looking to get even more.

        • I think I've heard that before. It wasn't true then, it isn't true now.

          You left off the emoticon in his quote before you deadpan refuted his sarcasm.

    • by raddan ( 519638 ) *
      There are plenty of applications that don't need to be able to address 64 bits worth of memory. Think webapps. Lots of cores with fast I/O are what you want. Core speed itself is less important since you're usually I/O bound.
    • by Alioth ( 221270 )

      Not all servers acommodate huge databases. There are plenty of servers that have to service high numbers of users for tasks which are not computationally or memory intensive. 32 bit is likely to be better for these kinds of tasks.

    • Each thread/process deals with a 32-bit slice of a larger processing domain. Even when working with huge databases, there's no reason that each processing node of it can't work well within 1GB of RAM. (It seems there are 4 cores per 4GB of RAM).

      In the "many low-power CPU" strategy, saddling each CPU to work with 64-bit by default could be a real waste of memory bandwidth compared to the actual slice of the workload that it will get. But I expect this line to get full 64-bit just for ease & transparen

    • The low hanging fruit is probably 95% of the server market. Most servers sit around all day doling out a few files and maybe handling email. This could all have been done on a PDP-11 with plenty of juice left over.

      Whatever fantasy land you are living in sounds very hot and noisy. Take a look at how many machines in a typical corporate datacenter are running under any significant load sometime - it's usually only a few, if any.

  • Are we going back to transputers again, then?
    • Yes, but back is not the right word, since the idea (cheap, not very powerful, but many processors) never went away...

      Multicore is the same idea but on one chip

      Many of the worlds fastest computers are based on this ... e.g. BlueGene/L 106,496 x PowerPC 440 700 MHz ...

      The issues are getting the processors/cores to take the load evenly, and writing the software with parallel running in mind, Many systems up until recently were bad at this ...

    • We would have to have used them to go back to them, but hardly anyone ever did, since the cost of the hardware AND the cost of the development were both staggering.

      Transputers lost out to software solutions.

  • Alike most DSLAMs (Score:5, Informative)

    by La Gris ( 531858 ) <lea@gris.noiraude@net> on Wednesday November 02, 2011 @06:02AM (#37917676) Homepage

    This type of setup is already used in Most DSLAMs. Full rack, 2PSU, cooliing, 24 or 48 port (x)DSL cards with ARM CPU as independent servers, Internal management card and network switch. Think of blade server racks.

  • by bertok ( 226922 ) on Wednesday November 02, 2011 @06:24AM (#37917806)

    Those processors run at only about 1.1 GHz, and ARM isn't quite as snappy on a "per GHz" basis as a typical Intel core because of the power-vs-speed tradeoff, so I figure that a 1.1 GHz ARM quad-core chip has about the same computer power as a single ~3GHz latest generation Intel Xeon core.

    They say the can pack 288 quad core ARM processors into 4 rack units (with no disks). For comparison, HP sells blade systems that let you pack in 16 dual-socket blades into 10 rack units. Populate each socket with a 10 core Intel Xeon, and we're talking 320 cores. So for comparison, that's the equivalent of 72 cores per rack unit with ARM, vs 32 with Intel. The memory density is the other way around, with 288 GB per rack unit for ARM, and 614 GB with Intel.

    So, if you have a an embarrassingly parallel problem to solve that can fit into 4GB of memory per node, doesn't use much I/O, and can run on Linux, this might be a pretty good idea.

    • BlueGene/L Each node is a Dual Core processor with 4MiB of memory (M not G) and they seem to do OK... it's a case of writing the software correctly to distribute it correctly ... That's a single system running one application

      But this is about servers not cores ... this gives you 288 servers per rack, Your blade solution gives you many less independent servers, with each server having many more cores, and more memory, which is not the market they are aiming at ...

      • The limitation on "servers" per rack is not processors, and hasn't been in a while, it is RAM, at least where I work. We need higher density RAM capability.

    • by gl4ss ( 559668 )

      quad core arm has still nothing on 3ghz intel, not even in things that highly parallelize, with floating point things go even worse(per the semi-recent tegra benches, ).
      it's kinda sad, really. there's a lot of nifty stuff that could be done realtime if per cpu-core(per thread) power went up. anyhow
      http://www.xbitlabs.com/news/mobile/display/20110921142759_Nvidia_Unwraps_Performance_Benchmarks_of_Tegra_3_Kal_El.html [xbitlabs.com]

      smack a lot of shitty cpu's in a small case and call it a day has been done before "supercompu

      • by bertok ( 226922 )

        I was thinking renderfarm myself, but a) that's 90% about the floating point performance, not integer, and ARM isn't stellar on floating point throughput, and b) a lot of scenes these days are greater than 4GB. While it may be possible to "tile" some scenes, the most compute expensive bit (that you'd want to accelerate the most) is global illumination, which basically needs the whole scene in RAM.

        Being forced to stay under some arbitrary scene complexity limit would suck, especially with tools like ZBrush t

      • but it's not for a single server that runs a renderfarm, that's not the target market. (for that you'd want dedicated graphics type chips anyway)

        These are to support cloud and web type servers. Note that the intent here is not to provide a single massively virtualized server that you cram hundreds of paying customers onto, but to create a single server that runs 4000 individual OSs. At 1.25W per OS that makes a huge cost saving for most datacentres that are filled with web servers that pretty much don't nee

    • by Anonymous Coward
      TFA says they run at 1.4Ghz
    • So, if you have a an embarrassingly parallel problem to solve that can fit into 4GB of memory per node, doesn't use much I/O, and can run on Linux, this might be a pretty good idea.

      I'd imagine people who do 'cloudy' things like remote voice recognition for cell phones are jumping up and down and not renewing all their rackspace commitments.

      Now, let's see if HP can actually deliver or if the 6th CEO from now fails to understand how this sells ink.

    • While comparing the performance specs is sexy and nerdish and l33t and all that - you leave off important bits, like power consumption and heat production. These matter in the real world of engineering data centers.

    • Another use case is that you have some massive web site like Facebook which creates and destroys a lot of little server processes which handle a few KB to a MB of data or so in each instance, then die. Today those processes are running on Intel machines and they are almost totally IO bound (network drives and all that) so it would be easy peasy to slip in a bunch of ARM processors into the same role, and save massively on the power bill.

  • This looks to me to be similar to Bluegene supercomputers. A Bluegene essentially consists of packaged PowerPC processors with a scalable high-performance switch interface on board. The two first current generation Bluegenes were using 32bit CPUs as well.


  • Make a Minecraft themed one and I will find a reason to need it.
  • So, HP, are you really going to do this or should I just wait a few weeks and wait for the cancellation announcement?

    'Cause recently you guys have been a little wishy-washy...

  • Where would this fit in the market? My first thought is things with high number of threads but low compute complexity like web servers or something but Oracle essentially flopped in that arena with their ultrasparc or whatever it was with a bunch of threads. It's possible ARM is very fast but I'm only accustomed to seeing it in set top boxes, phones, and such. My understanding is they're great on power consumption but not so great on compute speed...
    • by Amouth ( 879122 )

      Oracle essentially flopped in that arena with their ultrasparc or whatever it was with a bunch of threads

      It was Sun who did it before Oracle bought them - it was the Niagara CPU line. It didn't flop, for the people who needed that and where Sun customers it was wonderful, but out side of that ecosystem it had nearly zero application. then Oracle bought Sun and well everything seems to have flopped from that.

  • ...does it run Android?

  • What kind of applications would this be used for. The only thing I can think of would be web hosting. Does KVM / Xen even work on ARM?

    There wouldn't be any serious enterprise applications that would run on ARM (right now) are there? Java?

"Everyone's head is a cheap movie show." -- Jeff G. Bone