Stories
Slash Boxes
Comments
typodupeerror delete not in

Hot Comments

Comments: 370 +-   Facebook VP Slams Intel's, AMD's Chip Performance Claims on Thursday June 25, @10:23PM

Posted by timothy on Thursday June 25, @10:23PM
from the more-than-half-of-you-less-than-half-as-well-as-you-deserve dept.
intel
amd
hardware
narramissic writes "In an interview on stage at GigaOm's Structure conference in San Francisco on Thursday, Jonathan Heiliger, Facebook's VP of technical operations, told Om Malik that the latest generations of server processors from Intel and AMD don't deliver the performance gains that 'they're touting in the press.' 'And we're, literally in real time right now, trying to figure out why that is,' Heiliger said. He also had some harsh words for server makers: 'You guys don't get it,' Heiliger said. 'To build servers for companies like Facebook, and Amazon, and other people who are operating fairly homogeneous applications, the servers have to be cheap, and they have to be super power-efficient.' Heiliger added that Google has done a great job designing and building its own servers for this kind of use."
story

Related Stories

This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • So let me get this straight, the Vice President of a web company is criticizing the hardware guys in two of the world's biggest chip makers?

    You guys don't get it

    Is it possible to take out a massive life insurance policy on Jonathan Heiliger?

    To build servers for companies like Facebook, and Amazon, and other people who are operating fairly homogeneous applications, the servers have to be cheap, and they have to be super power-efficient.

    I assure you, despite your misconception that the world revolves around you everyone has those requirements. From the people who build supercomputers right down to the netbook I am typing on while watching Gurren Lagann.

    Can we get like a panel of hardware engineers to have a discussion with this guy and can I get some popcorn?

  • Hm... (Score:4, Insightful)

    by Darkness404 (1287218) on Thursday June 25, @10:28PM (#28476941)

    To build servers for companies like Facebook, and Amazon, and other people who are operating fairly homogeneous applications, the servers have to be cheap, and they have to be super power-efficient.

    Hm, lets see... perhaps because Facebook and Amazon are niche markets? The average server isn't going to even need all the computing horsepower and the power efficiency is simply a drop in the bucket for most companies electrical bills. The average server is going to be much more I/O intensive than CPU intensive unless you do cluster computing or render a lot of stuff. The average server such as a web server or a file server doesn't use that much CPU and usually you are running 1-3 servers, not the hundreds that Facebook or Amazon would run.

    And really, why is a VP complaining about this stuff? That he can't either afford custom solutions or spend the money buying more servers?

    • Re:Hm... (Score:5, Informative)

      by HockeyPuck (141947) on Thursday June 25, @11:25PM (#28477411)

      The average server is going to be much more I/O intensive than CPU intensive unless you do cluster computing or render a lot of stuff.

      As someone who designs and deploys large storage environments for a living, I call BS. While the current generation of HBAs are 8Gb FibreChannel, I would say that the "average server" (as you put it) could happily live on a 1Gb HBA. Recall that almost all servers, or atleast those you care about, have DUAL HBA connections to their respective storage. So that's actually 2Gb of storage connectivity. Sure there are servers which have multiple HBAs, or use a higher utilization of the HBAs, such as database servers or backup/media servers. Most servers today are deployed with dual 4Gb HBAs as the 8Gb SFPs/optics are still quite pricey, and you cannot, in all seriousness, purchase 1 or 2Gb FC HBAs.

      Even as we deploy VMware based servers, the VMware servers themselves tend to be more memory/cpu strapped than IO.

      It would be very rare, or almost impossible for a server to be driving linerate HBAs, with still plenty of headroom left in the CPU. Even basic test tools like IOmeter require significant CPU usage to drive an HBA to capacity. And that is when it's writing/reading all zeros. It's doesn't actually need to do anything with the data. As would be the case if a database server was requesting 2Gb/s from a disk array, and then had to join/sort/add/whatever the tables retrieved.

       

      • Re: (Score:3, Insightful)

        Niche as in, only a few companies (~100) are going to need the same solutions. On the other hand the vast majority of servers will be for much, much, much less intense use. Then you have the problem that really Facebook isn't super profitable, Amazon is but they seem to be doing decent with their servers and have the spare cash to simply upgrade them. I mean, other than a few websites who needs a "perfect" server?
  • I have heard from some reliable sources that Facebook and Twitter's backend applications are poorly written.

    Are Intel and AMD's claims overblown, sure what hardware manufacter doesn't cherry pick performance claims.

    But I don't care what sort of hardware you through at crap code you are always going to get crap performance.

    • Crap code on faster computer is still going to be faster than it was on a slower computer. He's not saying anything about how efficient their software is, just that buying new processors didn't get him the performance delta that it was supposed to. More advanced hardware should deliver a performance benefit no matter what is running on it.
      • More advanced hardware should deliver a performance benefit no matter what is running on it.

        Not if your code is not tuned for this new "advanced hardware". Surely there are new compile flags to consider, and if you are not tuning your code for the new processor features it could very well be slower than before.

          • by hidden (135234) on Thursday June 25, @11:14PM (#28477323)

            Facebook is written in PHP; there are no compile flags.

            apache and the php engine have plenty of compile flags. not to mention whatever the database is.

                • by cowbutt (21077) on Friday June 26, @02:27AM (#28478617) Journal
                  Essentially our disks are no faster than they where 3 years ago, or even 5 years ago

                  # hdparm -Tt /dev/sdc

                  /dev/sdc:
                  Timing cached reads: 5120 MB in 2.00 seconds = 2562.04 MB/sec
                  Timing buffered disk reads: 84 MB in 3.02 seconds = 27.77 MB/sec # hdparm -i /dev/sdc | grep Model
                  Model=ST3200822A, FwRev=3.01, SerialNo=xxxxxx
                  # hdparm -Tt /dev/sda

                  /dev/sda:
                  Timing cached reads: 6078 MB in 1.99 seconds = 3052.95 MB/sec
                  Timing buffered disk reads: 338 MB in 3.01 seconds = 112.22 MB/sec
                  # hdparm -i /dev/sda | grep Model
                  Model=ST31000333AS, FwRev=SD1B, SerialNo=xxxxxx

                  It's not even a full order of magnitude faster, but 112MB/s is still nearly four times faster. And these are both magnetic discs, rather than SSDs.

    • by evanbd (210358) on Thursday June 25, @10:53PM (#28477167)

      Developers have been known to trade off performance for development ease. Frequently the result is crap code. Yes, it performs like crap on both sets of processors. But if the application is CPU-limited (rather than IO or memory or...), then throwing faster CPUs at it ought to make it proportionally faster, no? Obviously they thought the previous performance was acceptable, is it unreasonable to think that buying CPUs marketed as 50% faster should give a 50% performance increase? Clearly crap code will still run like crap, but you ought to be able to throw more CPU power at it and get 150% of crap performance.

    • by Necroman (61604) on Thursday June 25, @11:14PM (#28477337)

      One of the server techs from Twitter was at SXSW 2 years and gave some details about how their backend servers worked. If I remember correctly (there were 4 sites on the panel, so I may be confusing them with someone else), the original code was written in Ruby on Rails which did not scale well to the multi-server systems that they had setup. They have spent a lot of time improving their code over the years, but from what I could tell, their initial implementation wasn't the most thought out thing in the world.

    • by Stormie (708) on Friday June 26, @12:26AM (#28477819) Homepage

      I have heard from some reliable sources that Facebook and Twitter's backend applications are poorly written.

      Given the quality of Facebook's developer API (it's horrible), I'd be amazed if the back-end of the actual site wasn't poorly written.

  • by cptnapalm (120276) on Thursday June 25, @10:31PM (#28476967)

    Well, I suppose that if he does not like the offerings from Intel and AMD, they could always go with...

    Uh..

    Oh.

    • Re:Well I suppose... (Score:5, Informative)

      by the linux geek (799780) on Thursday June 25, @10:39PM (#28477041)
      Let's see... IBM, Sun, Fujitsu, Itanium (yeah, its still Intel, but has great performance)... All of these can offer equivalent or much better performance at these kinds of applications than what they're using. Don't bitch if you're not willing to consider the alternatives.
    • Its the next logical solution... Those T5440 servers with 256 processing threads are MONSTERS in terms of handling simultaneous connections which make them very good web servers, database servers, and file servers, all of which means they are very good for a company who's product is a website.
  • by joeflies (529536) on Thursday June 25, @10:33PM (#28476989)

    1) Facebook & Amazon need cheap, power efficient systems
    2) Intel and AMD aren't measuring up with processors to power these systems
    3) However, Google has systems appropriate for this use (presumably using Intel or AMD processors)

    If that's his argument, then it would seem that the real conclusion is that Facebook can't build systems as good as Google's, even though they are using the same processor technology.

  • PHP (Score:3, Interesting)

    by Anonymous Coward on Thursday June 25, @10:35PM (#28477011)

    And we're, literally in real time right now, trying to figure out why that is,' Heiliger said.

    It's because your shitty website doesn't have a single line of compiled code. PHP only goes so far.

    • Re:PHP (Score:5, Interesting)

      by afidel (530433) on Thursday June 25, @10:51PM (#28477151)
      Yeah, this. Most of us don't have too much trouble wringing performance out of x64 processors when we need to. He wants a miracle of hardware he can throw at poor code which is NOT what Google asks for. Google simply want to wring every last flop/dollar (TCO) out of their systems which is slightly more than most of us need (the cost of engineering Google type solutions is more than 99.9+% of shops could reap through improved efficiency).
    • PHP "extension" (Score:5, Insightful)

      by RGRistroph (86936) <rgristroph@yahoo.com> on Friday June 26, @02:41AM (#28478723) Homepage

      I once did a large project in which I took a large, slow site in PHP (it was pretty complecated, it was a CRM with a lot of custom business logic) and rewrote all the core functionality from PHP to C / C++, and made it a "module" of PHP. The rewriting was mostly simple translation -- litterally removing all dollar signs, adding some types, and attempting to compile, and just fixing the compile errors until it would build. Then going back through it with a fine-tooth comb to track down all the memory leaks.

      The speed increase from doing that is pretty surprising. Simple loops that do a bit of math or something speed up by 100 times, and a loop that creates and destroys an object within the loop will be 100,000 times faster. This is without actually trying to write fast C/C++ code, and not create and delete the same thing over and over in a loop -- just pure dumb translation of the code.

      At that point, the web site guys can keep tweaking and changing the web page in PHP just like before; but they load that module in the php.ini and then they have a basic library of stuff, like login_user() or get_user_balance() and etc, that are really fast and do all the heavy lifting.

      I would be surprised if Facebook has not already done this. How to do it is well documented in several books, and there are lots of PHP modules written in C/C++ to look at for examples.

      I suspect that Facebook's VP is right that AMD and Intel exaggerate their claims, but is also generally true that most computer programs are more IO bound that you expect. This is not a reason to avoid something like I describe above; once you have the more complete control of programming in C, IO issues may be easier to find and address.

      He also mentions that the servers offered by Dell and others aren't very power efficient or practicle for him, and he mentions Google designing their own servers. Nothing google did was really rocket science, from what we know, and Facebook probably doesn't have to go as far as they did to get a reasonable benefit. It's not that hard to set up motherboards to run without a case, booting off the network with no harddrive attached.

  • by stox (131684) on Thursday June 25, @10:47PM (#28477115) Homepage

    Assuming that a solution was properly engineered, this should not have been a surprise.

    Cheap. power efficient, performance. Pick two.

  • by 1sockchuck (826398) on Thursday June 25, @10:56PM (#28477189) Homepage
    This is becoming an annual event for Heiliger, who also complained about server vendors [datacenterknowledge.com] at GigaOm's Structure 08 conference last year. Facebook used to buy a lot of cloud-optimized gear from Rackable/SGI, but no longer appears on the list of their largest customers. Makes you wonder if they're not going to follow Google's lead and build their own servers.
  • by SeaFox (739806) on Thursday June 25, @11:12PM (#28477305)

    'You guys don't get it,' Heiliger said. 'To build servers for companies like Facebook, and Amazon, and other people who are operating fairly homogeneous applications, the servers have to be cheap, and they have to be super power-efficient.'

    NEWSFLASH! Customer are tightwads.

    Performance/Reliability/Price.

    Pick any two, Heiliger.

  • And yet... (Score:5, Interesting)

    by Junta (36770) on Thursday June 25, @11:37PM (#28477487)

    Every major server vendor has jumped on the bandwagon of 'look how efficient we are, and 'cheap'. Three years ago, by and large the tier ones wouldn't bother designing systems without forcing even the cheap design to have parts included to facilitate purchase of redundant add-ons (i.e. power distribution cards designed for dual power supplies regardless of one being bought or not). They would always put a high end storage controller on the planar. They would always make their 'entry' platform be burdened with expensive components to make it easier to option it up.

    Now, we have tons of 'internet scale', or 'cloud', or whatever buzzword you feel like. They tend to stress energy efficiency, low cost components, with sales and management strategies targeted at thousands of servers (i.e. IBM iDataplex, HP SL6000). Basically, precisely what he prescribes, though probably not as 'cheap' as he wants. The incentive he gives is that the vendors should have zero margin, which is not particularly compelling for companies to work toward. Google's situation works because they brought it in-house and thus have fewer middle-men. Honestly, from all the rumours I hear, it's the logical thing to do when your server consumption is larger than some respectable computer companies' entire production. If he thinks the volume of servers is high enough to pull a google, by all means do it. Otherwise, be prepared for people not jump at the chance to give their designs to him at zero margin.

    Of course, if he is calling them out on performance per-watt by avoiding non-x86 solutions, including ARM, that might be a fair criticism. However, I think company forays into 'exotic' architectures have not panned out in the market recently. Sun's niagra, despite all the worthy praise, couldn't attract a mass-market required to subsidize it for those who benefited most from it. Last year, IBM seemed to be saying Cell architecture would light the world on fire, but have been a lot quieter about it now. The message their buisness leaders have probably taken in is that while these things have their target market, that market isn't worth the expense of developing products that are refused by the larger market and focus instead on leveraging commonly accepted building blocks to do as best they can for that niche, even if it means skipping the 'perfect' solution. Sure, IBM still sells plenty of POWER, but I haven't heard that be *particularly* praised on the performance/watt category like I hear a lot for Niagra, Cell, and ARM. And if not for POWER's legacy, it probably would be still born in the market today. The PA-RISC->Itanium decision for HP probably sank their HP-UX product line faster than banking on legacy of PA-RISC installs, and it seems IBM won't make that mistake, but at the same time I don't hear much about *new* POWER customers.

  • Strange... (Score:5, Informative)

    by spire3661 (1038968) on Thursday June 25, @11:55PM (#28477581)

    Since when do we listen to manufacturer's claims? You take the new hardware, stress test it with your custom software, record results, plan servers accordingly. How hard is it really to commission a server design that meets your needs and then QA some prototypes?

  • by OneSmartFellow (716217) on Friday June 26, @01:08AM (#28478081)
    ...I'm supposed to care about the comments of the guy who wrote Facebook ?

    Hah, hah, hah, hah, hah !At least google needed to actually engineer their solution, but Facebook, come on ! The next time I need to write a PHP script for displaying photos and text, I'll hire my 13 year old daughter.
    • Re:WTF? (Score:5, Insightful)

      by Zebai (979227) on Thursday June 25, @10:42PM (#28477069)
      I agree I think this was writing his own resignation with this crap. The guy is basically telling everyone that he is incapable of finding an acceptable solution for his company and blaming intel and amd because he has committed a great deal of money on something that he didn't plan well enough to know exactly what the long term costs vs performance was. In the very article he says to not be cheap, but in many more words than necessary, probably to try to disguise what he is saying like most politicians, that they were not only too cheap, but made bad decisions on what to be cheap with. Its as if he's already in a public office, hes telling everyone he screwed up, why he screwed up, and trying to make it look like hes teaching everyone lesson to make his mistake to be less of a disaster.
      • Re:WTF? (Score:5, Interesting)

        by Spit (23158) on Thursday June 25, @11:14PM (#28477331)

        Looks like that to me; he scoped for cheap and cheerful and was bit on the ass when he realised that sometimes you get what you pay for. Like what's the point in having quad-core server CPU without the high-bandwidth buses of server-grade hardware.

        In the concurrent DNS/Kaminsky thread, I saw a reference that facebook's DNS TTL is low. A quick investigation reveals that they have a 30 second TTL and are using DNS round-robin for their load balancing.

        He's nothing but a blame-shifting cretin.

      • Re:WTF? (Score:5, Insightful)

        by cryogenix (811497) on Thursday June 25, @11:19PM (#28477361)

        I think we read different articles. He's not saying he didn't plan well enough, he's saying that Intel and AMD promise that Gen Y processor is 35% faster than Gen X processor, and he's not seeing anywhere near 35% in real world performance. The 35% is a made up number but it doesn't matter what the number is that they claim. He's probably correct. Manufacturers pull this stuff all the time. Look at the recent articles on battery life claims on notebook's. AMD came out and called BS on the whole thing and basically said if you guys don't stop lying through your teeth, the FTC is going to regulate us. From the perspective you are taking, that would mean we have to call AMD incompetent for not understanding how batteries work and not properly selecting them.

        Manufacturers ALWAYS overstate claims in computer related products. CRT actual inches vs viewable inches (thank you lcd's for finally being honest... about inches anyway.. brightness and contrast however....) Computer speaker wattage being 1/2 or 1/4 of what's claimed. Power supply efficiency or wattage not measuring up to claims... you name it. He's calling out what he see's to be bogus claims based on a real world experiences in a demanding environment, the type of environment where one is always looking for better performance. I think we should get some more information before declaring him to be the problem as I'm sure he has a whole team of people that are working on this issue.

        What I'd like to see from him is some numbers. On this Intel (or AMD) rig, we get so many operations per hour/minute/whatever. On this new Intel (or AMD) rig which they claim is 20% faster than the previous rig, we're only seeing this number of operations per hour which amounts to only a 7% gain, thus we're seeing 13% less than they are claiming. Again, numbers made up for examples sake. I'd also be very interested in what a typical rig of theirs looks like... X Processor, Y Ram, what type of storage system is it connected to, etc. I think such numbers are vital to understanding the issues at hand. We all know that vendors will run the benchmarks that makes their stuff look the best, and that is often not reflective of real world performance. If I was Intel/AMD I'd be chiming in right about now and opening a dialog with Facebook and looking to see what the issues are. Facebook is a big customer with huge name recognition and you want to be able to use them as an example of your solution delivering the promised performance for your marketing. I'm going to assume (I know I know) that they are already working with the server vendor to try and see what's going on here.

        • Re:WTF? (Score:5, Insightful)

          by edmudama (155475) on Friday June 26, @12:21AM (#28477763)

          I think we read different articles. He's not saying he didn't plan well enough, he's saying that Intel and AMD promise that Gen Y processor is 35% faster than Gen X processor, and he's not seeing anywhere near 35% in real world performance.

          If the application was purely CPU bound, and Y wasn't giving me 35% more than X, I'd complain.

          However, if it's a complex system like almost everything else, why would they expect their application to get 35% faster when there's probably 6 or 8 critical subsystems that could all be bottlenecks as well?

          • Re:WTF? (Score:5, Interesting)

            by TheRaven64 (641858) on Friday June 26, @04:30AM (#28479401) Homepage Journal

            One of the fun toys Intel has to play with is a complete system simulator, which simulates every single component in a computer for early testing. This lets them vary parameters that aren't feasible yet while they're working on their design goals. A few years ago they did a test; what happens to the system performance if you make the CPU infinitely fast? They adjusted the simulator so that every CPU operation took zero simulated time and ran their benchmark suite. It ran twice as fast (in simulated time) as it was before.

            A CPU-bound workload can quickly become a RAM-speed bound or a disk-speed bound workload if you make the CPU faster but don't upgrade anything else.

        • Re:WTF? (Score:4, Interesting)

          by Runaway1956 (1322357) on Friday June 26, @12:53AM (#28477977) Homepage Journal

          Uhhh, correct me if I'm wrong. I've been looking at after market bolt on parts for my car. The headers claim increase fuel mileage, the spark plugs, the air filter, the tires, as does a turbocharger. The glass pack mufflers, and the resonator. Oh yeah, the aerodynamic rims, the hood, and spoiler. Don't forget the carbon fiber body panels. Taken all together, those increased MPG's add up to about 150 MPG. You're saying I may not see that much improvement on my 1968 Chevy Malibu? It's just hype? Man - you just saved me about $5,000!!!

Remember folks. Street lights timed for 35 mph are also timed for 70 mph. -- Jim Samuels