Forgot your password?
typodupeerror
Networking Communications Hardware

Behind the Scenes at Hotmail 292

Posted by Zonk
from the that's-a-big-datacenter-you-have-grandma dept.
mallumax writes "ACM Queue interviews Hotmail engineer Phil Smoot on how they manage more than 10,000 servers spread around the globe. Between them, they process billions of emails per day and are overseen by hundreds of administrators. To do that they have returned to the command line. From the article: 'Our operations group never wants to rely on any sort of user interface. Everything has to be scriptable and run from some sort of command line'. The overriding philosophy seems to be KISS. Also: tape backups are out and spam levels have stabilized."
This discussion has been archived. No new comments can be posted.

Behind the Scenes at Hotmail

Comments Filter:
  • by digitaldc (879047) * on Friday January 13, 2006 @12:13PM (#14464041)
    The overriding philosophy seems to be KISS.

    Don't try to tell me that the guys at Hotmail only want to Rock & Roll all night and party every day?!?
  • by kurth (221375)
    ..who they call for support? :-)
    • AOL of course :-)
    • by daikokatana (845609) on Friday January 13, 2006 @12:20PM (#14464129)
      ..who they call for support? :-)

      Ghostbusters?

    • by yobjob (942868)
      who they call for support?

      The French?
    • ACM Queue interviews Hotmail engineer Phil Smoot on how they manage more than 10,000 servers spread around the globe.

      This apparently appeared in "People Weekly", April 24, 1989, v. 31, p. 93+

      Harvard Bridge spans the Charles River linking Boston and Cambridge. In 1958 Lambda Chi Alpha took 5' 7" MIT freshman pledge Oliver R. Smoot, Jr. and rolled him head over heels the entire length of the bridge. Every ten smoots they calibrated the bridge, painting marks. The bridge was found to be exactly 364.4 smo

      • The Boston police have been known to use smoot markers to indicate accident locations on the bridge. Apparently Smoot's experience as a unit of measurement led to a life-long career; he eventually became Chairman of the Board of the American National Standards Institute, and later President of the International Organization for Standardization.
  • UNIX? (Score:5, Interesting)

    by IAmTheDave (746256) <(moc.oohay) (ta) (ds-evademanesab)> on Friday January 13, 2006 @12:16PM (#14464080) Homepage Journal
    If I recall correctly, wasn't Hotmail originally run on UNIX boxes?

    •   If I recall correctly, wasn't Hotmail originally run on UNIX boxes?


      Yes, it was run on a combination of BSD & Solaris boxes, IIRC.
    • BSD boxen, yes. I think they still do, but I'm not sure. I do recall they tried migrating to Exchange once, and had to switch back for a bit, but I think(?) that they've finally switched over to Exchange by now.
      • Re:UNIX? (Score:2, Interesting)

        by jcaldwel (935913)
        Last time I was able to get a sniff out of it, they had changed over to Win-ders boxes, at least at the visible part of the Internet.
      • Re:UNIX? (Score:3, Informative)

        by ThinkFr33ly (902481)
        While there were initial problems migrating to Windows, 100% of Hotmail now runs on Windows.

        Also, Exchange was never involved in the migration. Hotmail is a combination of C++ ISAPI filters, COM+ (ATL) Enterprise Components, and SQL Server.
        • Re:UNIX? (Score:2, Troll)

          by GoodOmens (904827)
          While there were initial problems migrating to Windows, 100% of Hotmail now runs on Windows.

          No wonder hotmail sucks. Then again I am a diehard gmail fan :-p
          • Re:UNIX? (Score:3, Interesting)

            by ThinkFr33ly (902481)
            Hotmail sucks because of the feature set when compared to Gmail or Yahoo mail, not because it runs on Windows.

            The new Windows Live Mail beta is fairly good. Doesn't have the feature set of Gmail or Yahoo yet, but it's getting there.

            If it wasn't for the near impossibility of migrating 20,000+ e-mails from Hotmail to Gmail, I probably would have jumped ship long ago... but Live Beta is keeping me interested.
      • From TFA, pg 2:

        BF I'd like to talk a little about tools. In particular, what tools do you need to build rather than buy?

        PS Clearly, we're a Microsoft shop and we're going to leverage everything that the public can leverage, which would be Visual Studio, SDK tools, and SQL and all the tools associated with it. Custom tools that we may build would be more in the area of deployment, metrics gathering, ticketing, bug tracking, code coverage, monitoring, inventory, failure detection, and build systems.

        We do leve

        • by CmdrGravy (645153) on Friday January 13, 2006 @12:46PM (#14464378) Homepage
          Why does he keep mistaking the word "use" for the word "leverage" ? The only possible advantage I can see in substituting the word "leverage" is that it sort of implies they are making the best use of these tools that they can in which case you would think that most people would have already assumed they are not making the worst possible use they could of the tools and it's interesting that the author feels it necessary to make that distinction.
          • by Anonymous Coward
            Speaking of English, most speakers frown on the use of long run-on sentences.
          • by Tet (2721)
            The only possible advantage I can see in substituting the word "leverage" is that it sort of implies they are making the best use of these tools that they can

            No, in fact it just makes no sense at all. The word "leverage" is a noun. The verb he was looking for is "lever", at which point it would at least have been grammatically correct. Of course, "use" would still have been a better option.

          • Simple. It's PHB-speak. Haven't you ever read Dilbert?
          • by kotj.mf (645325) on Friday January 13, 2006 @01:59PM (#14465129)
            Why does he keep mistaking the word "use" for the word "leverage"?

            Because his team leverages best-of-breed systems to utilize the synergistic effects of the paradigm shift in relationships among stakeholders and the knowledge infrastructure, silly.

    • Re:UNIX? (Score:5, Informative)

      by Kraegar (565221) on Friday January 13, 2006 @12:25PM (#14464174)
      It used to be on FreeBSD w/Apache, now it runs on Windows w/IIS. It's not exchange based.

      Read about it [microsoft.com]

      • Re:UNIX? (Score:5, Informative)

        by enantiodromia (895412) on Friday January 13, 2006 @02:57PM (#14465657) Journal
        This is only half true. The _front end_ runs on Windows with IIS. The _back end_, where the email data is stored (the User Stores), are Solaris. The front end machines dont mean much. If one or twenty go down, there are tons more to take their place. They are simply removed from the load balancing and marked as "admin plz fix this some day". The back end machines however, are super critical, as each user lives one one, and only one, user store. That machine goes down, and hundreds of thousands, to millions, of Hotmail users cant get to their mail. And thats why those machines run Solaris.
    • Re:UNIX? (Score:5, Informative)

      by Amoeba (55277) * on Friday January 13, 2006 @12:44PM (#14464369)
      Yes. Hotmail was originally run on clusters of E3500 and E4500's running Solaris 2.5.1. After they got bought by Microsoft, a major initiative to migrate all boxes to Windows was undertaken in 2000. Hotmail has been 99.9% Windows for over 3 years now. The remaining 0.1% are some legacy solaris boxes used to handle backups for clusters... and even they are being phased out slowly.

      --Amoeba (who no longer works there)
    • Re:UNIX? (Score:2, Funny)

      by Anonymous Coward
      Bingo! See, that's why they need 10000 servers spread around the world to keep the thing (barely) affloat. Had they stayed on FreeBSD they could have run the whole operation on a Duron 800 with a good DSL connection! I mean, just look at GMail. Ok, that runs on Linux, so it needs a little more power, probably a Centrino (which is why you get those "temporarily unavailable" thingies when the administrator takes the laptop home). 10000 servers is just bad Karma for migrate the thing to NT-Server :P
      • And don't forget that the whole thing could probably be run off a single install of Qmail / Clam / Spamassassin... I man, it works for me, it should work for Hotmail if only they would stop thinking proprietary technology...
  • by pegr (46683) on Friday January 13, 2006 @12:17PM (#14464086) Homepage Journal
    "Between them, they process billions of emails per day and are overseen by hundreds of administrators."
     
    And how does the NSA process all that email? Now THAT would be an interesting technical challenge!
    • Easy.

      They use Microsoft MailSpy (version 1.0) with the Decrypt plugin.

      Slick gui (although I hear it uses non standard widgets)
      with the massive processing power of Microsoft Speed.

      No need to be patched though as it has a very small userbase and isn't a virus vector target as such (like Linux/unix/OSX)

      (removes tongue from cheek)

    • Re:Better subject... (Score:3, Interesting)

      by alexjohns (53323)
      And how does the NSA process all that email?

      Has anyone ever considered that spam may actually help keep us all 'freer'? There's billions of spam messages everyday that add to the legitimate traffic. If all spam email magically disappeared, all that would be left is 'legitimate' correspondence.

      Which would make the NSA's new job of spying on us much easier.

      I used to know a guy who always went to the limit on doing his taxes - exploited every loophole, deducted everything that could even vaguely be dedu

  • by ehaggis (879721) on Friday January 13, 2006 @12:18PM (#14464093) Homepage Journal
    What OS it runs on and which web server? I am not trying to be funny.
    • by jcaldwel (935913)
      [root@jboss html]# wget --save-headers -q -O- http://www.hotmail.com/ [hotmail.com] | grep "^Server:" 2>/dev/null Server: Microsoft-IIS/6.0
  • They've gone back to the command line? I wonder if it's SFU (Services for UNIX) where they at least have bash, or if they're having to wear out the "\" key and give their right pinky-finger carpal tunnel? /P
  • Fairly Impressive (Score:5, Interesting)

    by eldavojohn (898314) * <eldavojohn.gmail@com> on Friday January 13, 2006 @12:19PM (#14464109) Journal
    I don't know about everyone else but this article was shocking to me.

    Not only are the questions well picked but the some of the answers are quite interesting. For instance Phil on scalability:
    The problems are those of basic client-server programming--that is, figuring out the browser/http/server data-access patterns and optimizing the protocols, extending these protocols as new functionality is introduced, and ensuring that these protocols work across geo-distributed data centers when the speed of light becomes a factor. Designing applications with built-in redundancy so that they are resilient to abuse is also a challenge.
    Before reading this article, I always had hotmail pegged as a hacked together e-mail system less organized than a monkey sh*tfight but if Phil speaks the truth, I've underestimated them. They're a hacked togethor server mess with a lot of effort put into staying afloat--and they have been doing well for a long time.

    I guess I've always taken my free Hotmail account for granted.
    • Re:Fairly Impressive (Score:3, Informative)

      by mekkab (133181)
      Not only are the questions well picked

      The interviewer is ACM Queue editorial baord member Ben Fried, who is the managing director of Morgan Stanley's worldwide IT deptartment.
      • Re:Fairly Impressive (Score:2, Informative)

        by xtracto (837672)
        Man, it is the Association for Computing Machinery magazine, I mean, it is not any PC-Weekly WalMart mag.

        If you don't know about ACM publications, here [acm.org]are other interesting ones:

        Ubiquity: IT opinion magazine and forum
        TechNews: News Gathering Service for IT Professionals
        eLearn: Distance learning magazine
        MemberNet: Your Key to the World of ACM...and Beyond
        Computers in Entertainment: New ACM online magazine

        P.s. Sorry for the K.B.
    • Yet despite the talented people working on Hotmail, they still fall flat on their face in two apparently challenging areas:

      1.) Logging in. You would think that since I already typed hotmail.com in the address bar, I wouldn't have to type "@hotmail.com" in the log-in form, but alas, the solution has aluded them. In fact, it seems to have escaped them altogether, since it used to be that way. Apparently having seperate hotmail.com and msnmail.com, storing a cookie, or even just having a radio button is bey
  • by saskboy (600063) on Friday January 13, 2006 @12:19PM (#14464110) Homepage Journal
    I used to get about 35 spam a day in my primary hotmail account that I'd had since 1997. Now it gets about 4 a day so things have improved, but my biggest concern about Hotmail is that its virus scanning is horrible. There have been several times when it would have let me download a virus attachment, or allowed multiple obvious virus messages through. They've switched to Trend from McAfee, but I think the problem still remains.
    • I've had a particular hotmail address for....a long time. Most likely in 97.

      I was getting a LOT of spam for a while, turned on most everything that could be turned on, I flag it all as junk...but I still get 20ish a day. Really, I just skim it fast to see if I have anything from people I know, then I go back to gmail or such (where I only get 1 or 2 spam emails a day).

      If 20ish a day is their version of stable, I'd prefer it stabilize a little bit...lower.
  • High level of QC! (Score:5, Informative)

    by Anonymous Coward on Friday January 13, 2006 @12:19PM (#14464115)
    From the article:
    Hotmail relies on less than 100 system administrators to manage it all.

    From the summary:
    Between them, they process billions of emails per day and are overseen by hundreds of administrators.

    Brought to you by the high quality control here at /.
  • The SPAM problem (Score:2, Insightful)

    by Billosaur (927319) *

    BF Can you quantify in some way the extent of the spam problem?

    PS It is massive. Years ago we saw as many as 3 billion incoming messages. This has declined, but the estimates are that 75 percent of all e-mail is spam. Over the past couple of years our techniques have gotten better, and our partnerships with other major ISPs have improved. I would say spam is still gross and abusive, but it hasn't been getting worse lately.

    We do continue to react to spam on a daily basis as spammers continue to seek

    • Also look at how they seem to be defining spam only as an "incoming messages" related problem.
      They have installed spamfilters, but only on the input. Every Nigerian can create a hotmail account and start spamming, and their filters don't bother to act.
      Receivers of those mails can complain at abuse@hotmail.com, but it will take two weeks for them to process the complaint and lock the account, at which time the spammer just opens a new one.

      Is it really that difficult to scan outgoing mail, rate-limit the ma
  • Command line (Score:3, Insightful)

    by Anonymous Coward on Friday January 13, 2006 @12:20PM (#14464128)

    > To do that they have returned to the command line.

    Absolutely.

    I'm currently in the process of trying to change our company culture away from legacy GUI tools and toward command-line tools.

    Scriptability is a highly under-rated goal. I'm not against GUI tools -- but they need to be built on top of scriptable utilities.

  • by Phat_Tony (661117) on Friday January 13, 2006 @12:25PM (#14464171)
    "they have returned to the command line. From the article: 'Our operations group never wants to rely on any sort of user interface"

    I always thought that the command line was a user interface. You know, interfacing between a user and a computer.

    It's hard to picture using a computer without any sort of user interface. I'm pretty sure that, in order to call it "using" a computer, some sort of interface must exist, be it keyboard mouse and monitor, binary switch, light gun, real gun, neural link, telekinesis, or whatever. Otherwise, you're not using it, are you?

    On the other hand, maybe the article is correct- a lot of operations group probably don't want to use "any sort of user interface" to communicate with their computers. They want to be sitting on a beach in tahiti drinking daiquiris, thousands of miles away from the computers they're supposed to maintain.

    • I guess you could say that if it's scriptable then it can be automatically kicked off, like a cron job or something. So in that case the Operations group sets it up and then shouldn't have to do anything else with it. That might be a bit of a stretch, eventually you'll need to find out if it's working right or not. Even that could be done with out a computer interface, it could be your manager yelling at you over the phone wondering why the system is down with you on the other end in Tahiti while your wonde
    • "they have returned to the command line. From the article: 'Our operations group never wants to rely on any sort of user interface"

      I always thought that the command line was a user interface. You know, interfacing between a user and a computer.

      No, this is Hotmail. They do not need any user interface. They managed to configure the servers so that they send each other billions of SPAM emails each day. Totally automatically. Then they deleted all user interfaces. That is also why the spam levels have sta

  • In the landscape of today's megaservices, Hotmail just might be Mount Everest

    Is this true? I thought Google might be the Everest. Anyway, speaking from personal experience, in my university every student has multiple yahoo/gmail accounts but just a handful use Hotmail. Can someone throw light on the actual number of users all over?
    • You have to take into account less developed countries than the US.
      I travel a lot to Mexico and it amazes me that *everyone* has a hotmail account there. They advertise it on fliers, on business cards, etc....
      Some people will have (own) a domain like http://www.muchostacos.com.mx/ [muchostacos.com.mx] and *still* print their muchostacos@hotmail.com email.

      It kills me....

      I think this is because of the proliferation of internet cafes back when having internet (or a computer) at home was prohibitive.
      All those machines with
  • by hackstraw (262471) * on Friday January 13, 2006 @12:31PM (#14464229)

    "Those who don't understand UNIX are doomed to reinvent it, poorly."

    From the article and elaborating on the /. summary (It has a print version that consolidates the 4 pages together if you want):

    Q: Are there scaling reasons to think about the benefits of a command line for managing over a GUI, or are there other things to think about?

    A: Our operations group never wants to rely on any sort of user interface. Everything has to be scriptable and run from some sort of command line. That's the only way you're going to be able to execute scripts and gather the results over thousands of machines.

    Also, we all remember the scaling issues that MS had when they took over hotmail and initially tried to switch from freebsd to Windows.

    MS had to port over cron jobs because its not something that is installed and used by default under windows like UNIX. They had to rewrite the "inefficient" perl code that ran fine on FreeBSD to C++. They had to redo the memory allocation to prevent memory leaks in the new C++ code. Read about it from the goat's mouth http://www.microsoft.com/technet/interopmigration/ case/hotmail/default.mspx [microsoft.com].

    I can't wait until FreeBSD and other inferior OSes get tools to find memory leaks. One day....

    (That last line was sarcasm and not a flame).

  • Coral Cache (Score:4, Informative)

    by OctaneZ (73357) * <ben-slashdot2@NosPAM.uma.litech.org> on Friday January 13, 2006 @12:39PM (#14464313) Journal
    Looks like the site is down, it is however there is, however, a Coral Cache copy [nyud.net].
  • During the last month I only used my gmail account. When I saw this story about hotmail I went and check my hotmail inbox. Everything had been deleted.

    So long and thanks for all the spam
    So sad that it should come to sham

    Could anyone suggest a better rhyme for spam?

  • Windows (Score:2, Interesting)

    by certel (849946)
    It's interesting, but for some specific uses, IIS does a great job of handling traffic. For example, streaming video from servers seem to run a lot better on IIS and seem to be a little less resource intensive. I'm not sure about the overall use of Hotmail, though.
  • by Iphtashu Fitz (263795) on Friday January 13, 2006 @01:14PM (#14464633)
    more than 10,000 servers spread around the globe ... are overseen by hundreds of administrators.

    Heh. I used to work at Akamai [akamai.com] which provides content delivery services for many of the biggest sites on the web. They have somewhere over 15,000 servers that are managed by tens of administrators, not hundreds. In fact, a typical NOCC (yes, 2 'C's for Akamai) shift at Akamai is only staffed by 8 or so people, with only a couple of senior level admins on call. And they're delivering all sorts of web-based content, including streaming, not just e-mail.
    But then Akamai runs them all on linux, whereas I belive Hotmail is all Windows based. You do the math.
    • Nothing to do with OS. Its the complexity of the application.
      MSN Search has admin/server ratios similar to Akamai's.
    • I think running a mail server is a bit more complicated than a webserver or a streaming server for video. From my past experiences, I've spent more time tuning mail servers and adding features. It seems with mail, I have to keep up on spammer countermeasures, antivirus and spam filtering software, and different imap/webmail packages.

      Does anyone know what mail server hotmail uses for smtp and imap or pop or whatever? I'm curious what scales up that well regardless of platform. If there's actually a decent
      • by Iphtashu Fitz (263795) on Friday January 13, 2006 @02:49PM (#14465576)
        I think running a mail server is a bit more complicated than a webserver or a streaming server for video

        It sounds to me like you don't understand what it is that Akamai does. They're not just running web & streaming servers on their 15k machines. They're distributing content in real time in a way thtat vastly improves user access all around the world. You may have heard when Victorias Secret held their first video-streaming lingerie show. Well their servers couldn't handle the load because of all the people trying to watch it. They became an Akamai customer, and Akamai was able to redistribute their streams in real-time all over the globe. To be able to take video (or just web content) from a single source and distribute it quickly and efficiently to thousands of distributed users in real-time is a huge undertaking. Akamai has some very impressive technology to be able to do this.

        I'm not saying that running a mail service like Hotmail is a piece of cake, but I do think that what Akamai does is a lot more difficult and impressive when you think about it. If Akamai's distributed environment were to drop off the net then you probably wouldn't be able to access any of the on-line services of most of their customers [akamai.com]. (And that's just a small subset of their customer base) The ability to keep websites like those of Microsoft, eBay, Fed Ex, Red Hat, etc. all highly responsive to end users is not a simple feat by any stretch of the imagination.
    • From TFA:
      What's interesting is that despite this enormous amount of traffic, Hotmail relies on less than 100 system administrators to manage it all.

    • by Anonymous Coward
      RTFA, not the /. summary. Hotmail runs on tens of administrators as well. The /. summary got it wrong.
  • by wsanders (114993) on Friday January 13, 2006 @01:22PM (#14464721) Homepage
    AT first, it was BSD running on a bunch of identical custom-made sub-1U servers. But No! Then it was replaced by windows boxes . . . racks and racks of 99c Fry's keyboards velcroed to the backs and fronts of racks, with miles of small-gauge track, upon which ran diabolical steam-powered robots, each with a single arm and with fingers at the end, forever fixed at the precise spacing to stab the keyboards' CTRL-ALT-DEL keys. Noisily the robots rumbled back and forth on their appointed rounds . . .
  • "they process billions of emails per day" and probably most of it is spam. who in his right mind uses a hotmail account anyway?
  • by esconsult1 (203878) on Friday January 13, 2006 @02:22PM (#14465360) Homepage Journal
    After reading the article, it seems that a they did not think out a really scalable platform to run their services and apps. So over time, it became a huge mashup of servers and services. Heck, they can't even properly map the production environment to a small development set.

    Compared to Google clusters [internetnews.com], they seem to be light years behind. As a software developer, I can tell you that the key to rolling out applications quickly, is to have a decent framework in place. Whatever that framework might be (from shell scripts to java monstrosities), once its in place, developing apps on top of it are easy. Similarly a well thought out app execution environment is golden.

    If you ever check out Google's MapReduce [google.com], you'll see what I mean. It's just so well thought out and so elegant, that its easy to believe that they can scale outwards forever. You'd not be too far off if you thought that Microsoft were rethinking their whole production environment to compete with Google.

    There's no way that Microsoft can quickly and easily roll out vast new applications that scale, because that whole clustering framework is completely opposite to what Windows provides.

  • by bill_kress (99356) on Friday January 13, 2006 @02:57PM (#14465655)
    I was a strong hotmail user before Microsoft took it down, uh, I mean took it over.

    It was a great service! One of the first, and probably the best.

    Microsoft took it over and there was no advancement or innovation for years (a decade?). Spam ate up my tiny inbox while Microsoft just threw MSN graphics all over the place.

    When Gmail came out, I gave it a try. It was everything Hotmail could have been years ago if it hadn't been bought by MS! (Well, it COULD have been out of business, so I've got to give them that I suppose).

    They forced Microsoft to pay a little attention to features. They gave out a little more storage and started blocking some spam, but it was too little too late.

    In order to write this I decided to visit my hotmail inbox, I haven't been there for a while. 136 emails, and 43 have been detected as junk. They are ALL junk--A party invite from "heather", a Cola Quiz, etc. 136 undetected junk emails out of 179.

    And even at that, they still only give 1/8 the amount of storage that Google does.

    Crap, on top of that I just looked at a spam with pictures in it and it didn't auto-block them like Google does. Now I'm probably infected.

    Thanks Microsoft!

    From,

    The guy who used to argue the advantages of Microsoft to the Unix admins...

    • by nighty5 (615965) on Friday January 13, 2006 @09:56PM (#14469048)
      I've agree with mostly what you've said, but I'd say it was Yahoo that pushed Hotmail's innovation button, and not Gmail.

      Gmail didn't appear until much later on, but Yahoo were creating some fantastic portal features.

      I have a Gamil, Yahoo & Hotmail accounts, but prefer to give out my Hotmail account for "free offers" and other junk, its a junkbox.

      However Gmail & Yahoo are both solid email solutions, and as you say, Gmail fairs better than all of them in the spam war.

      Gmail; From a geek perspective, I admire them for creating key mappings that mimick those of vi/vim.

      There is features present in Yahoo I'd love to see in Gmail:
      * Setup up one-time (or temporary) email addresses that are binded to your email address.
      * A decent calendar that can sync to iCal and Sunbird. (I don't think Yahoo have this yet)
      * Events management, setup birthday reminders and the like.
      * A virtual notepad that you can scribble down notes
      * Sharing your calender, its private by default.
      * Check number of new messages without logging in or providing credentials (uses a cookie)

      Yahoo is awesome, if you havent tried out their web portal, take a look. Its very impressive.

All warranty and guarantee clauses become null and void upon payment of invoice.

Working...