Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Power The Internet IT

Cooling Challenges an Issue In Rackspace Outage 294

miller60 writes "If your data center's cooling system fails, how long do you have before your servers overheat? The shrinking window for recovery from a grid power outage appears to have been an issue in Monday night's downtime for some customers of Rackspace, which has historically been among the most reliable hosting providers. The company's Dallas data center lost power when a traffic accident damaged a nearby power transformer. There were difficulties getting the chillers fully back online (it's not clear if this was equipment issues or subsequent power bumps) and temperatures rose in the data center, forcing Rackspace to take customer servers offline to protect the equipment. A recent study found that a data center running at 5 kilowatts per server cabinet may experience a thermal shutdown in as little as three minutes during a power outage. The short recovery window from cooling outages has been a hot topic in discussions of data center energy efficiency. One strategy being actively debated is raising the temperature set point in the data center, which trims power bills but may create a less forgiving environment in a cooling outage."
This discussion has been archived. No new comments can be posted.

Cooling Challenges an Issue In Rackspace Outage

Comments Filter:
  • This is number 3 (Score:5, Informative)

    by DuctTape ( 101304 ) * on Tuesday November 13, 2007 @12:06PM (#21337871)
    This is actually Rackspace's number 3 outage in the past couple days. My company was only (!) affected by outages 1 and 2. My boss would have had a fit if number 3 would have taken us down for the third time.

    Other publications [valleywag.com] have noted it was number 3, too.

    DT

    • It seems crazy that the data centres seem to run in hot states. Surely Alaska would be better? C'mon Alaska, get the tax-breaks right.
      • by arth1 ( 260657 ) on Tuesday November 13, 2007 @01:17PM (#21338989) Homepage Journal
        (Disregarding your blatant karma whoring by replying to the top post while changing the subject)

        There's several good reasons why the servers are located where they are, and not, say, in Alaska.
        The main one is light speed through fiber, and a cable from Houston to Fairbanks would induce a best case of around 28 ms latency, each way. Multiply by several billion packets.

        This is why hosting near the customer is considered a Good Thing, and why companies like Akamai have made it their business of transparently re-routing clients to the closest server.

        Back to cooling. A few years ago, I worked for a telephone company, and the local data centre there had a 15 degree C ambient baseline temperature. We had to wear sweaters if working for any length of time in the server hall, but had a secure normal temperature room outside the server hall, with console switches and a couple of ttys for configuration.
        The main reason why the temperature was kept so low was to be on the safe side -- even if a fan should burn out in one of the cabinets, opening the cabinet doors would provide adequate (albeit not good) cooling until it could be repaired, without (and this is the important part) taking anything down.
        A secondary reason was that the backup power generators were, for security reasons, inside the server hall themselves, and during a power outage these would add substantial heat to the equation.
        • Re: (Score:3, Informative)

          by Spazmania ( 174582 )
          the local data centre there had a 15 degree C ambient baseline

          Well that's just incompetent. For one thing, commercial electronics experience increased failure as you move away from an ambient 70 degrees F regardless of which direction you move. Running them at 59 degrees F (15 C) is just as likely to induce intermittent failures as running it at 80 degrees F.

          For another, you're supposed to design your cooling system to accommodate all of the planned heat load in the environment. If your generators will be a
          • Re: (Score:3, Interesting)

            by RockDoctor ( 15477 )

            the local data centre there had a 15 degree C ambient baseline

            Well that's just incompetent. For one thing, commercial electronics experience increased failure as you move away from an ambient 70 degrees F regardless of which direction you move. Running them at 59 degrees F (15 C) is just as likely to induce intermittent failures as running it at 80 degrees F.

            I was considering asking why the GP poster was bothering with a sweater when working (as opposed to sleeping) in his server room at 15centigrade, b

      • It seems crazy that the data centres seem to run in hot states
         
        Several people have asked this and you're all thinking inside the box; a better idea is why not locate data centers near large cavern complexes or abandoned mines where the air temp inside is in the 60's no matter what the temp outside. Run a duct into the cave and you have natural AC. Just need a fan and if the power to that goes out, have one of those exercise bikes with the big fan as a wheel for the backup.
        • Re: (Score:3, Informative)

          by arth1 ( 260657 )
          While thinking outside the box is all well and fine, it's even better when combined with Common Knowledge. Like knowing that caves and mines (a) tend to be rather warm when deep enough, and (b) have a fixed amount of air.

          As for the power efficiency of pumping air from several hundred meters away compared to pumping it through the grille of an AC unit, well, there's a reason why skyscrapers these days have multiple central air facilities instead of just one: Economics.

          I'd like to see you pump air for any l
  • Which only shows (Score:3, Informative)

    by CaptainPatent ( 1087643 ) on Tuesday November 13, 2007 @12:06PM (#21337873) Journal
    If you want 100% uptime, it's important to have back up power for the cooling as well as the server systems themselves.
     
    Is this really news?
    • Comment removed based on user account deletion
      • Re: (Score:3, Interesting)

        by lb746 ( 721699 )
        I actually use a vent duct to suck in cold air from outside during the winter to help cool a server in my house. Originally I was more concerned with random object/bugs/leaves so I made it a closed system(like water cooling) to help protect the actual system. It works nicely, but only for about 1/3 or less of the year when the temperature is cold enough to make a difference. I've always wondered about a larger scale of something like this such as how the parent suggested servers in a colder/arctic region.
        • They have this already. It's called "Economizer Mode". There are basically 2 types, water side economizer and air side economizer. The air side is similar to what you're doing, which is taking the raw, unconditioned air, filtering it, and using it to cool the space.

          The other type is the Water Side Economizer which is probably going to be more prevalent in Data Centers. The way it works is, at a predetermined Outside Wetbulb Temperature, the cooling tower(s) speed up to cool the Condenser Water to
      • Or Siberia.

        I was thinking the same thing.

        AC is out? Crank open the vents and turn on the fans.

        Admittedly it wouldn't work so well in the summer, but spring/winter/fall could be nice.
        • In the winter, if you heat with electricity, you can basically run your computer for free, since its waste heat reduces the amount of heat needed to be generated by resistance heaters.

          • An Athlon XP 2500+ with two HDDs and a GeForce 6600GT running 24x7 can give a 12x14 foot room a 10-15 degree temperature boost.

            My computer room is quite toasty in the winter...

          • I'm about to move to a colder (and more damp) environment (and an older house with wood floors), and have thought about putting my server and NAS in hallway cupboard, drawing cool air from under the house and venting it to the hallway.

            Will have to see how big a hole I can cut (and repair) in the floor of the cupboard without the new landlord noticing....
      • Re: (Score:3, Interesting)

        by Ironsides ( 739422 )
        Yes, actually. This was looked into by multiple companies during the late 90's. I'm not sure if any were ever built. I think one of the considerations as a byproduct was the savings of not having to run chillers with the cost of getting fibre and power laid to the facility.
      • Re:Which only shows (Score:4, Interesting)

        by blhack ( 921171 ) * on Tuesday November 13, 2007 @12:21PM (#21338111)
        I think the problem is availability of power. When you are talking about facilities that consume so much power that, when built, their proximity to a power station is taken into account, you can't just slap one down at the poles and call it good. I would imagine that lack of bandwidth is a MAJOR issue as well..... ...one field where I think storing servers at the poles would be amazing is super computing. Supercomputers don't require the massive ammounts of bandwidth that webservers etc do. You send a cluster a chunk of data for processing, it processes it, and it gets sent back. For really REALLY large datasets (government stuff)...just fill a jet with hard-disks and have it to the server center in a few hours.
        • Norway is obviously the answer then. Bloody freezing, and loads of hydroelectric power.
          • by arth1 ( 260657 )
            Also the land of mountains and fjords, meaning that each mile of distance can equate to five miles or more of cable. Latencies is a bitch, and Norway isn't exactly centrally located to start with.
            Not to mention that Norway, being one of the richest countries in the world, would be prohibitively expensive, with starting salaries around twice of those in the US, 37.5 hr work weeks, a minimum of 4 weeks of paid vacation, and very high taxes (including wealth tax, capital gain taxes, employer taxes and a 25% V
        • by Average ( 648 )
          Bandwidth is comparatively cheap to get somewhere. A few redundant loops of fiber... undersea if need be. Fiber does not suffer transmission losses in the way that sending electricity the other way would.

          One fairly obvious location for this would be Labrador in Canada. Very well cooled. Absolutely lots of hydroelectric. Churchill Falls is huge. They lose half the energy sending it down to the US, but no one closer needs the power. Several major untapped hydro locations, too. Lots of land for approxima
        • by scheme ( 19778 )

          one field where I think storing servers at the poles would be amazing is super computing. Supercomputers don't require the massive ammounts of bandwidth that webservers etc do.

          Supercomputing absolutely requires massive amounts of bandwidth. In a particle physics, detectors at places like LHC are generating 1-5 petabytes of data each year and this data needs to be sent out to centers and processed. Likewise, bioinformatics applications tend to generate lots of data (sequences, proteins, etc.) and this dat

    • Exactly! The fact that these Chillers weren't on Emergency Generator Power is rookie mistake #1. All the generator power and UPS power in the world ain't gonna help if your Data Center gets too hot.
      • Re: (Score:3, Informative)

        by autocracy ( 192714 )
        Reading the article, they WERE on backup power. Emergency crews shut down the backup power to the chillers temporarily while they were working in the area, so the chillers had to start again. Cycling these big machines isn't instant.
    • Re:Which only shows (Score:5, Informative)

      by jandrese ( 485 ) <kensama@vt.edu> on Tuesday November 13, 2007 @12:15PM (#21338003) Homepage Journal
      If you want 100% uptime (which is impossible, but you can put enough 9s in your reliability to be close enough), you need to have your data distributed across multiple data centers, geographically separate, and over provisioned enough that the loss of one data center won't cause the others to be overloaded. It's important to keep your geographical separation large because you never know when the entire eastern (or western) seaboard will experience complete power failure or when a major backhaul router will go down/have a line cut. Preferably each data center should get power from multiple sources if they can, and multiple POPs on the internet from each center is almost mandatory.
      • Re:Which only shows (Score:4, Interesting)

        by NickCatal ( 865805 ) on Tuesday November 13, 2007 @12:31PM (#21338275)
        I can't stress this enough. When I talk to people about hosting and they rely on 100% availability they NEED to go with geographically diverse locations. Even if it is a single backup somewhere you have to have something.

        For example, Chicago's primary datacenter facility is in 350 E. Cermak (right next to McCormick Place) and the primary interconnect facility in that building is Equinix (which has the 5th and now 6th floors.) A year or so ago there was a major outage there (that mucked up a good amount of the internet in the midwest) when a power substation caught on fire and the Chicago Fire Department had to shut off power to the entire neighborhood. So the backup system started like it should, with the huge battery rooms powering everything (including the chillers) for a bit while the engineers started up the generators. Only thing is, the circuitry that controls the generators shorted out, so while the generators themselves were working, the UPS was working, the chillers were working, this one circuit board blew at the WRONG moment. And this isn't the only time this circuit has been used, they test the generators every few weeks.

        Long story short, once the UPSes started running out of power the chillers started going, lights flickered, and for a VERY SHORT period of time the chillers went out before all of the servers did. Within a minute or two it got well over 100 degrees in that datacenter. Thank god the power cut out as quick as it did.

        So yes, Equinix in that case did everything by the book. They had everything setup as you would set it up. It was no big deal. But something went wrong at the worst time for it to go wrong and all hell broke loose.

        It could be worse, your datacenter could be hit by a tornado [nyud.net]

    • Re: (Score:3, Interesting)

      by Azarael ( 896715 )
      Some data centers also have multiple incoming power lines (which hopefully don't have a single transformer bottle-neck). Anyway, I know for sure that at least one data center in Toronto had 100% uptime during the big August 2004 Blackout, so it is possible to prevent these problems.
    • Re: (Score:3, Interesting)

      by afidel ( 530433 )
      It sounds like they DID have backup power for the cooling but that they switch over to backup power caused some problems. This isn't really all that unusual because cooling is basically never on UPS power so the transition to backup power may not go completely smoothly unless everything is setup correctly, tested, and there are no or little unusual circumstances during the switchover. I've seen even well designed systems have problems in the real world. One time we lost one leg of a triphase power system so
      • This actually IS unusual. It's true that you would never put your cooling units on UPS power, but you would absolutely put them on generator power. Most people fail to differentiate between UPS power and Emergency Power. Having redundant utility feeds is great, but you have to have the generators in case both feeds fail.

        It doesn't sound like the systems from your story were very well designed/maintained. If you were to lose 1 leg of your 3 phase power, the transfer switch would have gone to emergen
        • Re: (Score:3, Interesting)

          by afidel ( 530433 )
          Ok, I specifically said UPS power, as in it takes time to spinup the generators and switching from one source to the other does not always go perfectly in the real world. One factor is minimum cycle time on the compressors. The 3 minute time frame was from TFA which says that at a density of 5KVA per cabinet thermal shutdown can happen in 3 minutes due to thermal load.

          Oh and as far as the one leg collapsing thing, yes we were VERY pissed at everyone involved in that little problem, it turns out it was a
    • by Bandman ( 86149 )
      I'm not disagreeing, but if people really want 100% uptime, they're much better off investigating an infrastructure using GSLB or something similar, where a single geographically isolated event won't impact them
    • by beavis88 ( 25983 )
      From a message to Rackspace customers:

      "When generator power was established two chillers within the data center failed to start back up"

      They had backup power for the chillers - but obviously, something didn't go quite right.
    • Re: (Score:3, Interesting)

      by spun ( 1352 )
      Hmph. We have backup power for the cooling in our server room, but we had to deal with a fun little incident two weeks ago. Trane sent out a new HVAC monkey a month ago for routine maintenance. I was the one who let this doofus in, and let me tell you, he was a slack-jawed mouth-breathing yokel of tender years. He took one look at our equipment and said, I quote, "I ain't never seen nutin' like this'un before, hee-yuck!" I was a bit taken aback, but he seemed to go through all the proper motions.

      Fast forwar
  • by Dynedain ( 141758 ) <slashdot2&anthonymclin,com> on Tuesday November 13, 2007 @12:08PM (#21337899) Homepage
    Actually this brings up an interesting point of discussion for me at least. Our office is doing a remodel and I'm specifying a small server room (finally!) and the contractors are asking what AC unit(s) we need. Is there a general rule for figuring out how many BTUs of cooling you need for a given wattage of power supplies?
    • by CaptainPatent ( 1087643 ) on Tuesday November 13, 2007 @12:15PM (#21338001) Journal

      Is there a general rule for figuring out how many BTUs of cooling you need for a given wattage of power supplies?
      I actually found a good article [openxtra.co.uk] about this earlier on and it helped me purchase a window unit for a closet turned server-room. Hope that helps out a bit.
    • Re: (Score:3, Informative)

      Try this. [anver.com]
    • by trolltalk.com ( 1108067 ) on Tuesday November 13, 2007 @12:19PM (#21338077) Homepage Journal

      Believe it or not, but in one of those "life coincidences", pi is a safe approximation. Take the number of watts your equipment, lighting, etc., use, multiply by pi, and that's the # of btus of cooling. Don't forget to include 100 watts per person for body heat.

      It'll be 90F degrees outside, and you'll be a cool 66F.

      • Re: (Score:3, Funny)

        Believe it or not, but in one of those "life coincidences", pi is a safe approximation. Take the number of watts your equipment, lighting, etc., use, multiply by pi, and that's the # of btus of cooling. Don't forget to include 100 watts per person for body heat.

        It'll be 90F degrees outside, and you'll be a cool 66F.

        And if that doesn't work, you can always tell your VP that you were taking your numbers from some guy named TrollTalk on ./
        I'm sure he'll understand.

        • Re: (Score:3, Interesting)

          Think for 2 secs ... each kw of electricity eventually gets converted to heat. Resistive heating generates ~ 3,400 btus per kilowatt, so multiplying electrical consumption by pi gives you a decent cooling capacity. Add an extra 10% and you're good to go (you *DO* remember to add in a fudge factor of between 10 and 20% for "future expansion", right?)
    • 1 Watt = 3.41 BTU/hr

      So if your server/network setup has 1000 watts of power supply capacity, I would recommend no less than 3410 BTU/Hr of cooling capacity, or 3 tons of cooling. This is given that power supplies don't usually run at their peak rated capacity, so it's slightly more than you technically need if there were no other sources of heat. Add in 10-25% for additional heat gains. Add 250 watts on top of that for each person that may work in that room more than an hour at a time.

      Final formula I used o
      • by afidel ( 530433 )
        Your numbers are off by a factor of 10 because one ton = 12,000 BTU/hr! I guess you have WAY too much cooling capacity, which isn't good because you will be constantly cycling your cooling which is way less power efficient than running a properly sized unit more or less continuously.
    • Physics (Score:4, Informative)

      by DFDumont ( 19326 ) on Tuesday November 13, 2007 @12:40PM (#21338411)
      For those of you who either didn't take Physics, or slept through it, Watts and BTU's/hr are both measurements of POWER. Add up all the (input) wattages, and use something like http://www.onlineconversion.com/power.htm/ [onlineconversion.com] to convert. This site also has a conversion to 'tons of refrigeration' on that same page.
      Also note - Don't EVER user the rated wattage of a power supply because that's what it SUPPLIES, not uses. Instead use the current draw multiplied by the voltage (US - 110 for single phase, 208 for dual phase in must commercial blgs, 220 only in homes or where you know thats the case). This is the 'VA' [Volt-Amps] unit. Use this number for 'watts' in the conversion to refrigeration needs.
      Just FYI - a watt is defined as 'the power developed in a circuit by a current of one ampere flowing through a potential difference of one volt." see http://www.siliconvalleypower.com/info/?doc=glossary/ [siliconvalleypower.com], i.e. 1W = 1VA. The dirty little secret about power calculations is that there is another factor thrown in, typically about 0.65, called the 'power factor' that UPS and power supply manufacturers use to lower the overall wattage. That's why you always use VA (rather than the reported wattage) because in a pinch you can always measure both voltage and amperage(under load).
      Basically do this - take all the amperage draws for all the devices in your rack/room/data center, multiply them by the applied voltage for that device (110 or 208) and add all the products together. Then convert that number to tons of refrigeration. This is your minimum required cooling for a lights out room. If you have people in the room, count 1100 BTU's/hr for each person and add that to the requirements (after conversion to whatever unit you're working with). Some HVAC contractors want specifications in BTU's/hr and other want it in tons. Don't forget lighting either if its not a 'lights out' operation. A 40W florescent bulb means its going to dissipate 40W (as in heat). You can use these numbers directly as they are a measure of the actual heat thrown, not of the power used to light the bulb.
      Make sense?

      Dennis Dumont
      • Re: (Score:3, Informative)

        by timster ( 32400 )
        The dirty little secret about power calculations is that there is another factor thrown in, typically about 0.65, called the 'power factor' that UPS and power supply manufacturers use to lower the overall wattage.

        It's not "thrown in" by the manufacturers. The dirty little secret is simply that you are talking about AC circuits. 1W = 1VA in AC circuits only if the volts and the amps are in phase -- which they aren't.

        Take a sine wave -- in AC, that's what your voltage looks like, always changing. If you're
      • Re: (Score:2, Insightful)

        by EmagGeek ( 574360 )
        First, 1 Watt is the movement of energy at the rate of 1 Joule per second, and need not be electrically related at all. A watt is energy per unit time.

        Second, power factor is irrelevant to cooling calculations because reactive power does not generate heat, even though it does generate imaginary current in the generating device. This is why power companies bill industrial power based on VAH and not on KWH.

        Generators are rated for the magnitude of their output current, not just the real component of it. This
      • Just FYI - a watt is defined as 'the power developed in a circuit by a current of one ampere flowing through a potential difference of one volt." see http://www.siliconvalleypower.com/info/?doc=glossary/ [siliconvalleypower.com], i.e. 1W = 1VA. The dirty little secret about power calculations is that there is another factor thrown in, typically about 0.65, called the 'power factor' that UPS and power supply manufacturers use to lower the overall wattage. That's why you always use VA (rather than the reported wattage) because in a p

      • by afidel ( 530433 )
        If your PSU's have a PF of .65 they absolutely SUCK. Our datacenter has a PF of .89 as measured by our UPS's. This is using IBM xseries and HP Prolients along with a smattering of other systems including a midsized PBX and a Xiotech SAN. My home PC has active PF correction and has a PF of .95. If you had an entire datacenter running at .65 you would be getting a HUGE bill from the power company because they have to use a dummy load at the generating facility to balance out that uneven load.
    • Re: (Score:2, Informative)

      by R2.0 ( 532027 )
      The general rule I always follow is HIRE AN ENGINEER! Ferchrissake, there are people who do these calculations for a living, and have insurance in case the screw up. You want to trust your data center to advice from slashdot and a back-of-the-envelope calculation?

      Sheesh - what's the name of your company so I can sell short?

    • Chiming in with this: http://www.dell.com/html/us/products/rack_advisor/index.html [dell.com].

      Dell-centric, but Dell is what we use here.

      It'll tell you how much power / cooling / rackspace / etc you need.
  • When systems start shutting down because the on-board temperature alarms trip, just disable the alarms.

    Man, I wish I was making that up.

  • Liquid nitrogen is the cooling answer, for sure. Then you're not dependent upon power of any kind at all. The nitrogen dissipates as it warms, just like how a pool stays cool on a hot day by 'sweating' through evaportation, and you just top up the tanks when you run low. It's cheap and it's simple. That's why critical cold storage applications like those in the biomedical industry don't use 'chillers' or refrigerators or anything like that. If you really want to put something on ice and keep it cold, y
    • Seems that having an industrial-sized tank of LN2 outside the building for such a purpose might make sense as a rather inexpensive emergency backup cooling system. Diesel generators keep the server farm online while cool N2 gas (after evaporation) keeps the server room cool. Just keep the ventilation system balanced properly that you don't displace the oxygen in the rest of the building, too.

      And that brings up one caveat: You wouldn't have access to the areas cooled without supplied air when such a syst

      • Liquid oxygen?

        The boiloff is a little bit worse but the stuff is almost as cheap as dirt. A mix of LOX and NOX would be breathable and not risk explosion.
        • Re: (Score:3, Informative)

          by MenTaLguY ( 5483 )
          Even oxygen levels elevated to as little as 23% oxygen can lead to a violent increase in the flammability of materials like cloth and hair. Controlling gas concentrations so they remain at safe levels can be very tricky.

          Setting aside evaporation, be careful not to get it on anything. LOX can easily saturate anything remotely porus and oxidisable, effectively turning it into an unstable explosive until the LOX evaporates... at LOX or LN temperatures, that can even become an issue with oxygen condensing fro
    • Re: (Score:3, Informative)

      by DerekLyons ( 302214 )

      Liquid nitrogen is the cooling answer, for sure. Then you're not dependent upon power of any kind at all.

      Except of course the power needed to create the LN2.

      That's why critical cold storage applications like those in the biomedical industry don't use 'chillers' or refrigerators or anything like that. If you really want to put something on ice and keep it cold, you use liquid nitrogen.

      As above - how do you think they prevent the LN2 from evaporating? The LN2 is a buffer against loss of pow

    • It's not that simple. If it were then others would have already implemented it. Liquid nitrogen has issues with it.

      The biomedical industry has used LN2 because they have different needs. They use it for cold storage (keeping things very cold). For computers, you need cooling (heat transfer). Also the footprint of their needs is small. The industry stores small things like vials, jars, etc. The computer industry needs to chill an entire rack. The average tank for the biomedical industry is about t

  • by MROD ( 101561 ) on Tuesday November 13, 2007 @12:21PM (#21338099) Homepage
    I've never understood why data centre designers haven't used a different cooling strategy to re-circulated cooled air. After all, for much of the temperate latitudes for much of the year the external ambient temperature is at or below that needed for the data centre so why not use conditioned external air to cool the equipment and then exhaust it (possibly with a heat exchanger to recover the heat for other uses such as geothermal storage and use in winter)? (Oh, and have the air-flow fans on the UPS.)

    The advantage of this is that even in the worst case scenario where the chillers fail totally during mid-summer there is no run-away, closed loop, self re-enforcing heat cycle, the data centre temperature will rise but it would do so more slowly and the maximum equilibrium temperature will be far lower (and dependant upon the external ambient temperature).

    In fact, as part of the design for the cluster room in our new building I've specified such a system, though due to the maximum size of the ducting space available we can only use this for half the heat load.
    • After all, for much of the temperate latitudes for much of the year the external ambient temperature is at or below that needed for the data centre so why not use conditioned external air to cool the equipment and then exhaust it...
      The next major source of "global warming"...
      • Re: (Score:2, Insightful)

        by cjanota ( 936004 )
        Where do you think current AC units dump all the heat that they extract? What the GP is suggesting just cuts out the middle man (AC). The AC units produce quite a bit if heat themselves.
    • by afidel ( 530433 ) on Tuesday November 13, 2007 @12:42PM (#21338433)
      The problem is humidity, a big part of what an AC system does is maintain humidity in an acceptable range. If you were going to try to do once through with outside air you'd spend MORE power during a significant percent of the year in most climates trying to either humidify or dehumidify the incoming air.
    • They do big data centers use glycol, when it's cool outside the compressors turn off and they just run the fans. It's a bit more up front but has savings in areas where it gets below 45 on a regular basis. Another option is large blocks of ice with coolant running through them to shift power consumption to the night and reduce the amount of energy required (cheaper to make ice at night) but only for smaller facilities they leave a reserve capacity of x hours and/or go with n+1 setups.
    • Re: (Score:2, Insightful)

      by R2.0 ( 532027 )
      Part of the problem is that it is a lot easier to move heat via liquid than air. the conventional design uses chillers mounted outside the space to cool a liquid medium/refrigerant, which is then pumped very efficiently to cooling coils in the space (modify for DX coils). the air inside the condityioned space makes a very short trip through the servers, across the room, over the coil, and back out again.

      Under your scenario, the AIR is the working medium - it is cooled on the outside, and then moved inside
  • Ah, the dangers of context-sensitive advertising.

    Ad on the main page [2mdn.net] when this article was at the top of the list.

    Does "50% off setup" mean you'll only be set up halfway before they run out of A/C?
  • If your data center's cooling system fails, how long do you have before your servers overheat?

    The first occasion was over a weekend (no-one present) in a server room full of VAX's. On the monday when it was discovered, we just opened a window and everything carried on as usual.

    The next time was when an ECL model Amdahl was replaced by a CMOS IBM. No-one downgraded the cooling and it froze up - solid. This time the who shebang was down for a day while the heat-exchangers thawed out. It was quite interesti

    • by bstone ( 145356 )
      This problem has been around since the dawn of data centers. One bank in Chicago with IBM mainframes in the 60's had battery UPS + generators to back up the mainframes, an identical setup to back up the cooling system, plus one more identical backup system to cover failure in either of the other two.

      • yes, quite. Any datacentre that relies on utility power and does not have the ability to run everything standalone is at least incompetent - bordering on negligent. Plus, if you buy space in one, without having a backup plan you deserve every bad thing that happens to you.

        Over here, there are laws that require certain establishments (i.e. financial ones) to have redundant everything, including locations.

  • by Leebert ( 1694 ) on Tuesday November 13, 2007 @12:39PM (#21338397)
    A few weeks ago the A/C dropped out in one of our computer rooms. I like the resulting graph: http://leebert.org/tmp/SCADA_S100_10-3-07.JPG [leebert.org]
    • by caluml ( 551744 )
      At 17:00 too - just when you're ready to head home.
    • by milgr ( 726027 )
      That graph doesn't look bad. It indicates that the high temerature was 92F.
      Where I work, the AC in one of the two main labs goes out. I have seen thermometers register 120F. And, the computer equipment keeps running until someone notices and asks people to shut down equipment that is not currently needed.

      One of the labs has exterior windows. Once when the AC failed in the middle of the winter, they removed a pane of glass to help cool the lab (this kept the temperature to the low 90's with some equipmen
      • by Leebert ( 1694 )

        That graph doesn't look bad. It indicates that the high temerature was 92F.

        Yes, because the A/C came back online. That curve was nowhere near leveling off. There's 200 or so TiB of SATA in that room along with ~1500 ItaniumII processors... :)

  • What happens when the primary, secondary, and tertiary air conditioners all shut down.

    http://worsethanfailure.com/Articles/Im-Sure-You-Can-Deal.aspx [worsethanfailure.com]

    steveha
  • I have worked for over a decade as a sysadmin and have seen firsthand the correlation between temperatures and server failure. I have witnessed two small server rooms melt down to lack of A/C. It is important to me because I know high temperatures will mean more likelihood that I will get a phone call in the middle of the night or on a weekend that a drive, processor or whatnot has failed on a machine.

    One thing to consider is if the heat measured outside a box is high, the heat on the surface of the proce

  • by Animats ( 122034 ) on Tuesday November 13, 2007 @12:42PM (#21338447) Homepage

    Most large refrigeration compressors have "short-cycling protection". The compressor motor is overloaded during startup, and needs time to cool. So there's a timer that limits the time between two compressor starts. 4 minutes is a typical delay for a large unit. If you don't have this delay, compressor motors burn out.

    Some fancy short-cycling protection timers have backup power, so the the "start to start" time is measured even through power failures. But that's rare. Here's a typical short-cycling timer. [ssac.com] For the ones that don't, like that one, a power failure restarts the timer, so you have to wait out the timer after a power glitch.

    The timers with backup power, or even the old style ones with a motor and cam-operated switch, allow a quick restart after a power failure if the compressor was already running. Once. If there's a second power failure, the compressor has to wait out the time delay.

    So it's important to ensure that a data center's chillers have time delay units that measure true start-to-start time, or you take a cooling outage of several minutes on any short power drop. And, after a power failure and transfer to emergency generators, don't go back to commercial power until enough time has elapsed for the short-cycling protection timers to time out. This last appears to be where Rackspace failed.

    Dealing with sequential power failures is tough. That's what took down that big data center in SF a few months ago.

  • I fail to see where this could be news to anyone who works with data centers. If you want your datacenter to operate during a power outage, you need a Generator with enough capacity for your servers/network and your cooling. If a fancy hosting site with SLA's making up-time guarantees doesn't understand this, I think thier customers should start looking elsewhere.
    • by bizitch ( 546406 )
      Thats the first that came to mind for me as well -

      What? No freaking generator? Umm....wtf?

      If not a matter of IF you lose power - just WHEN you lose power

      It will happen - I guarantee it
  • by Ron Bennett ( 14590 ) on Tuesday November 13, 2007 @01:09PM (#21338859) Homepage
    While many here are discussing UPSes, chillers, set-points, etc the most serious flaw is being glossed over ... the lack of redundency outside the data center, such as multiple, diverse power lines coming in...

    From the articles, it appears that Rackspace datacenter doesn't have multiple power lines coming in and/or many come in via one feed point.

    How else is it that a car crash quite some distance from the datacenter can cause such disruption. Does anyone even plan for such events - I get the feeling most planners don't, since I've seen first-hand many power failures occur in places where one would expect more redundency from dumb things like a vehicle hitting a utility pole, etc.

    Ron
    • Re: (Score:3, Informative)

      by PPH ( 736903 )
      You have to pay for redundant feeds from the local utility company. And they aren't cheap. If you don't select a location on the boundary of two independent distribution circuits, the two feeds are worthless.

      I live near a hospital which is located on the boundary between two distribution circuits, each fed from a different substation. That redundancy cost the hospital tens or hundreds of thousands of dollars. But the two substations are fed from the same transmission loop, which runs through the woods (lo

  • by techpawn ( 969834 ) on Tuesday November 13, 2007 @01:10PM (#21338871) Journal
    We've summoned a small demon to let in cool air particles and shunt out hot ones. Sure the weekly sacrifice gets to be a pain after a while, but there's always a pool of willing interns right?
  • by mwilliamson ( 672411 ) on Tuesday November 13, 2007 @01:15PM (#21338967) Homepage Journal
    Every single watt consumed by a computer is turned into heat, and generally released out the back of the case. Computers behave the same as the coil of nichrome wire as is used in a laundromat clothes dryer. (I guess a few milliwatts gets out of your cold room via ethernet cables and photons on fiber)
  • 5kw? ow. (Score:3, Insightful)

    by MattW ( 97290 ) <matt@ender.com> on Tuesday November 13, 2007 @01:18PM (#21339005) Homepage
    5 kilowatts is a heck of a lot to have on a single rack - assuming you're actually utilizing that. I recently interviewed a half dozen data centers to plan a 20-odd server deployment, and we ended up using 2 cabinets in order to ensure our heat dissipation was sufficient. Since data centers are usually supplying 20 amp, 110 or 120v power, you get 2200-2400 watts available per drop; although it's considered a bad idea to draw more than 15 amps per circuit. We have redundant power supplies in everything, so we keep ourselves at 37.5% of capacity on the drops, and each device is fed from a 20amp drop coming from a distinct data center pdu. That way even if one if the data center pdus implodes, we're still up and at 75%- capacity.

    Almost no data center we spoke to would commit to cooling more than 4800 watts of power at an absolute maximum per rack, and those were facilities with hot/cool row setups to maximize airflow. But that meant they didn't want to drop more than 2x20amp power drops, plus 2x20 for backup, if you agreed to maintain 50% utilization across all 4 drops. But since you'd really want to maintain 75%- even in the case of failure, you'd only be using 3600watts. (In the facility we ended up in, we have a total of 6 20 amp drops, and we only actually utilize ~4700 watts.

    Ultimately, though, the important thing is that cooling systems should be on generator/battery backup power. Otherwise, as this notes, your battery backup won't be useful.
  • The one, seemingly obvious, question I have is, why aren't the cooling needs on generator/ups backup?

    I have toured data-centers where even the cooling was on battery backup. The idea is that the battery banks hold everything as it is until the generators come fully online (usually within 30 seconds). The batteries/UPS transformers were able to hold the entire system for approx 30 minutes on battery alone irrespective of generator status. This also reduced the issues from quick brown-outs...no need to fir

No spitting on the Bus! Thank you, The Mgt.

Working...