Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Power The Internet IT

Cooling Challenges an Issue In Rackspace Outage 294

miller60 writes "If your data center's cooling system fails, how long do you have before your servers overheat? The shrinking window for recovery from a grid power outage appears to have been an issue in Monday night's downtime for some customers of Rackspace, which has historically been among the most reliable hosting providers. The company's Dallas data center lost power when a traffic accident damaged a nearby power transformer. There were difficulties getting the chillers fully back online (it's not clear if this was equipment issues or subsequent power bumps) and temperatures rose in the data center, forcing Rackspace to take customer servers offline to protect the equipment. A recent study found that a data center running at 5 kilowatts per server cabinet may experience a thermal shutdown in as little as three minutes during a power outage. The short recovery window from cooling outages has been a hot topic in discussions of data center energy efficiency. One strategy being actively debated is raising the temperature set point in the data center, which trims power bills but may create a less forgiving environment in a cooling outage."
This discussion has been archived. No new comments can be posted.

Cooling Challenges an Issue In Rackspace Outage

Comments Filter:
  • This is number 3 (Score:5, Informative)

    by DuctTape ( 101304 ) * on Tuesday November 13, 2007 @01:06PM (#21337871)
    This is actually Rackspace's number 3 outage in the past couple days. My company was only (!) affected by outages 1 and 2. My boss would have had a fit if number 3 would have taken us down for the third time.

    Other publications [valleywag.com] have noted it was number 3, too.

    DT

  • Which only shows (Score:3, Informative)

    by CaptainPatent ( 1087643 ) on Tuesday November 13, 2007 @01:06PM (#21337873) Journal
    If you want 100% uptime, it's important to have back up power for the cooling as well as the server systems themselves.
     
    Is this really news?
  • by CaptainPatent ( 1087643 ) on Tuesday November 13, 2007 @01:15PM (#21338001) Journal

    Is there a general rule for figuring out how many BTUs of cooling you need for a given wattage of power supplies?
    I actually found a good article [openxtra.co.uk] about this earlier on and it helped me purchase a window unit for a closet turned server-room. Hope that helps out a bit.
  • Re:Which only shows (Score:5, Informative)

    by jandrese ( 485 ) <kensama@vt.edu> on Tuesday November 13, 2007 @01:15PM (#21338003) Homepage Journal
    If you want 100% uptime (which is impossible, but you can put enough 9s in your reliability to be close enough), you need to have your data distributed across multiple data centers, geographically separate, and over provisioned enough that the loss of one data center won't cause the others to be overloaded. It's important to keep your geographical separation large because you never know when the entire eastern (or western) seaboard will experience complete power failure or when a major backhaul router will go down/have a line cut. Preferably each data center should get power from multiple sources if they can, and multiple POPs on the internet from each center is almost mandatory.
  • by Critical Facilities ( 850111 ) on Tuesday November 13, 2007 @01:16PM (#21338015)
    Try this. [anver.com]
  • by Bombula ( 670389 ) on Tuesday November 13, 2007 @01:19PM (#21338065)
    Liquid nitrogen is the cooling answer, for sure. Then you're not dependent upon power of any kind at all. The nitrogen dissipates as it warms, just like how a pool stays cool on a hot day by 'sweating' through evaportation, and you just top up the tanks when you run low. It's cheap and it's simple. That's why critical cold storage applications like those in the biomedical industry don't use 'chillers' or refrigerators or anything like that. If you really want to put something on ice and keep it cold, you use liquid nitrogen.
  • by fifedrum ( 611338 ) on Tuesday November 13, 2007 @01:28PM (#21338245) Journal
    It's not really this simple, but a decent back of the napkin is to take the amperage the voltage and multiply, then multiply again to get btu/hour, divide to get tons.

    20A x 110V = 2200 VA which doesn't directly translate to Watts, as someone will surely correct me, but for cooling purposes it's not a bad rule of thumb to directly translate the VA to Watts because you'll be including a built-in overhead into which you will surely grow your server space. Then go from Watts to BTU/hour.

    2200 Watts x 3.412 BTU/Hour Watt = 7506 BTU/hr

    12000 BTU/hr = 1 ton. Do that calculation for all possible hosts in your space, round up. Then purchase an additional, but portable, cooler for the space. Use that cooler for emergencies, like chilling beer, and if the main chillers break, you'll have nice cold beer to drink while the HVAC guys fix the big units and you wait for your less-essential machines to come up.

    Most people will do the caclulaton and find their datacenter cooling systems are woefully under sized, running 100% whenever the outside air temperature is above 50F...
  • Physics (Score:4, Informative)

    by DFDumont ( 19326 ) on Tuesday November 13, 2007 @01:40PM (#21338411)
    For those of you who either didn't take Physics, or slept through it, Watts and BTU's/hr are both measurements of POWER. Add up all the (input) wattages, and use something like http://www.onlineconversion.com/power.htm/ [onlineconversion.com] to convert. This site also has a conversion to 'tons of refrigeration' on that same page.
    Also note - Don't EVER user the rated wattage of a power supply because that's what it SUPPLIES, not uses. Instead use the current draw multiplied by the voltage (US - 110 for single phase, 208 for dual phase in must commercial blgs, 220 only in homes or where you know thats the case). This is the 'VA' [Volt-Amps] unit. Use this number for 'watts' in the conversion to refrigeration needs.
    Just FYI - a watt is defined as 'the power developed in a circuit by a current of one ampere flowing through a potential difference of one volt." see http://www.siliconvalleypower.com/info/?doc=glossary/ [siliconvalleypower.com], i.e. 1W = 1VA. The dirty little secret about power calculations is that there is another factor thrown in, typically about 0.65, called the 'power factor' that UPS and power supply manufacturers use to lower the overall wattage. That's why you always use VA (rather than the reported wattage) because in a pinch you can always measure both voltage and amperage(under load).
    Basically do this - take all the amperage draws for all the devices in your rack/room/data center, multiply them by the applied voltage for that device (110 or 208) and add all the products together. Then convert that number to tons of refrigeration. This is your minimum required cooling for a lights out room. If you have people in the room, count 1100 BTU's/hr for each person and add that to the requirements (after conversion to whatever unit you're working with). Some HVAC contractors want specifications in BTU's/hr and other want it in tons. Don't forget lighting either if its not a 'lights out' operation. A 40W florescent bulb means its going to dissipate 40W (as in heat). You can use these numbers directly as they are a measure of the actual heat thrown, not of the power used to light the bulb.
    Make sense?

    Dennis Dumont
  • by afidel ( 530433 ) on Tuesday November 13, 2007 @01:42PM (#21338433)
    The problem is humidity, a big part of what an AC system does is maintain humidity in an acceptable range. If you were going to try to do once through with outside air you'd spend MORE power during a significant percent of the year in most climates trying to either humidify or dehumidify the incoming air.
  • by DerekLyons ( 302214 ) <fairwater@@@gmail...com> on Tuesday November 13, 2007 @02:08PM (#21338851) Homepage

    Liquid nitrogen is the cooling answer, for sure. Then you're not dependent upon power of any kind at all.

    Except of course the power needed to create the LN2.
     
     

    That's why critical cold storage applications like those in the biomedical industry don't use 'chillers' or refrigerators or anything like that. If you really want to put something on ice and keep it cold, you use liquid nitrogen.

    As above - how do you think they prevent the LN2 from evaporating? The LN2 is a buffer against loss of power, but typically they have a pretty serious cryocooler to keep the LN2 there when they do have power.
  • Re:Physics (Score:3, Informative)

    by timster ( 32400 ) on Tuesday November 13, 2007 @02:16PM (#21338987)
    The dirty little secret about power calculations is that there is another factor thrown in, typically about 0.65, called the 'power factor' that UPS and power supply manufacturers use to lower the overall wattage.

    It's not "thrown in" by the manufacturers. The dirty little secret is simply that you are talking about AC circuits. 1W = 1VA in AC circuits only if the volts and the amps are in phase -- which they aren't.

    Take a sine wave -- in AC, that's what your voltage looks like, always changing. If you're powering something purely resistive like an incandescent bulb, your amps follow the same sine wave and 1W=1VA. But inductive loads like power supplies introduce a lag in the current, so that the amps aren't in phase with the volts. As a result, you cannot naively multiply the RMS volts by the RMS amps to get the average wattage -- you have to take the integral of volts times amps through the curve. And for part of that curve, the voltage and the current flow in different directions, which represents negative power (that is, the inductive circuitry is pushing current back across the wire). As a result of this the overall power will always be less than the volt-amps.
  • by Anonymous Coward on Tuesday November 13, 2007 @02:21PM (#21339035)
    Well, specifically, a ton, in HVAC terms, is the amount of energy one ton of ice can absorb. i.e. when refrigeration replaced ice, if you had one ton of ice delivered per day, a one ton cooler would fit the bill.
  • by R2.0 ( 532027 ) on Tuesday November 13, 2007 @02:24PM (#21339089)
    The general rule I always follow is HIRE AN ENGINEER! Ferchrissake, there are people who do these calculations for a living, and have insurance in case the screw up. You want to trust your data center to advice from slashdot and a back-of-the-envelope calculation?

    Sheesh - what's the name of your company so I can sell short?
  • Re:Which only shows (Score:3, Informative)

    by autocracy ( 192714 ) <slashdot2007@sto ... .com minus berry> on Tuesday November 13, 2007 @02:49PM (#21339513) Homepage
    Reading the article, they WERE on backup power. Emergency crews shut down the backup power to the chillers temporarily while they were working in the area, so the chillers had to start again. Cycling these big machines isn't instant.
  • by arth1 ( 260657 ) on Tuesday November 13, 2007 @02:53PM (#21339569) Homepage Journal
    While thinking outside the box is all well and fine, it's even better when combined with Common Knowledge. Like knowing that caves and mines (a) tend to be rather warm when deep enough, and (b) have a fixed amount of air.

    As for the power efficiency of pumping air from several hundred meters away compared to pumping it through the grille of an AC unit, well, there's a reason why skyscrapers these days have multiple central air facilities instead of just one: Economics.

    I'd like to see you pump air for any long distance with your exercise bike :-)
  • by PPH ( 736903 ) on Tuesday November 13, 2007 @03:18PM (#21339955)
    You have to pay for redundant feeds from the local utility company. And they aren't cheap. If you don't select a location on the boundary of two independent distribution circuits, the two feeds are worthless.

    I live near a hospital which is located on the boundary between two distribution circuits, each fed from a different substation. That redundancy cost the hospital tens or hundreds of thousands of dollars. But the two substations are fed from the same transmission loop, which runs through the woods (lots of trees and on inaccessible rights-of-ways), so the most probable fault will take both stations, circuits, and sources to the hospital off line.

    The moral of the story: Don't depend on an outside organization (the local utility) for service when its your neck on the line and not theirs.

  • by R2.0 ( 532027 ) on Tuesday November 13, 2007 @04:03PM (#21340621)
    What you and the OP are describing is called "free cooling", a long established principle in HVAC design. It is used in commercial and industrial buildings all the time. The reason that it is not used in residential all that much is that

    1) until relatively recently, houses "breathed" quite well on their own due to loose construction. With tightening energy codes and the use of Tyvek and better windows, houses don't have a lot of air exchange through the boundaries, and problems ensue - "stuffiness", moisture, mold, "sick building". Residential construction hasn't thought this through yet - there are some builders who now refuse to use Tyvek due to ventilation (and liability) issues.

    2) Controls become an order of magnitude more complicated. Most residential systems are "bang bang" systems - it's on or off based on 1 criteria. To introduce free cooling, you need outside air sensors, dampers, actuators, and a controller a lot more complex than a home t-stat. For most ersidential builders, that's a couple thousand in extra costs that can't be recouped in sale price - most owners just don't care, and when you are building 5000 of the same unit, "most owners" rule.

    As for dehumidification, you have it backwards - dehumidification typically required a COLDER coil than necessary for cooling alone, and then you reheat the air. It is horribly inefficient, but sometimes necessary - with a "tight" building, you have to get the moisture out somehow, and supercooling the air inside just isn't a good idea (other than making for lots of erect nipples, that is)

    Finally, what makes sense for one situation may not for another - a data center uses orders of magnitude more cooling than a house or common office building. Moving the amount of air necessary to provide that cooling gets really hard - the amount of energy a fan requires increase with the CUBE of the flow required. So to get twice the airflow you use 8x the power. It's the same with pumps, but because the heat capacity of water or glycol is so much greater than that of air, the effects are minimized.
  • by Critical Facilities ( 850111 ) on Tuesday November 13, 2007 @04:10PM (#21340713)
    I agree with almost all of your post with the only exception being the cooling systems on UPS. There is absolutely no reason to put cooling systems on UPS power. Large, inductive loads are a UPS's enemy. A big inrush current of a chiller starting up would beat the crap out of your battery string(s).

    Having said that, you are exactly right on having both your UPS system(s) and your cooling system(s) diversified. I tend to get into this argument with people regarding what constitutes a "data center" and one of the most significant parts of determining what actually constitutes a "data center" is redundancy. This means not just redundant utility power feeds, but redundant UPS systems/modules, redundant generators, redundant chillers/CRACs, redundant PDU's, etc etc.

    For our cooling systems, we have 4 Chillers (we only need 2) and 20 CRACs (we only need 10. Any problems with any system can be mitigated by rolling to the redundant system.
  • by MenTaLguY ( 5483 ) on Tuesday November 13, 2007 @04:13PM (#21340765) Homepage
    Even oxygen levels elevated to as little as 23% oxygen can lead to a violent increase in the flammability of materials like cloth and hair. Controlling gas concentrations so they remain at safe levels can be very tricky.

    Setting aside evaporation, be careful not to get it on anything. LOX can easily saturate anything remotely porus and oxidisable, effectively turning it into an unstable explosive until the LOX evaporates... at LOX or LN temperatures, that can even become an issue with oxygen condensing from the air onto your equipment/insulation. Forget just avoiding the creation of sparks -- better be sure that the safety measures have been successful in eliminating all LOX-incompatible materials and be careful not to bump anything too hard!

    And of course, even a tiny fire or explosion can easily lead to a rapid boiloff. Sudden boiloff can be an issue simply because of drastically increased pressure and still-cold temperature. Liquified gasses like LOX, NOX, etc. expand a LOT when they boil (about an 600-800x increase in volume, simply transitioning from a liquid to a gas), even while remaining dangerously cold. Imagine being in a closed room with a punctured dewar. Assuming you've escaped being hit by the dewar which has gone flying like a deflating balloon with reinforced-concrete-shattering force, you've potentially got ruptured eardrums and possibly internal injuries due to the abrupt pressure change which has also jammed the door. You fall to the floor from the pain of burns on your lower body from the ultracold gas which has quickly filled the lower part of the room -- which then starts to burn your face and lungs out too as you start breathing it.

    Hopefully the facility you're in has proper emergency ventilation measures, adequate room size, properly constructed doors, and protective equipment to avoid this scenario, but you still don't want to be in the room if it happens if you can help it... Cryogenic gasses are seriously dangerous. Don't underestimate them or treat them lightly.
  • by R2.0 ( 532027 ) on Tuesday November 13, 2007 @05:52PM (#21342185)
    Are you stupid? A heat exchanger (btw, not "very common" at all in residential) is the OPPOSITE of free cooling! In free cooling, COLD outside air is brought directly into the space, bypassing the cooling coils. Why? Because the air is cool already.

    With a heat exchanger, you are bringing cool air in, and then HEATING IT UP with the waste heat from the exhaust air. Great for saving energy in a residence, when one wants to stay toasty warm - not so great in a data center or office building when there is still a cooling load in winter. So absolutely nothing you said has anything to do with free cooling. Heat exchangers are great for what they do, but free cooling isn't it.

    "That's also untrue. The direct, free-flowing heat exchange between hot and cold coils allows dehumidifiers to be much more energy efficient, using typically around 1/3rd as much power for the same volume of air."

    True - IF you are using a heat exchanger. But if one is not - lets say, in an office building on a cool spring morning - then you have a problem. You bring in nice 65F air, at 65-70% RH - it's wet. You don't heat it up through a HX, because you need the 65F air to maintain temp setpoint. But now you are dumping a lot of water into the space, and it doesn't *feel* cool. So, you run your cooling coil at, say 50F discharge temp. That is below dewpoint, and it pulls moisture out of the air. But now you are dumping 50F air into the space, so the space temp gets driven down, and you get the nipple effect. So what do you do? REHEAT the air to 65F. Which, BTW, is exactly what home humidifiers do - the discharge air is reheated to a temp greater than the intake air, reflecting the energy added by the electricity. TANSTAAFL.

    You can throw a HX in that equation, but it certainly isn't a dumb device - the control logic needs to know when to open the air dampers and close them, so as not to interfere with free cooling.

    "As to coil temperature, obviously any temperature will work, to varying degrees of effectiveness. You'll need to provide some numbers to back up your claim. General-purpose dehumidifiers are usually just slightly modified AC units"

    Bullshit. The coil temperature MUST be less than the dew point of the air, by the very definition of "dew point". Practically, it needs to be substantially less for the dehumidification to really work. Often, that temp is less than desired for discharge air temp. See above example.

    Call me when you've bought a psychrometric chart and a ductulator. There are plenty of design decisions to be made when designing an HVAC system, unfortunately including appeasing owners who think they are design geniuses.
  • by Spazmania ( 174582 ) on Tuesday November 13, 2007 @06:01PM (#21342305) Homepage
    the local data centre there had a 15 degree C ambient baseline

    Well that's just incompetent. For one thing, commercial electronics experience increased failure as you move away from an ambient 70 degrees F regardless of which direction you move. Running them at 59 degrees F (15 C) is just as likely to induce intermittent failures as running it at 80 degrees F.

    For another, you're supposed to design your cooling system to accommodate all of the planned heat load in the environment. If your generators will be adding heat then the A/C needs to have sufficient capacity to take that heat back out.

    And anyway, your generators shouldn't be adding heat. They should be walled off from the data center with exterior air exchange. Otherwise an error in the exhaust ducting risks killing your operators with CO poisoning.
  • Re:Which only shows (Score:3, Informative)

    by shoemakc ( 448730 ) on Wednesday November 14, 2007 @03:48AM (#21347037) Homepage

    Well, it's clear by that statement that you have no idea of the infrastructure of a Data Center.


    Believe it or not, I've designed both, and while I certainly don't claim to be an expert on all the IT equipment, I've got a pretty good idea of the electrical systems that go into them.

    My description of the emergency branches was intentionally vague because their full definitions comprise some dozens of pages in NFPA 99. I assumed most people wouldn't care about that level of detail :-)

    Anyway, my point was that while a typical data center has 3 types of power available (Normal, Emergency and UPS), a typical hospital usually has at least 5:

    Normal
    Emergency (Life Safety)
    Emergency (Critical)
    Emergency (Equipment)
    Emergency (UPS)

    These generally include separate panels, feeders, automatic transfer switches...etc, so I still stand by my claim that hospitals have the more complex electrical system. Also consider that hospitals now contain increasingly critical data center facilities. Of course I will concede that the UPS topology of a large data center is generally far more complex then a hospitals....but again, that's just one part of the puzzle.

    "Starting" and closing to the Buss are 2 very different things. If you believe that large generators are starting and closing to the buss at full voltage and balanced frequencies in 3 seconds, I have a bridge that you may be interested in purchasing. To give you some perspective, our 2 generators for our Data Center (2 Megawatts each) start and close to the buss (and are assume the building load) in 15 seconds. We, of course, circulate the heated jacket water to keep the oil, cylinders, etc warm and ready as you described.


    I'll take that bridge. The reason your generators take 15 seconds to start is that they comprise a Level 2 system (as defined in NFPA 110), and not the Level 1 system that hospitals require. Level 1 includes a whole bunch of additional requirements (ie...expense) that are simply not required where the outage will not potentially risk human life, ie, datacenters. Now i'm not sure about all the modifications that manufacturers must make to their gen sets to meet these requirements, but I can assure you that that 10 second start (which includes startup, sync and bus connection) is required by code. Also, I've been there at the monthly test that hospitals are required to perform and yep...they really are that quick.

    Now again...it's not that your generators are bad...it's just that theirs no reason for a company to spend the extra cash on that sort of system when a longer startup time will do; typically the UPS is sized for 15 minutes of runtime and the HVAC equipment can go down for a few minutes without the room overheating.

    There are many components that you're probably unaware of and layers of redundancy that are invisible to those who do not work in the "back of house" Critical Environments. To reiterate, I'm not saying that hospitals aren't complex nor am I saying that they do not have Critical Environments within them. I'm simply saying that you may have a perception of what a Data Center is that is not necessarily consistent with what is actually the case.


    Similarly, I'm not claiming that hospitals are more complex overall systems....just that their electrical distribution systems typically are.

    -Chris

Understanding is always the understanding of a smaller problem in relation to a bigger problem. -- P.D. Ouspensky

Working...