Forgot your password?
typodupeerror
Power The Internet IT

Cooling Challenges an Issue In Rackspace Outage 294

Posted by Zonk
from the getting-a-touch-warm-in-here dept.
miller60 writes "If your data center's cooling system fails, how long do you have before your servers overheat? The shrinking window for recovery from a grid power outage appears to have been an issue in Monday night's downtime for some customers of Rackspace, which has historically been among the most reliable hosting providers. The company's Dallas data center lost power when a traffic accident damaged a nearby power transformer. There were difficulties getting the chillers fully back online (it's not clear if this was equipment issues or subsequent power bumps) and temperatures rose in the data center, forcing Rackspace to take customer servers offline to protect the equipment. A recent study found that a data center running at 5 kilowatts per server cabinet may experience a thermal shutdown in as little as three minutes during a power outage. The short recovery window from cooling outages has been a hot topic in discussions of data center energy efficiency. One strategy being actively debated is raising the temperature set point in the data center, which trims power bills but may create a less forgiving environment in a cooling outage."
This discussion has been archived. No new comments can be posted.

Cooling Challenges an Issue In Rackspace Outage

Comments Filter:
  • by Anonymous Coward on Tuesday November 13, 2007 @01:20PM (#21338091)
    Has anyone thought about putting data-centers in upper Canada / arctic regions? Just (honestly) curious.

    There isn't any cheap high-speed fiber up there. Even if there was, the additional lag due to 5000 miles of fiber would be annoying, not to mention shipping & transport costs.
  • by Ron Bennett (14590) on Tuesday November 13, 2007 @02:09PM (#21338859) Homepage
    While many here are discussing UPSes, chillers, set-points, etc the most serious flaw is being glossed over ... the lack of redundency outside the data center, such as multiple, diverse power lines coming in...

    From the articles, it appears that Rackspace datacenter doesn't have multiple power lines coming in and/or many come in via one feed point.

    How else is it that a car crash quite some distance from the datacenter can cause such disruption. Does anyone even plan for such events - I get the feeling most planners don't, since I've seen first-hand many power failures occur in places where one would expect more redundency from dumb things like a vehicle hitting a utility pole, etc.

    Ron
  • by Bandman (86149) <bandman@gmail . c om> on Tuesday November 13, 2007 @02:10PM (#21338889) Homepage
    I hate to pick, but I think you'll find rotund people actually have a lower surface to mass ratio than thin people.

  • by mwilliamson (672411) on Tuesday November 13, 2007 @02:15PM (#21338967) Homepage Journal
    Every single watt consumed by a computer is turned into heat, and generally released out the back of the case. Computers behave the same as the coil of nichrome wire as is used in a laundromat clothes dryer. (I guess a few milliwatts gets out of your cold room via ethernet cables and photons on fiber)
  • 5kw? ow. (Score:3, Insightful)

    by MattW (97290) <matt@ender.com> on Tuesday November 13, 2007 @02:18PM (#21339005) Homepage
    5 kilowatts is a heck of a lot to have on a single rack - assuming you're actually utilizing that. I recently interviewed a half dozen data centers to plan a 20-odd server deployment, and we ended up using 2 cabinets in order to ensure our heat dissipation was sufficient. Since data centers are usually supplying 20 amp, 110 or 120v power, you get 2200-2400 watts available per drop; although it's considered a bad idea to draw more than 15 amps per circuit. We have redundant power supplies in everything, so we keep ourselves at 37.5% of capacity on the drops, and each device is fed from a 20amp drop coming from a distinct data center pdu. That way even if one if the data center pdus implodes, we're still up and at 75%- capacity.

    Almost no data center we spoke to would commit to cooling more than 4800 watts of power at an absolute maximum per rack, and those were facilities with hot/cool row setups to maximize airflow. But that meant they didn't want to drop more than 2x20amp power drops, plus 2x20 for backup, if you agreed to maintain 50% utilization across all 4 drops. But since you'd really want to maintain 75%- even in the case of failure, you'd only be using 3600watts. (In the facility we ended up in, we have a total of 6 20 amp drops, and we only actually utilize ~4700 watts.

    Ultimately, though, the important thing is that cooling systems should be on generator/battery backup power. Otherwise, as this notes, your battery backup won't be useful.
  • by cjanota (936004) on Tuesday November 13, 2007 @02:37PM (#21339297)
    Where do you think current AC units dump all the heat that they extract? What the GP is suggesting just cuts out the middle man (AC). The AC units produce quite a bit if heat themselves.
  • by R2.0 (532027) on Tuesday November 13, 2007 @02:41PM (#21339351)
    Part of the problem is that it is a lot easier to move heat via liquid than air. the conventional design uses chillers mounted outside the space to cool a liquid medium/refrigerant, which is then pumped very efficiently to cooling coils in the space (modify for DX coils). the air inside the condityioned space makes a very short trip through the servers, across the room, over the coil, and back out again.

    Under your scenario, the AIR is the working medium - it is cooled on the outside, and then moved inside via relatively inefficient fans. And it is a SHITLOAD of air - that means either high volumes (huge ductwork) or high velocity (how do you like working in a wind tunnel?).

    Fans on UPS? Are you kidding? How big do you want your UPS to be? Fans suck a LOT of power, especially when you have them doing what you propose.

    "as part of the design for the cluster room in our new building I've specified such a system"

    You've specified? From your post, it's obvious you aren't an HVAC engineer, so what are your qualifications? Did you do an analysis to see what the real ROI is? Or is it just so obvious to you why years of HVAC design are totally wrong?
  • Re:Physics (Score:2, Insightful)

    by EmagGeek (574360) <gterich@aol.cTWAINom minus author> on Tuesday November 13, 2007 @02:44PM (#21339417) Journal
    First, 1 Watt is the movement of energy at the rate of 1 Joule per second, and need not be electrically related at all. A watt is energy per unit time.

    Second, power factor is irrelevant to cooling calculations because reactive power does not generate heat, even though it does generate imaginary current in the generating device. This is why power companies bill industrial power based on VAH and not on KWH.

    Generators are rated for the magnitude of their output current, not just the real component of it. This is also why power companies try their best to load all three phases equally - because in that case the net instantaneous current out of the generator is zero and the physical forces on the windings and stators is constant and uniform.

    Also, most server-class power supplies have power factor correction which adjusts the power factor to 1 by adding shunt capacitance to the input of the supply.

    The major point that most people seem to be missing in this dialogue is that a 500W PC power supply does not draw 500W from the wall by simply being plugged in. A PC power supply will deliver only what power is needed by the devices connected to it. For example, my server at home is an X2-4800 with 8 hard disks in it, 4 cooling fans, and a 600W power supply. The total power draw on the server box, two UPS units, the 24 port ethernet switch, the router, the cable modem, and the overhead light, is 262W at CPU idle. Just because it has redundant 500W supplies doesn't mean it's going to draw 1000W just sitting there. I have not measured the power with both CPUs at 100%.
  • by Skapare (16644) on Tuesday November 13, 2007 @03:33PM (#21340167) Homepage

    A large data center should not have one big massive UPS anyway. It should all be divided out into various load sections, each with its own UPS+battery system. Once you do that, then you can have cooling on its own UPS without any risk of the cooling system impacting the UPS feeding the computers ... if you really want cooling on UPS (it can be done, but generally is not the best way). Surely you would have the cooling on it's own three phase circuits.

    Perhaps a better approach is a smart cooling system that rotates the starting of compressors on various units so you always have some number of units running and some number not running, at the ratio needed for the current thermal demands. Then where there is an outage that has to go to generators, only a limit number of units will have been recently started just before the outage and need to be thermally protected. The controller skips those and starts the idle units (unless you are already maxxed out in which case you'd have no idle units). But you will need to have the cooling on the generators.

    If you are going to have a backup distribution circuit from the utility, it should be physically separate from the primary circuit so that it is not necessary to shut down both to deal with things like a traffic accident.

  • by Anonymous Coward on Tuesday November 13, 2007 @05:53PM (#21342207)
    And stop running the SETI client, install the power scaling software of choice for your architecture, and stop wasting so damn much electricity to keep your precious snowflake busy 24/7.

The more cordial the buyer's secretary, the greater the odds that the competition already has the order.

Working...