Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

[ Create a new account ]

Why Power Failures Can Always Lead To Data Loss

Posted by timothy on Wednesday July 23, @01:03PM
from the when-velcro-snags-shoelaces dept.
bigsmoke writes "So, all your servers run on RAID. You back up religiously. You're even sure that your backups are recoverable. But do you also need a UPS? According to Halfgaar (on Slashdot before to promote better Linux backup practices), yes, usually you do. He argues that despite technological advancements such as file system journaling, power failures can still cause data loss in most setups."

Related Stories

[+] Backing up a Linux (or Other *nix) System 134 comments
bigsmoke writes "My buddy Halfgaar finally got sick of all the helpful users on forums and mailing lists who keep suggesting backup methods and strategies to others which simply don't, won't and can't work. According to him, this indicates that most of the backups made by *nix users simply won't help you recover, while you'd think that disaster recovery is the whole point of doing backups. So, now he explains to the world once and for all what's involved in backing up *nix systems."
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • by Skyshadow (508) * on Wednesday July 23, @01:05PM (#24306965) Homepage

    Power losses can cause data loss? Gee, you mean that my system that relies on electricity for everything it does can be adversely effected by power outages even if I take precautions? That's some good admin work there, Lou -- if only there was some sort of law that covered the tendency of things that can go wrong to go wrong...

    Next week: Fires can make things warm, floods can make things wet.

  • Illiteracy (Score:5, Funny)

    by carou (88501) on Wednesday July 23, @01:06PM (#24307005) Homepage Journal

    From TFA:

    (DRAM needs to be refreshed constantly otherwise it will loose it's data)

    Fly, little data! Be free!

  • by internerdj (1319281) on Wednesday July 23, @01:07PM (#24307009)
    Definitely maybe?
  • Duh! (Score:5, Insightful)

    by mlwmohawk (801821) on Wednesday July 23, @01:08PM (#24307029)

    I remember a discussion on the PostgreSQL hacker's list about recoverability and transaction logs.

    You can't make a system that will not lose data, you can only make a system that knows the last save point of 100% integrity.

    There are too many variables and too much randomness on a cold hard power failure. You absolutely need a UPS that gives you time to shut down cleanly.

  • by pembo13 (770295) on Wednesday July 23, @01:12PM (#24307103) Homepage
    APC is the only UPS maker on the market that has at least spent some small effort so that their UPSs can be properly integrated with a Linux machine. I made the mistake of purchasing an Ultra UPS as it was cheaper than the APC.
  • by Joebert (946227) on Wednesday July 23, @01:13PM (#24307123) Homepage
    The funny part is someone had to have thought they were safe without a UPS for this to become news.
  • by sco_robinso (749990) on Wednesday July 23, @01:14PM (#24307133)
    In my company, everything is behind UPSs. Our SAN is even behind 2 separate UPSs. We thought everything was configured properly, but you'd be surprised what comes to roost when you test everything.

    We recently had a test night where all we did was test the UPS system and shutdown procedures, and there was a couple gotchas. Interestingly, by default the APC powerchute app we were using defaulted to shutting down the UPS completely after the [first] server went down - not good. This was buried fairly deeply in the configuration.

    Equally important to any protection measure, be it RAID, Power Protection, whatever - is testing!
  • by alta (1263) on Wednesday July 23, @01:23PM (#24307279) Homepage Journal

    Ok, now everyone has something to give to your kid for the sysadmin-in-traning class.

    For the rest of us... back to work, nothing here you didn't learn your first year.

    For the poster... Shame shame... Turn in your card.

    • by mlwmohawk (801821) on Wednesday July 23, @01:19PM (#24307219)

      Computer power supplies should be built with enough spare capacitance to run things long enough for the computer to save critical data

      Here's a question for you: Calculate the size of the capacitor needed that can hold enough power to run a 200W load for 5 minutes and maintain a voltage level within a specific usable range.

      Hint: its BIG. batteries are more space efficient, but the chemicals and outgassing make them inappropriate for location INSIDE the computer box.

    • by Macman408 (1308925) on Wednesday July 23, @01:27PM (#24307321)

      This is old hat in embedded systems.

      Yes, but embedded systems usually have lower power requirements, or at the very least, a smaller range of power requirements. You can't add 3 PCIe cards, a few extra drives, and a few more GB of RAM to most embedded systems.

      I worked on the design of an embedded system a few years ago that had a holdup spec - I think it was supposed to survive for 50 ms with no power. So a 50 ms power interruption would result in continued operation, while an outage longer than that was allowed to reset the board. However, the power draw on the board was around 200 Watts; being able to supply that much power for that long in a fairly compact form factor was a huge hurdle. It also caused airflow problems, because the giant capacitors would prevent air from getting to other components on the board, like the CPU. In the next version of the spec, I believe the holdup requirement was eliminated - apparently we weren't the only ones having trouble meeting that requirement.

    • Our Tandem (Score:5, Interesting)

      by PIPBoy3000 (619296) on Wednesday July 23, @01:37PM (#24307541)
      This reminds me of my favorite power loss story. The facility was doing a generator test, where we were supposed to switch over from city power to the generator. Unfortunately it didn't happen smoothly and the UPS kicked in. Sadly it turned out that so many servers had been added since the original design, the UPS was really only good for fifteen minutes or so. The final problem was that our operator didn't notice the issue quickly enough and so the next thing everyone in IT knew is that our main data center just lost power.

      We spent most of the day getting our servers back up from various states of disrepair (confirming the article, power loss is superbad). It turns out that our main medical software ran on a Tandem. Though the drives and such lost power, the CPU had a backup of D-batteries and survived the power loss just fine. Needless to say, we stopped making fun of their seemingly primitive emergency backup power.
      • by SuperQ (431) * on Wednesday July 23, @01:45PM (#24307709) Homepage

        Yup the 3 major types of battery UPSs I know of:

        Offline - Relay or simple failover. (APC Backups)

        Line Interactive - Can correct line over/under voltage to a point (APC Smartups)

        Online - Full AC -> DC -> AC conversion. (APC Symetra, Liebert, anything that doesn't suck)

        Basically outside of home use you want an online type UPS.

        There are other systems like motor/generator flywheel types, but they need a very fast backup generator to sustain anything more than 30 seconds of outage. But they're great for smoothing out some types of line issues.