Server Failure Destroys Sidekick Users' Backup Data 304

Posted by timothy on Sunday October 11, 2009 @05:29AM from the oh-well-enough-said dept.

Expanding on the T-Mobile data loss mentioned in an update to an earlier story, reader stigmato writes "T-Mobile's popular Sidekick brand of devices and their users are facing a data loss crisis. According to the T-Mobile community forums, Microsoft/Danger has suffered a catastrophic server failure that has resulted in the loss of all personal data not stored on the phones. They are advising users not to turn off their phones, reset them or let the batteries die in them for fear of losing what data remains on the devices. Microsoft/Danger has stated that they cannot recover the data but are still trying. Already people are clamoring for a lawsuit. Should we continue to trust cloud computing content providers with our personal information? Perhaps they should have used ZFS or btrfs for their servers."

Server Failure Destroys Sidekick Users' Backup Data

This discussion has been archived. No new comments can be posted.

Search 304 Comments Log In/Create an Account

Comments Filter:

A server failure? (Score:4, Informative)

by corsec67 ( 627446 ) writes: on Sunday October 11, 2009 @05:36AM (#29709765) Homepage Journal

A server failure caused all of the data to be lost?
No backups? Not even a spare server with a mirror of the data? No servers in different places? No off-site backup strategy?
As an aside, why would that data be stored in volatile non-battery backed up ram? All of my graphing calculators have a special battery to keep the ram, and they aren't even supposed to store important stuff. Flash is cheap enough these days, why should simply removing the battery cause important data to be lost?

Re:"they should have used ZFS or btrfs" (Score:4, Informative)

by rastilin ( 752802 ) writes: on Sunday October 11, 2009 @05:50AM (#29709807)

This seems a rather silly point to make. I know this is Slashdot and we have to suggest Open Source alternatives but throwing out random file systems as a suggestion to fix poor management and HARDWARE issues is some place between ignorant and silly.
Not as silly as it might appear. One of ZFS's main functions is that it can compensate for some degree of hardware failure.

Re:Why not store the data on phone permanent memor (Score:4, Informative)

by Anonymous Coward writes: on Sunday October 11, 2009 @05:58AM (#29709843)

Because the entire Sidekick architecture is very client-serverish, not transparent as with ordinary phones (GPRS/EDGE/UMTS/etc. through a NAT to internet at large); the server is supposed to be responsible for all that data, and the phone is just caching it. Given that architecture, asking why the local copy is on volatile RAM is analogous to asking why your CPU doesn't have a battery backup for system RAM, or even L2 cache.
That's one of the big reasons I didn't go with a sidekick, even though they have (or had, last I was shopping around) basically the cheapest internet plans available; they push all sorts of stuff that's handled by the phone in any other system off to the Danger servers,. While that does expose you to other people losing your data, as seen here, I didn't even consider that. I just like having a direct internet pipe, so I can run whatever software I want locally.
That said, there are plain benefits to the Sidekick model, for some people. Basically, if you don't want to do funny stuff on your phone, and if you're no less incompetent than the MS/Danger sysadmins, it's better. After all, if you drop your sidekick in a toilet, run over it with a truck, and vaporise it with a plasgun, you can just get a new one and have all your data back -- which is good, since if you're 95% of people, you've _never_ backed up your phone's data. But it's not for me, and given your desire to have your phone work as a PDA even if you power-cycle it in a wilderness/cave/other net-less place, it's not for you either.

Re:"they should have used ZFS or btrfs" (Score:5, Informative)

by gravos ( 912628 ) writes: on Sunday October 11, 2009 @06:01AM (#29709857) Homepage

The current major cloud providers (Google and Amazon) both replicate your permanent data to multiple hard disks (Google: 3, not sure about Amazon) in multiple areas of the datacenter, and I know Google is looking at providing replication to different datacenters (which is more complex than replication in the same datacenter because of the time delay).

DIY phone backups (Score:4, Informative)

by golfnomad ( 1442971 ) writes: on Sunday October 11, 2009 @06:03AM (#29709869)

There are 3rd party apps out there that will let you "backup" your phone data yourself. I personally use a program called bitpim www.bitpim.org (make sure you d/l latest version). It works with many different phone models and I have used it several times to "restore" my phone data (had 2 phones with hardware issues). It restored my calendar, notes, phone book and rings tones (that last one can save you d/l $$$). It is easy enough to install and use, you do not have to be a total geek to make it functional (but having one available to help you set up backups would probably help). Been working in the IT industry too long to rely on someone else backing up my data for me, and I will not encourage Murphy to have a party in my honor!

Re:A server failure? (Score:5, Informative)

by Serious Callers Only ( 1022605 ) writes: on Sunday October 11, 2009 @06:44AM (#29710031)

There's some interesting background leaks on the takeover of Danger in this article [appleinsider.com] which seem to imply they cut a lot of staff, and gutted the company, which is now running on a skeleton staff. So I guess it's not too surprising when this sort of mistake is made. Not the most reliable source, but they did definitely cut a lot of danger staff after the acquisition.

Re:undelete (not de-corrupt) (Score:3, Informative)

by myxiplx ( 906307 ) writes: on Sunday October 11, 2009 @07:16AM (#29710155)

Yes, it's called a snapshot. Take a snapshot and you can either roll the entire system back to that point in time, or just browse its contents and extract the files you want.

It is an ancient story, endlessly repeated (Score:5, Informative)

by SmallFurryCreature ( 593017 ) writes: on Sunday October 11, 2009 @07:52AM (#29710275) Journal

It is development dome.
Two companies enter, MS comes out, slightly fatter.
If you do business with MS, you are riding a tiger with the brains to realize that lunch is only a roll on the ground away.
MS really should be renamed to BubbaSoft. Get into the shower with BubbaSoft and you know what is going to happen.

Re:"they should have used ZFS or btrfs" (Score:5, Informative)

by IamTheRealMike ( 537420 ) writes: on Sunday October 11, 2009 @09:02AM (#29710587)

I'm not sure what you mean by "cloud provider" as such but Google App Engine has always been replicated across datacenters [blogspot.com].

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

by Tweezer ( 83980 ) writes: on Sunday October 11, 2009 @09:03AM (#29710595)

Even with a SAN you need to limit volumes sizes to whatever size you can restore within the acceptable restoration window. There are also those times where you just want to run a chkdsk and if the volume is too big, it takes too long.
That being said, I can't believe they didn't have any backup. Even if they skipped the pre-upgrade backup, they should have had one from last night/week/month. Any of those options would be better than nothing. I have to assume they were doing backup to disk on the same SAN they were upgrading, which is pretty dumb. I still can't understand why they didn't have a backup at another site somewhere else in the world. We do that sort of thing all the time where I work.

Re:"they should have used ZFS or btrfs" (Score:5, Informative)

by jimicus ( 737525 ) writes: on Sunday October 11, 2009 @09:08AM (#29710613)
I've always been amazed that tape is trusted as much as it is. It seem (anecdotally at least) to have a disproportionately high failure rate.
I'm not sure that's the problem so much - after all, LTO has a read head positioned directly after the write head and automatically verifies as it goes along. A tape error is dead easy to spot.
There are a number of places where things can fall apart, and tapes don't even need to come into the matter:
- Nobody checking the logs
- Failure to understand the processes necessary to get a good backup. (You can't just dump the files that comprise a database to disk - you must either quiesce the database or use the DBMS' inbuilt backup routine - or you will wind up with inconsistent files and hence an inconsistent database. You'd be amazed how many people don't understand this.)
- Failure to maintain backup processes. (When you moved the database to another disk because you were running out of space, you did update your backup process? Right?)
- Not doing any test restores.
- Not doing enough test restores, or doing them carefully enough. (If you're unlucky, your database will come back up OK even though you didn't quiesce it before carrying out the backup. Why do I say unlucky? Well, if it had not come up OK, you'd know immediately that there was a problem with your process. Then once the database is back up, make sure you check the restored data to ensure that recent transactions which should be on the backup actually are).
Re:Thin client: Android, too? (Score:5, Informative)

by RedK ( 112790 ) writes: on Sunday October 11, 2009 @09:33AM (#29710723)

No, it's not how Android works, or how the iPhone works either. You can have cloud enabled applications, but you can also have local storage based ones without any problems. There is nothing in the SDKs that force you to use the cloud for storage at all.

Re:Thin client: Android, too? (Score:3, Informative)

by hedwards ( 940851 ) writes: on Sunday October 11, 2009 @10:17AM (#29710919)

It's not as much of an issue. You might be using a product for which Data Liberation Front [dataliberation.org] hasn't gotten to, but Google does have people working on any of those applications to make it possible to make ones own back up. I'm not sure what specifically triggered that, but I keep a backup of any important information on my computer which is backed up to my local backup mirror and remotely.

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

by uncleFester ( 29998 ) writes: on Sunday October 11, 2009 @10:40AM (#29711005) Homepage Journal

"Who the F*ck in the right mind fiddle something on SAN without confirming a full backup of all applications/databases?
people who drink the kool-aid whenever vendors of said products repeatedly swear up and down all their tasks/patching/operations are 'totally no-impact and no-visibility changes.' combine that with people unwilling to take downtime or spend $$$ to properly protect the contents ahead of time and you have just cooked a recipe for disaster.
-r (not speaking from personal experience.. of course.. :/ )

Re:Sidekick (Score:3, Informative)

by tangent3 ( 449222 ) writes: on Sunday October 11, 2009 @11:06AM (#29711137)

Ohh yes.. Need an ASCII table? It's just a Ctrl-Alt away

Re:Why not store the data on phone permanent memor (Score:2, Informative)

by Oshawapilot ( 1039614 ) writes: on Sunday October 11, 2009 @11:21AM (#29711221) Homepage

I'll admit to having one of the original (and second version) of the Sidekick (They were called the Hiptop everywhere else except the USA) and the idea of storing everything on the cloud seemed great at the time - through several device upgrades, warranty replacements, and other hardware changes everything just automagically restored to the new phone within 10-15 minutes of switching the SIM.
One should add that the devices themselves are designed to "Play dead" when the battery gets low and shut down while still maintaining enough power to ensure the volatile ram holding the devices local cache of data remains intact. It's only if the battery is fully exhausted to the point of not being able to accomplish this, or a critical error/OS crash (The dreaded "red X of death") is encountered is the volatile ram actually in danger of being erased.
Therefore all the warnings about not letting the phones go "dead" or turning them off are a bit misleading since, excluding one of the two above situations everything is actually safe, but it's not without warrant since I'm sure MS/Danger are going to try to "backwards restore" whatever is salvageable.
Furthermore, since the OS is locked down extremely tight there's no (to my understanding, admittedly a few years old now) method of locally backing up a Sidekicks data. Contacts stored on the device can be backed up to the SIM card one at a time (with only the basic name/phone data, all other extraneous data such as profile pics, etc will not be included) but it was tedious to accomplish (one contact at a time) and the average Sidekick user (read as teen/clueless) probably has no idea how to do it anyways.

Re:When Paranoia Pays (Score:2, Informative)

by larry bagina ( 561269 ) writes: on Sunday October 11, 2009 @12:53PM (#29711757) Journal

it runs NetBSD and Java.

Means what it said (Score:3, Informative)

by SuperKendall ( 25149 ) writes: on Sunday October 11, 2009 @01:42PM (#29712027)

shit, is that TSR still hanging around? goodness!
Dude, what part of "Stay Resident" did you not understand. It's not like selling your computer rids you of it.
That's why I never ran them, nor consorted with Deamons.

Re:You assume Danger used a MSFT platform (Score:2, Informative)

by Anonymous Coward writes: on Sunday October 11, 2009 @02:44PM (#29712395)

You know nothing of which you speak. I assure you it was running on Microsoft software. Unfortunately, I should know.

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

by AK Marc ( 707885 ) writes: on Sunday October 11, 2009 @04:44PM (#29713023)

Ever have a tape drive with mis alligned heads? That one drive and only that one drive will be able to read those tapes, and sometimes even it can't read them after the tape is ejected, but will show OK on a verify done before the tape is ejected. You either have a verified backup that can't be used, or a pile of tapes that are completely useless if that drive ever fails.

I found one of these when doing a backup/restore to upgrade a server (backup the data from ServerA and restore the data on ServerB). It took a while to figure out why the tapes worked perfectly in ServerA and not at all in Server B (internal tape drives, fixed by swapping the drive from ServerA into ServerB for the restore, then discarding ServerA and the drive from it after).

For a server-loss scenario (fire, theft), this means there is no backup, yet something that wouldn't be discovered without restoring on a separate system. No idea how common this is, but in dealing with not many situations where it could pop up, I've seen it all of once.

You insensitive clod! (Score:1, Informative)

by Anonymous Coward writes: on Sunday October 11, 2009 @05:43PM (#29713465)

My 9 year old daughter has a sidekick.
Microsoft has made her cry.

Re:"they should have used ZFS or btrfs" (Score:1, Informative)

by Anonymous Coward writes: on Monday October 12, 2009 @04:16AM (#29716493)

Just as well they didn't promise to never lose any email. In fact, I don't even know what it means to "loose" email.

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

by Cytotoxic ( 245301 ) writes: on Monday October 12, 2009 @11:18AM (#29719211)

We had a similar failure here. Had to replace a battery in a redundant SAN controller... it was under support with the vendor so they sent out a rep to do the fix - everything went just fine. Then poof - one whole shelf went dark. No problem, we designed the system to handle that - all arrays striped vertically with no two drives on any one shelf. Then the vendor took the backup card offline to repair the problem. Poof - another shelf down. Uh, oh! A little more work got the shelves back on line - but the drives had been totally corrupted by the glitchy controller. Luckily, not being idiots our engineers had full backups. Unluckily it took days to fully recover everything. Lesson learned - there is no such thing as a safe fix. We moved critical systems off of our "Fisher-Price SAN" over the next several months and it has not caused any additional catastrophes, but we learned a lot about redundancy - a single hardware failure can cut through a lot of layers of redundancy and bring you down hard when the failure mode is less than "off".

Re:"they should have used ZFS or btrfs" (Score:4, Informative)

by cbreaker ( 561297 ) writes: on Monday October 12, 2009 @11:43AM (#29719571) Journal

The technology is available to get good, solid backups for anything. They just didn't use it, test it, verify it, etc. And in the case of this, users cannot back up their own data. And what they lost isn't backups.

I used to have one of these things.

The phone is (like someone above pointed out) a local cache of what's on the server side. The live database/back end is what crashed. When you make a change on the phone, it immediately sends that change to the server. You can login to the sidekick web site and make changes there, which appear quickly on your phone. If you reboot your phone, it will retrieve anything it needs from the server side. Apparently, the phone doesn't even keep a permanent local copy on some sort of non-volatile storage (hence "Don't turn off your phone.")

It's like someone that uses Google apps and stores all their documents on their system. If that system should go down, you'd be screwed, except that you COULD back up your documents locally. With this case, you can not.

I don't really like the term "cloud computing." All it means is server storage somewhere on the Internet. Under this term you could call any web site a "Cloud." It's ambiguous at best.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Server Failure Destroys Sidekick Users' Backup Data 304

Server Failure Destroys Sidekick Users' Backup Data More Login

Server Failure Destroys Sidekick Users' Backup Data

A server failure? (Score:4, Informative)

Re:"they should have used ZFS or btrfs" (Score:4, Informative)

Re:Why not store the data on phone permanent memor (Score:4, Informative)

Re:"they should have used ZFS or btrfs" (Score:5, Informative)

DIY phone backups (Score:4, Informative)

Re:A server failure? (Score:5, Informative)

Re:undelete (not de-corrupt) (Score:3, Informative)

It is an ancient story, endlessly repeated (Score:5, Informative)

Re:"they should have used ZFS or btrfs" (Score:5, Informative)

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

Re:"they should have used ZFS or btrfs" (Score:5, Informative)

Re:Thin client: Android, too? (Score:5, Informative)

Re:Thin client: Android, too? (Score:3, Informative)

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

Re:Sidekick (Score:3, Informative)

Re:Why not store the data on phone permanent memor (Score:2, Informative)

Re:When Paranoia Pays (Score:2, Informative)

Means what it said (Score:3, Informative)

Re:You assume Danger used a MSFT platform (Score:2, Informative)

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

You insensitive clod! (Score:1, Informative)

Re:"they should have used ZFS or btrfs" (Score:1, Informative)

Re:"they should have used ZFS or btrfs" (Score:3, Informative)

Re:"they should have used ZFS or btrfs" (Score:4, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot