Why a decent backup strategy is important

I wrote this a long time ago with the intention of posting it on a blog one day once the dust has settled a bit:

—–

On monday 21 April 2008 at round 2pm while talking on the phone I started up a Parallels VM. I looked away for a few seconds only to find the machine had gone to sleep. I did the usual wiggle of the mouse but the machine didn’t wake up. At this point I was perplexed, but never mind sometimes the machine does crash. So I restarted it and attempted to bring the VM back up again. This time I was greeted with “NTLDR not found”.

Surely this is not a big deal I’ve seen this a few times with windows and did a quick google to see if there’s a fix for it. Satisfied that another user has had the same experience I thought nothing of it and decided to go shopping for a couple of new office chairs.

Upon my return I attempted to boot the virtual machine from the windows XP media so I could use the recovery console to fix the broken boot loader. The boot failed, how could this be? It’s only a virtual machine it should boot from CD. I decided to do a little further investigation, maybe I can mount the image using the parallels explorer utility. 

To my dismay I was informed that the disk was the wrong format. WHAT? the wrong format? Surely parallels can read their own format? In a panic I checked the vhd file size and realised that 2 gig was a little small for something that used to be 22 gig. My worst fears had come true, I checked to see if I had a backup, but wait I attempted to back it up a week ago but stopped half way through because something pressing came up and I decided to deal with that instead. Being able to find a backup of everything else on the machine except the 1 file I desperately need as it contains the only copy of the source code of a $40,000 project was a slight problem.

First thing that came to mind was to email parallels support to see if they could do anything for me. Next I called a data recovery company (they claim that if they can’t recover it no-one can). After talking to them I realise that I was going to be in for about $900-$2500 worth of cost for them to recover the file for me. I thought to myself this is ok, the data is worth a lot more than the amount of money they want, to get it back. They have a no fix no fee policy. So the very next morning I dropped my laptop off with them so they can “clone” my disk and attempt to recover the VHD.

The consultant assured me that they would be able to recover my data as it sounded like a typical issue. There was nothing physically wrong with the disk when their lab tech had a look at it so that gave me hope. 

First thing I did was to go out and buy a mac mini to set up as a source code repository, seeing as how I didn’t set up a repository server as I was using Mercurial locally and I have not had any problems with my virtual machines to date. Keep in mind that just the previous week I got sick of my Netgear ADSL router dropping out all the time and got a Cisco 857 ADSL router (very happy with it but that’s another story).

It took me all day getting mercurial configured and SSH (on Mac OS and using putty on windows) configured to work the way I wanted it to work. After I got it all configured I pushed all the source code that was not on that particular VM (I keep 2 virtual machines, one for personal development and one for commercial, funnily enough the personal development one was backed up 3 days before) to the mac mini and optimistically awaiting the return of my data (They claim to be the best).

On thursday 24 April 2008 at around 9:30 am I receive another phone call from the data recovery company. I was crossing my fingers and eyes in hope of good news. The news was bad, the worst news I had in a long time. They were unable to recover my file! I nearly broke down in tears. This could mean the end of me, my young consulting company will surely go bankrupt because of this. How could I be so stupid, I always preach backup your data to everybody!

Having to go see a customer I make my way to their office but on the way there I realise that I might be able to reverse engineer the binaries which I do have in my sent email, and that’s stored on the server! Great idea, so I call a friend who tells me to use Reflector, instead of buying something expensive. So I talked to the customer (who’s code i didn’t lose – I made a backup and gave it to someone else a while back, but only made one very small change since then) Still wanting to cry but keeping it together.

When I got home I started to download decompilation tools like my life depended on it. I started to decompile the first DLL and it worked, well kinda, the code was almost correct and looked almost exactly the way I wrote it!

This was good news. I decompiled all the dlls I had and grabbed the web files (this is an ASP.net 2.0 application). I started to change the aspx files to include the codefile attribute, renamed the .cs files from the decompilation, changed the class declarations to partial, removed the redundant code that the compiler complained about. Almost there…

I was missing the web.config and web.sitemap files – luckily for me I messed up and accidently included these in a build that I sent out to the customer a few days earlier, so I recovered them from that build (I produced custom ones for the production environment and excluded them from the package because their IT guy would botch the job if I didn’t). One more critical piece is missing in the puzzle, the Database as it happens these are easy to recover, just grab a backup of an existing copy, thankfully I have a copy of this on a development sever on site with the customer. I grabbed my DB and eureka! the application works flawlessly.

Now everything is not fine and dandy, there was some code that I commented out to disable certain features, these are gone now. But it was a good thing that I tried to disable as much as I could programatically to keep the integrity of the code in tact. I’d say I have about 99% of the code back. The other issue is that all local variable names are very generic like str str2 str3 etc. It’s good thing that Im lazy to write comments because they make themselves known just by the way I wrote the code. I can easily restore these using the Visual Studio re-factoring commands. Since I’m the only developer working on the code and there’s some changes that needs to be made in the future I will fix these up as I need to. The customer is none the wiser and I have learnt a lesson I will never forget.

I went out and bought a 500 gig Time Capsule to ensure that the mac mini source control server will be backed up without issue, I’m also backup up all my other macs to it. I still need to write a batch file to automatically commit and push all my source automatically, make DB backups and copy document directories to my NAS. Then I can run this script either manually or on a scheduled task to keep myself backed up.

The other thing I should do is to configure a boot-camp partition so that I have the safety of a real file system instead of a file system in a file. If I had a boot-camp partition I would have recovered 100% of the data easily myself.

My next purchase is a UPS to ensure that I do not get bitten by a power spike and I will look into an offsite backup solution too. This has been an extremely painful and expensive lesson to learn. The moral of the story is to never have a false sense of security, and remember disaster will always strike when you least expect it. Back your data up as often as you can and make sure you test your back up strategy. I’ve been caught with my pants down on this one and I got lucky in the end. I did how ever loose 4 days of work because of this which I now need to make up some how. The good thing is that I hopefully have a more secure work environment and I will not get caught out again, you never know how valuable your data is until you loose it!

—–

I’m a lot smarter than I was back the now and my backups are solid, I can loose my entire laptop and recover everything that’s valuable on it from my backups. My consulting company does not just consist of me anymore so the source control system I’ve got in place actually deals with a lot more than just me and yes it’s still a mac mini and a time capsule. We have a lot of code on it and it still functions well enough to not be an issue. I’ve actually had to explain my backup strategy to a customer which I was too eager to do since the code is pretty safe given multiple developers using mercurial plus the mac mini/time capsule combination. I hope that you have learnt something from my painful experience and audit your backup strategy. I cannot begin to drive home how painful this was!

Posted by: Eben Bruyns Wednesday, March 4th, 2009 General

2 Comments to Why a decent backup strategy is important

  • ine8181 says:

    Being paid for a project and not backing it up is bordering on insanity. Lucky that we’re working with such high level languages that decompiling works reasonably well.

    Were it a C++ project with similar complexity, you might have had to pay it back in drams of blood.

  • Leave a Reply