tadhg.com
tadhg.com
 

Always Have Good Backups, Reprise

20:19 Sun 04 Mar 2012
[, , , , ]

Last week I went through a data loss scenario similar to last year’s, but with a less happy outcome—no miraculous recovery of the data this time. I screwed up some things in my backup infrastructure, so I did lose some data—and this should serve as a warning to you all. Backup your stuff, do it well, do it often, and don’t leave any holes.

The amount of data you have in digital form is quite significant, and likely includes:

  • Books.
  • Code.
  • Contact information (how many current friends’ phone numbers can you remember?).
  • Correspondence.
  • Databases.
  • Financial data.
  • Health records.
  • Music.
  • Notes.
  • Personal writing.
  • Photos.
  • Professional writing.
  • Public writing.
  • Software.
  • Video.

It might not be on your computer, but it’s digital somewhere, and that means one of two things:

  1. It’s on a consumer-grade storage device, most likely a hard drive.
  2. It’s stored by some online service (your email, your blog, your contact information).

Neither of these is sufficient on their own.

Intertwined with your data is your services: email is both data and a service, as are Twitter, Facebook, your blog, your instant messaging accounts, and so on.

Threats

Here are the common threats to your data (and services):

  • Digital obsolescence: You have all of your data organized and backed up… but on hardware that readers are no longer made for, or in formats that are now difficult to find software to read—or, this is what will happen to you in the future.
  • Hardware failure: Extremely common. Hard drives fail way more than is comfortable. Other storage media get damaged too, and none are durable to the extent that they can be considered trustworthy. Also covers disasters, such as fire or flood destroying your home and its contents.
  • Malice: you’re subject to some attack, targeted or random, that deletes your data and/or strips your of your access to services
  • Privacy loss/exposure: A special case, but worth mentioning; here you don’t lose data, but are adversely affected by its becoming public beyond the extent you wish it to.
  • Rights revocation: You have your data entrusted to some highly-reliable service, and they decide that you can’t see it anymore.
  • Security failure: You have everything organized, backed up, and locked away with secure encryption—but then you lose the key.
  • User error/data corruption: common too. You fall for the rm -rf / trick (never execute that unless you really understand what you’re doing!), or you move something instead of copying it and later forget and destroy the “copy”, or you save your brief set of notes for your next project over your almost-finished thesis—endless possibilities. Also in this category: due to software or hardware issues, your files become damaged in some way that doesn’t involve deletion or noticeable hardware failure, so you don’t realize for some extended period.

Suggestions

A description of increasing levels of backup safety follows. If you’re one of the first three, rectify that as soon as possible.

Two caveats:

  1. Individual circumstances differ, and it’s possible that some of these suggestions may not work for yours. As always, carefully evaluate instructions given out by Some Internet Person.
  2. I’m concerned primarily with avoiding data loss, and not with privacy or security—sensitive data should be handled with additional care, and if most of what you do is secret and must not be seen by others, you should not follow these suggestions.
No Backups, Local

The typical case, where your data is on your computer and nowhere else. You are entirely at the mercy of the hard drive gods; history suggests they should not be trusted. Malice and user error could really ruin your day too.

No Backups, Remote

You’re past that “my computer” mindset, and instead you need only to connect to various online services to access your data. You don’t keep any of it locally, because your service providers have excellent backup strategies and very competent systems administrators. The hardware gods have little sway over you; instead, you are at the mercy of your service providers and malicious actors, who may deliberately or mistakenly revoke your rights to your own data at any moment.

RAID, Local

You’re onto the hardware gods, and so have automatic mirroring to help you avoid their wrath. However, you’re now at the mercy of your RAID controller(s), as well as user error.

Local Backups

I think this is just slightly better than RAID, if only because you don’t have a single point of failure.

None of you should be lower on this list than this item. Local backups are the absolute minimum. Hard drives are cheap. If you’re on a Mac, use Time Machine, and either have it run all the time, or run it once per day. There are equivalents for other platforms—use whichever one is easiest for you.

Remote Backups

Better than local backups because there’s no longer a single geographic point of failure. These services can cost money, or require you to be selective and organized about what you back up—although I don’t use it myself, Dropbox should be a perfectly reasonable online backup solution for data that isn’t photos, music, or video.

Local and Remote Backups

You should be here at minimum. Hard drives are cheap, Dropbox gives you 2GB for free.

You should be backing up everything to a local external hard drive and select data to a remote service.

At this point, your local machine and hard drive could be stolen or consumed by flames, but you’d still have the most important stuff online. Conversely, your chosen online backup service could suddenly disappear and your hard drive could fail, but you’d still have your local backup.

Local and Remote Backups Plus Basic Version Control

Time Machine gives you something like version control already, and a 30-day version of this can be enabled for free Dropbox accounts (paid accounts can look back indefinitely). This is important because it protects you from yourself; file deletion isn’t necessarily permanent.

Local and Remote Backups Plus Strong Version Control

As above, but using a distributed version control system so that you have the entire history of your important data both locally and remotely.

I use git for this, and it’s great: any of my machines has the full history of my important data, and it’s easy to keep them synchronized. It is a programmer’s tool, but the basic concepts are very easy, and to use it effectively you’d need to understand maybe six commands. It’s file-based, so you could use it locally (i.e. without synchronizing to other machines) and still have it backed up by your existing backup tools. It’s also better for text formats, and wouldn’t be ideal for photo backups[1].

This is where I was before the recent data loss[2], and where I am again, several days later. I did not lose any data covered by version control or local backups.

Local Backups, Multiple Remote Backups, Strong Version Control, Services Backup

This is where I would like to be[3]. While multiple remote backups are additional peace of mind, the real value here is in “services backup”.

The setup for this is different for me because I’m comfortable running my own services, but the basic concept is the same: have a backup for each of the services you rely on, and ensure that you have a plan for switching over to that backup.

  • What would you do if your email address disappeared?
  • What would you do if, after using Facebook for your online social contact for years, Facebook banned you?
  • What would you do if your blog was deleted due to a spurious legal claim and your provider had no interest in your side?

The technical answers are different if you run your own services, as I did, but the basics are the same: have a backup email address that you can tell people to contact you at, and have communication channels that let you announce this (among other things, this means you should have an email address book that’s separate from your primary email service). In addition, have a list of the services that use this email address and a way of switching them to the new one; control of an email address is the Web’s de facto proof of identity system.

For each other service that’s important to you, have a backup plan. Make sure you download and back up any data of yours stored in the service. In addition, have multiple communication channels; for example, Twitter is a reasonable way to tell blog followers where they can find you.

What I Lost This Time

I run my own blog and ran my own email, as well as the master repository for my version control system. The machine I used for these suffered multiple hard drive failures over the course of a couple of days, ending up with it no longer able to boot.

I did not have a backup plan for what I would do if this happened.

I have a backup service for my email (a Gmail account), but no backup for the email data itself[4], an almost inexcusable oversight.

My blog is composed of three elements: the code, including modifications I’ve made to WordPress; the content I created (i.e. the posts); and the comments of others, stored (along with a lot of other details) in the WordPress database. The code and the content were covered by my version control system—but the database was not, another almost inexcusable oversight on my part.

The master repository for the version control system was easiest to remedy: since git is a distributed system, it was easy (if a little time-consuming in terms of file transfer) to copy a local repository to a new server and make it the master[5].

Since I didn’t have a plan for this eventuality, I had to create one while everything was down. That plan was to sign up for a cheap virtual private server on a hosting service and then move things there.

I had some luck, in that someone else had made a backup of the contents of the now-dead machine about three weeks prior to the hard drive failure, so I had my email and my blog database to that point. Even so, the restoration process took considerably longer, and was markedly more stressful, than it would have been if I’d had a plan in place beforehand.

So far, the price of the holes in my backup strategy:

  • 3 weeks of email correspondence—not that much, since I haven’t been emailing a tremendous amount these days, but still more than I would like.
  • An unknown amount of lost email—whatever might have bounced during the time the hard drive problems were occurring but before I noticed and pointed my email elsewhere.
  • 3 weeks of blog comments—also not that much, since a lot of commentary seems to be on Facebook these days, and in any case there’s less commentary now that I’m only posting once per week.
  • About a week of blog downtime.
  • A lot of time, stress, and effort in setting up the new server and getting services running on it; I still haven’t set up email for myself on it, and the other websites I host are still down.
  • My WordPress blog’s ability to post automatic new post notifications to Twitter is now broken, so I have to do this manually and haven’t yet figured out why.

If I had not been lucky regarding someone else having made a backup three weeks before, I would have lost an unknown but large amount of past email, and all comments on my blog going back to March 2011, an obviously much worse outcome. There’s not really much excuse for my having run such a risk, given how little work it would have taken to have backed both of those things up regularly myself.

On the positive side, I didn’t lose any personal or financial data, nor any code, nor any of my configuration files[6].

Next Steps

I have to restore the rest of the services I lost, and then determine how to get a backup machine so that I don’t remain with a single point of failure.

You need to make sure you’re at least at the “Local and Remote Backups” stage, and to figure out what your backup plans for your remote services are[7].

We should probably all think about the data and services on our phones, and how to effectively deal with backing those up.

Finally, we should review the list of threats and consider how to better protect ourselves against them.

[1] But what is? This seems to remain an unsolved problem.

[2] So at least I learned enough from the previous failure to move to a distributed version control system, which definitely made things much easier this time.

[3] It’s not the ultimate level by any means. The next one would have “Services Failover” instead of “Services Backup”, meaning that without my having to do anything significant, my services would come up and run on backup machines, ideally in a manner seamless to users (including myself).

[4] I discovered the hard way something I should already have known—local IMAP data isn’t really a backup for your email.

[5] More accurately, make it the “origin”; I’m technically misusing “master” here, but using it this way makes that paragraph briefer and clearer.

[6] That last is no small thing, as I’ve put considerable time and effort into making and tweaking the tools I use every day. My Vim configuration involves a lot of custom code and would take a long time to recreate if lost.

[7] If you have all that covered already, excellent work. You should probably just do a quick review of your backup strategy and see where any remaining weaknesses are.

Leave a Reply