November 16, 2006

System Administration in Self-Defense

(in which, I resort to webloggery)

It is the pride and the shame of unix: every user must be a sysadmin
And how good are your backups Had some "unscheduled down-time" yesterday (I kicked a power switch on my aging desktop machine, which still doubles as my home webserver. Recently, there had been signs of problems with the main disk partition, but I was being lazy about finally moving operations to my new box(es).

All of a sudden I had to hustle to "maintain my web-presence"... but that shouldn't be too big a deal though, because I, of course, Have Backups.
My backup strategy involves two grades of locations that I care about:
  1. places where code and text writing is done frequently
  2. places where I stash large files (e.g. digital photos)
The first can be backed up to a single DAT tape (which holds around 2 Gigs); the second is a bigger production to deal with, so I back it up less freqently (my digital photos alone take up a good 5 DAT tapes at this point).

It turns out there's a problem with this strategy -- and there's one with every backup strategy it seems -- in that it relied on me to classify important locations as one or the other, and it turns out that I missed one. There's a location with some image files that was too big to want to include in the frequent backups, but small enough that it didn't occur to me to put it in the infrequent backup proceedure.

I think part of the trouble is that the spot that I missed is a place that I stash some images that I have on display on the web, and that location has gradually been turning into a place where I store generated files, not originals (e.g. photos rescaled for web viewing, with the originals kept elsewhere). The trouble is that I wasn't always using it that way, and it looks like there were some originals of my early photos that got trashed along with the drive.

This is embarassing, because these were indeed up on the web -- once I put something up I really don't like to take it down (in my opinion, the fact that you can "unpublish" on the web is a bug, not a feature).
summary: pitfalls of tailoring backup strategy to circumstancs I might summarize this problem as an example of what can go wrong if you tailor your backup strategy using your knowledge of how the information is being used: it's too easy to make a mistake; you can forget a location; you can classify a location in the wrong way, and so on. Directories are easy to create, and it's easy to change your mind about how you're using a directory -- making sure that your backup strategy tracks these changes correctly is always going to be a problem.
The alternate approach: just do it all. The obvious alternate approach is to just use a standard way of backing up every on the drive -- one approach is to do something like a full backup of everything (possibly using "dump"), and then afterwards do incremental backups of all files that have been modified since your last backup.

Doing an incremental backup is relatively painless, so you're more likely to keep up with them (any strategy that assumes superhuman dedication and efficiency on the part of the user is guaranteed to fail -- you've got other things to do besides make backups, eh?).

There's one big problem with this approach -- the individual pieces of backup media (disks or tape) are meaningless by themselves. In the case of a series of "dump" tapes, it's particularly bad -- as I understand it you need to apply all of those tapes in sequence, and if just one of them has gone bad, your whole backup is shot.

You're a little better off with the "incremental" tapes, each one of those is comprehensible on it's own... but you don't know what's on it without looking, and it's rarely going to be a complete set of anything -- to restore the entire system's state you have to apply the "dump", and then apply every incremental since then... and that whole chain has to work, or you won't know what you've got.

This is the kind of problem that steers me towards the custom tailored approach: I can know that one stack of media contains all of my digital photos as of a certain date. If I need to restore one of them, I know where to go looking for it, I don't have to scan a stack of a dozen DAT tapes.
Still other approaches, e.g. disk mirroring I realize that these days many people are just taking advantage of the general cheapness of harddrives to make "backups" to other harddrives, e.g. via disk mirroring, or just manually copying everything to some cheap IDE drive and then yanking the IDE drive and filing it away.

I don't mean to turn my nose up to these approaches entirely -- they're fairly easy to do, and "easy" isn't to be sneered at when the number one problem you have to overcome is human laziness -- but they do have there problems.

For example, if you use disk mirroring, there's a whole class of problems that might easily trash both of your disks, including user errors: an accidental delete will just get mirrored to the other drive.

Treating IDE drives as removable media is certainly a cute trick, but I'm a little skeptical about that one as well... for one thing, you're probably not going to have a lot of those IDE drives (they may be cheap, but they're not as cheap as DAT cartridges or DVD-Rs), so you're going to make do with a relatively small number of duplicate copies. For another, disk capacities that seem infinitely huge one year can easily seem a little cramped the next, and I suspect that many of the IDE-drive backup camp is often tempted to leave their "backup drives" inside the box to get some extra space to play with.
Why DAT tapes As you may have noticed, I have a bias toward using DAT for backup. This may just be me being stuck in my ways (I've had an external SCSI DAT drive kicking around for ages, since back in the days when it's 2 gigs per tape was amazingly impressive).

But I suspect that DAT still has some advantages over DVD-Rs. One is that the media has been around longer, so there's less likely to be some unexpected gotcha with it ten years down the road (with burned DVDs, people like to debate dye longevity issues and so on). Another point is that DATs have essentially a zero coaster rate: even without verifying, you can be pretty sure you're information made it to the tape. A third point might be that DATs are re-useable: many people just keep two sets of tapes around, and have schemes to cycle through them when making backups.

On the other hand, DVD's are up to around 4gigs, twice what I can get on a 90 meter DAT tape, the media is certainly cheaper (blank 90 meter DATs cost $3 or so, versus something like 25 cents for a DVD blank), and DVD burners are getting relatively ubiquitous -- (for a long time they were selling computers to consumers with essentially no means of making a backup: DVD-Rs are a huge improvment.). I may be right that DATs are a little better than DVD-Rs, but they're not so much better that you're likely to run out and buy the special equipment (and tapes) that you'd need.

Joseph Brenner, 16 Nov 2006