rc3.org

Strong opinions, weakly held

Being smart about database backups

The other day I mentioned a problem I created for myself using CocoaMySQL. There were a number of mishaps involved, but the end result was data loss. Basically what happened is that I unintentionally told CocoaMySQL to run an alter statement on a table in a production database which zeroed out all of the data in a particular column.

I was, in fact, just trying to look at the structure for a table and clicked in the wrong place. Then, poof, the data was gone. Of course, aside from the issue of how to get back as much data as possible, the question is how to prevent such data loss in the future. One problem is that the development and production databases live on the same server, and I generally accessed them with the same account. (Not root, but still an account with all privileges for both databases.) When I connected with CocoaMySQL, the production database was selected by default. This meant that I was putting myself in a position to mess up the production data even though there was no reason to assume this risk. I immediately created a new account with access only to staging, which makes mistakes that could destroy production data much less likely.

The second issue, though, is backups. There are three basic means by which you can back up a MySQL server that uses InnoDB tables. The options are offline data dump, hot backups, and replication. What I learned from this episode it’s best to apply several of these approaches at once.

The database in question was once backed up using the hot backup tool, which creates backup files that can be used to completely restore the database to the point before it was backed up. Then replication was set up, and the hot backups were turned off. In this case, the fact that replication was the only backup option we employed turned out to be costly. The alter command was replicated just like any other database operation and within seconds the data was deleted from both databases.

Even if I had a full backup from the hot backup tool, I would have had to fully restore the database to some server to get back the data from the one column I needed, which would have been painful. Both the replication and the hot backup tool are fine disaster recovery tools, but aren’t so useful in a case like this, where data is inadvertently destroyed. Given my recent experience, protecting against inadvertent data destruction is probably more important that insuring against natural disaster or hardware failure (although they’re important too).

I think that in the future, for MySQL, the solution I’m going to recommend is a replicated database that is taken offline regularly to be backed up using mysqldump, which yields ASCII files that are more accessible than the files the hot backup tool creates. The way replication works, the slave database will catch back up with the master database when it comes back online.

I would be curious to know what readers recommend for database backup practices.

4 Comments

  1. I use replication to protect against a server failing and mysqldump backups in case I need to inspect old data or back out a change made more than a short time in the past. It’s a lot like RAID and tape backups – some people believe that they don’t need to worry about a server with a RAID array, a misunderstanding which lasts until the first time they need to undelete a file.

    I also have to share your caution about tools like CocoaMySQL outside of development – even if they do properly wrap things in transactions I would argue that it’s just too risky to use them against a production database. What I prefer is to design the system so it’s inconvenient to make interactive changes on the live database and force yourself to write a script and follow some sort of testing and deployment process, particularly since you really shouldn’t need to change your production database that frequently.

  2. Sounds like you need a QA engineer/config mgmt person (no bias in that assessment)! 🙂

  3. on our ISPs webserver, i’ve got a daily cron job that does a mysqldump and then emails the backup to a gmail account. it’s saved my ass a couple of times already.

Leave a Reply

Your email address will not be published.

*

© 2024 rc3.org

Theme by Anders NorenUp ↑