Moving tadhg.com

21:38 Sun 11 Nov 2012
[, , , , , , , ]

Since my primary server died in February this year, I’ve been running tadhg.com on a cheap virtual machine. That’s worked fine, but the original server came back to life quite some time ago, and today I finally completed the process of moving tadhg.com back to it. The move is now complete, and hopefully you’re not seeing anything unexpected[1]. This post is about what’s involved in that move and what I’ve tried to improve along the way.

My blog still runs on WordPress, so moving it is not as simple as I’d like, primarily because it involves database dumps and imports. Given that I lost some data stored in the database when the server died, I didn’t want to make the move back without also making sure my backup system was better.

I use git for all of my version control now, and so wanted to move my blog to git repositories as well. To do that correctly required splitting the various components of the blog into separate repositories.

The WordPress database, which is where the transformed pseudoHTML lives, as well as all of the information about comments and approved commenters. Not having this backed up in the past meant losing some comments. To store it in a repository, I export it from MySQL and check it in[2].

There are various files that I’ve stored on my site over the years that aren’t necessarily related to WordPress or my blog, but which I want to keep up there (partly because broken links are web pollution), and so I have a separate repository for those.

The photos I publish on my blog were already in version control, but as I worked through this I realized that they needed to be in their own repository, separate from the rest of my photos and graphical assets. I created a repository to store, essentially, all of the local assets I’ve ever linked to in a blog post.

Some of these I wrote myself, and some are the work of others. In both cases, having them be part of the larger project meant mixing their version history into the rest of the items here, which is far from ideal. Ultimately, each of my plugins needs to have its own repository, and each of the other plugins needs to be handled individually so that they can be updated one at a time.

For the moment, I’ve put all the plugins in one repository. The next step will be to break each of mine into their own repository. After that, I’m not sure how I’ll proceed with the others; I might simply keep one repository for all of them, allowing easy reversion if an update breaks something on my site. This is an acceptable compromise because I’m not likely to make any changes to those plugins myself and thus only need to track the upstream versions.

Bits and pieces that I manage myself, such as .htaccess files and 404 pages that keep old URLs for my site and redirect them to the correct places.

Text Source
The reStructuredText files I write my posts in, which get get transformed into WordPress-style pseudoHTML at publication time. These have been in a git repository for quite some time, so were not a concern.

The layout and design of the blog is determined by a custom WordPress theme, which needs to be its own repository, including its images, CSS, and PHP/HTML. The source files for it, primarily mockups made in The GIMP or Inkscape, are in another repository.

WordPress is another project I need to track, but not to contribute or make significant local changes to. I don’t need a repository for it at all, and can simply export it from Subversion whenever I need to update. Further, if such an update goes wrong, it’s not difficult to just export the previous version in order to revert.

Apart from the plugins, that’s a reasonably logical separation of components. Splitting them out this way (rather than storing everything in a single repository) has the minor disadvantage of making deployment slightly trickier, as the above pieces have to be stitched back together—although because of the database component, some stitching would be required in any case. I created a handful of scripts to do deployment, and naturally store these scripts in their own repository as well.

There are two major gains from doing things this way: first, having a script for deployment means that restoring from backups after another data loss scenario simply requires running the script; second, I can now run my site locally (or in just about any Unix-like evironment). In addition, separating the components should make development of any kind easier, from fixing some of my plugins to changing my theme to, ultimately, getting away from WordPress entirely.

The steps for moving the blog were as follows:

  1. Figure out the repository structure, as outlined above.
  2. Write the deployment script, which:
    • Clones the various git repositories.
    • Creates the directory structure for the blog, mostly pointing at various of the repositories.
    • Imports the WordPress database from the file in the database repository.
    • Exports a WordPress version to the directory structure.
    • Overlays the various files specific to my site onto that version of WordPress (wp-config.php, .htaccess files, etc.).
  3. Dump the database from the old site and check it in.
  4. Create a subdomain for the old site so that after DNS propagation I can still reach it.
  5. Create a subdomain for the new site so that I can reach it, and redirect to it, before DNS propagation.
  6. Run the deployment script on the new site.
  7. Verify that it worked by visiting the new site’s subdomain.
  8. Change the WordPress setting for the blog’s hostname so that the WordPress admin panel doesn’t redirect to tadhg.com[3]. This step led to the database upgrade page—WordPress changed their schema in some recent version and so my site’s database needed to be upgraded to work with the new code.
  9. Create a URL rewrite rule for the old site that redirects all connections to the new site’s subdomain.
  10. Remove the URL rewrite rule on the new site that would normally redirect all requests to use the tadhg.com hostname. (If I didn’t do this, prior to DNS propagation the previous step would result in a loop: tadhg.com would resolve to the old site, which would redirect to the new site’s subdomain, which would change the host to tadhg.com, sending the user request back to the old site.)
  11. Change the DNS for tadhg.com to point at the new site.
  12. Wait a few days for DNS propagation to finish, then restore the rule removed in step 10.

That all seems to have worked. It might be my imagination, but this version of WordPress seems to be slower, something I’ll hopefully be able to fix soon.

[1] If your DNS hasn’t acquired the update, you might see my blog’s hostname as uw.tadhg.com instead of tadhg.com; this will go away in a couple of days once I’m satisfied that DNS propagation is complete.

[2] The command I use for export is:

mysqldump --databases <database> --complete-insert --hex-blob --skip-add-drop-table --single-transaction --order-by-primary --skip-dump-date --force > <outputfile>
[3] Done via mysql:

UPDATE wp_options SET option_value = ’http://<newurl>/wp’ WHERE option_name = ’site_url’;

Leave a Reply