Tuesday, November 18, 2008

Old WordPress posts are (mostly) redirecting

In moving 3,000 posts over from WordPress to Blogger, errors when users visit old links are inevitable. My implementation of WordPress published posts to URLs that looked like this:

http://www.rklau.com/tins/archives/2008/11/04/yes-we-did.php

I've converted to Blogger, and Blogger's permalinks are a tad different:

http://tins.rklau.com/2008/11/yes-we-did.html

Specific differences: no /archives sub-directory, no /day in the path (just /year and /month), and the file extension is .html instead of .php. Ideally, I wanted a way to ensure that people visiting the first URL end up at the second.

For the most part, that's now happening. Using the WordPress Redirection plugin (I'll eventually move this out of WP itself and handle this in .htaccess), I'm using the following regex query:

/tins/archives/(\d*)/(\d*)/(\d*)/(.*).php
And I'm redirecting those URLs to this string:
http://tins.rklau.com/$1/$2/$4.html

A few edge cases where this breaks: Blogger caps its permalinks at 5 words. So posts-with-lots-of-words-in-the-title.php become posts-with-lots-of-words.html, and my redirection won't work. Also, Blogger doesn't include the word "the" in permalinks, so the-day-is-here.php won't properly resolve to day-is-here.html.

Nevertheless, this is about a 98% effective solution, which I'm quite happy with. I'd love to have a custom 404 page so confused visitors could figure out what was going on - but on the balance, I've accomplished what I set out to do.

And so concludes this period of meta-blogging, in which I blog about the blogging engine that lets me blog. Even I'm a tad tired of it, so look for obsessive political blogging and the random nerd post to reappear any day now.

2 comments:

  1. One thing I see absent from these posts about moving over to Blogger is an answer to the question "why"? WordPress is awesome for a number of reasons, not the least of which are the plugins, categories, and [insert any number of tired observations here]. So, um, what's the benefit?

    ReplyDelete
  2. If you want to catch the long ones too:

    ^\/tins\/archives\/(\d+)\/(\d+)\/\d+\/(([^-]+-?){1,4}[^-]*).*?\.php

    http://tins.rklau.com/$1/$2/$3.html

    That should turn
    /tins/archives/2008/12/13/this-is-going-to-rock-the-world.php

    into:
    http://tins.rklau.com/2008/12/this-is-going-to-rock.php

    ReplyDelete