Friday, February 05, 2010

Relocation accomplished

G4ILO's Blog is now at its new home http://blog.g4ilo.com and I'm hoping that nothing got left behind in the move!

The actual change from FTP hosting to Blogger hosting was fairly easy. It was just a matter of selecting the "custom domain" option from the publishing options, then clicking the Advanced button and putting in the name I wanted: blog.g4ilo.com.

I also had to specify something called a "Missing Files Redirect" which tells the Blogger server where to look for any files used in the blog that it doesn't have. This is important because although the text content is now served up by Blogger (and presumably any images included in new postings will be hosted there) all the images in my existing postings remain where I uploaded them, on the old server. And I certainly did not want to go through over 300 postings editing all the image locations to provide their full path!

Next I had to go to my web host and create a CNAME record for the new domain blog.g4ilo.com which pointed to ghs.google.com. This basically tells the world via the Domain Name Service (DNS) that blog.g4ilo.com is hosted on Google's server. I then had to wait while this information propagated around the Internet, so I went and helped Olga get the shopping.

That was the easy part. The difficult bit - the part I was concerned about - was ensuring that all references to my blog at its old location would automatically redirect to the new one. Google was promising to build a migration tool that would take care of all this, but I was afraid that it would only take care of the situation where people only had a blog on their server. My blog was cohabiting with my G4ILO's Shack website so I wanted only links to the blog pages to be redirected.

Blogger techs told me "you can write an .htaccess file that can do this" but they seemed to have rather more faith in my ability to do this than I had. The .htaccess file is a configuration file used by Apache web servers that allow them to do more than simply serve the page "blah.html" when someone's browser requests that exact page. None of the .html pages at G4ILO's Shack exist as real files at all. They are all generated on the fly from a MySQL database by a content management system which knows what text is required thanks to an .htaccess file that the CMS authors thankfully provided. And I didn't want to break that.

The .htaccess file uses something called "regular expressions" to match against the filenames that are requested. They caused a perplexed expression to appear on my face because I don't have the kind of mind that is good at puzzles and I just couldn't figure out how to use them. So I did what I normally do when I hit a technical problem: Google to see if someone cleverer than me had had the same problem and managed to solve it.

I found several examples where people had transferred their blog from domain.com/blog, or even from just domain.com where the blog was the only thing on the server. But eventually I managed to find some examples which, with a bit of trial and error testing, seems to do the job of redirecting all requests for blog files to the new server. For the benefit of those following in my footsteps, the lines I had to insert in my .htaccess file are:

RedirectMatch permanent ^/blog.html$ http://blog.g4ilo.com/
RedirectMatch permanent ^/rss.xml$ http://blog.g4ilo.com/rss.xml
RedirectMatch permanent ^/atom.xml$ http://blog.g4ilo.com/atom.xml
RedirectMatch permanent ^/20([0-9][0-9])_([0-9][0-9])_([0-9][0-9])_archive.html$ http://blog.g4ilo.com/20$1_$2_$3_archive.html
RedirectMatch permanent ^/20([0-9][0-9])/(.*)$ http://blog.g4ilo.com/20$1/$2
RedirectMatch permanent ^/labels/(.*).html$ http://blog.g4ilo.com/search/label/$1


These take care of, respectively, my old blog main page (blog.html), the RSS and Atom feeds, the archives, the individual posts, and the labels.

The thing that caused the most head scratching in the end was the bit of code that displays the last ten blog topics on the front page of G4ILO's Shack. This is generated from the RSS XML file produced by Blogger but for some reason it worked with the copy that was stored on my server but not with the one obtained direct from Blogger even though they looked practically identical. But after more trial and error that issue was also resolved at around midnight last night.

I'm crossing my fingers, but I think it all works. I have deleted all the blog files off the old server now, apart from the uploaded images of course and a few other files used by the template. Please let me know if you notice anything that seems wrong.

My blog address is now blog.g4ilo.com and you should update it if you are not receiving updates. But because of the redirection, hopefully you won't need to. Now I'm going for a lie down. Somebody please pass me an ice pack!

2 comments:

Gordon said...

Better to use Mod Rewrite.

See my article here.

http://www.ecalpemos.org/2010/02/moving-from-blogger-ftp-to-blogspot.html

Its near the bottom of the article. Also look at the second comment. I had to add another .htaccess file with a single line into my images directory to stop them being redirected.

Gordon
GM4SVM

Unknown said...

Hi Gordon. I don't understand the difference between RedirectMatch Permanent and RewriteRule with R=301. I don't have the problem of images being redirected because I'm redirecting specific file specs whereas you're going an *.* kind of redirect.