Blog content migration from MovableType to Typo using XML-RPC
Posted by Marcus Crafter Thu, 21 Sep 2006 18:41:36 GMT
After installing my Typo based blog at it's new location last weekend, the next step was to find a way to migrate all of the data across from my older MovableType based blog.
Since both blog engines were on different servers and used different database schema's, rather than delving into the depths of SQL magic to bring all of the content across (mapping tables, foreign keys and data, etc), I took an approach of using XML-RPC to migrate the content instead.
XML-RPC is a style of web services supported by both MovableType and Typo to provide access to blogs over the web programatically. There's various different API's available for MovableType and Typo including metaWeblog, Blogger, and MT, all supporting various methods of accessing and posting articles. Clients and servers of these API's have been implemented in many varied languages as well.
For example, to create a new post using the metaWeblog API, the XML-RPC method to call is defined as follows:
metaWeblog.newPost Description: Creates a new post, and optionally publishes it. Parameters: String blogid, String username, String password, struct content, boolean publish Return value: on success, String postid of new post; on failure, fault
Here's how you would call the this method using the internet glue of choice - Ruby, with a Typo blog:
require "xmlrpc/client"
server = XMLRPC::Client.new("<my-blog-server.net>", "/backend/xmlrpc")
content = { :title => 'A new post', :description => 'hello world' }
result = server.call("metaWeblog.newPost", 1, "<username>", "<password>", content, true) A few things to note:
- Typo's XML-RPC interface is <web-server>/<blog>/backend/xmlrpc, each blogging implementation will be different.
- The content 'struct' in the XML-RPC API can be specified as a Ruby Hash.
- The 'true' at the end of the call indicates that the article should be published immediately.
So, to migrate all of our data we need to access our posts on the source blog, and re-post them to the new blog. metaWeblog.getRecentPosts and metaWeblog.newPost does what we need.
My preference here was to use the metaWeblog API since it uses a structure for post content rather than just a string. This means in addition to being able to specify the post data, you can also specify post meta-data such as retrospective date creation times (which will ensure your original post dates are preserved), whether comments/trackbacks are enabled, and many other things specified in the API.
Here's the migration code:
require 'xmlrpc/client'
class Blog
def initialize(host, path, username, password, blog_id = 1, port = 80)
@server = XMLRPC::Client.new(host, path, port)
@blog_id = blog_id
@username = username
@password = password
end
def posts(count = 5)
@server.call('metaWeblog.getRecentPosts', @blog_id, @username, @password, count)
end
def <<(content, retries = 5)
while (retries > 0)
begin
return @server.call('metaWeblog.newPost', @blog_id, @username, @password, content, true)
rescue Timeout::Error
puts "Retrying #{content['title']}, retry #{retries}"
retries =- 1
end
end
end
end
mt = Blog.new('<source-blog-host>', '/MT/mt-xmlrpc.cgi', '<username>', '<password>', 5)
ty = Blog.new('<target-blog-host>', '/backend/xmlrpc', '<username>', '<password>', 1)
mt.posts(330).reverse.each_with_index do |post, index|
puts "#{index}: migrating post #{post['title']}"
ty << post
end
$> ruby moveblog.rb 0: migrating post This was my first post! 1: migrating post My second post! ... 329: migrating post This was my most recent post!
...and that's it.
There's one caveat using this approach to be aware of - comments. As far as I can see, none of the API methods for obtaining post data from an existing server return comment content. This means article comments won't be migrated to your new blog. In my case this was actually beneficial since my older blog was well spammed beyond recognition and my true comment count was low, but your blog might be in a different position.
Once an XML-RPC API supports retrieval of comments however, we'll be able to add support for it in the migration script.

Wow, mate, that is really cool.
More power to Ruby and open APIs.
Good work, Marcus. I have one question for you:
Have you ever used 'pubDate' in the content? I'm trying to use it to preserve the publishing dates on my old blog. But Typo keeps considering the post as just created.
Hi Simon,
Thanks!
When using the script I wrote I didn't notice any problems with the published date being modified, it certainly comes across in the xmlrpc and I've been able to verify that the dates are preserved using my weblog editor.
Perhaps there's additional dates for when updates are made to articles?
Cheers,
Marcus