Howdy! This importer allows you to extract posts from a Blosxom generated RSS 2.0
feed file into your blog. This tool also has the ability to import writebacks from
the Blosxom writeback or writeback_plus plugins.
The feed should be generated using the following Blosxom .rss20 theme.
If you don't use themes then simply break the theme up into the necessary individual
flavour files.
Blosxom RSS 2.0 Wordpress Import theme file:
<!-- blosxom theme .rss20 -->
<!-- blosxom content_type text/xml -->
<!-- blosxom head -->
<?xml version="1.0"?>
<rss version="2.0">
<channel>
<title>$blog_title</title>
<link>$url</link>
<description>$blog_description</description>
<language>$blog_language</language>
<copyright>Get Lost!</copyright>
<generator>Blosxom</generator>
<ttl>180</ttl>
<!-- blosxom date -->
<!-- blosxom story -->
<item>
<title>$title</title>
<link>$url$path/$fn.html</link>
<description><![CDATA[$body]]></description>
<pubDate>$dw, $da $mo $yr $hr:$min:00 PST</pubDate>
<category><@filesystem.path_basename path="$path" output="yes" /></category>
<guid isPermaLink="true">$url$path/$fn.html</guid>
<author>Administrator</author>
$writeback::writebacks
</item>
<!-- blosxom writeback -->
<wb>
<wb_name>$writeback::name</wb_name>
<wb_url>$writeback::url</wb_url>
<wb_date>$writeback::date</wb_date>
<wb_ip>$writeback::ip</wb_ip>
<wb_title>$writeback::title</wb_title>
<wb_comment><![CDATA[$writeback::comment]]></wb_comment>
</wb>
<!-- blosxom foot -->
</channel>
</rss>
If you look carefully at the above theme you will see the category field is filled
in using the interpolate_fancy plugin which calls the path_basename
function within the filesystem plugin. This code results in the basename of
the file path of the post to be used as the category name. For example, if a post exists
at /software/blosxom/hacks.txt, the path is /software/blosxom
and the resulting category will be blosxom. This might not be what you want and
you have a couple options. First is to remove the category field from the
theme resulting in every post being posted in the Uncategorized category. Then
you'll need to recategorize your posts manually using the Wordpress admin pages. Second
is to write, or politely ask someone else to write, a Blosxom plugin that will break apart
the post path and create multiple category fields for the post. This will
essentially cross pollinate the post into multiple categories. Third... any ideas?
The interpolate_fancy plugin can be downloaded
here and
the filesystem plugin can be downloaded
here.
If your Blosxom blog does not have comments or you don't use the writeback or
writeback_plus plugins then remove the $writeback::writebacks line
from the above theme.
To get started you must modify your Blosxom blog by installing the above theme and
changing your $num_entries configuration item to a very large number (i.e.
more than the number of posts in your blog). Then visit your blog using the following
url: http://<your_site>/index.rss20. Save this output to a file and upload to your web server. This
is your Blosxom generated RSS 2.0 Wordpress Import file.
Now edit the configuration variables in this file (import-blosxom.php), particularly this line:
define('BLOSXOM_RSSFILE', '');
You want to define where the RSS file you saved above is. For example:
define('BLOSXOM_RSSFILE', '/home/foobar/rss.xml');
You have to do this manually for security reasons. There are several other options that you might like to change while editing, as well. When you're done,
reload this page and we'll take you to the
next step.
(.*?)|is', $importdata, $posts);
$posts = $posts[1];
echo "";
foreach ($posts as $post)
{
$title = $date = $categories = $content = $post_id = '';
echo "- Importing post... ";
preg_match('|(.*?)|is', $post, $title);
$title = addslashes(trim($title[1]));
$post_name = sanitize_title($title);
preg_match('|(.*?)|is', $post, $date);
$date = strtotime($date[1]);
$post_date_gmt = gmdate('Y-m-d H:i:s', $date);
preg_match_all('|(.*?)|is', $post, $categories);
$categories = $categories[1];
preg_match('|(.*?)|is', $post, $guid);
$guid = addslashes(trim($guid[1]));
preg_match('|(.*?)|is', $post, $content);
$content = str_replace(array(''), '', addslashes(trim($content[1])));
$content = unhtmlentities($content);
// Clean up content
$content = preg_replace('|<(/?[A-Z]+)|e', "'<' . strtolower('$1')", $content);
$content = str_replace('
', '
', $content);
$content = str_replace('
', '
', $content);
// Check for a duplicate
$duplicate = $wpdb->get_var("SELECT ID FROM $wpdb->posts WHERE
post_title = '$title' AND post_date = '$post_date_gmt'");
if ($duplicate)
{
echo "Post already imported ";
continue;
}
// Insert the post into the database
$wpdb->query("INSERT INTO $wpdb->posts
(post_author, post_date,
post_date_gmt, post_content,
post_title, post_status,
comment_status, ping_status,
post_name, guid)
VALUES
('$post_author',
DATE_ADD('$post_date_gmt', INTERVAL '$add_hours:$add_minutes' HOUR_MINUTE), '$post_date_gmt',
'$content', '$title',
'$post_status', '$comment_status',
'$ping_status', '$post_name', '$guid')");
$post_id = $wpdb->get_var("SELECT ID FROM $wpdb->posts WHERE
post_title = '$title' AND post_date_gmt = '$post_date_gmt'");
if (!$post_id)
{
die("Couldn't get post ID");
}
// Insert and associate the categories with the post
if (count($categories) != 0)
{
foreach ($categories as $post_category)
{
$post_category = unhtmlentities($post_category);
// See if the category exists yet
$cat_id = $wpdb->get_var("SELECT cat_ID from $wpdb->categories WHERE
cat_name = '$post_category'");
if (!$cat_id && (trim($post_category) != ''))
{
$cat_nicename = sanitize_title($post_category);
$wpdb->query("INSERT INTO $wpdb->categories (cat_name, category_nicename)
VALUES ('$post_category', '$cat_nicename')");
$cat_id = $wpdb->get_var("SELECT cat_ID from $wpdb->categories WHERE
cat_name = '$post_category'");
}
if (trim($post_category) == '')
{
$cat_id = 1;
}
// Double check it's not there already
$exists = $wpdb->get_row("SELECT * FROM $wpdb->post2cat WHERE
post_id = $post_id AND category_id = $cat_id");
if (!$exists)
{
$wpdb->query("INSERT INTO $wpdb->post2cat (post_id, category_id)
VALUES ($post_id, $cat_id)");
}
// JPS' addition - increment count if cat ID exists
if ($cat_id) {
$wpdb->query("UPDATE $wpdb->categories SET category_count = category_count + 1 WHERE cat_ID = $cat_id");
}
// End JPS' addition
}
}
else
{
$exists = $wpdb->get_row("SELECT * FROM $wpdb->post2cat WHERE
post_id = $post_id AND category_id = 1");
if (!$exists)
{
$wpdb->query("INSERT INTO $wpdb->post2cat (post_id, category_id)
VALUES ($post_id, 1)");
}
}
// Insert the writebacks for the post
$wbs = '';
preg_match_all('|(.*?)|is', $post, $wbs);
$wbs = $wbs[1];
if (!$import_writebacks || (count($wbs) == 0))
{
echo "Done!";
continue;
}
foreach ($wbs as $post_wb)
{
$wb_name = $wb_url = $wb_email = $wb_date_gmt = '';
$wb_title = $wb_comment = $wb_ip = '';
preg_match('|(.*?)|is', $post_wb, $wb_name);
if ($wb_name)
{
$wb_name = addslashes(trim($wb_name[1]));
}
preg_match('|(.*?)|is', $post_wb, $wb_url);
if ($wb_url)
{
$wb_url = trim($wb_url[1]);
if (preg_match('|mailto|is', $wb_url) || preg_match('|.+@.+|is', $wb_url))
{
$wb_url = '';
$wb_email = addslashes($wb_url);
}
else
{
$wb_url = addslashes($wb_url);
$wb_email = '';
}
}
else
{
$wb_url = '';
$wb_email = '';
}
preg_match('|(.*?)|is', $post_wb, $wb_date_gmt);
if ($wb_date_gmt)
{
$wb_date_gmt = strtotime($wb_date_gmt[1]);
$wb_date_gmt = gmdate('Y-m-d H:i:s', $wb_date_gmt);
}
preg_match('|(.*?)|is', $post_wb, $wb_ip);
if ($wb_ip)
{
$wb_ip = trim($wb_ip[1]);
}
preg_match('|(.*?)|is', $post_wb, $wb_title);
if ($wb_title)
{
$wb_title = addslashes(trim($wb_title[1]));
}
preg_match('|(.*?)|is', $post_wb, $wb_comment);
if ($wb_comment)
{
$wb_comment = str_replace(array(''), '', addslashes(trim($wb_comment[1])));
}
if ($wb_title)
{
$wb_comment = "" . $wb_title . "
" . $wb_comment;
}
$wb_comment = unhtmlentities($wb_comment);
// Check if it's already there
if (!$wpdb->get_row("SELECT * FROM $wpdb->comments WHERE
comment_date = '$comment_date' AND
comment_content = '$comment_content'"))
{
$wpdb->query("INSERT INTO $wpdb->comments
(comment_post_ID, comment_author,
comment_author_email, comment_author_url,
comment_author_IP, comment_date, comment_date_gmt,
comment_content, comment_approved)
VALUES
($post_id, '$wb_name',
'$wb_email', '$wb_url',
'$wb_ip', DATE_ADD('$wb_date_gmt', INTERVAL '$add_hours:$add_minutes' HOUR_MINUTE), '$wb_date_gmt',
'$wb_comment', '1')");
}
}
echo "Done!";
}
?>