Importing an existing site
pelican-import is a command-line tool for converting articles from other software to reStructuredText or Markdown. The supported import formats are:
The conversion from HTML to reStructuredText or Markdown relies on Pandoc. For Dotclear, if the source posts are written with Markdown syntax, they will not be converted (as Pelican also supports Markdown).
Unlike Pelican, Wordpress supports multiple categories per article. These are imported as a comma-separated string. You have to resolve these manually, or use a plugin such as More Categories that enables multiple categories per article.
pelican-import has some dependencies not required by the rest of Pelican:
pelican-import [-h] [--blogger] [--dotclear] [--posterous] [--tumblr] [--wpfile] [--feed] [-o OUTPUT] [-m MARKUP] [--dir-cat] [--dir-page] [--strip-raw] [--wp-custpost] [--wp-attach] [--disable-slugs] [-e EMAIL] [-p PASSWORD] [-b BLOGNAME] input|api_token|api_key
Positional arguments
inputThe input file to read
api_token(Posterous only) api_token can be obtained from
api_key(Tumblr only) api_key can be obtained from
Optional arguments
-h, --helpShow this help message and exit
--bloggerBlogger XML export (default: False)
--dotclearDotclear export (default: False)
--posterousPosterous API (default: False)
--tumblrTumblr API (default: False)
--wpfileWordPress XML export (default: False)
--feedFeed to parse (default: False)
-o OUTPUT, --output OUTPUT
 Output path (default: content)
-m MARKUP, --markup MARKUP
 Output markup format: rst, markdown, or asciidoc (default: rst)
--dir-catPut files in directories with categories name (default: False)
--dir-pagePut files recognised as pages in “pages/” sub- directory (blogger and wordpress import only) (default: False)
 Import only post from the specified author
--strip-rawStrip raw HTML code that can’t be converted to markup such as flash embeds or iframes (wordpress import only) (default: False)
--wp-custpostPut wordpress custom post types in directories. If used with –dir-cat option directories will be created as “/post_type/category/” (wordpress import only)
--wp-attachDownload files uploaded to wordpress as attachments. Files will be added to posts as a list in the post header and links to the files within the post will be updated. All files will be downloaded, even if they aren’t associated with a post. Files will be downloaded with their original path inside the output directory, e.g. “output/wp-uploads/date/postname/file.jpg”. (wordpress import only) (requires an internet connection)
 Disable storing slugs from imported posts within output. With this disabled, your Pelican URLs may not be consistent with your original posts. (default: False)
-e EMAIL, --email=EMAIL
 Email used to authenticate Posterous API
-p PASSWORD, --password=​PASSWORD
 Password used to authenticate Posterous API
-b BLOGNAME, --blogname=​BLOGNAME
 Blog name used in Tumblr API
For Blogger:
$ pelican-import --blogger -o ~/output ~/posts.xml
For Dotclear:
$ pelican-import --dotclear -o ~/output ~/backup.txt
for Posterous:
$ pelican-import --posterous -o ~/output --email=<email_address> --password=<password> <api_token>
For Tumblr:
$ pelican-import --tumblr -o ~/output --blogname=<blogname> <api_token>
For WordPress:
$ pelican-import --wpfile -o ~/output ~/posts.xml
To test the module, one can use sample files:
