Moving from Jekyll to WordPress

This post is also available in german.

The last years I’d been using Jekyll to manage my blog. Jekyll is a static site generator: There’s no special software running on the webserver. Instead I’m using a software (Jekyll) on my local computer to convert a bunch of source files into the HTML files that make up this blog. In the end I just get a folder full of files I have to upload to my webserver.

This has the advantage of being extremely simple work for my webserver. No pages have to be created and rendered for users, I don’t need to have a database and stuff. Only a static HTML file to deliver to the visitor. I don’t even have to worry about security flaws, since there is no extra software running that could have security flaws.

But it also has its disadvantages. Mainly that it’s not that simple to write a blog post “on the go”. I need all my source files on a computer that’s having Jekyll installed. Using my smartphone is entirely out of the question. In WordPress on the other hand I can just visit the admin panel, open the editor and type away. And my argument about security flaws doesn’t really count since I already have an instance of WordPress running on my server to serve my podcast Nerd, Nerd, Nerd & Uli.

As the decision to move back from Jekyll to WordPress was made, I started looking for a converter for this process. I didn’t just want to leave all my old posts behind and start anew. But I couldn’t find anything useful. So I built my own solution.

All the scripts and files you need can be found in my Gitea. Let me be clear: The code isn’t beautiful. I mean, I probably won’t use it ever again. Why should I put lots of time into it, just to make it more “beautiful”? It’s working (for me), that’s all I need.

The conversion happens in 5 steps:

  • Exporting all posts into an XML file
  • Extracting all media URLs
  • Importing the media into WordPress
  • Modifying the XML file to changed image URLs
  • Importing the posts into WordPress

Exporting all posts into an XML file

In Gitea you’ll find the file wp_export.xml. Put it in your Jekyll folder and modify it. The places you should modify are easy to spot. Essentially change everything containing my name or fabianonline.de. 😉

Then start Jekyll as usual by calling jekyll build. In your _site folder you’ll find a file called wp_export.xml containing all your posts. But don’t just yet import this file into your new WordPress blog!

Extracting all media URLs

Instead of just importing all media files into WordPress, I decided to only import used media files. For this task, I wrote the script extract_media.rb. It searches the previously created wp_export.xml for links to media files and puts all these links into another XML file to be imported into WordPress.

Grab the script and modify it. The following parts are especially important:

images = STDIN.read.scan(/(?:src|href)="(\/uploads\/.+?)"/).to_a.map(&:first).uniq

This line looks for all src and href attributes beginning with /uploads/. This was tailored to my blog which had all media files in the folder uploads. You may have to modify this to your own folder structure.

<wp:post_date><![CDATA[1970-01-01 13:41:29]]></wp:post_date>
<wp:post_date_gmt><![CDATA[1970-01-01 12:41:29]]></wp:post_date_gmt>

This block sets a fixed timestamp on January 1, 1970 for all media files. It does not use the timestamp of the media file because we need to be sure where WordPress will save the media file. Using this date, we know the file will be put into a folder 1970/01.

<wp:attachment_url><![CDATA[https://blog.fabianonline.de#{path}]]></wp:attachment_url>

This line contains the full URL the media file is now (!) reachable. WordPress will download the file from this URL, so this has to be correct.

Now run the script, piping in wp_export.xml created in step 1:

ruby extract_media.rb < wp_export.xml > wp_media.xml

This will create wp_media.xml, a file containing information about all media files to import.

Importing the media into WordPress

Now we can import the media files into WordPress. Open the WordPress admin dashboard and navigate to “Tools > Import data”. There, select the “Wordpress Importer”. Select and upload wp_media.xml. In the next step, select a user to be owner of all media and activate the option “Import all attachments”. WordPress will now import all media files.

Modifying the XML file to changed image URLs

Most likely the paths to your media files in WordPress differs from your old Jekyll blog. WordPress doesn’t allow custom folders and puts all files into a path like wp-content/uploads/<YEAR>/<MONTH>/<FILENAME>. When you’re using a multisite blog, the path will even be wp-content/uploads/sites/<SITE_ID>/<YEAR>/<MONTH>/<FILENAME>. So we have to modify the paths in wp_export.xml to point to the new media paths. That’s what the script finalize_export.rb does.

First, check the line just to select all the href and src attributes. It should be the same like the one in extract_media.rb.

data = data.gsub("=&quot;#{path}&quot;", "=&quot;/wp-content/uploads/sites/5/1970/01/#{File.basename(path)}&quot;")

This line will modify the paths to the media files. Since we set the date to January 1, 1970 in extract_media.rb, we know that all files will be in the folder /1970/01/.

Also, this script removes all newlines from your posts. Pragraphs will be included from the conversion of Markdown to HTML. Additional newlines make the post look funny because all lines are broken.

Call this script, piping in the content of wp_export.xml:

ruby finalize_export.rb < wp_export.xml > wp_posts.xml

The file wp_posts.xml will be created.

Importing the posts into WordPress

This step is kinda a repeat of step 3: Open the “Wordpress Importer” again, using the file wp_posts.xml this time. Again select a user to be owner of all posts, but this time do not activate the option “Import all attachments”.

That’s it. Now you should have all your posts and media files in WordPress.

If it didn’t work and you have to try again, you might find the plugin Bulk Delete useful to quickly delete all posts.

Ein Gedanke zu „Moving from Jekyll to WordPress“

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

Diese Website verwendet Akismet, um Spam zu reduzieren. Erfahre mehr darüber, wie deine Kommentardaten verarbeitet werden.