Wordpress to Ghost

My experience of moving a mid-sized website from self-hosted Wordpress to a managed Ghost server. Easier than I'd feared, but there were a few gotchas ...
Wordpress to Ghost
Light at the end of the tunnel (Photo by Aaron Burden on Unsplash)

This site was started in late 2013 using a hosted Wordpress setup. After about five years I was starting to get more serious about writing but was disappointed with the costs involved of using a hosted Wordpress site so moved to AWS. If you do this, avoid Bitnami's LAMP stack ... just install everything from scratch. More difficult to setup, but so much easier to maintain.

Five years later and my real frustration with Wordpress was the plugin maintenance, the ever-increasing integration with Jetpack for readership engagement (subscriptions, social media links etc. and the creeping charges associated for access to my subscribers) and never-ending system maintenance ... I spent my spare time with ufw, fail2ban and Cloudflare, not writing.

Enough was enough. With thousands of readers a week, an outdated theme, a smorgasbord of plugins that needed updating on a near-daily basis, and a less-than-satisfying writing experience, I decided to change platforms.

I looked at Substack, Medium and Patreon, setting up trial accounts and investigating the writing/editing interface, the publishing/reader experience and likely migration woes. I considered - and rapidly rejected (too 'bare bones') - Hugo, which I use for a couple of other small sites. For a fresh start I might now choose Hugo, but migrating an established site did not appeal.

I finally decided on using a hosted Ghost installation ... I want to write, not maintain the server, but I could always revert to self-hosted if needed.

Which hosted Ghost?

I did what I always do ... set up a trial account with several providers and then send a pre-subscription, slightly techy, question to their technical support about migration.

Gloat never responded (or to a follow-up email or social media contact). I assumed they were either defunct or not interested.

Ghost (Pro), DigitalPress, Firepress and MagicPages all responded promptly, though some faster (hours, not days) than others, providing coherent, relevant and encouraging answers to my question.

Of these, Ghost(Pro) was looking to be too expensive. With ~4,000 subscribers (none currently paying) I estimated it would cost about $750/annum.

Of the remainder, DigitalPress looked promising - and about half the cost of Ghost(Pro) - though I'd need to pay for their top offering and a surcharge for the subscriber/email numbers, or run my own Mailgun account. The latter did not appeal. However, the thing that ruled them out was the inability to upload a theme during the trial period, meaning I could not test template changes - for the commenting system, see below - I wanted for the site.

I chose MagicPages.

Jannis was quick to respond, the pricing was attractive (and became more so during my trial period as he reduced charges for additional emails πŸ˜ƒ) and I could test all the features I needed before subscribing.

I have not been disappointed. Since subscribing, Jannis has quickly resolved one or two teething problems I experienced. So far, I'm a 'happy camper'.

The scale of the problem

My site consisted of ~600 posts, ~50 pages, ~1,000,000 words, ~7,500 comments and more than a gigabyte of images.

I'd used a range of plugins to handle galleries and footnotes, the latter being an issue as they are not natively handled in the Ghost editor (a surprising weakness for a platform supposedly dedicated to writing).

I read the official guide to migrating from Wordpress to Ghost. It made it sound a little easier than it actually was.

Export

I used the Export to Ghost plugin for Wordpress to generate a JSON file.

πŸ’‘
Sublime Text is useful for opening and viewing large JSON files. Use CMD-OPT-F to reformat it. It's possible you could make extensive search/replace type changes with Sublime Text, but - with 27,000 lines and a lot of changes to make - I chose to use perl scripts.

The majority of the edits need to be made to the mobiledoc records for each post in the JSON file, but I also needed to work out how tags were assigned to posts.

I created a single .tar.gz archive of the /wp-content/uploads/ directory and downloaded it using scp.

I exported all the Wordpress comments using this WP plugin.

Finally, I exported a list of all the posts with tags and categories into a CSV format file using a WP plugin, the name of which I am afraid I have forgotten.

The other useful download, though I also cannot remember how I generated it, was a list of posts and the count of page reads over the last year. This provides two things:

  • an indication of which posts are most popular and therefore which I needed to focus on in terms of formatting/broken links etc. to maintain a good experience for my readers. There's no point in putting effort into a post published half a decade ago read only 53 times last year, if there's something older, but that was read 17,000 times.
  • an aide memoir that some posts aren't posts, but are actually search results from Wordpress, or categories, or similar that will need to be corrected, either by editing individual posts or by adding them to the redirect.yaml file

Gotchas

Wordpress allows files that are not images to be uploaded to the images directories {{56}}. The Ghost image import does not work if there are non-image files in the uploaded zip files. Upload fails with a wholly uninformative error. It is therefore necessary to exclude anything that isn't an image. I therefore wrote a short script to recursively search through ~20,000 files and only retain directories and jpg/jpeg/gif/webp files. In my case, ~50 non-image files were littered around the /uploads/ directory. They are there for a reason ... don't just delete them. In my case they were mainly PDF's that had to be hosted elsewhere (and the links in posts amended accordingly).

Illogically (but it's Wordpress ... was I surprised?) some plugins also upload or store files in the /wp-content/uploads/ directory. I deleted and archived these. They weren't missed.

Ghost file upload can struggle with large files - at least that was my experience. I split my downloaded ~1 Gb /wp-content/uploads/ file into about 10 individual zip files for uploading to Ghost.

πŸ’‘
My recollection from testing is that Ghost cannot handle zip files divided using the -s (split) option. I did it manually, based roughly on the size of the directories in /wp-content/uploads/.

The Export to Ghost plugin automagically converts image links to /content/images/wordpress/image_name {{63}}. Useful, except I didn't want a billion references to 'wordpress' buried in the pages. These were modified in a script that made additional changes to the JSON file (see below).

Footnotes

I had used the WP plugin Easy Footnotes for years. This works by wrapping the footnote test in a shortcode [efn_note]like this[/efn_note]. Some posts had none, others had 20 or more.

Ghost does not have native footnotes. The best solution I found when testing everything was the littlenote.js library as recommended by Cathy Sarisky {{77}}. This works entirely by code-injection and - with a few of my amateur CSS tweaks - provided the functionality I wanted.

However, it involves wrapping the footnote number in paired square brackets and the footnote in a matching bracket followed by a colon.

This means that the footnote would be marked in the HTML like this [[3]] and - somewhere else - there would be the footnote, which would appear like this.

[[3]]: Here is the footnote text.

All well and good except I use Obsidian for all my writing and paired square brackets are used for Obsidian document links. I therefore changed the regex in the injected code to instead work with squiggly brackets {{3}}. To match this I added a subroutine to the script I used to modify the JSON file prior to importing it to Ghost (see below) to substitute all the footnote citations and reformat the footnotes.

The littlenote.js code-injection only recognises footnotes in particular HTML elements. For example, within blockquotes or <p> tags. If you have footnotes in headings you will need to modify the injected code.

Tags and categories

My tags and categories in Wordpress were a decade old and a hopeless shambles. Having exported them all to a CSV file I imported them to Numbers (Excel for the Mac), added a few extra columns and manually re-tagged everything.

Painful, but worthwhile.

Rationalising the tags was one of the most time consuming parts of the migration, and something that was difficult to meaningfully automate. However, it was time well spent as finding related information is now much easier.

It's worth creating a 'miscellaneous' tag and/or a hidden #wtf tag for posts you might want to return to in the future.

It's worth remembering that the order of tags in Ghost is important. My primary tag was going to be used for 'read more like this' lists.

I also added tags for year and month (e.g. year-2022 or month-may) to help create an archive page searchable by date of publication; my site involves a seasonal hobby so I might write about similar topics every November that others want to read.

All the new tags need to be added to the appropriate posts and appended to the JSON file.

Butchering the JSON file

I know enough perl coding to be useful. It, like me, is old-fashioned and a little idiosyncratic. But it gets the job done. Eventually.

I wrote a perl script (making extensive use of HTML::Entities and JSON::Tiny modules) that took the Wordpress to Ghost JSON export as its input and did the following:

  • erased all current tags and categories from all posts and pages
  • repopulated every post with the new tags (remembering the importance of the primary tag for subsequent sorting) imported from the edited CSV file
  • searched every post/page for footnotes based on the presence of a shortcode; appended the footnote text to a block of HTML and re-inserted a footnote citation within {{brackets}} within the body text, finally inserting the complete list of footnotes at the end of the post/page text
  • based upon the length of the post I also inserted a block of HTML to provide a progress indicator and 'back to the top' link to posts of more that a couple of screens in length (some of my posts are 3,000 words long)
  • replace all mentions of 'wordpress' in the image URL's with 'wp' (a trivial change, but satisfying)
  • purged some clearly abberrant HTML hidden in commented regions that increased the length of some posts by 2 or 3-fold ... I've no idea where this originated, it was in the exported JSON file from Wordpress so must have been there for years
  • a whole host of additional minor HTML tweaks, often involving stripping out references to font size which caused all sorts of display problems. Almost without exception these were due to vagaries of the Wordpress editor as I'd never deliberately changed font sizes (other than the H2 etc. headings)
  • some final sanity checking to ensure nothing was obviously wrong with the file

Many of the changes involved regular expressions of one form or another. I can recommend regex101.com as a way to tweak these so that they work as intended.

The resulting file could be imported to Ghost without a problem and most of the pages displayed correctly (though see my comment below on 'problems').

πŸ’‘
A local installation of Ghost, including a complete copy of the entire image library, made troubleshooting the JSON file changes much easier. This was a game changer.

Upload and import everything

Having uploaded the modified JSON file to Ghost and all the separate zipped image files I did some sanity checking to see that internal and external links worked as expected and that the formatting was more or less as expected (this was just a 'belt and braces' repeat of the same thing with a local install of Ghost).

I had also pre-screened the subscriber CSV file downloaded from Wordpress, removing a few dozen obviously wrong email addresses (something like @wordpress.inactive).

At this point I also realised that subscribers who had read my site via wordpress.com were not exported with those who had opted for the direct email. I'm pretty sure this was a change from the subscriber export function a year or so ago 😠.

Frustrating, but it was too late to do anything about it (and another of those opaque irritations that persuaded me to move from Wordpress in the first place). Again, the lesson here is to determine exactly what Wordpress/Jetpack allows access to, and to backup this information periodically just in case they arbitrarily remove your access in the future.

I uploaded all the subscriber email addresses and checked that a couple of dummy ones of mine worked as expected - sign-ins, comments etc.

Comments

I had decided quite early on to use a third-party commenting system (FastComments) as it allows import of legacy Wordpress comments. This required some theme modifications, but was relatively straightforward. Devon, the tech I dealt with at FastComments, helped a lot with some relatively minor yet still irritating issues.

I modified the theme (Dawn) to only display comments to signed in subscribers, added the FastComments code and then imported the ~8,000 legacy comments from Wordpress.

With one exception, commenting has worked well.

The exception relates to single sign on and the import of users from Wordpress to Ghost.

Wordpress only exports email addresses ... no other subscriber data is provided, at least not without paying for a Jetpack subscription (and even then I'm not certain). FastComments needs a username or first name and an email. Without a username the subscriber gets an error message about a Malformed SSO Request.

New subscribers to Ghost have to provide a username and email so this is a non-issue for them.

The temporary solution to this was an announcement banner on Ghost and a few frantic emails to enquiries from subscribers and to Devon at FastComments ... the longer term solution is to globally edit imported subscribers and add a 'generic' username to all of them, with a comment in future posts encouraging them to change it to something more personal. My analysis suggests only ~10-15% of my subscribers ever leave comments, so this is not a major issue (though it's never good to leave the screen littered with error messages).

Had I known about this problem in advance, I would have modified the subscriber CSV file prior to uploading it to Ghost.

I've not investigated the native Ghost commenting system to know how it deals with 'missing' subscriber information.

Again ... look carefully at what you export from Wordpress.

Miscellaneous

I use a contact form from Letterbird to avoid leaving an email address in plain text on the site. Their technical support have been excellent and - whilst I could probably code something myself and/or use Google Forms - it's neater and easier, and one less thing to think about. I also investigated Formspree, but the Letterbird experience was altogether better in my view.

It works!

2-3 weeks ago the site went live. I made the necessary DNS changes to move my domain, adding all the stuff MagicPages needs for the AmazonSES mail backend (the instructions for which were commendably straightforward).

The first post went out, scheduled and on time {{1}}. I watched as the 'mail opened' count increased ... 17, 167, 890, 1974 ... opened a beer and relaxed.

My fears that I'd have thousands of emails bouncing back were unfounded. In fact, the bulk emailing was probably the most trouble-free part of the entire process.

I knew, or at least suspected, that some of the Wordpress subscribers were not active readers of my site. After all, some dated back to late-2013. The first 2-3 weeks have seen about 60-65% of newsletters sent (of ~12,000 now) read, or at least opened.

I expect to cull totally inactive subscribers - or, more accurately, their email addresses - after a month or two. MagicPages email limits and costs are reasonable, but I'd prefer not bombard the internet with unwanted emails.

I set up a noreply@domain to monitor bounced emails and was surprised how few there were (~1.5%) considering the imported subscriber list had been accumulated over a decade. Jannis at MagicPages also provided some more detailed delivery/bounce stats which were encouraging.

Page views on the website - I use Goatcounter, with the code injected into the page footer - are healthy. These will omit newsletter-only 'reads' as they require javascript.

I received some very positive comments from readers πŸ˜ƒ.

I returned to the AWS site, archived the site, exported the MySQL database etc., downloaded everything and then pulled the virtual plug out of the back of the server (remembering to also cancel the static IP address that Amazon otherwise continues charging for ... been there, done that).

What didn't work?

I've mentioned the major issues above.

Some gallery plugins from Wordpress have caused a problem ... the thumbnail images display properly, but the linked full-size image is broken. Despite spending some time with the HTML there appeared no rhyme or reason for why some worked and others did not, and my regex-fu was not good enough to create an elegant solution.

I'd used these plugins on old (and so now little read) posts, so pragmatically just left them in knowing I was going to have to retrospectively make changes manually at some point.

The code to create the progress indicator/back to the top link leaves an ugly graphic in the footer of emails ... I need to look into this.

Other than that ... all good πŸ˜ƒ.

Ongoing maintenance

I use Lychee to search for broken links on a subset of the site every night. Using the Ghost reader API I pull down a complete list of posts and pages, choose about 10% of them and check for broken links. All this runs on a local cron job overnight and I get an email in the morning with a list of links that need checking. This works well.

To do

I need to roll up my sleeves and look at making some changes to the Dawn template. I want to include pages in the list of featured posts and would like to order featured articles by something other than date.

I'm aware that the footnote popup on a mobile screen is partially covered by the 'notch' on the recent iPhone. I've had complaints. Well, one complaint. CSS is not my thing and this is not a priority ... my current advice is to buy a phone that adheres to standards.

The priority is now to largely ignore the 'backend' and instead concentrate on writing ... which was the reason for making the change in the first place.

Writing

I write almost entirely offline (or at least, not in the Ghost editor) using Obsidian and Markdown. Ideas, web-clippings, drafts, back-of-an-envelope sketches and outlines all go into Obsidian ... and periodically 3,000 words/week appear out the other end.

There is an Obsidian to Ghost plugin that uses the Ghost Admin API to upload new posts. These are uploaded natively, not as Markdown, so it's then easy to use the Ghost editor to make the final tweaks.

The Obsidian to Ghost upload works well, with a couple of caveats:

  • embedded images are not uploaded. I use the recently-introduced TK reminders in Ghost to mark where images are to be added
  • I've not managed to get the feature image to work (perhaps for the same reason)
  • excerpts added into the YAML header are also not transferred on upload

However, importantly, tagging and draft/published options options do work. For safety, everything is uploaded as a draft.

Obsidian is extendable with templates and plugins and so I use a number of them to enhance my writing environment or to post-process the text before uploading to Ghost.

For example, to add a footnote I simply wrap the text in squiggly brackets and give it a unique number {{54 Here is a footnote 🦢🎢}} and my post-processing script - run using the Obsidian-Shell Commands plugin - makes the necessary changes for it to appear as a proper footnote {{54}} once uploaded to Ghost {{99}}.

Until Ghost gets native footnotes it's the easiest way - with littlenote.js code-injection - to add footnotes to posts. They appear in a logical place (at the end of the text) for newsletter readers and are all automagically renumbered properly.

And finally

I'm pleased I've switched away from Wordpress and am enjoying writing with Ghost without worrying about the firewall or a corrupted cache or yet another dodgy plugin.

I'm particularly pleased that the migration went reasonably smoothly and I can now forget perl regex's for a bit and focus on some writing instead.


{{1}}: No infuriating meddling with cron.php needed again πŸ˜ƒ.

{{56}}: Yes, go figure!

{{54}}: Here is a footnote 🦢🎢

{{77}}: All sorts of excellent advice on this site ... take it, and buy a coffee in return β˜•.

{{99}}: It might be better to use proper Markdown footnotes and then convert those to the squiggly brackets ... however, writing them in Markdown is more work, and if I really ever need pure Markdown I'll just write a filter to correct them.

{{63}}: Note that the official guide suggests that this needs to be edited in the JSON file, but the Export to Ghost plugin already does this for you.