Hack Day 2017: Publishing website pages to an e-book

Caplin Hack Day 2017 was held at the end of November. The theme was ‘Let’s get physical’ — every project had to have a hardware component. For my project this year, I set myself the challenge of exporting the online documentation for Caplin Liberator to the offline world of an e-reader: the Amazon Kindle.

Project background

In Caplin Dev Week last summer, I exported our developer website to the static-site generator Jekyll and Markdown. Although Markdown provided a simple entry to static-site generation, Markdown’s limited feature set could not support the formatting requirements of our content. So, since Dev Week, I’ve been working on exporting our developer content to a more powerful markup language supported by Jekyll: AsciiDoc.

One of the attractions of AsciiDoc is that it offers a range of output formats. The AsciiDoc plugin for Jekyll is part of the Asciidoctor suite of libraries, which includes tools to compile AsciiDoc to PDF (Asciidoctor PDF) and to e-book formats EPUB3 and Amazon’s KF8 (Asciidoctor EPUB3).

caplin-website-kindle

The process of writing content once and publishing it to multiple formats is termed single-source publishing. It’s something I’ve always wanted to try with Caplin documentation, but until recently our content has been too loosely structured to allow this. Hack Day was an opportunity to find out if the AsciiDoc source files for Caplin’s future static-site could also be used to generate an e-book.

Hack Day

I successfully converted our website documentation for Caplin Liberator to a Kindle e-book, and I was overjoyed when my project achieved joint third place.

Some parts of the conversion were straightforward, and other parts required workarounds. If you’re thinking about moving your website to AsciiDoc and single-source publishing is one of your goals, then read on for the details of how I did it. I hope you find something here that helps you.

How to combine a website’s AsciiDoc files into an e-book

The instructions in this section should be read in conjunction with the main documentation for Asciidoctor EPUB3.

Requirements

To compile an AsciiDoc document to Amazon’s KF8 e-book format, you require the following software:

  • A standalone Ruby environment that allows you to manage gems without requiring administrator privileges. If you don’t have your own Ruby environment, I recommend using rbenv to install Ruby under your home directory. I installed Ruby 2.4.1.
  • The Asciidoctor and Asciidoctor EPUB3 Ruby gems. For installation instructions, see Installing the Asciidoctor Ruby Gem and Asciidoctor EPUB3: Getting started.
  • Amazon’s KindleGen utility. Download the KindleGen archive for your operating system and extract the kindlegen executable to your PATH (for example, ~/bin). The 32-bit Linux executable works on Ubuntu 14.04 64-bit.

Preparing the AsciiDoc files

AsciiDoc files intended for a website require several changes before they can be compiled into an ebook. Some of these changes are always required when AsciiDoc documents are combined into a larger document, and other changes are required specifically for Asciidoctor EPUB3.

There are enough pre-compilation tasks that I recommend writing a script.

Directory structure

Create the project’s directory structure:

  1. Create a build directory (for example, ~/src/website-to-kindle/build)
  2. Copy the AsciiDoc files for the book into the root of the build directory. Do not create subdirectories; all the book’s AsciiDoc files must reside in the root of the build directory.
  3. In the root of the build directory, create an ‘images’ subdirectory. Copy your website images and the book’s front cover to this subdirectory. The name and location of the subdirectory are important: Asciidoctor EPUB3 will fail to locate your book’s front cover image if it is located anywhere else.

Validation

Check for duplicate IDs in each AsciiDoc file. If an ID is duplicated within a file, throw an error and exit.

IDs must be unique within an AsciiDoc document, whether the document comprises one file or many (as is the case with a book). We can prevent duplication of IDs between chapters by manually prefixing the IDs in each chapter with a unique namespace, but duplication of IDs within a chapter is a fault and must be fixed in the original AsciiDoc file.

Spine file

The spine file (the ‘master’ document) sets AsciiDoc attributes and includes the chapter files in your e-book. Create the file in the root of your build directory. For more information, see Declaring the spine in the Asciidoctor EPUB3 documentation.

Include chapters in your spine file in the order they are listed in your website’s menu. If, for example, you use Jekyll to generate your website, then your script can retrieve the chapter order from the menu’s YAML data file.

IDs must be unique in the e-book. It’s unlikely that your website’s pages will have been written with the understanding that the pages might later be combined into a book, so commonly used subheadings are likely to have been assigned identical IDs (‘overview’, ‘getting-started’, ‘system-requirements’, …). To avoid ID duplication, we will prefix all IDs with a string unique to their chapter. I used ‘chapterN_’, where N was the chapter number. Generate the prefixes in advance and store them in a hash map keyed by the chapter’s filename minus its extension.

AsciiDoc automatically generates heading IDs if you don’t specify your own. Before the instruction to include a chapter in the spine file, set AsciiDoc’s ‘idprefix’ attribute to the prefix you generated for the chapter. This ensures that automatically generated IDs use the same prefixing convention that you will apply, in the next section, to all explicitly defined IDs.

The listing below is an excerpt from the spine file generated by my Hack Day script. It shows the AsciiDoc header and the first few chapters.

= Liberator
Caplin Systems Ltd. 2017-12-01
:updated: 2017-12-01
:doctype: book
:producer: Asciidoctor
:imagesdir: images
ifndef::ebook-format[:leveloffset: 1]
:lang: en
:copyright: Caplin Systems Ltd.
:front-cover-image: image:caplin-liberator-cover.png[Front Cover,1050,1600]

:idprefix: chapter1_
include::index.adoc[]

:idprefix: chapter2_
include::liberator-liberator-7-release-highlights.adoc[]

:idprefix: chapter3_
include::liberator-installing-liberator.adoc[] 

// ... and so on

Chapter files

Make the following alterations to your chapter files:

  1. Add an ID to the chapter’s level 0 heading. The ID must match the chapter’s file name (minus the .adoc extension). For more details, see Structuring your manuscript in the Asciidoctor EPUB3 documentation.
  2. Prefix IDs in each chapter with the unique string you generated for that chapter. For example, in the first chapter of your book, rewrite the ID [[overview]] as [[chapter1_overview]].
  3. Prefix IDs referenced in internal cross-references with the unique string you generated for that chapter. For example, in the first chapter of your book, rewrite xref:overview[Overview] as xref:chapter1_overview[Overview].
  4. Rewrite any URLs to chapters as: link:<chapter-id>.xhtml[<text>]. For example, rewrite link:/developer/platform/liberator/installing-liberator.html[Installing Liberator] as link:installing-liberator.xhtml[Installing Liberator].
  5. Rewrite any URLs to chapter anchors (deep-links) as: link:<chapter-id>.xhtml#<chapter-prefix><anchor-id>[<text>]. For example, rewrite link:/developer/platform/liberator/installing-liberator.html#system-requirements[Liberator’s system requirements] as link:installing-liberator.xhtml#chapter3_system-requirements[Liberator’s system requirements].
  6. [Optional] Remove all URLs to locations external to your book. E-readers do not traditionally have active Internet connections.

Note: steps 4 and 5 rewrite URLs as URLs to the XHTML file in the e-book. Technically, these URLs should have been rewritten as inter-document cross references, but in practice I could not get inter-document cross-references to work in Asciidoctor EPUB3 1.5.0 Alpha 7. Ultimately, inter-document cross references are rendered by Asciidoctor EPUB3 as hyperlinks to the XHTML files of the book; this workaround creates those links manually.  See Asciidoctor EPUB3: Issue 27 on GitHub.

Compiling the book

To compile your book to a Kindle e-book, run the command below from the root of your build directory:

KINDLEGEN=<path-to-kindlegen> asciidoctor-epub3 -D output -a ebook-format=kf8 <spine-filename>

For example:

KINDLEGEN=/home/johnsmith/bin/kindlegen asciidoctor-epub3 -D output -a ebook-format=kf8 caplin-liberator.adoc

Note: The KINDLEGEN environment variable is required. Asciidoctor EPUB3 does not look for the kindlegen executable in your PATH.

Known issues

One known issue that affected my book is that Asciidoctor EPUB3 does not yet support inline images. Inline icons work as expected.

Next steps

In a future project, I want to investigate how much freedom there is to customise the book’s default theme.

Leave a Reply to Anonymous Cancel reply

Your e-mail address will not be published. Required fields are marked *