Just what is an e-book anyway? Part two
Check out Part one of Just what is an e-book anyway?
Now that we have looked at how an e-book functions, let’s look at what’s under the hood of a typical e-book file.
There are many different types of e-book file formats. Here are some:
- EPUB, EPUB3 – the open source de facto file format and it’s newer sibling.
- MOBI, AZW, KF8, PRC – proprietary formats owned by Amazon, similar to EPUB.
- IBA – proprietary format owned by Apple, similar to EPUB.
- BBeB – proprietary format owned by Sony and Canon, similar to EPUB. BBeB files can have LRS and LRF or LRX file extensions.
- PDF, TXT, RTF and others – static file formats that can be read on various e-book readers and computers.
At the highest level, an e-book file is an archive file format, much like a ZIP file. An archive file format is a kind of file that can take large and/or multiple files and compress them in size to save space. Like a ZIP file, e-book file formats like EPUB and MOBI compress multiple files together to create one file.
Let’s break one apart and look at what’s inside. In this example, I’m using a test EPUB file that I created specifically for this blog post. Here’s what we get:
What’s all this? Remember an EPUB file (as well as MOBI, AZW, PRC and others) are like mini web sites, and are written in XHTML and CSS.
XHTML (or Extensible HyperText Markup Language) is a variation of HTML (or HyperText Markup Language), the main programming language for creating webpages. Each section of the book is kind of like it’s own web page.
CSS (or Cascading Style Sheets) is the code that describes and controls the look and formatting of the web page and its elements (like how the fonts look).
You also notice some other important files for the EPUB format:
- content.opf – this file contains all the e-book’s metadata, the file manifest and the linear reading order. In other words, the file contains all the descriptive data for the book, lists all the parts of the book and directs the order of the parts.
- toc.ncx – this file describes and controls the hierarchical or navigational table of contents…the table of contents that runs the navigation on the device. For example, the list of chapters that appears in the left hand side in Adobe Digital Editions:
- container.xml – this file helps define the contents of the book.
- mimetype – this file, which needs to be uncompressed and unencrypted for the e-book to work, basically tells other programs and devices that this is an EPUB file.
So, those are the parts of an EPUB file. MOBI and other proprietary formats have similar structures and use XHTML and CSS as well.
For those who are curious, I created the test EPUB file by creating a book file (and all supporting chapter files) in Indesign CS4 (Mac) and exporting the book to Adobe Digital Editions. To break apart the EPUB file, I used EPUB Unzip 1.0, a free application for Mac. For more info on how to break apart your own EPUB files, see this great article from Anne-Marie Concepcion at Indesign Secrets.
In the final installment, we’ll explore how all this affects your approach to designing an e-book.