How are Ebook Files Made and Read?

1. The .epub File is a .zip File

Yes, that's right. The .epub file extension, commonly known for ebook files, is essentially the same as a .zip file, which is an archive of multiple files! By specifically naming it .epub, when a user clicks to open the file, it tells the computer, "Hey, I'm an ebook file, please run me with an app that can read ebook files."

2. What's Inside the .epub File

Since an .epub is just an archive file, if you simply change the name to .zip and unzip it, you'll see various files inside. (On a Mac, you can open the terminal in the file's folder and type unzip [ebook name].epub -d [folder name to save] to decompress it.) If you unzip any ebook file using this method, the contents will generally be structured as follows:

a) mimetype

This is just a text file that contains the following line:

application/epub+zip

It acts as an ID to confirm, "Yes, I am an .epub file." This file must never be modified.

b) META-INF

It tells the ebook reader the location of container.xml and content.opf. The folder name "META-INF" and the file name "container.xml" should not be arbitrarily changed either. They are essential for the ebook reader app to correctly read the book's information.

c) Folder containing the content

(The structure can be freely organized, but the organization must be properly recorded in container.xml.)

  • content.opf: Detailed information about the ebook file. Written in XML, it represents the ebook file's blueprint (including the title, author, language, unique identifier, list of contained files, file locations, and the order of body text display).
  • Table of Contents (toc.ncx or nav.xhtml depending on the ePub version): Contains the chapter titles and links that allow the user to jump to the respective chapter upon clicking.
  • Content (chapter.xhtml): The ebook's content is written using tags, which are the syntax of xhtml.
    Example: <p> He said <b>this</b>. </p>
  • Style (style.css): These are the rules that determine how each tag will be displayed (color, size, etc.).
    Example: "Display the <p> tag at 11px size, in gray!"
  • images: cover_image.jpg โ€” The cover image of the ebook file, recorded in content.opf as "The image with this name in this folder is the ebook file's cover."

3. How are Ebook Files Read?

The content so far is simpler than you might have expected, isn't it? An "ebook file" can essentially be viewed as a "mini website." The technology used to build websites is almost entirely used in ebook files. If an ebook file is a website, then "Ebook Reader Apps" act as browsers like Chrome, Safari, or Edge. The difference is that they are primarily offline. When we "download an ebook file," we need the internet to get the file from the bookstore's server. However, once the file is stored on your device, the pre-installed ebook reader app can interpret the file's content and display it on the screen without an internet connection.

In practice, unless you want to create these processes directly with code, you can easily obtain an ebook file by simply writing your content and exporting it using tools like Adobe InDesign or various other ebook editors. However, having knowledge of an ebook file's structure can be beneficial if you want to perform unusual tasks that others don't, or if you need to manually check where a problematic file's issue lies.

This concludes the explanation of how ebook files are made and how they are read.


ยฉ Dong-sun Han | Bug Loop