History
A successor to the Open eBook Publication Structure, EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) approved in September 2010. The EPUB 3.0 specification became effective in October 2011, superseded by a minor maintenance update (3.0.1) in June 2014. New major features include support for precise layout or specialized formatting (Fixed Layout Documents), such as for comic books, and MathML support. The current version of EPUB is 3.2, effective May 8, 2019. The (text of) format specification underwent reorganization and clean-up; format supports remotely hosted resources and new font formats ( WOFF 2.0 and SFNT) and uses more pureVersion 2.0.1
EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) intended to clarify and correct errata in the specifications being approved in September 2010. EPUB version 2.0.1 consists of three specifications: * ''Open Publication Structure'' (OPS) 2.0.1, contains the formatting of its content. * ''Open Packaging Format'' (OPF) 2.0.1, describes the structure of the.epub
file in XML.
* ''Open Container Format'' (OCF) 2.0.1, collects all files as a ZIP archive.
EPUB internally uses XHTML or Open Publication Structure 2.0.1
An EPUB file uses XHTML 1.1 (or DTBook) to construct the content of a book as of version 2.0.1. This is different from previous versions ( OEBPS 1.2 and earlier), which used a subset of XHTML. There are, however, a few restrictions on certain elements. The mimetype for XHTML documents in EPUB isapplication/xhtml+xml
.
Styling and layout are performed using a subset of CSS 2.0, referred to as ''OPS Style Sheets''. This specialized syntax requires that reading systems support only a portion of CSS properties and adds a few custom properties. Custom properties include oeb-page-head, oeb-page-foot,
and oeb-column-number
. Font-embedding can be accomplished using the @font-face
property, as well as including the font file in the OPF's manifest (see below). The mimetype for CSS documents in EPUB is text/css
.
EPUB also requires that PNG, image/png, image/jpeg, image/gif, image/svg+xml
. Other media types are allowed, but creators must include alternative renditions using supported types. For a table of all required mimetypes, seOpen Packaging Format 2.0.1
The OPF specification's purpose is to " efinethe mechanism by which the various components of an OPS publication are tied together and provides additional structure and semantics to the electronic publication". This is accomplished by two XML files with the extensions.opf
and .ncx
.
= .opf file
= The OPF file, traditionally namedcontent.opf
, houses the EPUB book's metadata, file manifest, and linear reading order. This file has a root element package
and four child elements: metadata
, manifest
, spine
, and guide
. Furthermore, the package
node must have the unique-identifier
attribute. The .opf file's mimetype is application/oebps-package+xml
.
The metadata
element contains all the metadata information for a particular EPUB file. Three metadata tags are required (though many more are available): title
, language
, and identifier
. title
contains the title of the book, language
contains the language of the book's contents in RFC 3066 format ''or'' its successors, such as the newer RFC 4646 and identifier
contains a unique identifier for the book, such as its identifier
's id
attribute should equal the unique-identifier
attribute from the package
element.
The manifest
element lists all the files contained in the package. Each file is represented by an item
element, and has the attributes id
, href
, media-type
. All XHTML (content documents), stylesheets, images or other media, embedded fonts, and the NCX file should be listed here. Only the .opf
file itself, the container.xml
, and the mimetype
files should not be included.
The spine
element lists all the XHTML content documents in their linear reading order. Also, any content document that can be reached through linking or the table of contents must be listed as well. The toc
attribute of spine
must contain the id
of the NCX file listed in the manifest. Each itemref
element's idref
is set to the id
of its respective content document.
The guide
element is an optional element for the purpose of identifying fundamental structural components of the book. Each reference
element has the attributes type
, title
, href
. Files referenced in href
must be listed in the manifest, and are allowed to have an element identifier (e.g. #figures
in the example).
An example OPF file:
= .ncx file
= The NCX file (Navigation Control file for XML), traditionally namedtoc.ncx
, contains the hierarchical application/x-dtbncx+xml
.
Of note here is that the values for the docTitle
, docAuthor
, and meta name="dtb:uid"
elements should match their analogs in the OPF file. Also, the meta name="dtb:depth"
element is set equal to the depth of the navMap
element. navPoint
elements can be nested to create a hierarchical table of contents. navLabel
's content is the text that appears in the table of contents generated by reading systems that use the .ncx. navPoint
's content
element points to a content document listed in the manifest and can also include an element identifier (e.g. #section1
).
A description of certain exceptions to the NCX specification as used in EPUB is in Section 2.4.1 of the specification. The complete specification for NCX can be found in the ''Specifications for the Digital Talking Book''.
An example .ncx file:
Open Container Format 2.0.1
An EPUB file is a group of files that conform to the OPS/OPF standards and are wrapped in a ZIP file. The OCF specifies how to organize these files in the ZIP, and defines two additional files that must be included. Themimetype
file must be a text document in ASCII that contains the string application/epub+zip
. It must also be uncompressed, unencrypted, and the first file in the ZIP archive. This file provides a more reliable way for applications to identify the mimetype of the file than just the .epub
extension.
Also, there must be a folder named META-INF
, which contains the required file container.xml
. This XML file points to the file defining the contents of the book. This is the OPF file, though additional alternative rootfile
elements are allowed.
Apart from mimetype
and META-INF/container.xml
, the other files (OPF, NCX, XHTML, CSS and images files) are traditionally put in a directory named OEBPS
.
An example file structure:
--ZIP Container-- mimetype META-INF/ container.xml OEBPS/ content.opf chapter1.xhtml ch1-pic.png css/ style.css myfont.otfAn example container.xml, given the above file structure:
Version 3.0.1
The EPUB 3.0 Recommended Specification was approved on 11 October 2011. On June 26, 2014, EPUB 3.0.1 was approved as a minor maintenance update to EPUB 3.0. EPUB 3.0 supersedes the previous release 2.0.1. EPUB 3 consists of a set of four specifications: * ''EPUB Publications 3.0'', which defines publication-level semantics and overarching conformance requirements for EPUB Publications * ''EPUB Content Documents 3.0'', which defines profiles of XHTML, SVG and CSS for use in the context of EPUB Publications * ''EPUB Open Container Format (OCF) 3.0'', which defines a file format and processing model for encapsulating a set of related resources into a single-file (ZIP) EPUB Container. * ''EPUB Media Overlays 3.0'', which defines a format and a processing model for synchronization of text and audio The EPUB 3.0 format was intended to address the following criticisms: * While good for text-centric books, EPUB was rather unsuitable for publications that require precise layout or specialized formatting, such as comic books. * A major issue hindering the use of EPUB for most technical publications was the lack of support for equations formatted as MathML. They were included as bitmap or SVG images, precluding proper handling by screen readers and interaction with computer algebra systems. Support for MathML is included in the EPUB 3.0 specification. * Other criticisms of EPUB were the specification's lack of detail on linking within or between EPUB books, and its lack of a specification for annotation. Such linking is hindered by the use of a ZIP file as the container for EPUB. Furthermore, it was unclear if it would be better to link by using EPUB's internal structural markup (the OPF specification mentioned above) or directly to files through the ZIP's file structure. The lack of a standardized way to annotate EPUB books led to difficulty in sharing and transferring annotations and therefore limited the use scenarios of EPUB, particularly in educational settings, because it cannot provide a level of interactivity comparable to the web. On June 26, 2014, the IDPF published EPUB 3.0.1 as a final Recommended Specification.. In November 2014, EPUB 3.0 was published by theVersion 3.2
EPUB 3.2 was announced in 2018, and the final specification was released in 2019. A notable change is the removal of a specialized subset of CSS, enabling the use of non-epub-prefixed properties. The references to HTML and SVG standards are also updated to "newest version available", as opposed to a fixed version in time.Version 3.3
The W3C announced version 3.3 on May 25, 2023. Changes included stricter security and privacy standards; and the adoption of the WebP and Opus media formats.Features
The format and many readers support the following: * Reflowable document: optimize text for a particular display * Fixed-layout content: pre-paginated content can be useful for certain kinds of highly designed content, such as illustrated books intended only for larger screens, such as tablets. * Like anDigital rights management
An EPUB file can optionally contain DRM as an additional layer, but it is not required by the specifications. In addition, the specification does not name any particular DRM system to use, so publishers can choose a DRM scheme to their liking. However, future versions of EPUB (specifically OCF) ''may'' specify a format for DRM. The EPUB specification does not enforce or suggest a particular DRM scheme. This could affect the level of support for various DRM systems on devices and the portability of purchased e-books. Consequently, such DRM incompatibility may segment the EPUB format along the lines of DRM systems, undermining the advantages of a single standard format and confusing the consumer. DRMed EPUB files must contain a file calledrights.xml
within the META-INF
directory at the root level of the ZIP container.
Adoption
EPUB is a popular format for electronic data interchange as it is based on HTML and other published standards and requires no licensing to implement. EPUB is widely supported by software readers such as Google Play Books on Android and Apple Books on iOS andSecurity and privacy concerns
EPUB requires readers to support theImplementation
An EPUB file is an archive that contains, in effect, a website. It includes HTML files, images, CSS style sheets, and other assets. It also contains metadata. EPUB 3.3 is the latest version. By usingContainer
An EPUB publication is delivered as a single file. This file is an unencrypted zipped archive containing a set of interrelated resources. An OCF (Open Container Format) Abstract Container defines a file system model for the contents of the container. The file system model uses a single common root directory for all contents in the container. All (non-remote) resources for publications are in the directory tree headed by the container's root directory, though EPUB mandates no specific file system structure for this. The file system model includes a mandatory directory named META-INF that is a direct child of the container's root directory. META-INF stores container.xml. The first file in the archive must be the mimetype file. It must be unencrypted and uncompressed so that non-ZIP utilities can read the mimetype. The mimetype file must be an--ZIP Container-- mimetype META-INF/ container.xml OEBPS/ content.opf chapter1.xhtml ch1-pic.png css/ style.css myfont.otf toc.ncxThere must be a META-INF directory containing container.xml. This file points to the file defining the contents of the book, the OPF file, though additional alternative rootfile elements are allowed. Apart from mimetype and META-INF/container.xml, the other files (OPF, NCX, XHTML, CSS and images files) are traditionally put in a directory named OEBPS. An example container.xml:
Publication
The ePUB container must contain: * At least one content document. * One navigation document. * One package document listing all publication resources. This file should use the file extension ''.opf''. It contains metadata, a manifest, fallback chains, bindings, and a spine. This is an ordered sequence of ID references defining the default reading order. The ePUB container may contain: * Style sheets * Pronunciation Lexicon Specification (PLS) documents * Media overlay documentsContents
Content documents include HTML 5 content, navigation documents, SVG documents, scripted content documents, and fixed layout documents. Contents also include CSS and PLS documents. Navigation documents supersede the NCX grammar used in EPUB 2.Media overlays
Books with synchronized audio narration are created in EPUB 3 by using media overlay documents to describe the timing for the pre-recorded audio narration and how it relates to the EPUB Content Document markup. The file format for Media Overlays is defined as a subset of SMIL.Software
EPUB reader software exists for all major computing platforms, such as Adobe Digital Editions and calibre on desktop platforms, Google Play Books and Aldiko on Android and iOS, and Apple Books on macOS and iOS. There is also cross-platform editor software for creating EPUB files, including theReading software
The following software can read and display EPUB files.Creation software
The following software can create EPUB files.Notes
References
External links