HOME

TheInfoList



OR:

EPUB is an
e-book An ebook (short for electronic book), also known as an e-book or eBook, is a book publication made available in digital form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Alt ...
file format A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free. Some file formats ...
that uses the ".epub"
file extension A filename extension, file name extension or file extension is a suffix to the name of a computer file (e.g., .txt, .docx, .md). The extension indicates a characteristic of the file contents or its intended use. A filename extension is typically d ...
. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many
e-reader An e-reader, also called an e-book reader or e-book device, is a mobile electronic device that is designed primarily for the purpose of reading digital e-books and periodicals. Any device that can display text on a screen may act as an e-read ...
s, and compatible software is available for most smartphones, tablets, and computers. EPUB is a
technical standard A technical standard is an established norm or requirement for a repeatable technical task which is applied to a common and repeated use of rules, conditions, guidelines or characteristics for products or related processes and production methods, ...
published by the
International Digital Publishing Forum The International Digital Publishing Forum (IDPF) was a trade and standards association for the digital publishing industry, set up to establish a standard for electronic book publishing. It was responsible for the EPUB standard currently used by ...
(IDPF). It became an official standard of the IDPF in September 2007, superseding the older
Open eBook Open eBook (OEB), or formally, the Open eBook Publication Structure (OEBPS), is a legacy e-book format which has been superseded by the EPUB format. It was "based primarily on technology developed by SoftBook Press". and on XML. OEB was released ...
(OEB) standard. The
Book Industry Study Group The Book Industry Study Group, Inc. (BISG) is a U.S. trade association for policy, technical standards and research related to books and similar products. The mission of BISG is to simplify logistics for publishers, manufacturers, suppliers, whol ...
endorses EPUB 3 as the format of choice for packaging content and has stated that the global book publishing industry should rally around a single standard. The EPUB format is implemented as an archive file consisting of
XHTML Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated. While HTML, prior ...
files carrying the content, along with images and other supporting files. EPUB is the most widely supported vendor-independent
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
-based e-book format; that is, it is supported by almost all hardware readers.


History

A successor to the Open eBook Publication Structure, EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) approved in September 2010. The EPUB 3.0 specification became effective in October 2011, superseded by a minor maintenance update (3.0.1) in June 2014. New major features include support for precise layout or specialized formatting (Fixed Layout Documents), such as for comic books, and
MathML Mathematical Markup Language (MathML) is a mathematical markup language, an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide Web ...
support. The current version of EPUB is 3.2, effective May 8, 2019. The (text of) format specification underwent reorganization and clean-up; format supports remotely hosted resources and new font formats ( WOFF 2.0 and
SFNT SFNT is a font file format which can contain other fonts, such as PostScript, TrueType, OpenType, Web Open Font Format (WOFF) fonts and other. SFNT stands for '' spline font'' or ''scalable font'', and was originally developed for TrueType fonts o ...
) and uses more pure
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
and
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
. In May 2016 IDPF members approved
World Wide Web Consortium The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 and led by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working to ...
(W3C) merger, "to fully align the publishing industry and core Web technology".


Version 2.0.1

EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) intended to clarify and correct errata in the specifications being approved in September 2010. EPUB version 2.0.1 consists of three specifications: * ''Open Publication Structure'' (OPS) 2.0.1, contains the formatting of its content. * ''Open Packaging Format'' (OPF) 2.0.1, describes the structure of the .epub file in XML. * ''Open Container Format'' (OCF) 2.0.1, collects all files as a ZIP archive. EPUB internally uses
XHTML Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated. While HTML, prior ...
or
DTBook DTBook (an acronym for ''DAISY Digital Talking Book'') or DAISY XML is a XML-based document file format. It is used in EPUB 2.0 e-books and DAISY Digital Talking Book, as well as other places. Unlike other document file formats such as ODF DTBook ...
(an XML standard provided by the DAISY Consortium) to represent the text and structure of the content document, and a subset of
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
to provide layout and formatting.
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. T ...
is used to create the document manifest,
table of contents A table of contents, usually headed simply Contents and abbreviated informally as TOC, is a list, usually found on a page before the start of a written work, of its chapter or section titles or brief descriptions with their commencing page number ...
, and EPUB
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
. Finally, the files are bundled in a zip file as a packaging format.


Open Publication Structure 2.0.1

An EPUB file uses XHTML 1.1 (or DTBook) to construct the content of a book as of version 2.0.1. This is different from previous versions (OEBPS 1.2 and earlier), which used a subset of XHTML. There are, however, a few restrictions on certain elements. The mimetype for XHTML documents in EPUB is application/xhtml+xml. Styling and layout are performed using a subset of CSS 2.0, referred to as ''OPS Style Sheets''. This specialized syntax requires that reading systems support only a portion of CSS properties and adds a few custom properties. Custom properties include oeb-page-head, oeb-page-foot, and oeb-column-number. Font-embedding can be accomplished using the @font-face property, as well as including the font file in the OPF's manifest (see below). The mimetype for CSS documents in EPUB is text/css. EPUB also requires that PNG,
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
,
GIF The Graphics Interchange Format (GIF; or , see pronunciation) is a bitmap image format that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released on 15 June 1987. ...
, and SVG images be supported using the mimetypes image/png, image/jpeg, image/gif, image/svg+xml. Other media types are allowed, but creators must include alternative renditions using supported types. For a table of all required mimetypes, se
Section 1.3.7
of the specification.
Unicode Unicode, formally The Unicode Standard,The formal version reference is is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expre ...
is required, and content producers must use either
UTF-8 UTF-8 is a variable-width encoding, variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit'' ...
or
UTF-16 UTF-16 (16-bit computing, 16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variab ...
encoding. This is to support international and multilingual books. However, reading systems are not required to provide the fonts necessary to display every Unicode character, though they are required to display at least a placeholder for characters that cannot be displayed fully. An example skeleton of an XHTML file for EPUB looks like this: Pride and Prejudice ...


Open Packaging Format 2.0.1

The OPF specification's purpose is to " efinethe mechanism by which the various components of an OPS publication are tied together and provides additional structure and semantics to the electronic publication". This is accomplished by two XML files with the extensions .opf and .ncx. ; .opf file The OPF file, traditionally named content.opf, houses the EPUB book's
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
, file manifest, and linear reading order. This file has a root element package and four child elements: metadata, manifest, spine, and guide. Furthermore, the package node must have the unique-identifier attribute. The .opf file's mimetype is application/oebps-package+xml. The metadata element contains all the metadata information for a particular EPUB file. Three metadata tags are required (though many more are available): title, language, and identifier. title contains the title of the book, language contains the language of the book's contents in RFC 3066 format ''or'' its successors, such as the newer RFC 4646 and identifier contains a unique identifier for the book, such as its
ISBN The International Standard Book Number (ISBN) is a numeric commercial book identifier that is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency. An ISBN is assigned to each separate edition and ...
or a URL. The identifier's id attribute should equal the unique-identifier attribute from the package element. The manifest element lists all the files contained in the package. Each file is represented by an item element, and has the attributes id, href, media-type. All XHTML (content documents), stylesheets, images or other media, embedded fonts, and the NCX file should be listed here. Only the .opf file itself, the container.xml, and the mimetype files should not be included. The spine element lists all the XHTML content documents in their linear reading order. Also, any content document that can be reached through linking or the table of contents must be listed as well. The toc attribute of spine must contain the id of the NCX file listed in the manifest. Each itemref element's idref is set to the id of its respective content document. The guide element is an optional element for the purpose of identifying fundamental structural components of the book. Each reference element has the attributes type, title, href. Files referenced in href must be listed in the manifest, and are allowed to have an element identifier (e.g. #figures in the example). An example OPF file: Pride and Prejudice en 123456789X Jane Austen ; .ncx file The NCX file (Navigation Control file for XML), traditionally named toc.ncx, contains the hierarchical
table of contents A table of contents, usually headed simply Contents and abbreviated informally as TOC, is a list, usually found on a page before the start of a written work, of its chapter or section titles or brief descriptions with their commencing page number ...
for the EPUB file. The specification for NCX was developed for Digital Talking Book (DTB), is maintained by the DAISY Consortium, and is not a part of the EPUB specification. The NCX file has a mimetype of application/x-dtbncx+xml. Of note here is that the values for the docTitle, docAuthor, and meta name="dtb:uid" elements should match their analogs in the OPF file. Also, the meta name="dtb:depth" element is set equal to the depth of the navMap element. navPoint elements can be nested to create a hierarchical table of contents. navLabel's content is the text that appears in the table of contents generated by reading systems that use the .ncx. navPoint's content element points to a content document listed in the manifest and can also include an element identifier (e.g. #section1). A description of certain exceptions to the NCX specification as used in EPUB is i
Section 2.4.1
of the specification. The complete specification for NCX can be found i

of the ''Specifications for the Digital Talking Book''. An example .ncx file: Pride and Prejudice Austen, Jane Chapter 1


Open Container Format 2.0.1

An EPUB file is a group of files that conform to the OPS/OPF standards and are wrapped in a ZIP file. The OCF specifies how to organize these files in the ZIP, and defines two additional files that must be included. The mimetype file must be a text document in ASCII that contains the string application/epub+zip. It must also be uncompressed, unencrypted, and the first file in the ZIP archive. This file provides a more reliable way for applications to identify the mimetype of the file than just the .epub extension. Also, there must be a folder named META-INF, which contains the required file container.xml. This XML file points to the file defining the contents of the book. This is the OPF file, though additional alternative rootfile elements are allowed. Apart from mimetype and META-INF/container.xml, the other files (OPF, NCX, XHTML, CSS and images files) are traditionally put in a directory named OEBPS. An example file structure:
--ZIP Container--
mimetype
META-INF/
  container.xml
OEBPS/
  content.opf
  chapter1.xhtml
  ch1-pic.png
  css/
    style.css
    myfont.otf
An example container.xml, given the above file structure:


Version 3.0.1

The EPUB 3.0 Recommended Specification was approved on 11 October 2011. On June 26, 2014 EPUB 3.0.1 was approved as a minor maintenance update to EPUB 3.0. EPUB 3.0 supersedes the previous release 2.0.1. EPUB 3 consists of a set of four specifications: * ''EPUB Publications 3.0'', which defines publication-level semantics and overarching conformance requirements for EPUB Publications * ''EPUB Content Documents 3.0'', which defines profiles of XHTML, SVG and CSS for use in the context of EPUB Publications * ''EPUB Open Container Format (OCF) 3.0'', which defines a file format and processing model for encapsulating a set of related resources into a single-file (ZIP) EPUB Container. * ''EPUB Media Overlays 3.0'', which defines a format and a processing model for synchronization of text and audio The EPUB 3.0 format was intended to address the following criticisms: * While good for text-centric books, EPUB was rather unsuitable for publications that require precise layout or specialized formatting, such as comic books. * A major issue hindering the use of EPUB for most technical publications was the lack of support for equations formatted as
MathML Mathematical Markup Language (MathML) is a mathematical markup language, an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide Web ...
. They were included as
bitmap In computing, a bitmap is a mapping from some domain (for example, a range of integers) to bits. It is also called a bit array A bit array (also known as bitmask, bit map, bit set, bit string, or bit vector) is an array data structure that c ...
or SVG images, precluding proper handling by screen readers and interaction with computer algebra systems. Support for MathML is included in the EPUB 3.0 specification. * Other criticisms of EPUB were the specification's lack of detail on linking within or between EPUB books, and its lack of a specification for annotation. Such linking is hindered by the use of a ZIP file as the container for EPUB. Furthermore, it was unclear if it would be better to link by using EPUB's internal structural markup (the OPF specification mentioned above) or directly to files through the ZIP's file structure. The lack of a standardized way to annotate EPUB books led to difficulty in sharing and transferring annotations and therefore limited the use scenarios of EPUB, particularly in educational settings, because it cannot provide a level of interactivity comparable to the web. On June 26, 2014, the IDPF published EPUB 3.0.1 as a final Recommended Specification.. In November 2014, EPUB 3.0 was published by the
ISO ISO is the most common abbreviation for the International Organization for Standardization. ISO or Iso may also refer to: Business and finance * Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007 * Iso ...
/
IEC The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and r ...
as ISO/IEC TS 30135 (parts 1-7).. In January 2020, EPUB 3.0.1 was published by the
ISO ISO is the most common abbreviation for the International Organization for Standardization. ISO or Iso may also refer to: Business and finance * Iso (supermarket), a chain of Danish supermarkets incorporated into the SuperBest chain in 2007 * Iso ...
/
IEC The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and r ...
as ISO/IEC 23736 (parts 1-6)..


Version 3.2

EPUB 3.2 was announced in 2018, and the final specification was released in 2019. A notable change is the removal of a specialized subset of
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
, enabling the use of non-epub-prefixed properties. The references to HTML and SVG standards are also updated to "newest version available", as opposed to a fixed version in time.


Features

The format and many readers support the following: *
Reflowable document A reflowable document is a type of electronic document that can adapt its presentation to the output device. Typical prepress or fixed page size output formats like PostScript or PDF are not reflowable during the actual printing process because ...
: optimize text for a particular display * Fixed-layout content: pre-paginated content can be useful for certain kinds of highly designed content, such as illustrated books intended only for larger screens, such as tablets. * Like an
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
web site, the format supports inline
raster Raster may refer to: * Raster graphics, graphical techniques using arrays of pixel values * Raster graphics editor, a computer program * Raster scan, the pattern of image readout, transmission, storage, and reconstruction in television and compu ...
and
vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...
images,
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
, and
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
styling. * Page bookmarking * Passage highlighting and notes * A library that stores books and can be searched * Re-sizable fonts, and changeable text and background colors * Support for a subset of
MathML Mathematical Markup Language (MathML) is a mathematical markup language, an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide Web ...
*Better analytical support with compatible platforms * Digital rights management—can contain
digital rights management Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures (TPM) such as access control technologies can restrict the use of proprietary hardware and copyrighted works. ...
(DRM) as an optional layer


Digital rights management

An EPUB file can optionally contain
DRM DRM may refer to: Government, military and politics * Defense reform movement, U.S. campaign inspired by Col. John Boyd * Democratic Republic of Madagascar, a former socialist state (1975–1992) on Madagascar * Direction du renseignement milita ...
as an additional layer, but it is not required by the specifications. In addition, the specification does not name any particular DRM system to use, so publishers can choose a DRM scheme to their liking. However, future versions of EPUB (specifically OCF) ''may'' specify a format for DRM. The EPUB specification does not enforce or suggest a particular
DRM DRM may refer to: Government, military and politics * Defense reform movement, U.S. campaign inspired by Col. John Boyd * Democratic Republic of Madagascar, a former socialist state (1975–1992) on Madagascar * Direction du renseignement milita ...
scheme. This could affect the level of support for various DRM systems on devices and the portability of purchased e-books. Consequently, such DRM incompatibility may segment the EPUB format along the lines of DRM systems, undermining the advantages of a single standard format and confusing the consumer. DRMed EPUB files must contain a file called rights.xml within the META-INF directory at the root level of the ZIP container.


Adoption

EPUB is widely used on software readers such as
Google Play Books Google Play Books, formerly Google eBooks, is an ebook digital distribution service operated by Google, part of its Google Play product line. Users can purchase and download ebooks and audiobooks from Google Play, which offers over five millio ...
on Android and
Apple Books Apple Books (formerly known as iBooks between January 2010 and September 2018) is an e-book reading and store application by Apple Inc. for its iOS and macOS operating systems and List of iOS devices, devices. It was announced, under the name i ...
on
iOS iOS (formerly iPhone OS) is a mobile operating system created and developed by Apple Inc. exclusively for its hardware. It is the operating system that powers many of the company's mobile devices, including the iPhone; the term also includes ...
and
macOS macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac computers. Within the market of desktop and lapt ...
and
Amazon Kindle Amazon Kindle is a series of e-readers designed and marketed by Amazon. Amazon Kindle devices enable users to browse, buy, download, and read e-books, newspapers, magazines and other digital media via wireless networking to the Kindle Store. ...
's e-readers, but not by associated apps for other platforms. iBooks also supports the proprietary iBook format, which is based on the EPUB format but depends upon code from the iBooks app to function. EPUB is a popular format for electronic data interchange because it can be an open format and is based on HTML, as opposed to Amazon's proprietary format for Kindle readers. Popular EPUB producers of
public domain The public domain (PD) consists of all the creative work A creative work is a manifestation of creative effort including fine artwork (sculpture, paintings, drawing, sketching, performance art), dance, writing (literature), filmmaking, ...
and
open license A free license or open license is a license which allows others to reuse another creator’s work as they wish. Without a special license, these uses are normally prohibited by copyright, patent or commercial license. Most free licenses are ...
d content include
Project Gutenberg Project Gutenberg (PG) is a Virtual volunteering, volunteer effort to digitize and archive cultural works, as well as to "encourage the creation and distribution of eBooks." It was founded in 1971 by American writer Michael S. Hart and is the ...
, Standard Ebooks,
PubMed Central PubMed Central (PMC) is a free digital repository that archives open access full-text scholarly articles that have been published in biomedical and life sciences journals. As one of the major research databases developed by the National Center f ...
,
SciELO SciELO (Scientific Electronic Library Online) is a bibliographic database, digital library, and cooperative electronic publishing model of open access journals. SciELO was created to meet the scientific communication needs of developing countries ...
and others. In 2022,
Amazon Amazon most often refers to: * Amazons, a tribe of female warriors in Greek mythology * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon River, in South America * Amazon (company), an American multinational technology c ...
's Send-to-Kindle service removed support for its own
Kindle File Format Kindle File Format is a proprietary e-book file format created by Amazon.com that can be downloaded and read on devices like smartphones, tablets, computers, or e-readers that have Amazon's Kindle app. E-book files in the Kindle File Format o ...
in favor of EPUB.


Security and privacy concerns

EPUB requires readers to support the
HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
,
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
,
CSS Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone techno ...
, SVG formats, making EPUB readers use the same technology as web browsers. Such formats are associated with various types of security issues and privacy-breaching behaviors e.g.
Web beacon A web beaconAlso called web bug, tracking bug, tag, web tag, page tag, tracking pixel, pixel tag, 1×1 GIF, or clear GIF. is a technique used on web pages and email to unobtrusively (usually invisibly) allow checking that a user has accessed s ...
s,
CSRF Cross-site request forgery, also known as one-click attack or session riding and abbreviated as CSRF (sometimes pronounced ''sea-surf'') or XSRF, is a type of malicious exploit of a website or web application where unauthorized commands are submitt ...

XSHM
due to their complexity and flexibility. Such vulnerabilities can be used to implement
web tracking Web tracking is the practice by which operators of websites and third parties collect, store and share information about visitors’ activities on the World Wide Web. Analysis of a user's behaviour may be used to provide content that enables the ...
and
cross-device tracking Cross-device tracking refers to technology which enables the tracking of users across multiple devices such as smartphones, television sets, smart TVs, and personal computers. More specifically, cross-device tracking is a technique in which techno ...
on EPUB files. Security researchers also identified attacks leading to local files and other user data being uploaded. The "EPUB 3.1 Overview" document provides a security warning: EPUB also requires PNG,
JPEG JPEG ( ) is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and imag ...
and
GIF The Graphics Interchange Format (GIF; or , see pronunciation) is a bitmap image format that was developed by a team at the online services provider CompuServe led by American computer scientist Steve Wilhite and released on 15 June 1987. ...
.


Implementation

An EPUB file is an archive that contains, in effect, a website. It includes HTML files, images, CSS style sheets, and other assets. It also contains
metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
. EPUB 3.2 is the latest version. By using
HTML5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML ...
, publications can contain video, audio, and interactivity, just like websites in web browsers.


Container

An EPUB publication is delivered as a single file. This file is an unencrypted zipped archive containing a set of interrelated resources. An OCF (Open Container Format) Abstract Container defines a file system model for the contents of the container. The file system model uses a single common root directory for all contents in the container. All (non-remote) resources for publications are in the directory tree headed by the container's root directory, though EPUB mandates no specific file system structure for this. The file system model includes a mandatory directory named META-INF that is a direct child of the container's root directory. META-INF stores container.xml. The first file in the archive must be the mimetype file. It must be unencrypted and uncompressed so that non-ZIP utilities can read the mimetype. The mimetype file must be an
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
file that contains the string "application/epub+zip". This file provides a more reliable way for applications to identify the mimetype of the file than just the .epub extension. An example file structure:
--ZIP Container--
mimetype
META-INF/
  container.xml
OEBPS/
  content.opf
  chapter1.xhtml
  ch1-pic.png
  css/
    style.css
    myfont.otf
  toc.ncx
There must be a META-INF directory containing container.xml. This file points to the file defining the contents of the book, the OPF file, though additional alternative rootfile elements are allowed. Apart from mimetype and META-INF/container.xml, the other files (OPF, NCX, XHTML, CSS and images files) are traditionally put in a directory named OEBPS. An example container.xml:


Publication

The ePUB container must contain: * At least one content document. * One navigation document. * One package document listing all publication resources. This file should use the file extension ''.opf''. It contains metadata, a manifest, fallback chains, bindings, and a spine. This is an ordered sequence of ID references defining the default reading order. The ePUB container may contain: * Style sheets *
Pronunciation Lexicon Specification The Pronunciation Lexicon Specification (PLS) is a W3C Recommendation, which is designed to enable interoperable specification of pronunciation information for both speech recognition and speech synthesis engines within voice browsing applicati ...
(PLS) documents * Media overlay documents


Contents

Content documents include
HTML 5 HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML L ...
content, navigation documents, SVG documents, scripted content documents, and fixed layout documents. Contents also include CSS and PLS documents. Navigation documents supersede the NCX grammar used in EPUB 2.


Media overlays

Books with synchronized audio narration are created in EPUB 3 by using media overlay documents to describe the timing for the pre-recorded audio narration and how it relates to the EPUB Content Document markup. The file format for Media Overlays is defined as a subset of SMIL.


Software

EPUB reader software exists for all major computing platforms, such as
Adobe Digital Editions Adobe Digital Editions (abbreviated ADE) is an e-book reader software program from Adobe Systems, built initially (1.x version) using Adobe Flash. It is used for acquiring, managing, and reading e-books, digital newspapers, and other digital pub ...
and
calibre In guns, particularly firearms, caliber (or calibre; sometimes abbreviated as "cal") is the specified nominal internal diameter of the gun barrel bore – regardless of how or where the bore is measured and whether the finished bore match ...
on desktop platforms,
Google Play Books Google Play Books, formerly Google eBooks, is an ebook digital distribution service operated by Google, part of its Google Play product line. Users can purchase and download ebooks and audiobooks from Google Play, which offers over five millio ...
and
Aldiko Aldiko is an e-book reader application for the Android and iOS operating systems. It supports the EPUB format for digital publications and incorporates facilities for browsing online catalogs on thousands of books (including thousands of free p ...
on Android and iOS, and
Apple Books Apple Books (formerly known as iBooks between January 2010 and September 2018) is an e-book reading and store application by Apple Inc. for its iOS and macOS operating systems and List of iOS devices, devices. It was announced, under the name i ...
on macOS and iOS. There is also cross-platform editor software for creating EPUB files, including the
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
programs
calibre In guns, particularly firearms, caliber (or calibre; sometimes abbreviated as "cal") is the specified nominal internal diameter of the gun barrel bore – regardless of how or where the bore is measured and whether the finished bore match ...
and
Sigil A sigil () is a type of symbol used in magic. The term has usually referred to a pictorial signature of a deity or spirit. In modern usage, especially in the context of chaos magic, sigil refers to a symbolic representation of the practitioner ...
. Most modern
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
s also support EPUB reader plugins. The
Microsoft Edge Microsoft Edge is a proprietary, cross-platform web browser created by Microsoft. It was first released in 2015 as part of Windows 10 and Xbox One and later ported to other platforms as a fork of Google's Chromium open-source project: Android ...
browser had EPUB reader capability built in until September 2019.


Reading software

The following software can read and display EPUB files.


Creation software

The following software can create EPUB files.


Notes


References


External links


ISO/IEC TS 30135-1:2014 - EPUB3 — Part 1: EPUB3 Overview

IDPF EPUB ValidatorGithub-repository
{{DEFAULTSORT:Epub Computer file formats Document-centric XML-based standards Ebooks Electronic paper technology Open formats World Wide Web Consortium