The following is a comparison of e-book formats used to create and publish
e-book
An ebook (short for electronic book), also known as an e-book or eBook, is a book publication made available in digital form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Al ...
s.
The
EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphon ...
format is the most widely supported e-book format, supported by most
e-book readers except
Amazon Kindle
Amazon Kindle is a series of e-readers designed and marketed by Amazon. Amazon Kindle devices enable users to browse, buy, download, and read e-books, newspapers, magazines and other digital media via wireless networking to the Kindle Stor ...
devices. Most
e-book readers also support the
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
and
plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a limi ...
formats.
E-book software
E-book software is software that allows the creation, editing, display, conversion and/or publishing of e-books. E-book software is available for many platforms in both paid, proprietary as well as free, open source form.
List of e-book software
...
can be used to convert e-books from one format to another, as well as to create, edit and publish e-books.
Format descriptions
Formats available include, but are not limited to:
Broadband eBooks (BBeB)
The digital book format originally used by
Sony Corporation
, commonly stylized as SONY, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan. As a major technology company, it operates as one of the world's largest manufacturers of consumer and professional ...
. It is a proprietary format, but some reader software for general-purpose computers, particularly under Linux (for example,
Calibre's internal viewer), have the capability to read it. The LRX file extension represents a
DRM encrypted eBook. More recently, Sony has converted its books from BBeB to EPUB and is now issuing new titles in EPUB.
Comic Book Archive file
Compiled HTML
CHM format is a proprietary format based on HTML. Multiple pages and embedded graphics are distributed along with
metadata as a single compressed file. The indexing is both for keywords and for full text search.
DAISY – ANSI/NISO Z39.86
The Digital Accessible Information SYstem (DAISY) is an
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
-based open standard published by the National Information Standards Organization (NISO) and maintained by the DAISY Consortium for people with
print disabilities. DAISY has wide international support with features for multimedia, navigation and synchronization. A subset of the DAISY format has been adopted by law in the United States as the National Instructional Material Accessibility Standard (NIMAS), and K-12 textbooks and instructional materials are now required to be provided to students with disabilities.
DAISY is already aligned with the EPUB technical standard, and is expected to fully converge with its forthcoming EPUB3 revision.
Djvu
DjVu is a format specialized for storing scanned documents. It includes advanced compressors optimized for low-color images, such as text documents. Individual files may contain one or more pages. DjVu files cannot be re-flowed.
The contained page images are divided in separate layers (such as multi-color, low-resolution, background layer using
lossy compression
In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size ...
, and few-colors, high-resolution, tightly compressed foreground layer), each compressed in the best available method. The format is designed to decompress very quickly, even faster than vector-based formats.
The advantage of DjVu is that it is possible to take a high-resolution scan (300–400 DPI), good enough for both on-
screen reading Screen reading is the act of reading a text on a computer screen, smartphone, e-book reader,
Discovery
Louis Émile Javal, a French ophthalmologist and founder of an ophthalmology laboratory in Paris is credited with the introduction of the ter ...
and printing, and store it very efficiently. Several dozens of 300 DPI black-and-white scans can be stored in less than a megabyte.
DOC
DOC
DOC, Doc, doc or DoC may refer to:
In film and television
* ''Doc'' (2001 TV series), a 2001–2004 PAX series
* ''Doc'' (1975 TV series), a 1975–1976 CBS sitcom
* "D.O.C." (''Lost''), a television episode
* ''Doc'' (film), a 1971 Wester ...
is a
document
A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
file format that is directly supported by few ebook readers. Its advantages as an ebook format is that it can be easily converted to other ebook formats and it can be reflowed. It can be easily edited using Microsoft software, and any of several other programs. Note that the format has changed several times since its original release, and there are numerous incompatibility difficulties between various releases and the assorted programs which attempt to read / write the format.
DOCX
DOCX is a
document
A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
file format that is directly supported by few ebook readers. Its advantages as an ebook format are that it can be easily converted to other ebook formats and it can be reflowed. It can be easily edited.
EPUB
The .epub or
OEBPS format is a technical standard for e-books created by the
International Digital Publishing Forum (IDPF).
The EPUB format has gained some popularity as a vendor-independent XML-based e-book format. The format can be read by the
Kobo eReader,
BlackBerry
The blackberry is an edible fruit produced by many species in the genus ''Rubus'' in the family Rosaceae, hybrids among these species within the subgenus ''Rubus'', and hybrids between the subgenera ''Rubus'' and ''Idaeobatus''. The taxonomy of ...
devices, Apple's
iBooks app running on
Macintosh
The Mac (known as Macintosh until 1999) is a family of personal computers designed and marketed by Apple Inc. Macs are known for their ease of use and minimalist designs, and are popular among students, creative professionals, and software en ...
computers and
iOS devices,
Google Play Books
Google Play Books, formerly Google eBooks, is an ebook digital distribution service operated by Google, part of its Google Play product line. Users can purchase and download ebooks and audiobooks from Google Play, which offers over five mi ...
app running on
Android
Android may refer to:
Science and technology
* Android (robot), a humanoid robot or synthetic organism designed to imitate a human
* Android (operating system), Google's mobile operating system
** Bugdroid, a Google mascot sometimes referred to ...
and iOS devices,
Barnes & Noble Nook
The Barnes & Noble Nook (styled nook or NOOK) is a brand of e-readers developed by American book retailer Barnes & Noble, based on the Android platform. The original device was announced in the U.S. in October 2009, and was released the next ...
, Amazon
Kindle Fire
The Amazon Fire, formerly called the Kindle Fire, is a line of tablet computers developed by Amazon. Built with Quanta Computer, the Kindle Fire was first released in November 2011, featuring a color 7-inch multi-touch display with IPS te ...
,
Sony Reader,
BeBook
BeBook is a trademark of Endless Ideas, a Dutch manufacturer of e-book readers and tablet computers. The first BeBook device was a rebranding of the Hanlin eReader
The Hanlin is an e-Reader, an electronic book (e-book) reading device. The Hanl ...
,
Bookeen Cybook Gen3 (with firmware v2 and up),
Adobe Digital Editions
Adobe Digital Editions (abbreviated ADE) is an e-book reader software program from Adobe Systems, built initially (1.x version) using Adobe Flash. It is used for acquiring, managing, and reading e-books, digital newspapers, and other digital pu ...
,
Lexcycle Stanza
Lexcycle was a software company that made electronic book reading software. They were responsible for Stanza, which ran on the iPhone, iPod Touch, Microsoft Windows
Windows is a group of several proprietary graphical operating system famil ...
,
FBReader
FBReader is an e-book reader for Linux, Microsoft Windows, Android, and other platforms.
It was originally written for the Sharp Zaurus and currently runs on many other mobile devices, like the Nokia Internet Tablets, as well as desktop com ...
,
PocketBook eReader,
Aldiko, the
Mozilla Firefox
Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements current a ...
add-on EPUBReader,
Lucifox,
Okular and other reading apps.
Adobe Digital Editions uses .epub format for its e-books, with
digital rights management
Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures (TPM) such as access control technologies can restrict the use of proprietary hardware and copyrighted work ...
(DRM) protection provided through their proprietary ADEPT mechanism. The ADEPT framework and scripts have been reverse-engineered to circumvent this DRM system.
eReader
;Formerly Palm Digital Media/Peanut Press
eReader is a
freeware
Freeware is software, most often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for t ...
program for viewing Palm Digital Media electronic books which use the pdb format used by many Palm applications. Versions are available for
Android
Android may refer to:
Science and technology
* Android (robot), a humanoid robot or synthetic organism designed to imitate a human
* Android (operating system), Google's mobile operating system
** Bugdroid, a Google mascot sometimes referred to ...
,
BlackBerry
The blackberry is an edible fruit produced by many species in the genus ''Rubus'' in the family Rosaceae, hybrids among these species within the subgenus ''Rubus'', and hybrids between the subgenera ''Rubus'' and ''Idaeobatus''. The taxonomy of ...
,
iOS,
Palm OS
Palm OS (also known as Garnet OS) was a mobile operating system initially developed by Palm, Inc., for personal digital assistants (PDAs) in 1996. Palm OS was designed for ease of use with a touchscreen-based graphical user interface. It is provi ...
(not webOS),
Symbian
Symbian is a discontinued mobile operating system (OS) and computing platform designed for smartphones. It was originally developed as a proprietary software OS for personal digital assistants in 1998 by the Symbian Ltd. consortium. Symbian ...
,
Windows Mobile
Windows Mobile is a discontinued family of mobile operating systems developed by Microsoft for smartphones and personal digital assistants.
Its origin dated back to Windows CE in 1996, though Windows Mobile itself first appeared in 2000 as Pock ...
Pocket PC/Smartphone, and
macOS
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac (computer), Mac computers. Within the market of ...
. The reader shows text one page at a time, as paper books do. eReader supports embedded hyperlinks and images. Additionally, the
Stanza application for the
iPhone and
iPod Touch
The iPod Touch (stylized as iPod touch) is a discontinued line of iOS-based mobile devices designed and marketed by Apple Inc. with a touchscreen-controlled user interface. As with other iPod models, the iPod Touch can be used as a music pl ...
can read both
encrypted
In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can decip ...
and unencrypted eReader files.
The program supports features like bookmarks and footnotes, enabling the user to mark any page with a bookmark and any part of the text with a footnote-like commentary. Footnotes can later be exported as a Memo document.
On July 20, 2009,
Barnes & Noble made an announcement implying that eReader would be the company's preferred format to deliver e-books. Exactly three months later, in a press release by
Adobe
Adobe ( ; ) is a building material made from earth and organic materials. is Spanish for '' mudbrick''. In some English-speaking regions of Spanish heritage, such as the Southwestern United States, the term is used to refer to any kind of ...
, it was revealed Barnes & Noble would be joining forces with the software company to standardize the EPUB and PDF eBook formats. Barnes & Noble e-books are now sold mostly in EPUB format.
FictionBook (fb2)
FictionBook is a popular
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
-based e-book format, supported by free readers such as
PocketBook eReader,
FBReader
FBReader is an e-book reader for Linux, Microsoft Windows, Android, and other platforms.
It was originally written for the Sharp Zaurus and currently runs on many other mobile devices, like the Nokia Internet Tablets, as well as desktop com ...
,
Okular,
CoolReader
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphones ...
,
Bebook
BeBook is a trademark of Endless Ideas, a Dutch manufacturer of e-book readers and tablet computers. The first BeBook device was a rebranding of the Hanlin eReader
The Hanlin is an e-Reader, an electronic book (e-book) reading device. The Hanl ...
and
STDU Viewer
STDU Viewer is computer software, a compact viewer for many computer file formats: Portable Document Format (PDF), World Wide Fund for Nature ( WWF), DjVu, comic book archive (CBR or CBZ), FB2, ePUB, XML Paper Specification (XPS), Text Compr ...
.
The FictionBook format does not specify the appearance of a document; instead, it describes its structure and semantics. All the ebook metadata, such as the author name, title, and publisher, is also present in the ebook file. Hence the format is convenient for automatic processing, indexing, and ebook collection management. This also is convenient to store books in it for later automatic conversion into other formats.
Founder Electronics
APABI is a format devised by
Founder Electronics
Founder Group () is a major Chinese technology conglomerate that deals with information technology, pharmaceuticals, real estate, finance, and commodities trading. It is divided into five major industry groups, each covering a separate industr ...
. It is a popular format for Chinese e-books. It can be read using the
Apabi Reader software, and produced using
Apabi Publisher. Both .xeb and .ceb files are encoded binary files. The
Iliad
The ''Iliad'' (; grc, Ἰλιάς, Iliás, ; "a poem about Ilium") is one of two major ancient Greek epic poems attributed to Homer. It is one of the oldest extant works of literature still widely read by modern audiences. As with the '' Odys ...
e-book device includes an Apabi 'viewer'.
Hypertext Markup Language
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
is the
markup language
Markup language refers to a text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship between its parts. Markup is often used to control the display of the document ...
used for most
web
Web most often refers to:
* Spider web, a silken structure created by the animal
* World Wide Web or the Web, an Internet-based hypertext system
Web, WEB, or the Web may also refer to:
Computing
* WEB, a literate programming system created b ...
pages. E-books using HTML can be read using a
Web browser
A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used on ...
. The specifications for the format are available without charge from the
W3C.
HTML adds specially marked meta-elements to otherwise plain text encoded using
character set
Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values tha ...
s like
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
or
UTF-8
UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''.
UTF-8 is capable of ...
. As such, suitably formatted files can be, and sometimes are, generated ''by hand'' using a ''
plain text editor'' or ''
programmer's editor''. Many ''HTML generator'' applications exist to ease this process and often require less intricate knowledge of the format details involved.
HTML on its own is not a particularly efficient format to store information in, requiring more storage space for a given work than many other formats. However, several e-Book formats including the Amazon Kindle, Open eBook, Compiled HTML, Mobipocket and EPUB store each book chapter in HTML format, then use
ZIP
Zip, Zips or ZIP may refer to:
Common uses
* ZIP Code, USPS postal code
* Zipper or zip, clothing fastener
Science and technology Computing
* ZIP (file format), a compressed archive file format
** zip, a command-line program from Info-ZIP
* Zi ...
compression to compress the HTML data, images, metadata and style sheets into a single, significantly smaller, file.
HTML files encompass a wide range of standards and displaying HTML files correctly can be complicated. Additionally many of the features supported, such as forms, are not relevant to e-books.
iBook (Apple)
The .ibooks format is created with the free
iBooks Author ebook layout software from
Apple Inc. This proprietary format is based on the
EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphon ...
standard, with some differences in the CSS tags used in an ibooks format file, this making it incompatible with the EPUB specification. The End-User Licensing Agreement (EULA) included with iBooks Author states that "If you want to charge a fee for a work that includes files in the .ibooks format generated using iBooks Author, you may only sell or distribute such work through Apple". The "through Apple" will typically be in the Apple
Apple Books
Apple Books (formerly known as iBooks between January 2010 and September 2018) is an e-book reading and store application by Apple Inc. for its iOS and macOS operating systems and List of iOS devices, devices. It was announced, under the name i ...
store. The EULA further states that "This restriction does not apply to the content of such works when distributed in a form that does not include files in the .ibooks format." Therefore, Apple has not included distribution restrictions in the iBooks Author EULA for ibooks format ebooks created in iBooks Author that are made available for free, and it does not prevent authors from re-purposing the content in other ebook formats to be sold outside the iBookstore. This software currently supports import and export functionally for three formats. ibook, Plain text and PDF. Versions 2.3 and later of iBooks Author support importing EPUB and exporting EPUB 3.0.
IEC 62448
IEC 62448 is an international standard created by
International Electrotechnical Commission
The International Electrotechnical Commission (IEC; in French: ''Commission électrotechnique internationale'') is an international standards organization that prepares and publishes international standards for all electrical, electronic and ...
(IEC), Technical Committee 100, Technical Area 10 (Multimedia e-publishing and e-book).
The current version of IEC 62448 is an umbrella standard that contains as appendices two concrete formats, XMDF of Sharp and BBeB of Sony. However, BBeB has been discontinued by Sony and the version of XMDF that is in the specification is out of date. The IEC TA10 group is discussing the next steps, and has invited the IDPF organization which has standardized
EPUB
EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes styled ''ePub''. EPUB is supported by many e-readers, and compatible software is available for most smartphon ...
to be a liaison. It is possible that the current version of EPUB and/or the forthcoming EPUB3 revision may be added to IEC 62448. Meanwhile, a number of Japanese companies have proposed that IEC standardize a proposed new Japanese-centric file format that is expected to unify DotBook of Voyager Japan and XMDF of Sharp. This new format has not been publicly disclosed as of November 2010 but it is supposed to cover basic representations for the Japanese language. Technically speaking, this revision is supposed to provide a Japanese minimum set, a Japanese extension set, and a stylesheet language. These issues were discussed in the TC100 meeting held in October 2010 but no decisions were taken besides offering the liaison status to IDPF.
INF (IBM)
IBM created this e-book format and used it extensively for
OS/2
OS/2 (Operating System/2) is a series of computer operating systems, initially created by Microsoft and IBM under the leadership of IBM software designer Ed Iacobucci. As a result of a feud between the two companies over how to position OS/2 ...
and other of its operating systems. The INF files were often digital versions of printed books that came with some bundles of OS/2 and other products. There were many other newsletters and monthly publications (e.g.: EDM/2) available in the INF format too.
The advantage of INF is that it is very compact and very fast. It also supports images, reflowed text, tables and various list formats. INF files get generated by compiling the markup text files — in the
Information Presentation Facility (IPF) format — into binary files.
Originally only IBM created an INF viewer and compiler, but later open source viewers like NewView, DocView and others appeared. There is also an open source IPF compiler named WIPFC, created by the
Open Watcom
Watcom C/C++ (currently Open Watcom C/C++) is an integrated development environment (IDE) product from Watcom International Corporation for the C, C++, and Fortran programming languages. Watcom C/C++ was a commercial product until it was disc ...
project.
Kindle (Amazon)
With the release of the
Kindle Fire
The Amazon Fire, formerly called the Kindle Fire, is a line of tablet computers developed by Amazon. Built with Quanta Computer, the Kindle Fire was first released in November 2011, featuring a color 7-inch multi-touch display with IPS te ...
reader in late 2011,
Amazon.com also released
Kindle Format 8
Kindle File Format is a proprietary e-book file format created by Amazon.com that can be downloaded and read on devices like smartphones, tablets, computers, or e-readers that have Amazon's Kindle app. E-book files in the Kindle File Format ...
, also known as .AZW3. The .azw3 file format supports a subset of
HTML5
HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HT ...
and
CSS3 features, with some additional nonstandard features; the new data is stored within a container which can also be used to store a Mobi content document, allowing limited backwards compatibility.
Older
Kindle
Kindle may refer to:
Companies and products
* Amazon Kindle, an e-reader line by Amazon.com
** Kindle Direct Publishing, an e-book publishing platform by Amazon
** Kindle Store, an online e-book e-commerce store by Amazon
* Kindle Banking Systems, ...
e-readers use the proprietary format, AZW. It is based on the
Mobipocket
Mobipocket SA was a French company incorporated in March 2000 that created the .mobi e-book file format and produced the Mobipocket Reader software for mobile phones, personal digital assistants (PDA) and desktop operating systems.
The Mobipoc ...
standard, with a slightly different serial number scheme (it uses an
asterisk
The asterisk ( ), from Late Latin , from Ancient Greek , ''asteriskos'', "little star", is a typographical symbol. It is so called because it resembles a conventional image of a heraldic star.
Computer scientists and mathematicians often vo ...
instead of a
dollar sign
The dollar sign, also known as peso sign, is a symbol consisting of a capital " S" crossed with one or two vertical strokes ($ or ), used to indicate the unit of various currencies around the world, including most currencies denominated " ...
) and its own
DRM formatting. It also lacks some Mobipocket features such as JavaScript. ''.prc'' publications can be read directly on the Kindle.
Because the ebooks bought on the Kindle are delivered over its wireless system called Whispernet, the user does not see the AZW files during the download process. The Kindle format is available on a variety of platforms, such as through the Kindle app for the various mobile device platforms.
Microsoft LIT
DRM-protected LIT files are only readable in the proprietary
Microsoft Reader program, as the .LIT format, otherwise similar to Microsoft's
CHM CHM may refer to:
Biology and medicine
* CHM, abbreviation for Clearing House Mechanism under the Convention on Biological Diversity
* CHM, a human gene that encodes Rab escort protein 1
* Choroideremia, a retinal disease caused by mutations in the ...
format, includes
Digital Rights Management
Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures (TPM) such as access control technologies can restrict the use of proprietary hardware and copyrighted work ...
features. Other third party readers, such as
Lexcycle Stanza
Lexcycle was a software company that made electronic book reading software. They were responsible for Stanza, which ran on the iPhone, iPod Touch, Microsoft Windows
Windows is a group of several proprietary graphical operating system famil ...
, can read unprotected LIT files.
The Microsoft Reader uses patented
ClearType display technology. In Reader navigation works with a keyboard, mouse, stylus, or through electronic bookmarks. The Catalog Library records reader books in a personalized "home page", and books are displayed with ClearType to improve readability. A user can add annotations and notes to any page, create large-print e-books with a single command, or create free-form drawings on the reader pages. A built-in dictionary allows the user to look up words.
In August 2011, Microsoft announced they were discontinuing both Microsoft Reader and the use of the .lit format for ebooks at the end of August 2012, and ending sales of the format on November 8, 2011.
Mobipocket
The
Mobipocket
Mobipocket SA was a French company incorporated in March 2000 that created the .mobi e-book file format and produced the Mobipocket Reader software for mobile phones, personal digital assistants (PDA) and desktop operating systems.
The Mobipoc ...
e-book format is based on the
Open eBook
Open eBook (OEB), or formally, the Open eBook Publication Structure (OEBPS), is a legacy e-book format which has been superseded by the EPUB format. It was "based primarily on technology developed by SoftBook Press". and on XML. OEB was released wi ...
standard using
XHTML
Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.
While HTML, prior ...
and can include
JavaScript
JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
and frames. It also supports native
SQL queries to be used with embedded databases. There is a corresponding e-book reader.
The
Mobipocket
Mobipocket SA was a French company incorporated in March 2000 that created the .mobi e-book file format and produced the Mobipocket Reader software for mobile phones, personal digital assistants (PDA) and desktop operating systems.
The Mobipoc ...
Reader has a home page library. Readers can add blank pages in any part of a book and add free-hand drawings. Annotations – highlights, bookmarks, corrections, notes, and drawings – can be applied, organized, and recalled from a single location. Images are converted to GIF format and have a maximum size of 64K, sufficient for mobile phones with small screens, but rather restrictive for newer gadgets.
Mobipocket
Mobipocket SA was a French company incorporated in March 2000 that created the .mobi e-book file format and produced the Mobipocket Reader software for mobile phones, personal digital assistants (PDA) and desktop operating systems.
The Mobipoc ...
Reader has electronic bookmarks, and a built-in dictionary.
The reader has a full screen mode for reading and support for many
PDAs,
Communicators, and
Smartphone
A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
s.
Mobipocket
Mobipocket SA was a French company incorporated in March 2000 that created the .mobi e-book file format and produced the Mobipocket Reader software for mobile phones, personal digital assistants (PDA) and desktop operating systems.
The Mobipoc ...
products support most Windows, Symbian, BlackBerry and Palm operating systems, but not the Android platform. Using WINE, the reader works under Linux or Mac OS X. Third-party applications like
Okular,
Calibre, and
FBReader
FBReader is an e-book reader for Linux, Microsoft Windows, Android, and other platforms.
It was originally written for the Sharp Zaurus and currently runs on many other mobile devices, like the Nokia Internet Tablets, as well as desktop com ...
can also be used under Linux or Mac OS X, but they work only with unencrypted files.
The
Amazon Kindle
Amazon Kindle is a series of e-readers designed and marketed by Amazon. Amazon Kindle devices enable users to browse, buy, download, and read e-books, newspapers, magazines and other digital media via wireless networking to the Kindle Stor ...
can read unprotected .mobi files, as can
Amazon
Amazon most often refers to:
* Amazons, a tribe of female warriors in Greek mythology
* Amazon rainforest, a rainforest covering most of the Amazon basin
* Amazon River, in South America
* Amazon (company), an American multinational technolog ...
's Kindle application for Windows and MacOS. Amazon has also developed an .epub to .mobi converter called KindleGen, and it supports IDPF 1.0 and IDPF 2.0 EPUB format.
Multimedia eBooks
A
multimedia ebook
Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradition ...
is
media
Media may refer to:
Communication
* Media (communication), tools used to deliver information or data
** Advertising media, various media, content, buying and placement for advertising
** Broadcast media, communications delivered over mass el ...
and
book
A book is a medium for recording information in the form of writing or images, typically composed of many pages (made of papyrus, parchment, vellum, or paper) bound together and protected by a cover. The technical term for this phys ...
content that utilizes a combination of different book
content format A content format is an encoded format for converting a specific type of data to displayable information. Content formats are used in recording and transmission to prepare data for observation or interpretation. This includes both analog and di ...
s. The term can be used as a noun (a medium with multiple content formats) or as an adjective describing a medium as having multiple content formats.
The "multimedia ebook" term is used in contrast to media which only utilize traditional forms of printed or text books. Multimedia ebooks include a combination of
text
Text may refer to:
Written word
* Text (literary theory), any object that can be read, including:
**Religious text, a writing that a religious tradition considers to be sacred
**Text, a verse or passage from scripture used in expository preachin ...
,
audio
Audio most commonly refers to sound, as it is transmitted in signal form. It may also refer to:
Sound
*Audio signal, an electrical representation of sound
*Audio frequency, a frequency in the audio spectrum
* Digital audio, representation of soun ...
,
image
An image is a visual representation of something. It can be two-dimensional, three-dimensional, or somehow otherwise feed into the visual system to convey information. An image can be an artifact, such as a photograph or other two-dimensio ...
s,
video
Video is an Electronics, electronic medium for the recording, copying, playback, broadcasting, and display of moving picture, moving image, visual Media (communication), media. Video was first developed for mechanical television systems, whi ...
, or
interactive
Across the many fields concerned with interactivity, including information science, computer science, human-computer interaction, communication, and industrial design, there is little agreement over the meaning of the term "interactivity", but m ...
content formats. Much like how a traditional book can contain images to help the text tell a story, a multimedia ebook can contain other elements not formerly possible to help tell the story.
With the advent of more widespread tablet-like computers, such as the
smartphone
A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, whic ...
, some publishing houses are planning to make multimedia ebooks, such as Penguin.
Newton Digital Book
Commonly known as a
Newton Book, but officially referred to as a Newton Digital Book; a single Newton package file can contain multiple books (for example, the three books of a trilogy might be packaged together). Newton Books are created using Newton Press, or, for more advanced content, Newton Book Maker and Newton Toolkit.
All systems running the Newton operating system (the most common include the Newton MessagePads, eMates, Siemens Secretary Stations, Motorola Marcos, Digital Ocean Seahorses and Tarpons) have built-in support for viewing Newton books, through a system service known as Newton Book Reader. The Newton
package format was released to the public by Newton, Inc. prior to that company's absorption into Apple Computer. The format is thus arguably open and various people have written readers for it (writing a Newton book converter has even been assigned as a university-level class project).
Newton books have no support for DRM or encryption. They do support internal links, potentially multiple tables of contents and indexes, embedded gray scale images, and even some scripting capability using NewtonScript (for example, it's possible to make a book in which the reader can influence the outcome). Newton books utilize
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
and are thus available in numerous languages. An individual
Newton Book
Newton most commonly refers to:
* Isaac Newton (1642–1726/1727), English scientist
* Newton (unit), SI unit of force named after Isaac Newton
Newton may also refer to:
Arts and entertainment
* Newton (film), ''Newton'' (film), a 2017 Indian f ...
may actually contain multiple views representing the same content in different ways (such as for different screen resolutions).
Open Packaging Format
OPF is an
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
-based e-book format created by E-Book Systems; it has been superseded by the EPUB electronic publication standard.
Portable Document Format
Invented by
Adobe Systems, and first released in 1993,
PDF
Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. ...
became ISO 32000 in 2008. The format was developed to provide a platform-independent means of exchanging fixed-layout documents. Derived from
PostScript
PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, ...
, but without language features like loops, PDF adds support for features such as compression, passwords, semantic structures and DRM. Because PDF documents can easily be viewed and printed by users on a variety of computer
platforms, they are very common on the
World Wide Web
The World Wide Web (WWW), commonly known as the Web, is an information system enabling documents and other web resources to be accessed over the Internet.
Documents and downloadable media are made available to the network through web se ...
and in document management systems worldwide. The current PDF specification, ISO 32000-1:2008, is available from ISO's website, and under special arrangement, without charge from Adobe.
Because the format is designed to reproduce fixed-layout pages, re-flowing text to fit mobile device and e-book reader screens has traditionally been problematic. This limitation was addressed in 2001 with the release of PDF Reference 1.5 and "Tagged PDF", but 3rd party support for this feature was limited until the release of
PDF/UA
PDF/UA (PDF/Universal Accessibility), formally ISO 14289, is an International Organization for Standardization (ISO) standard for accessible PDF technology. A technical specification intended for developers implementing PDF writing and processing ...
in 2012.
Many products support creating and reading PDF files, such as Adobe Acrobat,
PDFCreator and
LibreOffice
LibreOffice () is a free and open-source office productivity software suite, a project of The Document Foundation (TDF). It was forked in 2010 from OpenOffice.org, an open-sourced version of the earlier StarOffice. The LibreOffice suite consi ...
, and several programming libraries such as
iText and
FOP. Third party viewers such as
xpdf
Xpdf is a free and open-source PDF viewer for operating systems supported by the Qt toolkit. Versions prior to 4.00 were written for the X Window System and Motif.
Functions
Xpdf runs on nearly any Unix-like operating system. Binaries are als ...
and
Nitro PDF
Nitro PDF Pro is an application used to create and edit Portable Document Format (PDF) files and digital documents.
History
Nitro Software was founded in Melbourne, Australia, by a team of three, as an alternative PDF software to Adobe Acrobat ...
are also available. Mac OS X has built-in PDF support, both for creation as part of the printing system and for display using the built-in Preview application.
Older PDF files are supported by almost all modern e-book readers, tablets and smartphones. Newer PDF files may not display properly on older e-readers, may not open, or may crash them. However, PDF reflow based on Tagged PDF, as opposed to re-flow based on the actual sequence of objects in the content-stream, is not yet commonly supported on mobile devices. Such Re-flow options as may exist are usually found under "view" options, and may be called "word-wrap".
Plain text files
The first e-books in history were in
plain text
In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a limi ...
(.txt) format, supplied for free by the
Project Gutenberg
Project Gutenberg (PG) is a volunteer effort to digitize and archive cultural works, as well as to "encourage the creation and distribution of eBooks."
It was founded in 1971 by American writer Michael S. Hart and is the oldest digital li ...
community, but the format itself existed before the e-book era. The plain text format doesn't support digital rights management (DRM) or formatting options (such as different fonts, graphics or colors), but it has excellent portability as it is the simplest e-book encoding possible as a plain text file contains only
ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because ...
or
Unicode
Unicode, formally The Unicode Standard,The formal version reference is is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, ...
text (text files with
UTF-8
UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''.
UTF-8 is capable of ...
or
UTF-16
UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 valid code points of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is variable-length, as cod ...
encoding are also popular for languages other than English). Almost all operating systems can read ASCII text files (e.g. Unix, Macintosh, Microsoft Windows, DOS and other systems) and newer operating systems support Unicode text files as well. The only potential for portability problems of ASCII text files is that operating systems differ in their preferred line ending convention and their interpretation of values outside the ASCII range (their character encoding). Conversion of files from one to another line-ending convention is easy with free software. DOS and Windows uses CRLF, Unix and Apple's OS X use LF, Mac OS up to and including OS 9 uses CR. By convention, lines are often broken to fit into 80 characters, a legacy of older terminals and consoles. Alternately, each paragraph may be a single line.
When Unicode is not in use, the size in bytes of a text file is simply the number of characters, including spaces, and with a new line counting for 1 or 2. For example, the
Bible
The Bible (from Koine Greek , , 'the books') is a collection of religious texts or scriptures that are held to be sacred in Christianity, Judaism, Samaritanism, and many other religions. The Bible is an anthologya compilation of texts o ...
, which is approximately 800,000 words, is about 4 MB.
Plucker
Plucker is an Open Source
free
Free may refer to:
Concept
* Freedom, having the ability to do something, without having to obey anyone/anything
* Freethought, a position that beliefs should be formed only on the basis of logic, reason, and empiricism
* Emancipate, to procur ...
mobile and desktop e-book reader application with its own associated file format and software to automatically generate Plucker files from text, PDF, HTML, or other document format files, web sites or RSS feeds. The format is public and well-documented. Free readers are available for all kinds of desktop computers and many PDAs.
PostScript
PostScript
PostScript (PS) is a page description language in the electronic publishing and desktop publishing realm. It is a dynamically typed, concatenative programming language. It was created at Adobe Systems by John Warnock, Charles Geschke, ...
is a
page description language
In digital printing, a page description language (PDL) is a computer language that describes the appearance of a printed page in a higher level than an actual output bitmap (or generally raster graphics). An overlapping term is printer control ...
used in the electronic and
desktop publishing
Desktop publishing (DTP) is the creation of documents using page layout software on a personal ("desktop") computer. It was first used almost exclusively for print publications, but now it also assists in the creation of various forms of online ...
areas for defining the contents and layout of a printed page, which can be used by a rendering program to assemble and create the actual output
bitmap
In computing, a bitmap is a mapping from some domain (for example, a range of integers) to bits. It is also called a bit array or bitmap index.
As a noun, the term "bitmap" is very often used to refer to a particular bitmapping application: th ...
. Many office printers directly support interpreting PostScript and printing the result. As a result, the format also sees wide use in the
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
world.
RTF
Rich Text Format
)
As an example, the following RTF code
would be rendered as follows:
This is some bold text.
Character encoding
A standard RTF file can only consist of 7-bit ASCII characters, but can use escape sequences to encode other characters. T ...
is a
document
A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" o ...
file format that is supported by many ebook readers. Its advantages as an ebook format are that it is widely supported, and it can be reflowed. It can be easily edited. It can be easily converted to other ebook formats, increasing its support.
SSReader
The digital book format used by a popular digital library company 超星数字图书馆 in China. It is a proprietary raster image compression and binding format, with reading time OCR plug-in modules. The company scanned a huge number of Chinese books in the China National Library and this becomes the major stock of their service. The detailed format is not published. There are also some other commercial e-book formats used in Chinese digital libraries.
Text Encoding Initiative
TEI Lite is the most popular of the
TEI-based (and thus
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
-based or
SGML
The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":
* Declarative: Markup should d ...
-based) electronic text formats.
TomeRaider
The
TomeRaider e-book format is a proprietary format. There are versions of TomeRaider for Windows, Windows Mobile (aka Pocket PC), Palm, Symbian and iPhone. Capabilities of the TomeRaider3 e-book reader vary considerably per platform: the Windows and Windows Mobile editions support full
HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
and
CSS
Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML (including XML dialects such as SVG, MathML or XHTML). CSS is a cornerstone t ...
. The Palm edition supports limited HTML (e.g., no tables, no fonts), and CSS support is missing. For Symbian there is only the older TomeRaider2 format, which does not render images or offer category search facilities. Despite these differences any TomeRaider e-book can be browsed on all supported platforms. The Tomeraider website
claims to have over 4000 e-books available, including free versions of the
Internet Movie Database
IMDb (an abbreviation of Internet Movie Database) is an online database of information related to films, television series, home videos, video games, and streaming content online – including cast, production crew and personal biographies, ...
and Wikipedia.
Open XML Paper Specification
Open XML Paper Specification (also referred to as OpenXPS) is an open
specification
A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard.
There are different types of technical or engineering specificat ...
for a
page description language
In digital printing, a page description language (PDL) is a computer language that describes the appearance of a printed page in a higher level than an actual output bitmap (or generally raster graphics). An overlapping term is printer control ...
and a fixed-document format.
Microsoft
Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation producing Software, computer software, consumer electronics, personal computers, and related services headquartered at th ...
developed it as the XML Paper Specification (XPS). In June 2009,
Ecma International
Ecma International () is a nonprofit standards organization for information and communication systems. It acquired its current name in 1994, when the European Computer Manufacturers Association (ECMA) changed its name to reflect the organizatio ...
adopted it as international standard ECMA-388.
The format is intentionally restricted to sequences of:
Glyphs (a fixed run of text),
Paths (a geometry that can be filled, or stroked, by a brush), and
Brushes (a description of a shaped brush used to in rendering paths).
This reduces the possibility of inadvertent introduction of malicious content and simplifies the implementation of compatible renderers.
Comparison tables
Features
Supporting platforms
See also
* Comparison of e-book readers
An e-reader, also known as an e-book reader, is a portable electronic device that is designed primarily for the purpose of reading e-books and periodicals. E-readers have a similar form factor to a tablet and usually refers to devices that use ...
* Comparison of Android e-book reader software
The following tables detail e-book reader software for the Android operating system. Each section corresponds to a major area of functionality in an e-book reader software. The comparisons are based on the latest released version.
Software rea ...
– includes software e-book readers for Android devices
* Comparison of iOS e-book reader software
The following tables compare general and technical features for a number of iOS e-book reader software. Each section corresponds to a major area of functionality in an e-book reader software. The comparisons are based on the latest released versi ...
– includes software e-book readers for iOS devices
Footnotes
References
*
*
* Cope, B., & Mason, D. (2002). Markets for electronic book products. C-2-C series, bk. 3.2. Altona, Vic: Common Ground Pub.
*
* Hanttula, D. (2001). Pocket PC handbook.
*
External links
ebook reader articles at Mobile Read Wiki
Daisy 3: A Standard for Accessible Multimedia Books
An E-Book Buyer's Guide to Privacy
{{DEFAULTSORT:Comparison Of E-Book Formats
Electronic documents
Electronic publishing
Computing comparisons