Comparison of e-book formats
   HOME

TheInfoList



OR:

The following is a comparison of e-book formats used to create and publish
e-book An ebook (short for electronic book), also spelled as e-book or eBook, is a book publication made available in electronic form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Al ...
s. The
EPUB EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes stylized as ''ePUB''. EPUB is supported by many e-readers, and compatible software is available for most smart ...
format is the most widely supported e-book format, supported by most
e-book reader An e-reader, also called an e reader or e device, is a Mobile computing, mobile electronic device that is designed primarily for the purpose of reading digital e-books and Periodical literature, periodicals. Any device that can display text on ...
s including
Amazon Kindle Amazon Kindle is a series of e-readers designed and marketed by Amazon. Amazon Kindle devices enable users to browse, buy, download, and read e-books, newspapers, magazines, Audible audiobooks, and other digital media via wireless networking ...
devices. Most e-book readers also support the
PDF Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
and
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
formats. E-book software, like the cross-platform Calibre, can be used to convert e-books from one format to another, as well as to create, edit and publish e-books.


Format descriptions

Formats available include, but are not limited to:


Broadband eBooks (BBeB)

The digital book format originally used by
Sony Corporation is a Japanese multinational conglomerate headquartered at Sony City in Minato, Tokyo, Japan. The Sony Group encompasses various businesses, including Sony Corporation (electronics), Sony Semiconductor Solutions (imaging and sensing), ...
. It is a proprietary format, but some reader software for general-purpose computers, particularly under Linux (for example, Calibre's internal viewer), have the capability to read it. The LRX file extension represents a DRM-encrypted e-book. More recently, Sony has converted its books from BBeB to EPUB and is now issuing new titles in EPUB.


Comic Book Archive file


Compiled HTML

CHM format is a proprietary format based on HTML. Multiple pages and embedded graphics are distributed along with
metadata Metadata (or metainformation) is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive ...
as a single compressed file. The indexing is both for keywords and for full text search.


DAISY – ANSI/NISO Z39.86

The Digital Accessible Information SYstem (DAISY) is an
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
-based open standard published by the National Information Standards Organization (NISO) and maintained by the DAISY Consortium for people with print disabilities. DAISY has wide international support with features for multimedia, navigation and synchronization. A subset of the DAISY format has been adopted by law in the United States as the National Instructional Material Accessibility Standard (NIMAS), and K-12 textbooks and instructional materials are now required to be provided to students with disabilities. DAISY is already aligned with the EPUB technical standard, and is expected to fully converge with its forthcoming EPUB3 revision.


Djvu

DjVu is a format specialized for storing scanned documents. It includes advanced compressors optimized for low-color images, such as text documents. Individual files may contain one or more pages. DjVu files cannot be re-flowed. The contained page images are divided in separate layers (such as multi-color, low-resolution, background layer using
lossy compression In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size ...
, and few-colors, high-resolution, tightly compressed foreground layer), each compressed in the best available method. The format is designed to decompress very quickly, even faster than vector-based formats. The advantage of DjVu is that it is possible to take a high-resolution scan (300–400 DPI), good enough for both on- screen reading and printing, and store it very efficiently. Provided the images are reasonably clean and the most aggressive compression settings are used, a couple hundred 600-DPI black-and-white text scans can be stored in less than a megabyte.


DOC

DOC is a
document A document is a writing, written, drawing, drawn, presented, or memorialized representation of thought, often the manifestation of nonfiction, non-fictional, as well as fictional, content. The word originates from the Latin ', which denotes ...
file format that is directly supported by few e-book readers. Its advantages as an e-book format is that it can be easily converted to other e-book formats and it can be reflowed. It can be easily edited using Microsoft software, and any of several other programs. Note that the format has changed several times since its original release, and there are numerous incompatibility difficulties between various releases and the assorted programs which attempt to read / write the format.


DOCX

DOCX is a document file format that is directly supported by few e-book readers. Its advantages as an e-book format are that it can be easily converted to other e-book formats and it can be reflowed. It can be easily edited.


EPUB

The EPUB (formerly OEBPS) format is a technical standard for e-books created by the
International Digital Publishing Forum The International Digital Publishing Forum (IDPF) was a trade and standards association for the digital publishing industry, set up to establish a standard for electronic book publishing. It was responsible for the EPUB standard currently used by ...
(IDPF). The format has gained mass popularity as the most popular vendor-independent XML-based e-book format. The format can be read by
Amazon Kindle Amazon Kindle is a series of e-readers designed and marketed by Amazon. Amazon Kindle devices enable users to browse, buy, download, and read e-books, newspapers, magazines, Audible audiobooks, and other digital media via wireless networking ...
, Kobo eReader devices,
BlackBerry BlackBerry is a discontinued brand of handheld devices and related mobile services, originally developed and maintained by the Canadian company Research In Motion (RIM, later known as BlackBerry Limited) until 2016. The first BlackBerry device ...
devices, Apple's Apple Books app running on
Macintosh Mac is a brand of personal computers designed and marketed by Apple Inc., Apple since 1984. The name is short for Macintosh (its official name until 1999), a reference to the McIntosh (apple), McIntosh apple. The current product lineup inclu ...
computers and
iOS Ios, Io or Nio (, ; ; locally Nios, Νιός) is a Greek island in the Cyclades group in the Aegean Sea. Ios is a hilly island with cliffs down to the sea on most sides. It is situated halfway between Naxos and Santorini. It is about long an ...
/
iPadOS iPadOS is a mobile operating system developed by Apple for its iPad line of tablet computers. It was given a name distinct from iOS, the operating system used by Apple's iPhones to reflect the diverging features of the two product lines, suc ...
devices,
Google Play Books Google Play Books, formerly Google eBooks, is an ebook digital distribution service operated by Google, part of its Google Play product line. Users can purchase and download ebooks and audiobooks from Google Play, which offers over five million ...
app running on Android and iOS/iPadOS devices, Barnes & Noble Nook, Sony Reader, BeBook, Bookeen Cybook Gen3 (with firmware v2 and up), Adobe Digital Editions, Lexcycle Stanza, FBReader, PocketBook eReader, Aldiko, the
Mozilla Firefox Mozilla Firefox, or simply Firefox, is a free and open-source web browser developed by the Mozilla Foundation and its subsidiary, the Mozilla Corporation. It uses the Gecko rendering engine to display web pages, which implements curren ...
add-on EPUBReader, Lucifox, Okular and other reading apps. Adobe Digital Editions uses .epub format for its e-books, with
digital rights management Digital rights management (DRM) is the management of legal access to digital content. Various tools or technological protection measures, such as access control technologies, can restrict the use of proprietary hardware and copyrighted works. DRM ...
(DRM) protection provided through their proprietary ADEPT mechanism. The ADEPT framework and scripts have been reverse-engineered to circumvent this DRM system.


eReader

eReader is a
freeware Freeware is software, often proprietary, that is distributed at no monetary cost to the end user. There is no agreed-upon set of rights, license, or EULA that defines ''freeware'' unambiguously; every publisher defines its own rules for the free ...
program for viewing Palm Digital Media electronic books which use the pdb format used by many Palm applications. Versions are available for Android, BlackBerry, iOS,
Palm OS Palm OS (also known as Garnet OS) is a discontinued mobile operating system initially developed by Palm, Inc., for personal digital assistants (PDAs) in 1996. Palm OS was designed for ease of use with a touchscreen-based graphical user interface. ...
(not webOS),
Symbian Symbian is a discontinued mobile operating system (OS) and computing platform designed for smartphones. It was originally developed as a proprietary software OS for personal digital assistants in 1998 by the Symbian Ltd. consortium. Symbian OS ...
,
Windows Mobile Windows Mobile is a discontinued mobile operating system developed by Microsoft for smartphones and personal digital assistants (PDA). Designed to be the portable equivalent of the Windows desktop OS in the emerging Mobile device, mobile/port ...
Pocket PC/Smartphone, and
macOS macOS, previously OS X and originally Mac OS X, is a Unix, Unix-based operating system developed and marketed by Apple Inc., Apple since 2001. It is the current operating system for Apple's Mac (computer), Mac computers. With ...
. The reader shows text one page at a time, as paper books do. eReader supports embedded hyperlinks and images. Additionally, the
Stanza In poetry, a stanza (; from Italian ''stanza'', ; ) is a group of lines within a poem, usually set off from others by a blank line or indentation. Stanzas can have regular rhyme and metrical schemes, but they are not required to have either. ...
application for the
iPhone The iPhone is a line of smartphones developed and marketed by Apple that run iOS, the company's own mobile operating system. The first-generation iPhone was announced by then–Apple CEO and co-founder Steve Jobs on January 9, 2007, at ...
and
iPod Touch The iPod Touch (stylized as iPod touch) is a discontinued line of iOS-based mobile devices designed and formerly marketed by Apple Inc. with a touchscreen-controlled user interface. As with other iPod models, the iPod Touch can be used as a po ...
can read both encrypted and unencrypted eReader files. The program supports features like bookmarks and
footnote In publishing, a note is a brief text in which the author comments on the subject and themes of the book and names supporting citations. In the editorial production of books and documents, typographically, a note is usually several lines of tex ...
s, enabling the user to mark any page with a bookmark and any part of the text with a footnote-like commentary. Footnotes can later be exported as a Memo document. On July 20, 2009,
Barnes & Noble Barnes & Noble Booksellers is an American bookseller with the largest number of retail outlets in the United States. The company operates approximately 600 retail stores across the United States. Barnes & Noble operates mainly through its B ...
made an announcement implying that eReader would be the company's preferred format to deliver e-books. Exactly three months later, in a press release by
Adobe Adobe (from arabic: الطوب Attub ; ) is a building material made from earth and organic materials. is Spanish for mudbrick. In some English-speaking regions of Spanish heritage, such as the Southwestern United States, the term is use ...
, it was revealed Barnes & Noble would be joining forces with the software company to standardize the EPUB and PDF e-book formats. Barnes & Noble e-books are now sold mostly in EPUB format.


FictionBook (fb2)

FictionBook is an
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
-based e-book format, supported by free readers such as PocketBook eReader, FBReader, Okular, CoolReader, BeBook and STDU Viewer. The FictionBook format does not specify the appearance of a document; instead, it describes its structure and semantics. All e-book metadata, such as the author name, title, and publisher, is also present in the file. Hence the format is convenient for automatic processing, indexing, and e-book collection management. This is also convenient for book storage for later automatic conversion into other formats.


Founder Electronics

APABI is a format devised by Founder Electronics. It is a popular format for Chinese e-books. It can be read using the Apabi Reader software, and produced using Apabi Publisher. Both .xeb and .ceb files are encoded binary files. The
ILiad The ''Iliad'' (; , ; ) is one of two major Ancient Greek epic poems attributed to Homer. It is one of the oldest extant works of literature still widely read by modern audiences. As with the ''Odyssey'', the poem is divided into 24 books and ...
e-book device includes an Apabi viewer.


Hypertext Markup Language

HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
is the
markup language A markup language is a Encoding, text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts. Markup can control the display of a document or enrich its content to facilitate au ...
used for most
web Web most often refers to: * Spider web, a silken structure created by the animal * World Wide Web or the Web, an Internet-based hypertext system Web, WEB, or the Web may also refer to: Computing * WEB, a literate programming system created by ...
pages. E-books using HTML can be read using a
Web browser A web browser, often shortened to browser, is an application for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's scr ...
. The specifications for the format are available without charge from the
W3C The World Wide Web Consortium (W3C) is the main international standards organization for the World Wide Web. Founded in 1994 by Tim Berners-Lee, the consortium is made up of member organizations that maintain full-time staff working together in ...
. HTML adds specially marked meta-elements to otherwise plain text encoded using
character set Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using computers. The numerical values that make up a c ...
s like
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
or
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
. As such, suitably formatted files can be, and sometimes are, generated by hand using a plain text editor or programmer's editor. Many HTML generator applications exist to ease this process and often require less intricate knowledge of the format details involved. HTML on its own is not a particularly efficient format to store information in, requiring more storage space for a given work than many other formats. However, several e-Book formats including the Amazon Kindle, Open eBook, Compiled HTML, Mobipocket and EPUB store each book chapter in HTML format, then use ZIP compression to compress the HTML data, images, metadata and style sheets into a single, significantly smaller, file. HTML files encompass a wide range of standards and displaying HTML files correctly can be complicated. Additionally many of the features supported, such as forms, are not relevant to e-books.


iBook (Apple)

The .ibooks format is created with the free iBooks Author e-book layout software from
Apple Inc. Apple Inc. is an American multinational corporation and technology company headquartered in Cupertino, California, in Silicon Valley. It is best known for its consumer electronics, software, and services. Founded in 1976 as Apple Comput ...
This proprietary format is based on the
EPUB EPUB is an e-book file format that uses the ".epub" file extension. The term is short for ''electronic publication'' and is sometimes stylized as ''ePUB''. EPUB is supported by many e-readers, and compatible software is available for most smart ...
standard, with some differences in the CSS tags used in an ibooks format file, this making it incompatible with the EPUB specification. The End-User Licensing Agreement (EULA) included with iBooks Author states that "If you want to charge a fee for a work that includes files in the .ibooks format generated using iBooks Author, you may only sell or distribute such work through Apple". The "through Apple" will typically be in the Apple Apple Books store. The EULA further states that "This restriction does not apply to the content of such works when distributed in a form that does not include files in the .ibooks format." Therefore, Apple has not included distribution restrictions in the iBooks Author EULA for ibooks format e-books created in iBooks Author that are made available for free, and it does not prevent authors from re-purposing the content in other e-book formats to be sold outside the iBookstore. This software currently supports import and export functionally for three formats. ibook, Plain text and PDF. Versions 2.3 and later of iBooks Author support importing EPUB and exporting EPUB 3.0.


IEC 62448

IEC 62448 is an international standard created by
International Electrotechnical Commission The International Electrotechnical Commission (IEC; ) is an international standards organization that prepares and publishes international standards for all electrical, electronics, electronic and related technologies. IEC standards cover a va ...
(IEC), Technical Committee 100, Technical Area 10 (Multimedia e-publishing and e-book). The current version of IEC 62448 is an umbrella standard that contains as appendices two concrete formats, XMDF of Sharp and BBeB of Sony. However, BBeB has been discontinued by Sony and the version of XMDF that is in the specification is out of date. The IEC TA10 group is discussing the next steps, and has invited the IDPF organization which has standardized EPUB to be a liaison. It is possible that the current version of EPUB or the EPUB3 revision may be added to IEC 62448. Meanwhile, a number of Japanese companies have proposed that IEC standardize a proposed new Japanese-centric file format that is expected to unify DotBook of Voyager Japan and XMDF of Sharp. This new format has not been publicly disclosed as of November 2010 but it is supposed to cover basic representations for the Japanese language. Technically speaking, the revision is supposed to provide a Japanese minimum set, a Japanese extension set, and a stylesheet language. These issues were discussed in the TC100 meeting held in October 2010 but no decisions were taken besides offering the liaison status to IDPF.


INF (IBM)

IBM International Business Machines Corporation (using the trademark IBM), nicknamed Big Blue, is an American Multinational corporation, multinational technology company headquartered in Armonk, New York, and present in over 175 countries. It is ...
created this e-book format and used it extensively for
OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...
and other of its operating systems. The INF files were often digital versions of printed books that came with some bundles of OS/2 and other products. There were many other newsletters and monthly publications (e.g.: EDM/2) available in the INF format too. The advantage of INF is that it is very compact and very fast. It also supports images, reflowed text, tables and various list formats. INF files get generated by compiling the markup text files — in the Information Presentation Facility (IPF) format — into binary files. Originally only IBM created an INF viewer and compiler, but later open source viewers like NewView, DocView and others appeared. There is also an open source IPF compiler named WIPFC, created by the Open Watcom project.


Kindle (Amazon)

With the release of the Kindle Fire reader in late 2011, Amazon.com also released Kindle Format 8, also known as .azw3. The .azw3 file format supports a subset of
HTML5 HTML5 (Hypertext Markup Language 5) is a markup language used for structuring and presenting hypertext documents on the World Wide Web. It was the fifth and final major HTML version that is now a retired World Wide Web Consortium (W3C) recommend ...
and CSS3 features, with some additional nonstandard features; the new data is stored within a container which can also be used to store a Mobi content document, allowing limited backwards compatibility. Older Kindle e-readers use the proprietary format, AZW. It is based on the Mobipocket standard, with a slightly different serial number scheme (it uses an
asterisk The asterisk ( ), from Late Latin , from Ancient Greek , , "little star", is a Typography, typographical symbol. It is so called because it resembles a conventional image of a star (heraldry), heraldic star. Computer scientists and Mathematici ...
instead of a
dollar sign The dollar sign, also known as the peso sign, is a currency symbol consisting of a Letter case, capital crossed with one or two vertical strokes ( or depending on typeface), used to indicate the unit of various currency, currencies around ...
) and its own DRM formatting. It also lacks some Mobipocket features such as
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
. ''.prc'' publications can be read directly on the Kindle. Because e-books bought on the Kindle are delivered over its wireless system called Whispernet, the user does not see the AZW files during the download process. The Kindle format is available on a variety of platforms, such as through the Kindle app for the various mobile device platforms.


Microsoft LIT

DRM-protected LIT files are only readable in the proprietary
Microsoft Reader Microsoft Reader is a discontinued Microsoft application for reading e-books, first released in August 2000, that used its own .LIT format. It was available for Windows computers and Pocket PC PDAs. The name was also used later for an unrelated ...
program, as the .LIT format, otherwise similar to Microsoft's CHM format, includes DRM features. Other third-party readers, such as Lexcycle Stanza, can read unprotected LIT files. The Microsoft Reader uses patented
ClearType ClearType is Microsoft's implementation of subpixel rendering technology in rendering text in a font system. ClearType attempts to improve the appearance of text on certain types of computer display screens by sacrificing color fidelity for addit ...
display technology. In Reader navigation works with a keyboard, mouse, stylus, or through electronic bookmarks. The Catalog Library records reader books in a personalized home page, and books are displayed with ClearType to improve readability. A user can add annotations and notes to any page, create large-print e-books with a single command, or create free-form drawings on the reader pages. A built-in dictionary allows the user to look up words. In August 2011, Microsoft announced they were discontinuing both Microsoft Reader and the use of the .lit format for e-books at the end of August 2012, and ending sales of the format on November 8, 2011.


Mobipocket

The Mobipocket e-book format is based on the
Open eBook Open eBook (OEB), or formally, the Open eBook Publication Structure (OEBPS), is a legacy e-book format which has been superseded by the EPUB format. It was "based primarily on technology developed by SoftBook Press" and on XML. OEB was released wi ...
standard using
XHTML Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages which mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated. While HTML, pr ...
and can include JavaScript and frames. It also supports native
SQL Structured Query Language (SQL) (pronounced ''S-Q-L''; or alternatively as "sequel") is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling s ...
queries to be used with embedded databases. There is a corresponding e-book reader. The Mobipocket Reader has a home page library. Readers can add blank pages in any part of a book and add free-hand drawings. Annotations – highlights, bookmarks, corrections, notes, and drawings – can be applied, organized, and recalled from a single location. Images are converted to GIF format and have a maximum size of 64K, sufficient for mobile phones with small screens, but rather restrictive for newer gadgets. Mobipocket Reader has electronic bookmarks, and a built-in dictionary. The reader has a full screen mode for reading and support for many PDAs, communicators, and
smartphone A smartphone is a mobile phone with advanced computing capabilities. It typically has a touchscreen interface, allowing users to access a wide range of applications and services, such as web browsing, email, and social media, as well as multi ...
s. Mobipocket products support most Windows, Symbian, BlackBerry and Palm operating systems, but not the Android platform. Using WINE, the reader works under Linux or Mac OS X. Third-party applications like Okular, Calibre, and FBReader can also be used under Linux or Mac OS X, but they work only with unencrypted files. The Amazon Kindle can read unprotected .mobi files, as can
Amazon Amazon most often refers to: * Amazon River, in South America * Amazon rainforest, a rainforest covering most of the Amazon basin * Amazon (company), an American multinational technology company * Amazons, a tribe of female warriors in Greek myth ...
's Kindle application for Windows and MacOS. Amazon has also developed an .epub to .mobi converter called KindleGen, and it supports IDPF 1.0 and IDPF 2.0 EPUB format.


Multimedia e-books

A multimedia e-book is
media Media may refer to: Communication * Means of communication, tools and channels used to deliver information or data ** Advertising media, various media, content, buying and placement for advertising ** Interactive media, media that is inter ...
and
book A book is a structured presentation of recorded information, primarily verbal and graphical, through a medium. Originally physical, electronic books and audiobooks are now existent. Physical books are objects that contain printed material, ...
content that utilizes a combination of different book
content format Content or contents may refer to: Media * Content (media), information or experience provided to audience or end-users by publishers or media producers ** Content industry, an umbrella term that encompasses companies owning and providing mass ...
s. The term can be used as a noun (a medium with multiple content formats) or as an adjective describing a medium as having multiple content formats. The term ''multimedia e-book'' is used in contrast to media which only utilize traditional forms of printed or text books. Multimedia e-books include a combination of
text Text may refer to: Written word * Text (literary theory) In literary theory, a text is any object that can be "read", whether this object is a work of literature, a street sign, an arrangement of buildings on a city block, or styles of clothi ...
,
audio Audio most commonly refers to sound, as it is transmitted in signal form. It may also refer to: Sound *Audio signal, an electrical representation of sound *Audio frequency, a frequency in the audio spectrum *Digital audio, representation of sound ...
,
image An image or picture is a visual representation. An image can be Two-dimensional space, two-dimensional, such as a drawing, painting, or photograph, or Three-dimensional space, three-dimensional, such as a carving or sculpture. Images may be di ...
s,
video Video is an Electronics, electronic medium for the recording, copying, playback, broadcasting, and display of moving picture, moving image, visual Media (communication), media. Video was first developed for mechanical television systems, whi ...
, or
interactive Across the many fields concerned with interactivity, including information science, computer science, human-computer interaction, communication, and industrial design, there is little agreement over the meaning of the term "interactivity", but mo ...
content formats. Much like how a traditional book can contain images to help the text tell a story, a multimedia e-book can contain other elements not formerly possible to help tell the story. With the advent of more widespread tablet-like computers, such as the smartphone, some publishing houses were planning to make multimedia ebooks, such as Penguin.


Newton Digital Book

Commonly known as a Newton Book, but officially referred to as a Newton Digital Book; a single Newton package file can contain multiple books (for example, the three books of a trilogy might be packaged together). Newton Books are created using Newton Press, or, for more advanced content, Newton Book Maker and Newton Toolkit. All systems running the Newton operating system (the most common include the Newton MessagePads, eMates, Siemens Secretary Stations, Motorola Marcos, Digital Ocean Seahorses and Tarpons) have built-in support for viewing Newton books, through a system service known as Newton Book Reader. The Newton
package format Package format is a type of archive containing computer programs and additional metadata needed by package managers; an instance of this type of archive is called a package. While the archive file format itself may be unchanged, package formats c ...
was released to the public by Newton, Inc. prior to that company's absorption into Apple Computer. The format is thus arguably open and various people have written readers for it (writing a Newton book converter has even been assigned as a university-level class project). Newton books have no support for DRM or encryption. They do support internal links, potentially multiple tables of contents and indexes, embedded gray scale images, and even some scripting capability using NewtonScript (for example, it's possible to make a book in which the reader can influence the outcome). Newton books utilize
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
and are thus available in numerous languages. An individual Newton Book may actually contain multiple views representing the same content in different ways (such as for different screen resolutions).


Open Packaging Format

OPF is an XML-based e-book format created by E-Book Systems; it has been superseded by the EPUB electronic publication standard.


Portable Document Format

Invented by
Adobe Systems Adobe Inc. ( ), formerly Adobe Systems Incorporated, is an American software, computer software company based in San Jose, California. It offers a wide range of programs from web design tools, photo manipulation and vector creation, through to ...
, and first released in 1993,
PDF Portable document format (PDF), standardized as ISO 32000, is a file format developed by Adobe Inc., Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, computer hardware, ...
became ISO 32000 in 2008. The format was developed to provide a platform-independent means of exchanging fixed-layout documents. Derived from
PostScript PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
, but without language features like loops, PDF adds support for features such as compression, passwords, semantic structures and DRM. Because PDF documents can easily be viewed and printed by users on a variety of computer platforms, they are very common on the
Internet The Internet (or internet) is the Global network, global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a internetworking, network of networks ...
and in document management systems worldwide. The current PDF specification, ISO 32000-1:2008, is available from ISO's website, and under special arrangement, without charge from Adobe. Because the format is designed to reproduce fixed-layout pages, re-flowing text to fit mobile device and e-book reader screens has traditionally been problematic. This limitation was addressed in 2001 with the release of PDF Reference 1.5 and Tagged PDF, but third-party support for this feature was limited until the release of PDF/UA in 2012. Many products support creating and reading PDF files, such as Adobe Acrobat, PDFCreator and
LibreOffice LibreOffice () is a free and open-source office productivity software suite developed by The Document Foundation (TDF). It was created in 2010 as a fork of OpenOffice.org, itself a successor to StarOffice. The suite includes applications ...
, and several programming libraries such as iText and FOP. Third-party viewers such as
xpdf Xpdf is a free and open-source PDF viewer and toolkit based on the Qt framework. Versions prior to 4.00 were written for the X Window System and Motif. Functions Xpdf runs on nearly any Unix-like operating system. Binaries are also availabl ...
and Nitro PDF are also available. Mac OS X has built-in PDF support, both for creation as part of the printing system and for display using the built-in Preview application. Older PDF files are supported by almost all modern e-book readers, tablets and smartphones. Newer PDF files may not display properly on older e-readers, may not open, or may crash them. However, PDF reflow based on Tagged PDF, as opposed to re-flow based on the actual sequence of objects in the content-stream, is not yet commonly supported on mobile devices. Such Re-flow options as may exist are usually found under "view" options, and may be called "word-wrap".


Plain text

The first e-books were in
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects ( floating-point numbers, images, etc.). It may also include a lim ...
.txt format, supplied for free by the
Project Gutenberg Project Gutenberg (PG) is a volunteer effort to digitize and archive cultural works, as well as to "encourage the creation and distribution of eBooks." It was founded in 1971 by American writer Michael S. Hart and is the oldest digital li ...
community, but the format itself existed before e-books. The plain text format doesn't support DRM or formatting options (such as different fonts, graphics or colors). It has excellent portability as it is the simplest e-book encoding possible; a plain text file contains only
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
or
Unicode Unicode or ''The Unicode Standard'' or TUS is a character encoding standard maintained by the Unicode Consortium designed to support the use of text in all of the world's writing systems that can be digitized. Version 16.0 defines 154,998 Char ...
text (text files with
UTF-8 UTF-8 is a character encoding standard used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode Transformation Format 8-bit''. Almost every webpage is transmitted as UTF-8. UTF-8 supports all 1,112,0 ...
or
UTF-16 UTF-16 (16-bit Unicode Transformation Format) is a character encoding that supports all 1,112,064 valid code points of Unicode. The encoding is variable-length as code points are encoded with one or two ''code units''. UTF-16 arose from an earli ...
encoding are also popular for languages other than English). Almost all operating systems can read ASCII text files (e.g. Unix, Macintosh, Microsoft Windows, DOS and other systems) and newer operating systems support Unicode text files as well. The only potential for portability problems of ASCII text files is that operating systems differ in their preferred line-ending convention and their interpretation of values outside the ASCII range (their character encoding). Conversion of files from one to another line-ending convention is easy with free software. DOS and Windows use CRLF, Unix and Apple's macOS use LF, and Mac OS up to and including OS 9 uses CR. By convention, lines are often broken to fit into 80 characters, a legacy of older terminals and consoles. Alternately, each paragraph may be a single line. When Unicode is not in use, the size in bytes of a text file is simply the number of characters, including spaces, and with a new line counting for 1 or 2. For example, the
Bible The Bible is a collection of religious texts that are central to Christianity and Judaism, and esteemed in other Abrahamic religions such as Islam. The Bible is an anthology (a compilation of texts of a variety of forms) originally writt ...
, which is approximately 800,000 words, is about 4 MB.


Plucker

Plucker is a
free and open-source Free and open-source software (FOSS) is software available under a Software license, license that grants users the right to use, modify, and distribute the software modified or not to everyone free of charge. FOSS is an inclusive umbrella term ...
mobile and desktop e-book reader application with its own associated file format and software to automatically generate Plucker files from text, PDF, HTML, or other document format files, web sites or RSS feeds. The format is public and well-documented. Free readers are available for all kinds of desktop computers and many PDAs.


PostScript

PostScript PostScript (PS) is a page description language and dynamically typed, stack-based programming language. It is most commonly used in the electronic publishing and desktop publishing realm, but as a Turing complete programming language, it c ...
is a page description language used in the electronic and
desktop publishing Desktop publishing (DTP) is the creation of documents using dedicated software on a personal ("desktop") computer. It was first used almost exclusively for print publications, but now it also assists in the creation of various forms of online co ...
areas for defining the contents and layout of a printed page, which can be used by a rendering program to assemble and create the actual output
bitmap In computing, a bitmap (also called raster) graphic is an image formed from rows of different colored pixels. A GIF is an example of a graphics image file that uses a bitmap. As a noun, the term "bitmap" is very often used to refer to a partic ...
. Many office printers directly support interpreting PostScript and printing the result. As a result, the format also sees wide use in the
Unix Unix (, ; trademarked as UNIX) is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
world.


RTF

Rich Text Format ) As an example, the following RTF code would be rendered as follows: This is some bold text. Character encoding A standard RTF file can only consist of 7-bit ASCII characters, but can use escape sequences to encode other characters. ...
is a document file format that is supported by many e-book readers. Its advantages as an e-book format are that it is widely supported, and it can be reflowed. It can be easily edited. It can be easily converted to other e-book formats, increasing its support.


SSReader

The digital book format used by the digital library company Chaoxing Digital Library () in China. It is a proprietary
raster image upright=1, The Smiley, smiley face in the top left corner is a raster image. When enlarged, individual pixels appear as squares. Enlarging further, each pixel can be analyzed, with their colors constructed through combination of the values for ...
compression and binding format, with reading time OCR plug-in modules. The company scanned a large number of Chinese books from the National Library of China, which became the major stock of their service. The detailed format is not published. There are also other commercial e-book formats used in Chinese digital libraries.


Text Encoding Initiative

TEI Lite is the most popular of the TEI-based (and thus
XML Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing data. It defines a set of rules for encoding electronic document, documents in a format that is both human-readable and Machine-r ...
-based or
SGML The Standard Generalized Markup Language (SGML; International Organization for Standardization, ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on t ...
-based) electronic text formats.


TomeRaider

The TomeRaider e-book format is a proprietary format. There are versions of the format for Windows, Windows Mobile (aka Pocket PC), Palm, Symbian and iPhone. Capabilities of the TomeRaider3 e-book reader vary considerably per platform: the Windows and Windows Mobile editions support full
HTML Hypertext Markup Language (HTML) is the standard markup language for documents designed to be displayed in a web browser. It defines the content and structure of web content. It is often assisted by technologies such as Cascading Style Sheets ( ...
and CSS. The Palm edition supports limited HTML (e.g. no tables or fonts), and CSS support is missing. For Symbian there is only the older TomeRaider2 format, which does not render images or offer category search facilities. Despite these differences any TomeRaider e-book can be browsed on all supported platforms. The TomeRaider website claims to have over 4000 e-books available, including free versions of the
Internet Movie Database IMDb, historically known as the Internet Movie Database, is an online database of information related to films, television series, podcasts, home videos, video games, and streaming content online – including cast, production crew and biograp ...
(IMDb) and Wikipedia.


Open XML Paper Specification

Open XML Paper Specification Open XML Paper Specification (also referred to as OpenXPS) is an open specification for a page description language and a fixed-document format. Microsoft developed it as the XML Paper Specification (XPS). In June 2009, Ecma International adop ...
(also referred to as OpenXPS) is an open
specification A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard. There are different types of technical or engineering specificati ...
for a page description language and a fixed-document format.
Microsoft Microsoft Corporation is an American multinational corporation and technology company, technology conglomerate headquartered in Redmond, Washington. Founded in 1975, the company became influential in the History of personal computers#The ear ...
developed it as the XML Paper Specification (XPS). In June 2009,
Ecma International Ecma International () is a Nonprofit organization, nonprofit standards organization for information and communication systems. It acquired its current name in 1994, when the European Computer Manufacturers Association (ECMA) changed its name to ...
adopted it as international standard ECMA-388. The format is intentionally restricted to sequences of glyphs (fixed runs of text), paths (geometry that can be filled, or stroked, by a brush), and brushes (descriptions of shaped brushes used to render paths). This reduces the possibility of inadvertent introduction of malicious content and simplifies the implementation of compatible renderers.


Comparison


Features


Supporting platforms


See also

*
Comparison of e-readers An e-reader, also known as an e-book reader, is a mobile device, portable electronic device that is designed primarily for the purpose of reading e-books and Periodical literature, periodicals. E-readers have a similar form factor (design), form f ...
* Comparison of Android e-reader software – includes software e-book readers for Android devices * Comparison of iOS e-reader software – includes software e-book readers for iOS devices


Notes and references


Notes


References


Further reading

* * * Cope, B., & Mason, D. (2002). Markets for electronic book products. C-2-C series, bk. 3.2. Altona, Vic: Common Ground Pub. * * Hanttula, D. (2001). ''Pocket PC Handbook''. *


External links


Ebook reader
articles at Mobile Read Wiki
Daisy 3: A Standard for Accessible Multimedia Books
(archive link)
An E-Book Buyer's Guide to Privacy
at the
Electronic Frontier Foundation The Electronic Frontier Foundation (EFF) is an American international non-profit digital rights group based in San Francisco, California. It was founded in 1990 to promote Internet civil liberties. It provides funds for legal defense in court, ...
(EFF) {{Ebooks e-book formats Electronic documents Electronic publishing