HOME

TheInfoList



OR:

e-text (from "''
electronic Electronic may refer to: *Electronics, the science of how to control electric energy in semiconductor * ''Electronics'' (magazine), a defunct American trade journal *Electronic storage, the storage of data using an electronic device *Electronic co ...
text''"; sometimes written as etext) is a general term for any
document A document is a written, drawn, presented, or memorialized representation of thought, often the manifestation of non-fictional, as well as fictional, content. The word originates from the Latin ''Documentum'', which denotes a "teaching" or ...
that is read in digital form, and especially a document that is mainly text. For example, a computer-based book of art with minimal text, or a set of
photograph A photograph (also known as a photo, image, or picture) is an image created by light falling on a photosensitive surface, usually photographic film or an electronic image sensor, such as a CCD or a CMOS chip. Most photographs are now create ...
s or scans of pages, would not usually be called an "e-text". An e-text may be a
binary Binary may refer to: Science and technology Mathematics * Binary number, a representation of numbers using only two digits (0 and 1) * Binary function, a function that takes two arguments * Binary operation, a mathematical operation that t ...
or a
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
file, viewed with any
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
or
proprietary software Proprietary software is software that is deemed within the free and open-source software to be non-free because its creator, publisher, or other rightsholder or rightsholder partner exercises a legal monopoly afforded by modern copyright and int ...
. An e-text may have markup or other formatting information, or not. An e-text may be an electronic edition of a work originally composed or published in other media, or may be created in electronic form originally. The term is usually synonymous with
e-book An ebook (short for electronic book), also known as an e-book or eBook, is a book publication made available in digital form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Alt ...
.


E-text origins

E-texts, or
electronic document An electronic document is any electronic media content (other than computer programs or system files) that is intended to be used in either an electronic form or as printed output. Originally, any computer data were considered as something inter ...
s, have been around since long before the Internet, the Web, and specialized E-book reading hardware.
Roberto Busa Roberto Busa (November 28, 1913 – August 9, 2011) was an Italian Jesuit priest and one of the pioneers in the usage of computers for linguistic and literary analysis. He was the author of the ''Index Thomisticus'', a complete lemmatization of the ...
began developing an electronic edition of
Aquinas Thomas Aquinas, OP (; it, Tommaso d'Aquino, lit=Thomas of Aquino; 1225 – 7 March 1274) was an Italian Dominican friar and priest who was an influential philosopher, theologian and jurist in the tradition of scholasticism; he is known wit ...
in the 1940s, while large-scale electronic text editing,
hypertext Hypertext is E-text, text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typi ...
, and online reading platforms such as
Augment Augment or augmentation may refer to: Language *Augment (Indo-European), a syllable added to the beginning of the word in certain Indo-European languages *Augment (Bantu languages), a morpheme that is prefixed to the noun class prefix of nouns i ...
and
FRESS The File Retrieval and Editing SyStem, or FRESS, was a hypertext system developed at Brown University starting in 1968 by Andries van Dam and his students, including Bob Wallace. It was the first hypertext system to run on readily available comm ...
appeared in the 1960s. These early systems made extensive use of formatting, markup, automatic tables of contents,
hyperlinks In computing, a hyperlink, or simply a link, is a digital reference to data that the user can follow or be guided by clicking or tapping. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text wi ...
, and other information in their texts, as well as in some cases (such as FRESS) supporting not just text but also graphics.


"Just plain text"

In some communities, "e-text" is used much more narrowly, to refer to electronic documents that are, so to speak, "plain
vanilla Vanilla is a spice derived from orchids of the genus ''Vanilla (genus), Vanilla'', primarily obtained from pods of the Mexican species, flat-leaved vanilla (''Vanilla planifolia, V. planifolia''). Pollination is required to make the p ...
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding standard for electronic communication. ASCII codes represent text in computers, telecommunications equipment, and other devices. Because of ...
". By this is meant not only that the document is a
plain text In computing, plain text is a loose term for data (e.g. file contents) that represent only characters of readable material but not its graphical representation nor other objects (floating-point numbers, images, etc.). It may also include a limit ...
file, but that it has no information beyond "the text itself"—no representation of bold or italics, paragraph, page, chapter, or footnote boundaries, etc. Michael S. Hart, for example, argued that this "is the only text mode that is easy on both the eyes and the computer". Hart made the correct point that proprietary word-processor formats made texts grossly inaccessible; but that is irrelevant to standard, open data formats. The narrow sense of "e-text" is now uncommon, because the notion of "just vanilla ASCII" (attractive at first glance), has turned out to have serious difficulties: First, this narrow type of "e-text" is limited to the English letters. Not even Spanish ñ or the accented vowels used in many European languages cannot be represented (unless awkwardly and ambiguously as "~n" "a'"). Asian, Slavic, Greek, and other writing systems are impossible. Second, diagrams and pictures cannot be accommodated, and many books have at least some such material; often it is essential to the book. Third, "e-texts" in this narrow sense have no reliable way to distinguish "the text" from other things that occur in a work. For example, page numbers,
page header In typography and word processing, a page header (or simply header) is text that is separated from the body text and appears at the top of a printed page. Word-processing programs usually allow for the configuration of page headers, which are typ ...
s, and
footnote A note is a string of text placed at the bottom of a page in a book or document or at the end of a chapter, volume, or the whole text. The note can provide an author's comments on the main text or citations of a reference work in support of the ...
s might be omitted, or might simply appear as additional lines of text, perhaps with blank lines before and after (or not). An ornate separator line might be represented instead by a line of asterisks (or not). Chapter and sections titles, likewise, are just additional lines of text: they might be detectable by capitalization if they were all caps in the original (or not). Even to discover what conventions (if any) were used, makes each book a new research or reverse-engineering project. In consequence of this, such texts cannot be reliably re-formatted. A program cannot reliably tell where footnotes, headers or footers are, or perhaps even paragraphs, so it cannot re-arrange the text, for example to fit a narrower screen, or read it aloud for the visually impaired. Programs might apply
heuristics A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate, ...
to guess at the structure, but this can easily fail. Fourth, and a perhaps surprisingly important issue, a "plain-text" e-text affords no way to represent information ''about'' the work. For example, is it the first or the tenth edition? Who prepared it, and what rights do they reserve or grant to others? Is this the raw version straight off a scanner, or has it been proofread and corrected?
Metadata Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including: * Descriptive metadata – the descriptive ...
relating to the text is sometimes included with an e-text, but there is by this definition no way to say whether or where it is preset. At best, the text of the title page might be included (or not), perhaps with centering imitated by indentation. Fifth, texts with more complicated information cannot really be handled at all. A bilingual edition, or a
critical edition Textual criticism is a branch of textual scholarship, philology, and of literary criticism that is concerned with the identification of textual variants, or different versions, of either manuscripts or of printed books. Such texts may range in da ...
with footnotes, commentary, critical apparatus,
cross-references The term cross-reference (abbreviation: xref) can refer to either: * An instance within a document which refers to related information elsewhere in the same document. In both printed and online dictionaries cross-references are important because ...
, or even the simplest tables. This leads to endless practical problems: for example, if the computer cannot reliably distinguish footnotes, it cannot find a phrase that a footnote interrupts. Even raw scanner OCR output usually produces more information than this, such as the use of bold and italic. If this information is not kept, it is expensive and time-consuming to reconstruct it; more sophisticated information such as what edition you have, may not be recoverable at all. If actuality, even "plain text" uses some kind of "markup"—usually
control character In computing and telecommunication, a control Character (computing), character or non-printing character (NPC) is a code point (a number) in a character encoding, character set, that does not represent a written symbol. They are used as in-band ...
s, spaces, tabs, and the like: Spaces between words; two returns and 5 spaces for paragraph. The main difference from more formal markup is that "plain texts" use implicit, usually undocumented conventions, which are therefore inconsistent and difficult to recognize. The narrow sense of e-text as "plain vanilla ASCII" has fallen out of favor. Nevertheless, many such texts are freely available on the Web, perhaps as much because they are easily produced as because of any purported portability advantage. For many years
Project Gutenberg Project Gutenberg (PG) is a Virtual volunteering, volunteer effort to digitize and archive cultural works, as well as to "encourage the creation and distribution of eBooks." It was founded in 1971 by American writer Michael S. Hart and is the ...
strongly favored this model of text, but with time, has begun to develop and distribute more capable forms such as
HTML The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
.


See also

*
Text file A text file (sometimes spelled textfile; an old alternative name is flatfile) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In operating ...
*
e-book An ebook (short for electronic book), also known as an e-book or eBook, is a book publication made available in digital form, consisting of text, images, or both, readable on the flat-panel display of computers or other electronic devices. Alt ...
*
Electronic paper Electronic paper, also sometimes electronic ink, e-ink or electrophoretic display, are display devices that mimic the appearance of ordinary ink on paper. Unlike conventional flat panel displays that emit light, an electronic paper display ref ...
*
Digital library A digital library, also called an online library, an internet library, a digital repository, or a digital collection is an online database of digital objects that can include text, still images, audio, video, digital documents, or other digital me ...
*
Online Books Page The Online Books Page is an index of e-text books available on the Internet. It is edited by John Mark Ockerbloom and is hosted by the library of the University of Pennsylvania. The Online Books Page lists over 2 million books and has several feat ...
*
Distributed Proofreaders Distributed Proofreaders (commonly abbreviated as DP or PGDP) is a web-based project that supports the development of e-texts for Project Gutenberg by allowing many people to work together in proofreading drafts of e-texts for errors. As of Marc ...
* L'Association des Bibliophiles Universels


References

{{reflist


External links


''Scholarly Electronic Publishing Bibliography''
Books by type Electronic publishing Documents ru:Электронная книга (документ)