HOME

TheInfoList




In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and software. It has sci ...

computing
, plain text is a loose term for data (e.g. file contents) that represent only
character Character(s) may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to Theophrastus M ...
s of readable material but not its graphical representation nor other objects (
floating-point numbers In computing, floating-point arithmetic (FP) is arithmetic using formulaic representation of real numbers as an approximation to support a trade-off between range and precision. For this reason, floating-point computation is often used in system ...
, images, etc.). It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters (although tab characters can "mean" many different things, so are hardly "plain"). Plain text is different from
formatted text In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and softw ...
, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from
binary files 290px, A hex dump of the 318 byte Wikipedia favicon">byte.html" ;"title="hex dump of the 318 byte">hex dump of the 318 byte Wikipedia favicon, or . The first column numerates the line's starting address, while the * indicates repetition. A bina ...
in which some portions must be interpreted as binary objects (encoded integers, real numbers, images, etc.). The term is sometimes used quite loosely, to mean files that contain ''only'' "readable" content (or just files with nothing that the speaker doesn't prefer). For example, that could exclude any indication of fonts or layout (such as markup, markdown, or even tabs); characters such as curly quotes, non-breaking spaces, soft hyphens, em dashes, and/or ligatures; or other things. In principle, plain text can be in any
encoding In communication Communication (from Latin ''communicare'', meaning "to share") is the act of developing Semantics, meaning among Subject (philosophy), entities or Organization, groups through the use of sufficiently mutually understood sign ...
, but occasionally the term is taken to imply
ASCII ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the stu ...
. As
Unicode Unicode, formally the Unicode Standard, is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expressed in most of the world's wri ...

Unicode
-based encodings such as
UTF-8 UTF-8 is a variable-width character encoding Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be ...
and
UTF-16 UTF-16 ( Transformation Format) is a capable of encoding all 1,112,064 valid character s of Unicode (in fact this number of code points is dictated by the design of UTF-16). The encoding is , as code points are encoded with one or two 16-bit ''c ...
become more common, that usage may be shrinking. Plain text is also sometimes used only to exclude "binary" files: those in which at least some parts of the file cannot be correctly interpreted via the character encoding in effect. For example, a file or
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * Strings (1991 film), ''Strings'' (1991 fil ...
consisting of "hello" (in whatever encoding), following by 4 bytes that express a binary integer that is ''not'' just a character(s), is a binary file, not plain text by even the loosest common usages. Put another way, translating a plain text file to a character encoding that uses entirely different numbers to represent
character Character(s) may refer to: Arts, entertainment, and media Literature * ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * ''Characters'' (Theophrastus), a classical Greek set of character sketches attributed to Theophrastus M ...
s does not change the meaning (so long as you know what encoding is in use), but for binary files such a conversion ''does'' change the meaning of at least some parts of the file.


Plain text and rich text

According to The Unicode Standard: * "''Plain text'' is a pure sequence of character codes; plain Un-encoded text is therefore a sequence of Unicode character codes. * In contrast, ''styled text'', also known as ''rich text'', is any text representation containing plain text plus added information such as a language identifier, font size, color, hypertext links, and so on. SGML, RTF, HTML, XML, and TEX are examples of rich text fully represented as plain text streams, interspersing plain text data with sequences of characters that represent the additional data structures." The Unicode Standard, version 14.0, pages 18–19
/ref> According to other definitions, however, files that contain markup or other
meta-data Metadata is "data Data are units of information Information can be thought of as the resolution of uncertainty; it answers the question of "What an entity is" and thus defines both its essence and the nature of its characteristics. T ...
are generally considered plain text, so long as the markup is also in directly
human-readable 220px, ISBN represented as EAN-13 bar code showing both human-readable and machine-readable data A human-readable medium or human-readable format is any encoding of data Data are units of information Information can be thought of as ...
form (as in
HTML The HyperText Markup Language, or HTML is the standard markup language #REDIRECT Markup language In computer text processing, a markup language is a system for annotation, annotating a document in a way that is Syntax (logic), syntacticall ...

HTML
,
XML Extensible Markup Language (XML) is a markup language #REDIRECT Markup language In computer text processing, a markup language is a system for annotation, annotating a document in a way that is Syntax (logic), syntactically distinguishable fro ...

XML
, and so on). Thus, representations such as
SGML The Standard Generalized Markup Language (SGML; ISO The International Organization for Standardization (ISO ) is an international standard An international standard is a technical standard A technical standard is an established norm (social) ...

SGML
, RTF,
HTML The HyperText Markup Language, or HTML is the standard markup language #REDIRECT Markup language In computer text processing, a markup language is a system for annotation, annotating a document in a way that is Syntax (logic), syntacticall ...

HTML
,
XML Extensible Markup Language (XML) is a markup language #REDIRECT Markup language In computer text processing, a markup language is a system for annotation, annotating a document in a way that is Syntax (logic), syntactically distinguishable fro ...

XML
,
wiki markup A wiki ( ) is a hypertext Hypertext is text displayed on a or other with references () to other text that the reader can immediately access. Hypertext documents are interconnected by hyperlinks, which are typically activated b ...
, and
TeX TeX (, see below), stylized within the system as TeX, is a typesetting system which was designed and mostly written by Donald Knuth and released in 1978. TeX is a popular means of typesetting complex mathematical formulae; it has been noted ...
, as well as nearly all programming language source code files, are considered plain text. The particular content is irrelevant to whether a file is plain text. For example, an
SVG#REDIRECT Scalable Vector Graphics Scalable Vector Graphics (SVG) is an Extensible Markup Language (XML)-based vector image format for two-dimensional graphics with support for interactivity and animation. The SVG specification is an open s ...
file can express drawings or even bitmapped graphics, but is still plain text. The use of plain text rather than binary files enables files to survive much better "in the wild", in part by making them largely immune to computer architecture incompatibilities. For example, all the problems of
Endianness In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithm of an algorithm (Euclid's algorithm) for calculating the greatest comm ...
can be avoided (with encodings such as
UCS-2 The Universal Coded Character Set (UCS, Unicode) is a standard set of characters Character(s) may refer to: Arts, entertainment, and media Literature * Character (novel), ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk * Ch ...
rather than UTF-8, endianness matters, but uniformly for every character, rather than for potentially-unknown subsets of it).


Usage

The purpose of using plain text today is primarily independence from programs that require their very own special encoding or formatting or
file format A file format is a standard Standard may refer to: Flags * Colours, standards and guidons * Standard (flag), a type of flag used for personal identification Norm, convention or requirement * Standard (metrology), an object that bears ...
. Plain text files can be opened, read, and edited with ubiquitous
text editor A text editor is a type of computer program In imperative programming, a computer program is a sequence of instructions in a programming language that a computer can execute or interpret. In declarative programming, a ''computer program'' is ...

text editor
s and utilities. A
command-line interface A command-line interface (CLI) processes command COMMAND.COM is the default command-line interpreter A command-line interface (CLI) processes commands to a computer program in the form of lines of text. The program which handles the i ...
allows people to give commands in plain text and get a response, also typically in plain text. Many other computer programs are also capable of processing or creating plain text, such as countless programs in
DOS DOS (, ) is a platform-independent acronym for disk operating system which later became a common shorthand for disk-based operating systems on IBM PC compatible IBM PC compatible computers are similar to the original IBM Personal Computer, IB ...

DOS
,
Windows Microsoft Windows, commonly referred to as Windows, is a group of several proprietary {{Short pages monitor