
In computing, the term text processing refers to the theory and practice of automating the creation or manipulation of electronic text.
''Text'' usually refers to all the alphanumeric characters specified on the keyboard of the person engaging the practice, but in general ''text'' means the
abstraction layer immediately above the standard
character encoding
Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values tha ...
of the target text.
The term ''processing'' refers to automated (or mechanized) processing, as opposed to the same manipulation done manually.
Text processing involves computer commands which invoke content, content changes, and cursor movement, for example to
* search and replace
* format
* generate a processed report of the content of, or
* filter a file or report of a text file.
The text processing of a
regular expression
A regular expression (shortened as regex or regexp; sometimes referred to as rational expression) is a sequence of characters that specifies a search pattern in text. Usually such patterns are used by string-searching algorithms for "find" ...
is a virtual editing machine, having a primitive programming language that has named registers (identifiers), and named positions in the sequence of characters comprising the text. Using these, the "text processor" can, for example, mark a region of text, and then move it. The text processing of a ''
utility
As a topic of economics, utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness as part of the theory of utilitarianism by moral philosoph ...
'' is a
filter program, or ''filter''. These two mechanisms comprise text processing.
Definition
Since the standardized markup such as
ANSI escape code
ANSI escape sequences are a standard for in-band signaling to control cursor location, color, font styling, and other options on video text terminals and terminal emulators. Certain sequences of bytes, most starting with an ASCII escape ch ...
s are generally invisible to the editor, they comprise a set of transitory properties that become at times indistinguishable from
word processing
A word is a basic element of language that carries an objective or practical meaning, can be used on its own, and is uninterruptible. Despite the fact that language speakers often have an intuitive grasp of what a word is, there is no consen ...