HOME

TheInfoList



OR:

In
formal language theory In logic, mathematics, computer science, and linguistics, a formal language is a set of string (computer science), strings whose symbols are taken from a set called "#Definition, alphabet". The alphabet of a formal language consists of symbol ...
, the empty string, or empty word, is the unique
string String or strings may refer to: *String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects Arts, entertainment, and media Films * ''Strings'' (1991 film), a Canadian anim ...
of length zero.


Formal theory

Formally, a string is a finite, ordered sequence of characters such as letters, digits or spaces. The empty string is the special case where the sequence has length zero, so there are no symbols in the string. There is only one empty string, because two strings are only different if they have different lengths or a different sequence of symbols. In formal treatments, the empty string is denoted with ε or sometimes Λ or λ. The empty string should not be confused with the empty language
∅ In mathematics, the empty set or void set is the unique set having no elements; its size or cardinality (count of elements in a set) is zero. Some axiomatic set theories ensure that the empty set exists by including an axiom of empty set, wh ...
, which is a
formal language In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet". The alphabet of a formal language consists of symbols that concatenate into strings (also c ...
(i.e. a set of strings) that contains no strings, not even the empty string. The empty string has several properties: * , ε, = 0. Its string length is zero. * ε ⋅ s = s ⋅ ε = s. The empty string is the
identity element In mathematics, an identity element or neutral element of a binary operation is an element that leaves unchanged every element when the operation is applied. For example, 0 is an identity element of the addition of real numbers. This concept is use ...
of the
concatenation In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalizations of concatenati ...
operation. The set of all strings forms a
free monoid In abstract algebra, the free monoid on a set is the monoid whose elements are all the finite sequences (or strings) of zero or more elements from that set, with string concatenation as the monoid operation and with the unique sequence of zero ...
with respect to ⋅ and ε. * εR = ε. Reversal of the empty string produces the empty string, so the empty string is a
palindrome A palindrome (Help:IPA/English, /ˈpæl.ɪn.droʊm/) is a word, palindromic number, number, phrase, or other sequence of symbols that reads the same backwards as forwards, such as ''madam'' or ''racecar'', the date "Twosday, 02/02/2020" and th ...
. * \forall c \in s: P(c). Statements that are about all characters in a string are vacuously true. * The empty string precedes any other string under
lexicographical order In mathematics, the lexicographic or lexicographical order (also known as lexical order, or dictionary order) is a generalization of the alphabetical order of the dictionaries to sequences of ordered symbols or, more generally, of elements of a ...
, because it is the shortest of all strings.CSE1002 Lecture Notes – Lexicographic
/ref> In
context-free grammar In formal language theory, a context-free grammar (CFG) is a formal grammar whose production rules can be applied to a nonterminal symbol regardless of its context. In particular, in a context-free grammar, each production rule is of the fo ...
s, a production rule that allows a
symbol A symbol is a mark, Sign (semiotics), sign, or word that indicates, signifies, or is understood as representing an idea, physical object, object, or wikt:relationship, relationship. Symbols allow people to go beyond what is known or seen by cr ...
to produce the empty string is known as an ε-production, and the symbol is said to be "nullable".


Use in programming languages

In most
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s, strings are a
data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...
. Strings are typically stored at distinct
memory address In computing, a memory address is a reference to a specific memory location in memory used by both software and hardware. These addresses are fixed-length sequences of digits, typically displayed and handled as unsigned integers. This numeric ...
es (locations). Thus, the same string (e.g., the empty string) may be stored in two or more places in memory. In this way, there could be multiple empty strings in memory, in contrast with the formal theory definition, for which there is only one possible empty string. However, a string comparison function would indicate that all of these empty strings are equal to each other. Even a string of length zero can require memory to store it, depending on the format being used. In most programming languages, the empty string is distinct from a null reference (or null pointer) because a null reference points to no string at all, not even the empty string. The empty string is a legitimate string, upon which most string operations should work. Some languages treat some or all of the following in similar ways: empty strings, null references, the integer 0, the floating point number 0, the Boolean value false, the
ASCII ASCII ( ), an acronym for American Standard Code for Information Interchange, is a character encoding standard for representing a particular set of 95 (English language focused) printable character, printable and 33 control character, control c ...
character NUL, or other such values. The empty string is usually represented similarly to other strings. In implementations with string terminating character (
null-terminated string In computer programming, a null-terminated string is a character string stored as an array containing the characters and terminated with a ''null character'' (a character with an internal value of zero, called "NUL" in this article, not same a ...
s or plain text lines), the empty string is indicated by the immediate use of this terminating character. Different functions, methods, macros, or
idioms An idiom is a phrase or expression that largely or exclusively carries a figurative or non-literal meaning, rather than making any literal sense. Categorized as formulaic language, an idiomatic expression's meaning is different from the lit ...
exist for checking if a string is empty in different languages.


Representations of the empty string

The empty string is a syntactically valid representation of
zero 0 (zero) is a number representing an empty quantity. Adding (or subtracting) 0 to any number leaves that number unchanged; in mathematical terminology, 0 is the additive identity of the integers, rational numbers, real numbers, and compl ...
in
positional notation Positional notation, also known as place-value notation, positional numeral system, or simply place value, usually denotes the extension to any radix, base of the Hindu–Arabic numeral system (or decimal, decimal system). More generally, a posit ...
(in any base), which does not contain
leading zero A leading zero is any 0 digit that comes before the first nonzero digit in a number string in positional notation.. For example, James Bond's famous identifier, 007, has two leading zeros. Any zeros appearing to the left of the first non-zero dig ...
s. Since the empty string does not have a standard visual representation outside of formal language theory, the number zero is traditionally represented by a single
decimal digit A numerical digit (often shortened to just digit) or numeral is a single symbol used alone (such as "1"), or in combinations (such as "15"), to represent numbers in positional notation, such as the common base 10. The name "digit" originate ...
0 instead. Zero-filled memory area, interpreted as a
null-terminated string In computer programming, a null-terminated string is a character string stored as an array containing the characters and terminated with a ''null character'' (a character with an internal value of zero, called "NUL" in this article, not same a ...
, is an empty string. Empty lines of text show the empty string. This can occur from two consecutive EOLs, as often occur in
text file A text file (sometimes spelled textfile; an old alternative name is flat file) is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system. In ope ...
s. This is sometimes used in
text processing In computing, the term text processing refers to the theory and practice of automating the creation or manipulation of electronic text. ''Text'' usually refers to all the alphanumeric characters specified on the keyboard of the person engaging th ...
to separate
paragraph A paragraph () is a self-contained unit of discourse in writing dealing with a particular point or idea. Though not required by the orthographic conventions of any language with a writing system, paragraphs are a conventional means of organizing ...
s, e.g. in
MediaWiki MediaWiki is free and open-source wiki software originally developed by Magnus Manske for use on Wikipedia on January 25, 2002, and further improved by Lee Daniel Crocker,mailarchive:wikipedia-l/2001-August/000382.html, Magnus Manske's announc ...
.


See also

*
Empty set In mathematics, the empty set or void set is the unique Set (mathematics), set having no Element (mathematics), elements; its size or cardinality (count of elements in a set) is 0, zero. Some axiomatic set theories ensure that the empty set exi ...
*
Null-terminated string In computer programming, a null-terminated string is a character string stored as an array containing the characters and terminated with a ''null character'' (a character with an internal value of zero, called "NUL" in this article, not same a ...
*
Concatenation theory Concatenation theory, also called string theory, character-string theory, or theoretical syntax, studies character strings over finite alphabets of characters, signs, symbols, or marks. String theory is foundational for formal linguistics, compute ...
*
String literal string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where , "foo ...


References

{{DEFAULTSORT:Empty String Formal languages String (computer science) Zero (linguistics)