In

^{*} (using the

^{*}, that is, a set of words over that alphabet. Sometimes the sets of words are grouped into expressions, whereas rules and constraints may be formulated for the creation of 'well-formed expressions'.
In computer science and mathematics, which do not usually deal with

^{*}, the set of ''all'' words over Σ;
* = ^{*} = , where ''n'' ranges over the natural numbers and "a^{''n''}" means "a" repeated ''n'' times (this is the set of words consisting only of the symbol "a");
* the set of syntactically correct programs in a given programming language (the syntax of which is usually defined by a

the set .

formal grammar
In formal language theory
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt:λ ...

;
* those strings described or matched by a particular

logic
Logic is an interdisciplinary field which studies truth and reasoning. Informal logic seeks to characterize Validity (logic), valid arguments informally, for instance by listing varieties of fallacies. Formal logic represents statements and ar ...

, the set of possible formulas of a particular logic is a formal language, and an

Formal Language Definitions

* James Power

"Notes on Formal Language Theory and Parsing"

29 November 2002. * Drafts of some chapters in the "Handbook of Formal Language Theory", Vol. 1–3, G. Rozenberg and A. Salomaa (eds.), Springer Verlag, (1997): ** Alexandru Mateescu and Arto Salomaa

"Preface" in Vol.1, pp. v–viii, and "Formal Languages: An Introduction and a Synopsis", Chapter 1 in Vol. 1, pp.1–39

** Sheng Yu

"Regular Languages", Chapter 2 in Vol. 1

** Jean-Michel Autebert, Jean Berstel, Luc Boasson

** Christian Choffrut and Juhani Karhumäki

"Combinatorics of Words", Chapter 6 in Vol. 1

** Tero Harju and Juhani Karhumäki

"Morphisms", Chapter 7 in Vol. 1, pp. 439–510

** Jean-Eric Pin

"Syntactic semigroups", Chapter 10 in Vol. 1, pp. 679–746

** M. Crochemore and C. Hancart

"Automata for matching patterns", Chapter 9 in Vol. 2

** Dora Giammarresi, Antonio Restivo

"Two-dimensional Languages", Chapter 4 in Vol. 3, pp. 215–267

{{DEFAULTSORT:Formal Language Formal languages, Theoretical computer science Combinatorics on words

logic
Logic is an interdisciplinary field which studies truth and reasoning. Informal logic seeks to characterize Validity (logic), valid arguments informally, for instance by listing varieties of fallacies. Formal logic represents statements and ar ...

, mathematics
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities and their changes (cal ...

, computer science
Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application.
Computer science is the study of computation, automation, a ...

, and linguistics
Linguistics is the scientific study of language
A language is a structured system of communication
Communication (from Latin
Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo ...

, a formal language consists of words
In linguistics
Linguistics is the scientific study of language
A language is a structured system of communication used by humans, including speech (spoken language), gestures (Signed language, sign language) and writing. Most lang ...

whose letters
Letter, letters, or literature may refer to:
Characters typeface
* Letter (alphabet)
A letter is a segmental symbol
A symbol is a mark, sign, or word that indicates, signifies, or is understood as representing an idea, Object (philosophy ...

are taken from an alphabet
An alphabet is a standardized set of basic written symbols
A symbol is a mark, sign, or word
In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semanti ...

and are well-formed
Well-formed or wellformed indicate syntactic correctness and may refer to:
* Well-formedness, quality of linguistic elements that conform to grammar rules
* Well-formed formula, a string that is generated by a formal grammar in logic
* Well-formed ...

according to a specific set of rules.
The alphabet
An alphabet is a standardized set of basic written symbols
A symbol is a mark, sign, or word
In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semanti ...

of a formal language consists of symbols, letters, or tokens that concatenate into strings of the language. Each string concatenated from symbols of this alphabet is called a word, and the words that belong to a particular formal language are sometimes called ''well-formed words'' or ''well-formed formula
In mathematical logic
Mathematical logic is the study of formal logic within mathematics. Major subareas include model theory, proof theory, set theory, and recursion theory. Research in mathematical logic commonly addresses the mathematical p ...

s''. A formal language is often defined by means of a formal grammar
In formal language theory
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt:λ ...

such as a regular grammar
In theoretical computer science and formal language theory, a regular grammar is a formal grammar such that all its production (computer science), production rules are of one of the following forms:
# ''A'' → ''a''
# ''A'' → ''aB''
# ''A'' → ...

or context-free grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), production rules are of the form
:A\ \to\ \alpha
with A a ''single'' nonterminal symbol, and \alpha a string of Terminal and nonterm ...

, which consists of its formation ruleIn mathematical logic
Mathematical logic, also called formal logic, is a subfield of mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algeb ...

s.
The field of formal language theory studies primarily the purely syntactical
In linguistics, syntax () is the set of rules, principles, and processes that govern the structure of Sentence (linguistics), sentences (sentence structure) in a given Natural language, language, usually including word order. The term ''syntax'' ...

aspects of such languages—that is, their internal structural patterns. Formal language theory sprang out of linguistics, as a way of understanding the syntactic regularities of natural language
In neuropsychology
Neuropsychology is a branch of psychology. It is concerned with how a person's cognition and behavior are related to the brain and the rest of the nervous system. Professionals in this branch of psychology often focus on ...

s.
In computer science, formal languages are used among others as the basis for defining the grammar of programming language
A programming language is a formal language
In logic, mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science) ...

s and formalized versions of subsets of natural languages in which the words of the language represent concepts that are associated with particular meanings or semantics
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference
Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another ...

. In computational complexity theory
Computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by ...

, decision problem
In computability theory and computational complexity theory, a decision problem is a problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding whether a given natural number is prime. Anot ...

s are typically defined as formal languages, and complexity class
Complexity characterises the behaviour of a system or model (disambiguation), model whose components interaction, interact in multiple ways and follow local rules, meaning there is no reasonable higher instruction to define the various possible i ...

es are defined as the sets of the formal languages that can be with limited computational power. In logic
Logic is an interdisciplinary field which studies truth and reasoning. Informal logic seeks to characterize Validity (logic), valid arguments informally, for instance by listing varieties of fallacies. Formal logic represents statements and ar ...

and the foundations of mathematics
Foundations of mathematics is the study of the philosophical
Philosophy (from , ) is the study of general and fundamental questions, such as those about existence
Existence is the ability of an entity to interact with physical or mental r ...

, formal languages are used to represent the syntax of axiomatic system
In mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...

s, and mathematical formalism is the philosophy that all of mathematics can be reduced to the syntactic manipulation of formal languages in this way.
History

The first use of formal language is thought to beGottlob Frege
Friedrich Ludwig Gottlob Frege (; ; 8 November 1848 – 26 July 1925) was a German philosopher, logician, and mathematician. He worked as a mathematics professor at the University of Jena, and is understood by many to be the father of analy ...

's 1879 ''Begriffsschrift
''Begriffsschrift'' (German for, roughly, "concept-script") is a book on logic by Gottlob Frege, published in 1879, and the formal system set out in that book.
''Begriffsschrift'' is usually translated as ''concept writing'' or ''concept notatio ...

'', meaning "concept writing", which described a "formal language, modeled upon that of arithmetic, for pure thought."
Axel Thue
Axel Thue (; 19 February 1863 – 7 March 1922), was a Norwegian
Norwegian, Norwayan, or Norsk may refer to:
*Something of, from, or related to Norway, a country in northwestern Europe
*Norwegians, both a nation and an ethnic group native to Nor ...

's early semi-Thue systemIn theoretical computer science
Image:Maquina.png, An artistic representation of a Turing machine. Turing machines are used to model general computing devices.
Theoretical computer science (TCS) is a subset of general computer science and mathemati ...

, which can be used for rewriting strings, was influential on formal grammar
In formal language theory
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt:λ ...

s.
Words over an alphabet

An alphabet, in the context of formal languages, can be any set, although it often makes sense to use analphabet
An alphabet is a standardized set of basic written symbols
A symbol is a mark, sign, or word
In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semanti ...

in the usual sense of the word, or more generally a character set
Character encoding is the process of assigning numbers to Graphics, graphical character (computing), characters, especially the written characters of Language, human language, allowing them to be Data storage, stored, Data communication, transmit ...

such as ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding
Character encoding is the process of assigning numbers to graphical
Graphics (from Greek
Greek may refer to:
Greece
Anything of, ...

or Unicode
Unicode, formally the Unicode Standard, is an information technology Technical standard, standard for the consistent character encoding, encoding, representation, and handling of Character (computing), text expressed in most of the world's wri ...

. The elements of an alphabet are called its letters. An alphabet may contain an infinite
Infinite may refer to:
Mathematics
*Infinite set, a set that is not a finite set
*Infinity, an abstract concept describing something without any limit
Music
*Infinite (band), a South Korean boy band
*''Infinite'' (EP), debut EP of American musi ...

number of elements; however, most definitions in formal language theory specify alphabets with a finite number of elements, and most results apply only to them.
A word over an alphabet can be any finite sequence (i.e., string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* Strings (1991 film), ''Strings'' (1991 fil ...

) of letters. The set of all words over an alphabet Σ is usually denoted by ΣKleene star
In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on Set (mathematics), sets of string (computer science), strings or on sets of symbols or characters. In mathematics
it ...

). The length of a word is the number of letters it is composed of. For any alphabet, there is only one word of length 0, the ''empty word'', which is often denoted by e, ε, λ or even Λ. By concatenation
In formal language theory
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt ...

one can combine two words to form a new word, whose length is the sum of the lengths of the original words. The result of concatenating a word with the empty word is the original word.
In some applications, especially in logic
Logic is an interdisciplinary field which studies truth and reasoning. Informal logic seeks to characterize Validity (logic), valid arguments informally, for instance by listing varieties of fallacies. Formal logic represents statements and ar ...

, the alphabet is also known as the ''vocabulary'' and words are known as ''formulas'' or ''sentences''; this breaks the letter/word metaphor and replaces it by a word/sentence metaphor.
Definition

A formal language ''L'' over an alphabet Σ is asubset
In mathematics
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are contained (geometry), and quantities a ...

of Σnatural language
In neuropsychology
Neuropsychology is a branch of psychology. It is concerned with how a person's cognition and behavior are related to the brain and the rest of the nervous system. Professionals in this branch of psychology often focus on ...

s, the adjective "formal" is often omitted as redundant.
While formal language theory usually concerns itself with formal languages that are described by some syntactical rules, the actual definition of the concept "formal language" is only as above: a (possibly infinite) set of finite-length strings composed from a given alphabet, no more and no less. In practice, there are many languages that can be described by rules, such as regular language
In theoretical computer science
An artistic representation of a Turing machine. Turing machines are used to model general computing devices.
Theoretical computer science (TCS) is a subset of general computer science that focuses on mathematica ...

s or context-free languageIn formal language theory
In mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science), alphabet and are well-formedn ...

s. The notion of a formal grammar
In formal language theory
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt:λ ...

may be closer to the intuitive concept of a "language," one described by syntactic rules. By an abuse of the definition, a particular formal language is often thought of as being equipped with a formal grammar that describes it.
Examples

The following rules describe a formal language over the alphabet Σ = : * Every nonempty string that does not contain "+" or "=" and does not start with "0" is in . * The string "0" is in . * A string containing "=" is in if and only if there is exactly one "=", and it separates two valid strings of . * A string containing "+" but not "=" is in if and only if every "+" in the string separates two valid strings of . * No string is in other than those implied by the previous rules. Under these rules, the string "23+4=555" is in , but the string "=234=+" is not. This formal language expressesnatural number
In mathematics, the natural numbers are those numbers used for counting (as in "there are ''six'' coins on the table") and total order, ordering (as in "this is the ''third'' largest city in the country"). In common mathematical terminology, w ...

s, well-formed additions, and well-formed addition equalities, but it expresses only what they look like (their syntax
In linguistics
Linguistics is the scientific study of language, meaning that it is a comprehensive, systematic, objective, and precise study of language. Linguistics encompasses the analysis of every aspect of language, as well as the ...

), not what they mean (semantics
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference
Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another ...

). For instance, nowhere in these rules is there any indication that "0" means the number zero, "+" means addition, "23+4=555" is false, etc.
Constructions

For finite languages, one can explicitly enumerate all well-formed words. For example, we can describe a language as just = . Thedegenerate
Degeneracy may refer to:
Science
* Codon degeneracy
* Degeneracy (biology), the ability of elements that are structurally different to perform the same function or yield the same output
* Degeneration (medical)
** Degenerative disease, a diseas ...

case of this construction is the empty language, which contains no words at all ( = ∅).
However, even over a finite (non-empty) alphabet such as Σ = there are an infinite number of finite-length words that can potentially be expressed: "a", "abb", "ababba", "aaababbbbaab", .... Therefore, formal languages are typically infinite, and describing an infinite formal language is not as simple as writing ''L'' = . Here are some examples of formal languages:
* = Σcontext-free grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), production rules are of the form
:A\ \to\ \alpha
with A a ''single'' nonterminal symbol, and \alpha a string of Terminal and nonterm ...

);
* the set of inputs upon which a certain Turing machine
A Turing machine is a mathematical model of computation
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are ...

halts; or
* the set of maximal strings of alphanumeric
Alphanumericals are a combination of alphabetical and numerical character
Character(s) may refer to:
Arts, entertainment, and media Literature
* ''Character'' (novel), a 1936 Dutch novel by Ferdinand Bordewijk
* ''Characters'' (Theophras ...

ASCII
ASCII ( ), abbreviated from American Standard Code for Information Interchange, is a character encoding
Character encoding is the process of assigning numbers to graphical
Graphics (from Greek
Greek may refer to:
Greece
Anything of, ...

characters on this line, i.e.,the set .

Language-specification formalisms

Formal languages are used as tools in multiple disciplines. However, formal language theory rarely concerns itself with particular languages (except as examples), but is mainly concerned with the study of various types of formalisms to describe languages. For instance, a language can be given as * those strings generated by someregular expression
A regular expression (shortened as regex or regexp; also referred to as rational expression) is a sequence of Character (computing), characters that specifies a ''search pattern matching, pattern''. Usually such patterns are used by string-se ...

;
* those strings accepted by some automaton
An automaton (; plural: automata or automatons) is a relatively self-operating machine
A machine is any physical system with ordered structural and functional properties. It may represent human-made or naturally occurring device molecu ...

, such as a Turing machine
A Turing machine is a mathematical model of computation
Mathematics (from Greek: ) includes the study of such topics as numbers (arithmetic and number theory), formulas and related structures (algebra), shapes and spaces in which they are ...

or finite-state automaton;
* those strings for which some decision procedure (an algorithm
In and , an algorithm () is a finite sequence of , computer-implementable instructions, typically to solve a class of problems or to perform a computation. Algorithms are always and are used as specifications for performing s, , , and other ...

that asks a sequence of related YES/NO questions) produces the answer YES.
Typical questions asked about such formalisms include:
* What is their expressive power? (Can formalism ''X'' describe every language that formalism ''Y'' can describe? Can it describe other languages?)
* What is their recognizability? (How difficult is it to decide whether a given word belongs to a language described by formalism ''X''?)
* What is their comparability? (How difficult is it to decide whether two languages, one described in formalism ''X'' and one in formalism ''Y'', or in ''X'' again, are actually the same language?).
Surprisingly often, the answer to these decision problems is "it cannot be done at all", or "it is extremely expensive" (with a characterization of how expensive). Therefore, formal language theory is a major application area of computability theory
Computability theory, also known as recursion theory, is a branch of mathematical logic
Mathematical logic is the study of formal logic within mathematics. Major subareas include model theory, proof theory, set theory, and recursion theory. ...

and complexity theory. Formal languages may be classified in the Chomsky hierarchy
In formal language, formal language theory, computer science and linguistics, the Chomsky hierarchy (also referred to as the Chomsky–Schützenberger hierarchy) is a containment hierarchy of classes of formal grammars.
This hierarchy of grammars ...

based on the expressive power of their generative grammar as well as the complexity of their recognizing automaton
An automaton (; plural: automata or automatons) is a relatively self-operating machine
A machine is any physical system with ordered structural and functional properties. It may represent human-made or naturally occurring device molecu ...

. Context-free grammar
In formal language theory, a context-free grammar (CFG) is a formal grammar whose Production (computer science), production rules are of the form
:A\ \to\ \alpha
with A a ''single'' nonterminal symbol, and \alpha a string of Terminal and nonterm ...

s and regular grammar
In theoretical computer science and formal language theory, a regular grammar is a formal grammar such that all its production (computer science), production rules are of one of the following forms:
# ''A'' → ''a''
# ''A'' → ''aB''
# ''A'' → ...

s provide a good compromise between expressivity and ease of parsing
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other obj ...

, and are widely used in practical applications.
Operations on languages

Certain operations on languages are common. This includes the standard set operations, such as union, intersection, and complement. Another class of operation is the element-wise application of string operations. Examples: suppose $L\_1$ and $L\_2$ are languages over some common alphabet $\backslash Sigma$. * The ''concatenation
In formal language theory
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt ...

'' $L\_1\; \backslash cdot\; L\_2$ consists of all strings of the form $vw$ where $v$ is a string from $L\_1$ and $w$ is a string from $L\_2$.
* The ''intersection'' $L\_1\; \backslash cap\; L\_2$ of $L\_1$ and $L\_2$ consists of all strings that are contained in both languages
* The ''complement'' $\backslash neg\; L\_1$ of $L\_1$ with respect to $\backslash Sigma$ consists of all strings over $\backslash Sigma$ that are not in $L\_1$.
* The Kleene star
In mathematical logic and computer science, the Kleene star (or Kleene operator or Kleene closure) is a unary operation, either on Set (mathematics), sets of string (computer science), strings or on sets of symbols or characters. In mathematics
it ...

: the language consisting of all words that are concatenations of zero or more words in the original language;
* ''Reversal'':
** Let ''ε'' be the empty word, then $\backslash varepsilon^R\; =\; \backslash varepsilon$, and
** for each non-empty word $w\; =\; \backslash sigma\_1\; \backslash cdots\; \backslash sigma\_n$ (where $\backslash sigma\_1,\; \backslash ldots,\; \backslash sigma\_n$are elements of some alphabet), let $w^R\; =\; \backslash sigma\_n\; \backslash cdots\; \backslash sigma\_1$,
** then for a formal language $L$, $L^R\; =\; \backslash $.
* String homomorphism
Such string operationsIn computer science
Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application.
Computer science is the study of Algorith ...

are used to investigate closure properties of classes of languages. A class of languages is closed under a particular operation when the operation, applied to languages in the class, always produces a language in the same class again. For instance, the context-free languageIn formal language theory
In mathematics, computer science, and linguistics, a formal language consists of string (computer science), words whose symbol (formal), letters are taken from an alphabet (computer science), alphabet and are well-formedn ...

s are known to be closed under union, concatenation, and intersection with regular language
In theoretical computer science
An artistic representation of a Turing machine. Turing machines are used to model general computing devices.
Theoretical computer science (TCS) is a subset of general computer science that focuses on mathematica ...

s, but not closed under intersection or complement. The theory of trios
Trio may refer to:
Music
Groups
* Trio (music), an ensemble of three performers, or a composition for such an ensemble
** Jazz trio, pianist, double bassist, drummer
** Minuet and trio, a form in classical music
** String trio, a group of three s ...

and abstract families of languages studies the most common closure properties of language families in their own right., Chapter 11: Closure properties of families of languages.
:
Applications

Programming languages

A compiler usually has two distinct components. A lexical analyzer, sometimes generated by a tool like`lex`

, identifies the tokens of the programming language grammar, e.g. identifier
An identifier is a name that identifies (that is, labels the identity of) either a unique object or a unique ''class'' of objects, where the "object" or class may be an idea, physical countable
In mathematics
Mathematics (from Greek: ) i ...

s or keywords, numeric and string literals, punctuation and operator symbols, which are themselves specified by a simpler formal language, usually by means of regular expressions
A regular expression (shortened as regex or regexp; also referred to as rational expression) is a sequence of characters that specifies a ''search pattern''. Usually such patterns are used by string-searching algorithms for "find" or "find ...

. At the most basic conceptual level, a parser
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a of , either in , or s, conforming to the rules of a . The term ''parsing'' comes from Latin ''pars'' (''orationis''), meaning .
The term has slightly different meani ...

, sometimes generated by a parser generator
In computer science, a compiler-compiler or compiler generator is a programming tool that creates a parsing, parser, interpreter (computer software), interpreter, or compiler from some form of formal description of a programming language and mac ...

like ```
yacc
Yacc (Yet Another Compiler-Compiler) is a computer program
A computer program is a collection of instructions that can be executed by a computer to perform a specific task.
A computer program is usually written by a computer programmer in a ...
```

, attempts to decide if the source program is syntactically valid, that is if it is well formed with respect to the programming language grammar for which the compiler was built.
Of course, compilers do more than just parse the source code – they usually translate it into some executable format. Because of this, a parser usually outputs more than a yes/no answer, typically an abstract syntax tree
In computer science
Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application.
Computer science is the study of Algor ...

. This is used by subsequent stages of the compiler to eventually generate an executable
In computing
Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and ...

containing machine code
In computer programming
Computer programming is the process of designing and building an executable
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, c ...

that runs directly on the hardware, or some intermediate code
Bytecode, also termed portable code or p-code, is a form of instruction set designed for efficient execution by a software Interpreter (computing), interpreter. Unlike Human-readable code, human-readable source code, bytecodes are compact numeric ...

that requires a virtual machine
In computing, a virtual machine (VM) is the virtualization
In computing, virtualization or virtualisation (sometimes abbreviated v12n, a numeronym) is the act of creating a virtual (rather than actual) version of something, including virtual co ...

to execute.
Formal theories, systems, and proofs

Inmathematical logic
Mathematical logic is the study of formal logic within mathematics. Major subareas include model theory, proof theory, set theory, and recursion theory. Research in mathematical logic commonly addresses the mathematical properties of formal sys ...

, a ''formal theory'' is a set of sentences
''The Four Books of Sentences'' (''Libri Quattuor Sententiarum'') is a book of theology
Theology is the systematic study of the nature of the Divinity, divine and, more broadly, of religious belief. It is taught as an Discipline (academia), aca ...

expressed in a formal language.
A ''formal system'' (also called a ''logical calculus'', or a ''logical system'') consists of a formal language together with a deductive apparatus
A formal system is an abstract structure used for inferring theorems from axioms according to a set of rules. These rules, which are used for carrying out the inference of theorems from axioms, are the logical calculus of the formal system.
A form ...

(also called a ''deductive system''). The deductive apparatus may consist of a set of transformation rule
A rule of inference, inference rule or transformation rule is a logical form
In logic
Logic (from Ancient Greek, Greek: grc, wikt:λογική, λογική, label=none, lit=possessed of reason, intellectual, dialectical, argumentative, ...

s, which may be interpreted as valid rules of inference, or a set of axiom
An axiom, postulate or assumption is a statement that is taken to be truth, true, to serve as a premise or starting point for further reasoning and arguments. The word comes from the Greek ''axíōma'' () 'that which is thought worthy or fit' o ...

s, or have both. A formal system is used to derive
Derive may refer to:
*Derive (computer algebra system), a commercial system made by Texas Instruments
*Dérive (magazine), ''Dérive'' (magazine), an Austrian science magazine on urbanism
*Dérive, a psychogeographical concept
See also
*
*Deri ...

one expression from one or more other expressions. Although a formal language can be identified with its formulas, a formal system cannot be likewise identified by its theorems. Two formal systems $\backslash mathcal$ and $\backslash mathcal$ may have all the same theorems and yet differ in some significant proof-theoretic way (a formula A may be a syntactic consequence of a formula B in one but not another for instance).
A ''formal proof'' or ''derivation'' is a finite sequence of well-formed formulas (which may be interpreted as sentences, or proposition
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt:λογική, λογική, lab ...

s) each of which is an axiom or follows from the preceding formulas in the sequence by a rule of inference
In the philosophy of logic
Following the developments in formal logic with symbolic logic in the late nineteenth century and mathematical logic in the twentieth, topics traditionally treated by logic not being part of formal logic have tended to ...

. The last sentence in the sequence is a theorem of a formal system. Formal proofs are useful because their theorems can be interpreted as true propositions.
Interpretations and models

Formal languages are entirely syntactic in nature but may be givensemantics
Semantics (from grc, σημαντικός ''sēmantikós'', "significant") is the study of reference
Reference is a relationship between objects in which one object designates, or acts as a means by which to connect to or link to, another ...

that give meaning to the elements of the language. For instance, in mathematical interpretation
Interpretation may refer to:
Culture
* Aesthetic interpretation, an explanation of the meaning of a work of art
* Allegorical interpretation, an approach that assumes a text should not be interpreted literally
* Dramatic Interpretation, an event i ...

assigns a meaning to each of the formulas—usually, a truth value
In logic
Logic is an interdisciplinary field which studies truth and reasoning
Reason is the capacity of consciously making sense of things, applying logic
Logic (from Ancient Greek, Greek: grc, wikt:λογική, λογική, la ...

.
The study of interpretations of formal languages is called formal semantics. In mathematical logic, this is often done in terms of model theory
In mathematical logic
Mathematical logic is the study of formal logic within mathematics. Major subareas include model theory, proof theory, set theory, and recursion theory. Research in mathematical logic commonly addresses the mathematical ...

. In model theory, the terms that occur in a formula are interpreted as objects within mathematical structures
In mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, change (mathematical analysis, analysis). It ...

, and fixed compositional interpretation rules determine how the truth value of the formula can be derived from the interpretation of its terms; a ''model'' for a formula is an interpretation of terms such that the formula becomes true.
See also

*Combinatorics on words
Combinatorics on words is a fairly new field of mathematics
Mathematics (from Ancient Greek, Greek: ) includes the study of such topics as quantity (number theory), mathematical structure, structure (algebra), space (geometry), and calculus, ...

* Free monoid In abstract algebra
In algebra, which is a broad division of mathematics, abstract algebra (occasionally called modern algebra) is the study of algebraic structures. Algebraic structures include group (mathematics), groups, ring (mathematics), r ...

* Formal method
* Grammar framework
* Mathematical notation
Mathematical notation is a system of symbol
A symbol is a mark, sign, or word
In linguistics, a word of a spoken language can be defined as the smallest sequence of phonemes that can be uttered in isolation with semantic, objective o ...

* Associative array
In computer science
Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application.
Computer science is the study of , ...

* String (computer science)
In computer programming, a string is traditionally a sequence of Character (computing), characters, either as a literal (computer programming), literal constant or as some kind of Variable (computer science), variable. The latter may allow its ...

Notes

References

Citations

Sources

; Works cited * ; General references * A. G. Hamilton, ''Logic for Mathematicians'',Cambridge University Press
Cambridge University Press (CUP) is the publishing business of the University of Cambridge
, mottoeng = Literal: From here, light and sacred draughts.
Non literal: From this place, we gain enlightenment and precious knowled ...

, 1978, .
* Seymour Ginsburg, ''Algebraic and automata theoretic properties of formal languages'', North-Holland, 1975, .
* Michael A. Harrison
Michael A. Harrison is a computer scientist, in particular a pioneer in the area of formal languages.
Biography
Michael A. Harrison (born in Philadelphia, Pennsylvania, U.S.) studied electrical engineering and computing for BS and MS at the Case ...

, ''Introduction to Formal Language Theory'', Addison-Wesley, 1978.
* .
* Grzegorz Rozenberg
Grzegorz Rozenberg (born 14 March 1942, Warsaw) is a Polish and Dutch computer scientist.
His primary research areas are natural computing,
formal languages, formal language and automata theory, graph rewriting, graph transformations, and pet ...

, Arto Salomaa, ''Handbook of Formal Languages: Volume I-III'', Springer, 1997, .
* Patrick Suppes, ''Introduction to Logic'', D. Van Nostrand, 1957, .
External links

* *University of Maryland, Baltimore, University of MarylandFormal Language Definitions

* James Power

"Notes on Formal Language Theory and Parsing"

29 November 2002. * Drafts of some chapters in the "Handbook of Formal Language Theory", Vol. 1–3, G. Rozenberg and A. Salomaa (eds.), Springer Verlag, (1997): ** Alexandru Mateescu and Arto Salomaa

"Preface" in Vol.1, pp. v–viii, and "Formal Languages: An Introduction and a Synopsis", Chapter 1 in Vol. 1, pp.1–39

** Sheng Yu

"Regular Languages", Chapter 2 in Vol. 1

** Jean-Michel Autebert, Jean Berstel, Luc Boasson

** Christian Choffrut and Juhani Karhumäki

"Combinatorics of Words", Chapter 6 in Vol. 1

** Tero Harju and Juhani Karhumäki

"Morphisms", Chapter 7 in Vol. 1, pp. 439–510

** Jean-Eric Pin

"Syntactic semigroups", Chapter 10 in Vol. 1, pp. 679–746

** M. Crochemore and C. Hancart

"Automata for matching patterns", Chapter 9 in Vol. 2

** Dora Giammarresi, Antonio Restivo

"Two-dimensional Languages", Chapter 4 in Vol. 3, pp. 215–267

{{DEFAULTSORT:Formal Language Formal languages, Theoretical computer science Combinatorics on words