computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...

, a type system is a

logical system A formal system is an abstract structure and formalization of an axiomatic system used for deducing, using rules of inference, theorems from axioms. In 1921, David Hilbert proposed to use formal systems as the foundation of knowledge in math ...

comprising a set of rules that assigns a property called a ''type'' (for example,

integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...

, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usually the terms are various language constructs of a

computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. It is one component of software, which also includes software documentation, documentation and other intangibl ...

, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term. Type systems formalize and enforce the otherwise implicit categories the programmer uses for

algebraic data type In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite data type, i.e., a data type formed by combining other types. Two common classes of algebraic types are product ty ...

data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...

s, or other

data types In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...

, such as "string", "array of float", "function returning boolean". Type systems are often specified as part of

programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...

s and built into interpreters and

compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...

s, although the type system of a language can be extended by optional tools that perform added checks using the language's original type

syntax In linguistics, syntax ( ) is the study of how words and morphemes combine to form larger units such as phrases and sentences. Central concerns of syntax include word order, grammatical relations, hierarchical sentence structure (constituenc ...

and

grammar In linguistics, grammar is the set of rules for how a natural language is structured, as demonstrated by its speakers or writers. Grammar rules may concern the use of clauses, phrases, and words. The term may also refer to the study of such rul ...

. The main purpose of a type system in a programming language is to reduce possibilities for bugs in computer programs due to type errors. The given type system in question determines what constitutes a type error, but in general, the aim is to prevent operations expecting a certain kind of value from being used with values of which that operation does not make sense (validity errors). Type systems allow defining interfaces between different parts of a computer program, and then checking that the parts have been connected in a consistent way. This checking can happen statically (at

compile time In computer science, compile time (or compile-time) describes the time window during which a language's statements are converted into binary instructions for the processor to execute. The term is used as an adjective to describe concepts relat ...

), dynamically (at run time), or as a combination of both. Type systems have other purposes as well, such as expressing business rules, enabling certain compiler optimizations, allowing for

multiple dispatch Multiple dispatch or multimethods is a feature of some programming languages in which a Subroutine, function or Method (computer programming), method can be dynamic dispatch, dynamically dispatched based on the run time (program lifecycle phase), ...

, and providing a form of

documentation Documentation is any communicable material that is used to describe, explain or instruct regarding some attributes of an object, system or procedure, such as its parts, assembly, installation, maintenance, and use. As a form of knowledge managem ...

Usage overview

An example of a simple type system is that of the C language. The portions of a C program are the function definitions. One function is invoked by another function. The interface of a function states the name of the function and a list of parameters that are passed to the function's code. The code of an invoking function states the name of the invoked, along with the names of variables that hold values to pass to it. During a

's execution, the values are placed into temporary storage, then execution jumps to the code of the invoked function. The invoked function's code accesses the values and makes use of them. If the instructions inside the function are written with the assumption of receiving an

value, but the calling code passed a floating-point value, then the wrong result will be computed by the invoked function. The C compiler checks the types of the arguments passed to a function when it is called against the types of the parameters declared in the function's definition. If the types do not match, the compiler throws a compile-time error or warning. A

may also use the static type of a value to optimize the storage it needs and the choice of

algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...

s for operations on the value. In many C compilers the ''float''

data type In computer science and computer programming, a data type (or simply type) is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these ...

, for example, is represented in 32 bits, in accord with the IEEE specification for single-precision floating point numbers. They will thus use floating-point-specific microprocessor operations on those values (floating-point addition, multiplication, etc.). The depth of type constraints and the manner of their evaluation affect the ''typing'' of the language. A

may further associate an operation with various resolutions for each type, in the case of type polymorphism.

Type theory In mathematics and theoretical computer science, a type theory is the formal presentation of a specific type system. Type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a foundation of ...

is the study of type systems. The concrete types of some programming languages, such as integers and strings, depend on practical issues of computer architecture, compiler implementation, and language design.

Fundamentals

Formally,

type theory In mathematics and theoretical computer science, a type theory is the formal presentation of a specific type system. Type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a foundation of ...

studies type systems. A programming language must have the opportunity to type check using the ''type system'' whether at compile time or runtime, manually annotated or automatically inferred. As Mark Manasse concisely put it: Assigning a data type, termed ''typing'', gives meaning to a sequence of bits such as a value in

memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...

or some object such as a variable. The hardware of a general purpose computer is unable to discriminate between for example a memory address and an instruction code, or between a character, an

, or a floating-point number, because it makes no intrinsic distinction between any of the possible values that a sequence of bits might ''mean''. Associating a sequence of bits with a type conveys that meaning to the programmable hardware to form a ''

symbolic system In logic, mathematics, computer science, and linguistics, a formal language is a set of strings whose symbols are taken from a set called "alphabet". The alphabet of a formal language consists of symbols that concatenate into strings (also c ...

'' composed of that hardware and some program. A program associates each value with at least one specific type, but it also can occur that one value is associated with many subtypes. Other entities, such as objects, modules, communication channels, and dependencies can become associated with a type. Even a type can become associated with a type. An implementation of a ''type system'' could in theory associate identifications called ''

'' (a type of a value), ''

class Class, Classes, or The Class may refer to: Common uses not otherwise categorized * Class (biology), a taxonomic rank * Class (knowledge representation), a collection of individuals or objects * Class (philosophy), an analytical concept used d ...

'' (a type of an object), and '' kind'' (a ''type of a type'', or metatype). These are the abstractions that typing can go through, on a hierarchy of levels contained in a system. When a programming language evolves a more elaborate type system, it gains a more finely grained rule set than basic type checking, but this comes at a price when the type inferences (and other properties) become undecidable, and when more attention must be paid by the programmer to annotate code or to consider computer-related operations and functioning. It is challenging to find a sufficiently expressive type system that satisfies all programming practices in a type safe manner. A programming language compiler can also implement a '' dependent type'' or an '' effect system'', which enables even more program specifications to be verified by a type checker. Beyond simple value-type pairs, a virtual "region" of code is associated with an "effect" component describing ''what'' is being done ''with what'', and enabling for example to "throw" an error report. Thus the symbolic system may be a ''type and effect system'', which endows it with more safety checking than type checking alone. Whether automated by the compiler or specified by a programmer, a type system renders program behavior illegal if it falls outside the type-system rules. Advantages provided by programmer-specified type systems include: * ''Abstraction'' (or ''modularity'') – Types enable programmers to think at a higher level than the bit or byte, not bothering with low-level implementation. For example, programmers can begin to think of a string as a set of character values instead of as an array of bytes. Higher still, types enable programmers to think about and express interfaces between two of ''any''-sized subsystems. This enables more levels of localization so that the definitions required for interoperability of the subsystems remain consistent when those two subsystems communicate. * ''Documentation'' – In more expressive type systems, types can serve as a form of

clarifying the intent of the programmer. For example, if a programmer declares a function as returning a timestamp type, this documents the function when the timestamp type can be explicitly declared deeper in the code to be an integer type. Advantages provided by compiler-specified type systems include: * ''Optimization'' – Static type-checking may provide useful compile-time information. For example, if a type requires that a value must align in memory at a multiple of four bytes, the compiler may be able to use more efficient machine instructions. * ''Safety'' – A type system enables the

to detect meaningless or invalid code. For example, we can identify an expression 3 / "Hello, World" as invalid, when the rules do not specify how to divide an

by a string. Strong typing offers more safety, but cannot guarantee complete ''

type safety In computer science, type safety and type soundness are the extent to which a programming language discourages or prevents type errors. Type safety is sometimes alternatively considered to be a property of facilities of a computer language; that ...

''.

Type errors

A type error occurs when an operation receives a different type of data than it expected. For example, a type error would happen if a line of code divides two integers, and is passed a string of letters instead of an integer. It is an unintended condition which might manifest in multiple stages of a program's development. Thus a facility for detection of the error is needed in the type system. In some languages, such as

Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...

, for which

type inference Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some bran ...

is automated, lint might be available to its compiler to aid in the detection of error. Type safety contributes to program correctness, but might only guarantee correctness at the cost of making the type checking itself an

undecidable problem In computability theory and computational complexity theory, an undecidable problem is a decision problem for which it is proved to be impossible to construct an algorithm that always leads to a correct yes-or-no answer. The halting problem is an ...

(as in the

halting problem In computability theory (computer science), computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run for ...

). In a ''type system'' with automated type checking, a program may prove to run incorrectly yet produce no compiler errors.

Division by zero In mathematics, division by zero, division (mathematics), division where the divisor (denominator) is 0, zero, is a unique and problematic special case. Using fraction notation, the general example can be written as \tfrac a0, where a is the di ...

is an unsafe and incorrect operation, but a type checker which only runs at

does not scan for division by zero in most languages; that division would surface as a runtime error. To prove the absence of these defects, other kinds of

formal method Formal, formality, informal or informality imply the complying with, or not complying with, some set of requirements ( forms, in Ancient Greek). They may refer to: Dress code and events * Formal wear, attire for formal events * Semi-formal att ...

s, collectively known as program analyses, are in common use. Alternatively, a sufficiently expressive type system, such as in dependently typed languages, can prevent these kinds of errors (for example, expressing ''the type of non-zero numbers''). In addition,

software testing Software testing is the act of checking whether software satisfies expectations. Software testing can provide objective, independent information about the Quality (business), quality of software and the risk of its failure to a User (computin ...

is an

empirical Empirical evidence is evidence obtained through sense experience or experimental procedure. It is of central importance to the sciences and plays a role in various other fields, like epistemology and law. There is no general agreement on how t ...

method for finding errors that such a type checker would not detect.

Type checking

The process of verifying and enforcing the constraints of types—''type checking''—may occur at

(a static check) or at run-time (a dynamic check). If a language specification requires its typing rules strongly, more or less allowing only those automatic type conversions that do not lose information, one can refer to the process as ''strongly typed''; if not, as ''weakly typed''. The terms are not usually used in a strict sense.

Static type checking

Static type checking is the process of verifying the

of a program based on analysis of a program's text (

source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...

). If a program passes a static type checker, then the program is guaranteed to satisfy some set of type safety properties for all possible inputs. Static type checking can be considered a limited form of program verification (see

), and in a type-safe language, can also be considered an optimization. If a compiler can prove that a program is well-typed, then it does not need to emit dynamic safety checks, allowing the resulting compiled binary to run faster and to be smaller. Static type checking for Turing-complete languages is inherently conservative. That is, if a type system is both ''sound'' (meaning that it rejects all incorrect programs) and ''decidable'' (meaning that it is possible to write an algorithm that determines whether a program is well-typed), then it must be ''incomplete'' (meaning there are correct programs, which are also rejected, even though they do not encounter runtime errors). For example, consider a program containing the code: : if then else Even if the expression always evaluates to true at run-time, most type checkers will reject the program as ill-typed, because it is difficult (if not impossible) for a static analyzer to determine that the else branch will not be taken. Consequently, a static type checker will quickly detect type errors in rarely used code paths. Without static type checking, even

code coverage In software engineering, code coverage, also called test coverage, is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high code coverage has more of its ...

tests with 100% coverage may be unable to find such type errors. The tests may fail to detect such type errors, because the combination of all places where values are created and all places where a certain value is used must be taken into account. A number of useful and common programming language features cannot be checked statically, such as downcasting. Thus, many languages will have both static and dynamic type checking; the static type checker verifies what it can, and dynamic checks verify the rest. Many languages with static type checking provide a way to bypass the type checker. Some languages allow programmers to choose between static and dynamic type safety. For example, historically C# declares variables statically, but C# 4.0 introduces the dynamic keyword, which is used to declare variables to be checked dynamically at runtime. Other languages allow writing code that is not type-safe; for example, in C, programmers can freely cast a value between any two types that have the same size, effectively subverting the type concept.

Dynamic type checking and runtime type information

Dynamic type checking is the process of verifying the type safety of a program at runtime. Implementations of dynamically type-checked languages generally associate each runtime object with a ''type tag'' (i.e., a reference to a type) containing its type information. This runtime type information (RTTI) can also be used to implement dynamic dispatch, late binding, downcasting,

reflective programming In computer science, reflective programming or reflection is the ability of a process to examine, introspect, and modify its own structure and behavior. Historical background The earliest computers were programmed in their native assembly lang ...

(reflection), and similar features. Most type-safe languages include some form of dynamic type checking, even if they also have a static type checker. The reason for this is that many useful features or properties are difficult or impossible to verify statically. For example, suppose that a program defines two types, A and B, where B is a subtype of A. If the program tries to convert a value of type A to type B, which is known as downcasting, then the operation is legal only if the value being converted is actually a value of type B. Thus, a dynamic check is needed to verify that the operation is safe. This requirement is one of the criticisms of downcasting. By definition, dynamic type checking may cause a program to fail at runtime. In some programming languages, it is possible to anticipate and recover from these failures. In others, type-checking errors are considered fatal. Programming languages that include dynamic type checking but not static type checking are often called "dynamically typed programming languages".

Combining static and dynamic type checking

Certain languages allow both static and dynamic typing. For example, Java and some other ostensibly statically typed languages support downcasting types to their subtypes, querying an object to discover its dynamic type and other type operations that depend on runtime type information. Another example is C++ RTTI. More generally, most programming languages include mechanisms for dispatching over different 'kinds' of data, such as

disjoint union In mathematics, the disjoint union (or discriminated union) A \sqcup B of the sets and is the set formed from the elements of and labelled (indexed) with the name of the set from which they come. So, an element belonging to both and appe ...

s, runtime polymorphism, and variant types. Even when not interacting with type annotations or type checking, such mechanisms are materially similar to dynamic typing implementations. Objects in object-oriented languages are usually accessed by a reference whose static target type (or manifest type) is equal to either the object's run-time type (its latent type) or a supertype thereof. This is conformant with the Liskov substitution principle, which states that all operations performed on an instance of a given type can also be performed on an instance of a subtype. This concept is also known as subsumption or subtype polymorphism. In some languages subtypes may also possess covariant or contravariant return types and argument types respectively. Certain languages, for example

Clojure Clojure (, like ''closure'') is a dynamic programming language, dynamic and functional programming, functional dialect (computing), dialect of the programming language Lisp (programming language), Lisp on the Java (software platform), Java platfo ...

Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ''ANSI INCITS 226-1994 (S2018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperli ...

, or Cython are dynamically type checked by default, but allow programs to opt into static type checking by providing optional annotations. One reason to use such hints would be to optimize the performance of critical sections of a program. This is formalized by gradual typing. The programming environment '' DrRacket'', a pedagogic environment based on Lisp, and a precursor of the language Racket is also soft-typed. Conversely, as of version 4.0, the C# language provides a way to indicate that a variable should not be statically type checked. A variable whose type is dynamic will not be subject to static type checking. Instead, the program relies on runtime type information to determine how the variable may be used. In Rust, the type provides dynamic typing of types.

Static and dynamic type checking in practice

The choice between static and dynamic typing requires certain trade-offs. Static typing can find type errors reliably at compile time, which increases the reliability of the delivered program. However, programmers disagree over how commonly type errors occur, resulting in further disagreements over the proportion of those bugs that are coded that would be caught by appropriately representing the designed types in code. Static typing advocates believe programs are more reliable when they have been well type-checked, whereas dynamic-typing advocates point to distributed code that has proven reliable and to small bug databases. The value of static typing increases as the strength of the type system is increased. Advocates of dependent typing, implemented in languages such as Dependent ML and

Epigram An epigram is a brief, interesting, memorable, sometimes surprising or satirical statement. The word derives from the Greek (, "inscription", from [], "to write on, to inscribe"). This literary device has been practiced for over two millennia ...

, have suggested that almost all bugs can be considered type errors, if the types used in a program are properly declared by the programmer or correctly inferred by the compiler. Static typing usually results in compiled code that executes faster. When the compiler knows the exact data types that are in use (which is necessary for static verification, either through declaration or inference) it can produce optimized machine code. Some dynamically typed languages such as

allow optional type declarations for optimization for this reason. By contrast, dynamic typing may allow compilers to run faster and interpreters to dynamically load new code, because changes to source code in dynamically typed languages may result in less checking to perform and less code to revisit. This too may reduce the edit-compile-test-debug cycle. Statically typed languages that lack

(such as C and

Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...

prior to version 10) require that programmers declare the types that a method or function must use. This can serve as added program documentation, that is active and dynamic, instead of static. This allows a compiler to prevent it from drifting out of synchrony, and from being ignored by programmers. However, a language can be statically typed without requiring type declarations (examples include

, Scala,

OCaml OCaml ( , formerly Objective Caml) is a General-purpose programming language, general-purpose, High-level programming language, high-level, Comparison of multi-paradigm programming languages, multi-paradigm programming language which extends the ...

, F#,

Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIF ...

, and to a lesser extent C# and C++), so explicit type declaration is not a necessary requirement for static typing in all languages. Dynamic typing allows constructs that some (simple) static type checking would reject as illegal. For example, '' eval'' functions, which execute arbitrary data as code, become possible. An ''eval'' function is possible with static typing, but requires advanced uses of algebraic data types. Further, dynamic typing better accommodates transitional code and prototyping, such as allowing a placeholder data structure ( mock object) to be transparently used in place of a full data structure (usually for the purposes of experimentation and testing). Dynamic typing typically allows duck typing (which enables easier code reuse). Many languages with static typing also feature duck typing or other mechanisms like generic programming that also enable easier code reuse. Dynamic typing typically makes

metaprogramming Metaprogramming is a computer programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyse, or transform other programs, and even modi ...

easier to use. For example, C++ templates are typically more cumbersome to write than the equivalent

Ruby Ruby is a pinkish-red-to-blood-red-colored gemstone, a variety of the mineral corundum ( aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapph ...

or Python code since C++ has stronger rules regarding type definitions (for both functions and variables). This forces a developer to write more boilerplate code for a template than a Python developer would need to. More advanced run-time constructs such as metaclasses and

introspection Introspection is the examination of one's own conscious thoughts and feelings. In psychology, the process of introspection relies on the observation of one's mental state, while in a spiritual context it may refer to the examination of one's s ...

are often harder to use in statically typed languages. In some languages, such features may also be used e.g. to generate new types and behaviors on the fly, based on run-time data. Such advanced constructs are often provided by

dynamic programming language A dynamic programming language is a type of programming language that allows various operations to be determined and executed at runtime. This is different from the compilation phase. Key decisions about variables, method calls, or data types are ...

s; many of these are dynamically typed, although ''dynamic typing'' need not be related to ''dynamic programming languages''.

Strong and weak type systems

Languages are often colloquially referred to as ''strongly typed'' or ''weakly typed''. In fact, there is no universally accepted definition of what these terms mean. In general, there are more precise terms to represent the differences between type systems that lead people to call them "strong" or "weak".

Type safety and memory safety

A third way of categorizing the type system of a programming language is by the safety of typed operations and conversions. Computer scientists use the term ''type-safe language'' to describe languages that do not allow operations or conversions that violate the rules of the type system. Computer scientists use the term ''memory-safe language'' (or just ''safe language'') to describe languages that do not allow programs to access memory that has not been assigned for their use. For example, a memory-safe language will check array bounds, or else statically guarantee (i.e., at compile time before execution) that array accesses out of the array boundaries will cause compile-time and perhaps runtime errors. Consider the following program of a language that is both type-safe and memory-safe: var x := 5; var y := "37"; var z := x + y; In this example, the variable will have the value 42. Although this may not be what the programmer anticipated, it is a well-defined result. If were a different string, one that could not be converted to a number (e.g. "Hello World"), the result would be well-defined as well. Note that a program can be type-safe or memory-safe and still crash on an invalid operation. This is for languages where the type system is not sufficiently advanced to precisely specify the validity of operations on all possible operands. But if a program encounters an operation that is not type-safe, terminating the program is often the only option. Now consider a similar example in C: int x = 5; char y[] = "37"; char* z = x + y; printf("%c\n", *z); In this example will point to a memory address five characters beyond , equivalent to three characters after the terminating zero character of the string pointed to by . This is memory that the program is not expected to access. In C terms this is simply undefined behaviour and the program may do anything; with a simple compiler it might actually print whatever byte is stored after the string "37". As this example shows, C is not memory-safe. As arbitrary data was assumed to be a character, it is also not a type-safe language. In general, type-safety and memory-safety go hand in hand. For example, a language that supports pointer arithmetic and number-to-pointer conversions (like C) is neither memory-safe nor type-safe, because it allows arbitrary memory to be accessed as if it were valid memory of any type.

Variable levels of type checking

Some languages allow different levels of checking to apply to different regions of code. Examples include: * The use strict directive in

JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...

and

Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...

applies stronger checking. * The declare(strict_types=1) in PHP on a per-file basis allows only a variable of exact type of the type declaration will be accepted, or a TypeError will be thrown. * The Option Strict On in VB.NET allows the compiler to require a conversion between objects. Additional tools such as lint and

IBM Rational Purify PurifyPlus is a memory debugger program used by software developers to detect memory access errors in programs, especially those written in C (programming language), C or C++. It was originally written by Reed Hastings of Pure Software.

can also be used to achieve a higher level of strictness.

Optional type systems

It has been proposed, chiefly by Gilad Bracha, that the choice of type system be made independent of choice of language; that a type system should be a module that can be ''plugged'' into a language as needed. He believes this is advantageous, because what he calls mandatory type systems make languages less expressive and code more fragile. The requirement that the type system does not affect the semantics of the language is difficult to fulfill. Optional typing is related to, but distinct from, gradual typing. While both typing disciplines can be used to perform static analysis of code (

static typing In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a ''type'' (for example, integer, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usu ...

), optional type systems do not enforce type safety at runtime ( dynamic typing).

Polymorphism and types

The term ''polymorphism'' refers to the ability of code (especially, functions or classes) to act on values of multiple types, or to the ability of different instances of the same data structure to contain elements of different types. Type systems that allow polymorphism generally do so in order to improve the potential for code re-use: in a language with polymorphism, programmers need only implement a data structure such as a list or an

associative array In computer science, an associative array, key-value store, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In math ...

once, rather than once for each type of element with which they plan to use it. For this reason computer scientists sometimes call the use of certain forms of polymorphism '' generic programming''. The type-theoretic foundations of polymorphism are closely related to those of

abstraction Abstraction is a process where general rules and concepts are derived from the use and classifying of specific examples, literal (reality, real or Abstract and concrete, concrete) signifiers, first principles, or other methods. "An abstraction" ...

modularity Modularity is the degree to which a system's components may be separated and recombined, often with the benefit of flexibility and variety in use. The concept of modularity is used primarily to reduce complexity by breaking a system into varying ...

and (in some cases)

subtyping In programming language theory, subtyping (also called subtype polymorphism or inclusion polymorphism) is a form of type polymorphism. A ''subtype'' is a datatype that is related to another datatype (the ''supertype'') by some notion of substi ...

Specialized type systems

Many type systems have been created that are specialized for use in certain environments with certain types of data, or for out-of-band static program analysis. Frequently, these are based on ideas from formal

and are only available as part of prototype research systems. The following table gives an overview over type theoretic concepts that are used in specialized type systems. The names range over terms and the names

\sigma, \tau

range over types. The following notation will be used: *

M:\sigma

means that

M

has type

\sigma

; *

M(N)

is that application of

M

N

; *

\tau alpha := \sigma /math> (resp. \tau := N /math>) describes the type which results from replacing all occurrences of the type variable  (resp. term variable ) in \tau by the type  (resp. term ).

Dependent types

Dependent types are based on the idea of using scalars or values to more precisely describe the type of some other value. For example,

\mathrm(3, 3)

might be the type of a

3 \times 3

matrix. We can then define typing rules such as the following rule for matrix multiplication: :

\mathrm_ : \mathrm(k, m) \times \mathrm(m, n) \to \mathrm(k, n)

where , , are arbitrary positive integer values. A variant of ML called Dependent ML has been created based on this type system, but because type checking for conventional dependent types is undecidable, not all programs using them can be type-checked without some kind of limits. Dependent ML limits the sort of equality it can decide to Presburger arithmetic. Other languages such as

make the value of all expressions in the language decidable so that type checking can be decidable. However, in general proof of decidability is undecidable, so many programs require hand-written annotations that may be very non-trivial. As this impedes the development process, many language implementations provide an easy way out in the form of an option to disable this condition. This, however, comes at the cost of making the type-checker run in an infinite loop when fed programs that do not type-check, causing the compilation to fail.

Linear types

Linear types, based on the theory of linear logic, and closely related to uniqueness types, are types assigned to values having the property that they have one and only one reference to them at all times. These are valuable for describing large immutable values such as files, strings, and so on, because any operation that simultaneously destroys a linear object and creates a similar object (such as str = str + "a") can be optimized "under the hood" into an in-place mutation. Normally this is not possible, as such mutations could cause side effects on parts of the program holding other references to the object, violating

referential transparency In analytic philosophy and computer science, referential transparency and referential opacity are properties of linguistic constructions, and by extension of languages. A linguistic construction is called ''referentially transparent'' when for an ...

. They are also used in the prototype operating system Singularity for interprocess communication, statically ensuring that processes cannot share objects in shared memory in order to prevent race conditions. The Clean language (a

-like language) uses this type system in order to gain a lot of speed (compared to performing a deep copy) while remaining safe.

Intersection types

Intersection types are types describing values that belong to ''both'' of two other given types with overlapping value sets. For example, in most implementations of C the signed char has range -128 to 127 and the unsigned char has range 0 to 255, so the intersection type of these two types would have range 0 to 127. Such an intersection type could be safely passed into functions expecting ''either'' signed or unsigned chars, because it is compatible with both types. Intersection types are useful for describing overloaded function types: for example, if " → " is the type of functions taking an integer argument and returning an integer, and " → " is the type of functions taking a float argument and returning a float, then the intersection of these two types can be used to describe functions that do one or the other, based on what type of input they are given. Such a function could be passed into another function expecting an " → " function safely; it simply would not use the " → " functionality. In a subclassing hierarchy, the intersection of a type and an ancestor type (such as its parent) is the most derived type. The intersection of sibling types is empty. The Forsythe language includes a general implementation of intersection types. A restricted form is refinement types.

Union types

Union types are types describing values that belong to ''either'' of two types. For example, in C, the signed char has a -128 to 127 range, and the unsigned char has a 0 to 255 range, so the union of these two types would have an overall "virtual" range of -128 to 255 that may be used partially depending on which union member is accessed. Any function handling this union type would have to deal with integers in this complete range. More generally, the only valid operations on a union type are operations that are valid on ''both'' types being unioned. C's "union" concept is similar to union types, but is not typesafe, as it permits operations that are valid on ''either'' type, rather than ''both''. Union types are important in program analysis, where they are used to represent symbolic values whose exact nature (e.g., value or type) is not known. In a subclassing hierarchy, the union of a type and an ancestor type (such as its parent) is the ancestor type. The union of sibling types is a subtype of their common ancestor (that is, all operations permitted on their common ancestor are permitted on the union type, but they may also have other valid operations in common).

Existential types

Existential types are frequently used in connection with record types to represent modules and abstract data types, due to their ability to separate implementation from interface. For example, the type "T = ∃X " describes a module interface that has a data member named ''a'' of type ''X'' and a function named ''f'' that takes a parameter of the ''same'' type ''X'' and returns an integer. This could be implemented in different ways; for example: * intT = * floatT = These types are both subtypes of the more general existential type T and correspond to concrete implementation types, so any value of one of these types is a value of type T. Given a value "t" of type "T", we know that "t.f(t.a)" is well-typed, regardless of what the abstract type ''X'' is. This gives flexibility for choosing types suited to a particular implementation, while clients that use only values of the interface type—the existential type—are isolated from these choices. In general it's impossible for the typechecker to infer which existential type a given module belongs to. In the above example intT could also have the type ∃X . The simplest solution is to annotate every module with its intended type, e.g.: * intT = as ∃X Although abstract data types and modules had been implemented in programming languages for quite some time, it wasn't until 1988 that John C. Mitchell and Gordon Plotkin established the formal theory under the slogan: "Abstract atatypes have existential type". The theory is a second-order typed lambda calculus similar to System F, but with existential instead of universal quantification.

Gradual typing

In a type system with Gradual typing, variables may be assigned a type either at compile-time (which is static typing), or at run-time (which is dynamic typing). This allows software developers to choose either type paradigm as appropriate, from within a single language. Gradual typing uses a special type named ''dynamic'' to represent statically unknown types; gradual typing replaces the notion of type equality with a new relation called ''consistency'' that relates the dynamic type to every other type. The consistency relation is symmetric but not transitive.

Explicit or implicit declaration and inference

Many static type systems, such as those of C and Java, require ''type declarations'': the programmer must explicitly associate each variable with a specific type. Others, such as Haskell's, perform ''

'': the compiler draws conclusions about the types of variables based on how programmers use those variables. For example, given a function that adds and together, the compiler can infer that and must be numbers—since addition is only defined for numbers. Thus, any call to elsewhere in the program that specifies a non-numeric type (such as a string or list) as an argument would signal an error. Numerical and string constants and expressions in code can and often do imply type in a particular context. For example, an expression might imply a type of

floating-point In computing, floating-point arithmetic (FP) is arithmetic on subsets of real numbers formed by a ''significand'' (a Sign (mathematics), signed sequence of a fixed number of digits in some Radix, base) multiplied by an integer power of that ba ...

, while might imply a list of integers—typically an array. Type inference is in general possible, if it is computable in the type system in question. Moreover, even if inference is not computable in general for a given type system, inference is often possible for a large subset of real-world programs. Haskell's type system, a version of Hindley–Milner, is a restriction of System Fω to so-called rank-1 polymorphic types, in which type inference is computable. Most Haskell compilers allow arbitrary-rank polymorphism as an extension, but this makes type inference not computable. (Type checking is decidable, however, and rank-1 programs still have type inference; higher rank polymorphic programs are rejected unless given explicit type annotations.)

Decision problems

A type system that assigns types to terms in type environments using

typing rule In type theory, a typing rule is an inference rule that describes how a type system assigns a type to a syntactic construction. These rules may be applied by the type system to determine if a program is well-typed and what type expressions ha ...

s is naturally associated with the

decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question on a set of input values. An example of a decision problem is deciding whether a given natura ...

s of

type checking In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a ''type'' (for example, integer, floating point, string) to every '' term'' (a word, phrase, or other set of symbols). Usu ...

, typability, and type inhabitation. * Given a type environment

\Gamma

, a term

e

, and a type

\tau

, decide whether the term

e

can be assigned the type

\tau

in the type environment. * Given a term

e

, decide whether there exists a type environment

\Gamma

and a type

\tau

such that the term

e

can be assigned the type

\tau

in the type environment

\Gamma

. * Given a type environment

\Gamma

and a type

\tau

, decide whether there exists a term

e

that can be assigned the type

\tau

in the type environment.

Unified type system

Some languages like C# or Scala have a unified type system. This means that all C# types including primitive types inherit from a single root object. Every type in C# inherits from the Object class. Some languages, like

and Raku, have a root type but also have primitive types that are not objects. Java provides wrapper object types that exist together with the primitive types so developers can use either the wrapper object types or the simpler non-object primitive types. Raku automatically converts primitive types to objects when their methods are accessed.

Compatibility: equivalence and subtyping

A type checker for a statically typed language must verify that the type of any expression is consistent with the type expected by the context in which that expression appears. For example, in an assignment statement of the form x := ''e'', the inferred type of the expression ''e'' must be consistent with the declared or inferred type of the variable x. This notion of consistency, called ''compatibility'', is specific to each programming language. If the type of ''e'' and the type of x are the same, and assignment is allowed for that type, then this is a valid expression. Thus, in the simplest type systems, the question of whether two types are compatible reduces to that of whether they are ''equal'' (or ''equivalent''). Different languages, however, have different criteria for when two type expressions are understood to denote the same type. These different ''equational theories'' of types vary widely, two extreme cases being ''

structural type system A structural type system (or property-based type system) is a major class of type systems in which type compatibility and equivalence are determined by the type's actual structure or definition and not by other characteristics such as its name ...

s'', in which any two types that describe values with the same structure are equivalent, and '' nominative type systems'', in which no two syntactically distinct type expressions denote the same type (''i.e.'', types must have the same "name" in order to be equal). In languages with

, the compatibility relation is more complex: If B is a subtype of A, then a value of type B can be used in a context where one of type A is expected ( covariant), even if the reverse is not true. Like equivalence, the subtype relation is defined differently for each programming language, with many variations possible. The presence of parametric or ad hoc polymorphism in a language may also have implications for type compatibility.

Notes

References

External links

* * {{DEFAULTSORT:Type System Data types Program analysis Type theory Articles with example C code