HOME

TheInfoList



OR:

In
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
, a generalized algebraic data type (GADT, also first-class phantom type, guarded recursive datatype, or equality-qualified type) is a generalization of a parametric
algebraic data type In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite data type, i.e., a data type formed by combining other types. Two common classes of algebraic types are product ty ...
(ADT).


Overview

In a GADT, the product constructors (called data constructors in
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
) can provide an explicit instantiation of the ADT as the type instantiation of their return value. This allows defining functions with a more advanced type behaviour. For a data constructor of Haskell 2010, the return value has the type instantiation implied by the instantiation of the ADT parameters at the constructor's application. -- A parametric ADT that is not a GADT data List a = Nil , Cons a (List a) integers :: List Int integers = Cons 12 (Cons 107 Nil) strings :: List String strings = Cons "boat" (Cons "dock" Nil) -- A GADT data Expr a where EBool :: Bool -> Expr Bool EInt :: Int -> Expr Int EEqual :: Expr Int -> Expr Int -> Expr Bool eval :: Expr a -> a eval e = case e of EBool a -> a EInt a -> a EEqual a b -> (eval a)

(eval b) expr1 :: Expr Bool expr1 = EEqual (EInt 2) (EInt 3) ret = eval expr1 -- False
They are currently implemented in the
Glasgow Haskell Compiler The Glasgow Haskell Compiler (GHC) is a native or machine code compiler for the functional programming language Haskell. It provides a cross-platform software environment for writing and testing Haskell code and supports many extensions, libra ...
(GHC) as a non-standard extension, used by, among others, Pugs and
Darcs Darcs is a distributed version control system created by David Roundy. Key features include the ability to choose which changes to accept from other repositories, interaction with either other local (on-disk) repositories or remote repositories v ...
.
OCaml OCaml ( , formerly Objective Caml) is a General-purpose programming language, general-purpose, High-level programming language, high-level, Comparison of multi-paradigm programming languages, multi-paradigm programming language which extends the ...
supports GADT natively since version 4.00. The GHC implementation provides support for existentially quantified type parameters and for local constraints.


History

An early version of generalized algebraic data types were described by and based on
pattern matching In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually must be exact: "either it will or will not be a ...
in ALF. Generalized algebraic data types were introduced independently by and prior by as extensions to the
algebraic data type In computer programming, especially functional programming and type theory, an algebraic data type (ADT) is a kind of composite data type, i.e., a data type formed by combining other types. Two common classes of algebraic types are product ty ...
s of ML and
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
. Both are essentially equivalent to each other. They are similar to the '' inductive families of data types'' (or ''inductive datatypes'') found in
Coq Coenzyme Q10 (CoQ10 ), also known as ubiquinone, is a naturally occurring biochemical cofactor (coenzyme) and an antioxidant produced by the human body. It can also be obtained from dietary sources, such as meat, fish, seed oils, vegetables, ...
's Calculus of Inductive Constructions and other dependently typed languages, modulo the dependent types and except that the latter have an additional positivity restriction which is not enforced in GADTs. introduced ''extended algebraic data types'' which combine GADTs together with the existential data types and
type class In computer science, a type class is a type system construct that supports ad hoc polymorphism. This is achieved by adding constraints to type variables in parametrically polymorphic types. Such a constraint typically involves a type class T a ...
constraints.
Type inference Type inference, sometimes called type reconstruction, refers to the automatic detection of the type of an expression in a formal language. These include programming languages and mathematical type systems, but also natural languages in some bran ...
in the absence of any programmer supplied type annotation, is undecidable and functions defined over GADTs do not admit principal types in general. Type reconstruction requires several design trade-offs and is an area of active research (; . In spring 2021, Scala 3.0 was released. This major update of Scala introduced the possibility to write GADTs with the same syntax as algebraic data types, which is not the case in other
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
s according to Martin Odersky.


Applications

Applications of GADTs include
generic programming Generic programming is a style of computer programming in which algorithms are written in terms of data types ''to-be-specified-later'' that are then ''instantiated'' when needed for specific types provided as parameters. This approach, pioneer ...
, modelling programming languages (
higher-order abstract syntax In computer science, higher-order abstract syntax (abbreviated HOAS) is a technique for the representation of abstract syntax trees for languages with variable name binding, binders. Relation to first-order abstract syntax An abstract syntax is ' ...
), maintaining invariants in
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
s, expressing constraints in embedded domain-specific languages, and modelling objects.


Higher-order abstract syntax

An important application of GADTs is to embed
higher-order abstract syntax In computer science, higher-order abstract syntax (abbreviated HOAS) is a technique for the representation of abstract syntax trees for languages with variable name binding, binders. Relation to first-order abstract syntax An abstract syntax is ' ...
in a type safe fashion. Here is an embedding of the
simply typed lambda calculus The simply typed lambda calculus (), a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor () that builds function types. It is the canonical and simplest example of a typed lambda calculus. The ...
with an arbitrary collection of base types,
product type In programming languages and type theory, a product of ''types'' is another, compounded, type in a structure. The "operands" of the product are types, and the structure of a product type is determined by the fixed order of the operands in the produ ...
s (
tuple In mathematics, a tuple is a finite sequence or ''ordered list'' of numbers or, more generally, mathematical objects, which are called the ''elements'' of the tuple. An -tuple is a tuple of elements, where is a non-negative integer. There is o ...
s) and a
fixed point combinator In combinatory logic for computer science, a fixed-point combinator (or fixpoint combinator) is a higher-order function (i.e., a function which takes a function as argument) that returns some '' fixed point'' (a value that is mapped to itself) of ...
: data Lam :: * -> * where Lift :: a -> Lam a -- ^ lifted value Pair :: Lam a -> Lam b -> Lam (a, b) -- ^ product Lam :: (Lam a -> Lam b) -> Lam (a -> b) -- ^ lambda abstraction App :: Lam (a -> b) -> Lam a -> Lam b -- ^ function application Fix :: Lam (a -> a) -> Lam a -- ^ fixed point And a type safe evaluation function: eval :: Lam t -> t eval (Lift v) = v eval (Pair l r) = (eval l, eval r) eval (Lam f) = \x -> eval (f (Lift x)) eval (App f x) = (eval f) (eval x) eval (Fix f) = (eval f) (eval (Fix f)) The factorial function can now be written as: fact = Fix (Lam (\f -> Lam (\y -> Lift (if eval y

0 then 1 else eval y * (eval f) (eval y - 1))))) eval(fact)(10)
Problems would have occurred using regular algebraic data types. Dropping the type parameter would have made the lifted base types existentially quantified, making it impossible to write the evaluator. With a type parameter, it is still restricted to one base type. Further, ill-formed expressions such as App (Lam (\x -> Lam (\y -> App x y))) (Lift True) would have been possible to construct, while they are type incorrect using the GADT. A well-formed analogue is App (Lam (\x -> Lam (\y -> App x y))) (Lift (\z -> True)). This is because the type of x is Lam (a -> b), inferred from the type of the Lam data constructor.


See also

*
Type variable In type theory and programming languages, a type variable is a mathematical variable ranging over types. Even in programming languages that allow mutable variables, a type variable remains an abstraction, in the sense that it does not correspond ...


Notes


Further reading

; Applications * * * * ; Semantics * Patricia Johann and Neil Ghani (2008).
Foundations for Structured Programming with GADTs
. * Arie Middelkoop, Atze Dijkstra and S. Doaitse Swierstra (2011).
A lean specification for GADTs: system F with first-class equality proofs
. ''Higher-Order and Symbolic Computation''. ; Type reconstruction * * * * * * ; Other * Andrew Kennedy and Claudio V. Russo.
Generalized algebraic data types and object-oriented programming
. ''In Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications''. ACM Press, 2005.


External links


Generalised Algebraic Datatype Page
on the Haskell
wiki A wiki ( ) is a form of hypertext publication on the internet which is collaboratively edited and managed by its audience directly through a web browser. A typical wiki contains multiple pages that can either be edited by the public or l ...

Generalised Algebraic Data Types
in the GHC Users' Guide
Generalized Algebraic Data Types and Object-Oriented ProgrammingGADTs – Haskell Prime – TracPapers about type inference for GADTs
bibliography by
Simon Peyton Jones Simon Peyton Jones (born 18 January 1958) is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional programming. Education Peyton Jones graduated fro ...

Type inference with constraints
bibliography by
Simon Peyton Jones Simon Peyton Jones (born 18 January 1958) is a British computer scientist who researches the implementation and applications of functional programming languages, particularly lazy functional programming. Education Peyton Jones graduated fro ...

Emulating GADTs in Java via the Yoneda lemma
{{Data types Functional programming Dependently typed programming Type theory Composite data types Data types