computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...

, a

programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...

is said to have first-class functions if it treats functions as

first-class citizen In a given programming language design, a first-class citizen is an entity which supports all the operations generally available to other entities. These operations typically include being passed as an argument, returned from a function, and ass ...

s. This means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures. Some programming language theorists require support for

anonymous function In computer programming, an anonymous function (function literal, expression or block) is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for const ...

s (function literals) as well.Programming language pragmatics
by Michael Lee Scott, section 11.2 "Functional Programming". In languages with first-class functions, the

names A name is a term used for identification by an external observer. They can identify a class or category of things, or a single thing, either uniquely, or within a given context. The entity identified by a name is called its referent. A person ...

of functions do not have any special status; they are treated like ordinary variables with a

function type In computer science and mathematical logic, a function type (or arrow type or exponential) is the type of a variable or parameter to which a function has or can be assigned, or an argument or result type of a higher-order function taking or return ...

. The term was coined by

Christopher Strachey Christopher S. Strachey (; 16 November 1916 – 18 May 1975) was a British computer scientist. He was one of the founders of denotational semantics, and a pioneer in programming language design and computer time-sharing.F. J. Corbató, et al., T ...

in the context of "functions as first-class citizens" in the mid-1960s. (also on 2010-02-16 First-class functions are a necessity for the

functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...

style, in which the use of

higher-order function In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following: * takes one or more functions as arguments (i.e. a procedural parameter, which is a parameter of a procedure that is itself ...

s is a standard practice. A simple example of a higher-ordered function is the ''

map A map is a symbolic depiction of interrelationships, commonly spatial, between things within a space. A map may be annotated with text and graphics. Like any graphic, a map may be fixed to paper or other durable media, or may be displayed on ...

'' function, which takes, as its arguments, a function and a list, and returns the list formed by applying the function to each member of the list. For a language to support ''map'', it must support passing a function as an argument. There are certain implementation difficulties in passing functions as arguments or returning them as results, especially in the presence of

non-local variable In programming language theory, a non-local variable is a variable that is not defined in the local Scope (computer science), scope. While the term can refer to global variables, it is primarily used in the context of nested function, nested and ...

s introduced in nested and

s. Historically, these were termed the '' funarg problems'', the name coming from ''function argument''. In early imperative languages these problems were avoided by either not supporting functions as result types (e.g.

ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...

, Pascal) or omitting nested functions and thus non-local variables (e.g. C). The early functional language

Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...

took the approach of

dynamic scoping In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...

, where non-local variables refer to the closest definition of that variable at the point where the function is executed, instead of where it was defined. Proper support for

lexically scoped In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...

first-class functions was introduced in Scheme and requires handling references to functions as closures instead of bare

function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer referencing executable code, rather than data. Dereferencing the function pointer yields the referenced function, which can be invoked and passed arguments ...

s, which in turn makes

garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable ...

a necessity.

Concepts

In this section, we compare how particular programming idioms are handled in a functional language with first-class functions (

Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...

) compared to an imperative language where functions are second-class citizens ( C).

Higher-order functions: passing functions as arguments

In languages where functions are first-class citizens, functions can be passed as arguments to other functions in the same way as other values (a function taking another function as argument is called a higher-order function). In the language

: map :: (a -> b) -> -> map f [] = [] map f (x:xs) = f x : map f xs Languages where functions are not first-class often still allow one to write higher-order functions through the use of features such as

s or delegates. In the language C: void map(int (*f)(int), int x[], size_t n) There are a number of differences between the two approaches that are ''not'' directly related to the support of first-class functions. The Haskell sample operates on

list A list is a Set (mathematics), set of discrete items of information collected and set forth in some format for utility, entertainment, or other purposes. A list may be memorialized in any number of ways, including existing only in the mind of t ...

s, while the C sample operates on

arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...

. Both are the most natural compound data structures in the respective languages and making the C sample operate on linked lists would have made it unnecessarily complex. This also accounts for the fact that the C function needs an additional parameter (giving the size of the array.) The C function updates the array

in-place In computer science, an in-place algorithm is an algorithm that operates directly on the input data structure without requiring extra space proportional to the input size. In other words, it modifies the input in place, without creating a separa ...

, returning no value, whereas in Haskell data structures are persistent (a new list is returned while the old is left intact.) The Haskell sample uses

recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...

to traverse the list, while the C sample uses

iteration Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...

. Again, this is the most natural way to express this function in both languages, but the Haskell sample could easily have been expressed in terms of a fold and the C sample in terms of recursion. Finally, the Haskell function has a polymorphic type, as this is not supported by C we have fixed all type variables to the type constant int.

Anonymous and nested functions

In languages supporting anonymous functions, we can pass such a function as an argument to a higher-order function: main = map (\x -> 3 * x + 1) , 2, 3, 4, 5 In a language which does not support anonymous functions, we have to bind it to a name instead: int f(int x) int main()

Non-local variables and closures

Once we have anonymous or nested functions, it becomes natural for them to refer to variables outside of their body (called ''non-local variables''): main = let a = 3 b = 1 in map (\x -> a * x + b) , 2, 3, 4, 5 If functions are represented with bare function pointers, we can not know anymore how the value that is outside of the function's body should be passed to it, and because of that a closure needs to be built manually. Therefore we can not speak of "first-class" functions here. typedef struct closure_t; void map(closure_t *closure, int x[], size_t n) int f(int a, int b, int x) void main() Also note that the map is now specialized to functions referring to two ints outside of their environment. This can be set up more generally, but requires more

boilerplate code In computer programming, boilerplate code, or simply boilerplate, are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered ''verbose'', the programmer must write a lot of boile ...

. If f would have been a

nested function In computer programming, a nested function (or nested procedure or subroutine) is a named function that is defined within another, enclosing, block and is lexically scoped within the enclosing block meaning it is only callable by name within t ...

we would still have run into the same problem and this is the reason they are not supported in C.

Higher-order functions: returning functions as results

When returning a function, we are in fact returning its closure. In the C example any local variables captured by the closure will go out of scope once we return from the function that builds the closure. Forcing the closure at a later point will result in undefined behaviour, possibly corrupting the stack. This is known as the upwards funarg problem.

Assigning functions to variables

Assigning functions to variables and storing them inside (global) datastructures potentially suffers from the same difficulties as returning functions. f ::

Integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...

-> nteger f = let a = 3 b = 1 in ap (\x -> a * x + b), map (\x -> b * x + a)

Equality of functions

As one can test most literals and values for equality, it is natural to ask whether a programming language can support testing functions for equality. On further inspection, this question appears more difficult and one has to distinguish between several types of function equality: ; Extensional equality: Two functions ''f'' and ''g'' are considered extensionally equal if they agree on their outputs for all inputs (∀''x''. ''f''(''x'') = ''g''(''x'')). Under this definition of equality, for example, any two implementations of a stable sorting algorithm, such as

insertion sort Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. Howev ...

and

merge sort In computer science, merge sort (also commonly spelled as mergesort and as ) is an efficient, general-purpose, and comparison sort, comparison-based sorting algorithm. Most implementations of merge sort are Sorting algorithm#Stability, stable, wh ...

, would be considered equal. Deciding on extensional equality is undecidable in general and even for functions with finite domains often intractable. For this reason no programming language implements function equality as extensional equality. ; Intensional equality: Under intensional equality, two functions ''f'' and ''g'' are considered equal if they have the same "internal structure". This kind of equality could be implemented in

interpreted language In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An inter ...

s by comparing the

source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...

of the function bodies (such as in Interpreted Lisp 1.5) or the

object code In computing, object code or object module is the product of an assembler or compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...

compiled language Compiled language categorizes a programming language as used with a compiler and generally implies not used with an interpreter. But, since any language can theoretically be compiled or interpreted the term lacks clarity. In practice, for some lan ...

s. Intensional equality implies extensional equality (assuming the functions are deterministic and have no hidden inputs, such as the

program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...

or a mutable

global variable In computer programming, a global variable is a variable with global scope, meaning that it is visible (hence accessible) throughout the program, unless shadowed. The set of all global variables is known as the ''global environment'' or ''global ...

.) ; Reference equality: Given the impracticality of implementing extensional and intensional equality, most languages supporting testing functions for equality use reference equality. All functions or closures are assigned a unique identifier (usually the address of the function body or the closure) and equality is decided based on equality of the identifier. Two separately defined, but otherwise identical function definitions will be considered unequal. Referential equality implies intensional and extensional equality. Referential equality breaks

referential transparency In analytic philosophy and computer science, referential transparency and referential opacity are properties of linguistic constructions, and by extension of languages. A linguistic construction is called ''referentially transparent'' when for an ...

and is therefore not supported in pure languages, such as Haskell.

Type theory

type theory In mathematics and theoretical computer science, a type theory is the formal presentation of a specific type system. Type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a foundation of ...

, the type of functions accepting values of type ''A'' and returning values of type ''B'' may be written as ''A'' → ''B'' or ''B''^''A''. In the

Curry–Howard correspondence In programming language theory and proof theory, the Curry–Howard correspondence is the direct relationship between computer programs and mathematical proofs. It is also known as the Curry–Howard isomorphism or equivalence, or the proofs-as-p ...

s are related to

logical implication Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the study of deductively valid inferences or logical truths. It examines how conclusions follow from premises based on the structure of ...

; lambda abstraction corresponds to discharging hypothetical assumptions and function application corresponds to the

modus ponens In propositional logic, (; MP), also known as (), implication elimination, or affirming the antecedent, is a deductive argument form and rule of inference. It can be summarized as "''P'' implies ''Q.'' ''P'' is true. Therefore, ''Q'' must ...

inference rule. Besides the usual case of programming functions, type theory also uses first-class functions to model

associative array In computer science, an associative array, key-value store, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In math ...

s and similar

data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...

s. In category-theoretical accounts of programming, the availability of first-class functions corresponds to the closed category assumption. For instance, the

simply typed lambda calculus The simply typed lambda calculus (), a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor () that builds function types. It is the canonical and simplest example of a typed lambda calculus. The ...

corresponds to the internal language of Cartesian closed categories.

Language support

Functional programming languages, such as Erlang, Scheme, ML,

, F#, and Scala, all have first-class functions. When

, one of the earliest functional languages, was designed, not all aspects of first-class functions were then properly understood, resulting in functions being dynamically scoped. The later Scheme and

Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ''ANSI INCITS 226-1994 (S2018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperli ...

dialects do have lexically scoped first-class functions. Many scripting languages, including

Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...

, Python,

PHP PHP is a general-purpose scripting language geared towards web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by the PHP Group. ...

, Lua,

Tcl TCL or Tcl or TCLs may refer to: Business * TCL Technology, a Chinese consumer electronics and appliance company ** TCL Electronics, a subsidiary of TCL Technology * Texas Collegiate League, a collegiate baseball league * Trade Centre Limited ...

/Tk,

JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...

and Io, have first-class functions. For imperative languages, a distinction has to be made between Algol and its descendants such as Pascal, the traditional C family, and the modern garbage-collected variants. The Algol family has allowed nested functions and higher-order taking function as arguments, but not higher-order functions that return functions as results (except Algol 68, which allows this). The reason for this was that it was not known how to deal with non-local variables if a nested-function was returned as a result (and Algol 68 produces runtime errors in such cases). The C family allowed both passing functions as arguments and returning them as results, but avoided any problems by not supporting nested functions. (The gcc compiler allows them as an extension.) As the usefulness of returning functions primarily lies in the ability to return nested functions that have captured non-local variables, instead of top-level functions, these languages are generally not considered to have first-class functions. Modern imperative languages often support garbage-collection making the implementation of first-class functions feasible. First-class functions have often only been supported in later revisions of the language, including C# 2.0 and Apple's Blocks extension to C, C++, and Objective-C. C++11 has added support for anonymous functions and closures to the language, but because of the non-garbage collected nature of the language, special care has to be taken for non-local variables in functions to be returned as results (see below). ; C++:

C++11 C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior vers ...

closures can capture non-local variables by copy construction, by reference (without extending their lifetime), or by move construction (the variable lives as long as the closure does). The first option is safe if the closure is returned but requires a copy and cannot be used to modify the original variable (which might not exist any more at the time the closure is called). The second option potentially avoids an expensive copy and allows to modify the original variable but is unsafe in case the closure is returned (see

dangling reference Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references ar ...

s). The third option is safe if the closure is returned and does not require a copy but cannot be used to modify the original variable either. ; Java:

Java 8 The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community P ...

closures can only capture final or "effectively final" non-local variables. Java's

s are represented as Classes. Anonymous functions take the type inferred from the context. Method references are limited. For more details, see . ; Lisp :

Lexically scoped In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...

Lisp variants support closures.

Dynamically scoped In computer programming, the scope of a name binding (an association of a name to an entity, such as a Variable (programming), variable) is the part of a Computer program, program where the name binding is valid; that is, where the name can be use ...

variants do not support closures or need a special construct to create closures.Closures in ZetaLisp
: In

, the identifier of a function in the function namespace cannot be used as a reference to a first-class value. The special operator function must be used to retrieve the function as a value: (function foo) evaluates to a function object. #'foo exists as a shorthand notation. To apply such a function object, one must use the funcall function: (funcall #'foo bar baz). ; Python : Explicit partial application with functools.partial
/code> since version 2.5, and
/code> since version 2.6. ; Ruby : The identifier of a regular "function" in Ruby (which is really a method) cannot be used as a value or passed. It must first be retrieved into a Method or Proc object to be used as first-class data. The syntax for calling such a function object differs from calling regular methods. : Nested method definitions do not actually nest the scope. : Explicit currying with
/code>.

Notes

References

* Leonidas Fegaras
"Functional Languages and Higher-Order Functions"
CSE5317/CSE4305: Design and Construction of Compilers. University of Texas at Arlington.

External links

First-class functions
on

Rosetta Code Rosetta Code is a wiki-based programming chrestomathy website with implementations of common algorithms and solutions to various computer programming, programming problems in many different programming languages. It is named for the Rosetta Stone ...

.
Higher order functions
at IBM developerWorks {{DEFAULTSORT:First-class function Articles with example C code Articles with example Haskell code Compiler construction Data types Functional programming Primitive types Programming language theory Subroutines