HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, information, and automation. Computer science spans Theoretical computer science, theoretical disciplines (such as algorithms, theory of computation, and information theory) to Applied science, ...
, a
programming language A programming language is a system of notation for writing computer programs. Programming languages are described in terms of their Syntax (programming languages), syntax (form) and semantics (computer science), semantics (meaning), usually def ...
is said to have first-class functions if it treats functions as
first-class citizen In a given programming language design, a first-class citizen is an entity which supports all the operations generally available to other entities. These operations typically include being passed as an argument, returned from a function, and ass ...
s. This means the language supports passing functions as arguments to other functions, returning them as the values from other functions, and assigning them to variables or storing them in data structures. Some programming language theorists require support for
anonymous function In computer programming, an anonymous function (function literal, expression or block) is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for const ...
s (function literals) as well.Programming language pragmatics
by Michael Lee Scott, section 11.2 "Functional Programming".
In languages with first-class functions, the
names A name is a term used for identification by an external observer. They can identify a class or category of things, or a single thing, either uniquely, or within a given context. The entity identified by a name is called its referent. A person ...
of functions do not have any special status; they are treated like ordinary variables with a
function type In computer science and mathematical logic, a function type (or arrow type or exponential) is the type of a variable or parameter to which a function has or can be assigned, or an argument or result type of a higher-order function taking or return ...
. The term was coined by
Christopher Strachey Christopher S. Strachey (; 16 November 1916 – 18 May 1975) was a British computer scientist. He was one of the founders of denotational semantics, and a pioneer in programming language design and computer time-sharing.F. J. Corbató, et al., T ...
in the context of "functions as first-class citizens" in the mid-1960s. (also on 2010-02-16 First-class functions are a necessity for the
functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...
style, in which the use of
higher-order function In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following: * takes one or more functions as arguments (i.e. a procedural parameter, which is a parameter of a procedure that is itself ...
s is a standard practice. A simple example of a higher-ordered function is the ''
map A map is a symbolic depiction of interrelationships, commonly spatial, between things within a space. A map may be annotated with text and graphics. Like any graphic, a map may be fixed to paper or other durable media, or may be displayed on ...
'' function, which takes, as its arguments, a function and a list, and returns the list formed by applying the function to each member of the list. For a language to support ''map'', it must support passing a function as an argument. There are certain implementation difficulties in passing functions as arguments or returning them as results, especially in the presence of
non-local variable In programming language theory, a non-local variable is a variable that is not defined in the local Scope (computer science), scope. While the term can refer to global variables, it is primarily used in the context of nested function, nested and ...
s introduced in nested and
anonymous function In computer programming, an anonymous function (function literal, expression or block) is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for const ...
s. Historically, these were termed the '' funarg problems'', the name coming from ''function argument''. In early imperative languages these problems were avoided by either not supporting functions as result types (e.g.
ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...
, Pascal) or omitting nested functions and thus non-local variables (e.g. C). The early functional language
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
took the approach of
dynamic scoping In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...
, where non-local variables refer to the closest definition of that variable at the point where the function is executed, instead of where it was defined. Proper support for
lexically scoped In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...
first-class functions was introduced in Scheme and requires handling references to functions as closures instead of bare
function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer referencing executable code, rather than data. Dereferencing the function pointer yields the referenced function, which can be invoked and passed arguments ...
s, which in turn makes
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable ...
a necessity.


Concepts

In this section, we compare how particular programming idioms are handled in a functional language with first-class functions (
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
) compared to an imperative language where functions are second-class citizens ( C).


Higher-order functions: passing functions as arguments

In languages where functions are first-class citizens, functions can be passed as arguments to other functions in the same way as other values (a function taking another function as argument is called a higher-order function). In the language
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
: map :: (a -> b) -> -> map f [] = [] map f (x:xs) = f x : map f xs Languages where functions are not first-class often still allow one to write higher-order functions through the use of features such as
function pointer A function pointer, also called a subroutine pointer or procedure pointer, is a pointer referencing executable code, rather than data. Dereferencing the function pointer yields the referenced function, which can be invoked and passed arguments ...
s or delegates. In the language C: void map(int (*f)(int), int x[], size_t n) There are a number of differences between the two approaches that are ''not'' directly related to the support of first-class functions. The Haskell sample operates on
list A list is a Set (mathematics), set of discrete items of information collected and set forth in some format for utility, entertainment, or other purposes. A list may be memorialized in any number of ways, including existing only in the mind of t ...
s, while the C sample operates on
arrays An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
. Both are the most natural compound data structures in the respective languages and making the C sample operate on linked lists would have made it unnecessarily complex. This also accounts for the fact that the C function needs an additional parameter (giving the size of the array.) The C function updates the array
in-place In computer science, an in-place algorithm is an algorithm that operates directly on the input data structure without requiring extra space proportional to the input size. In other words, it modifies the input in place, without creating a separa ...
, returning no value, whereas in Haskell data structures are persistent (a new list is returned while the old is left intact.) The Haskell sample uses
recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
to traverse the list, while the C sample uses
iteration Iteration is the repetition of a process in order to generate a (possibly unbounded) sequence of outcomes. Each repetition of the process is a single iteration, and the outcome of each iteration is then the starting point of the next iteration. ...
. Again, this is the most natural way to express this function in both languages, but the Haskell sample could easily have been expressed in terms of a fold and the C sample in terms of recursion. Finally, the Haskell function has a polymorphic type, as this is not supported by C we have fixed all type variables to the type constant int.


Anonymous and nested functions

In languages supporting anonymous functions, we can pass such a function as an argument to a higher-order function: main = map (\x -> 3 * x + 1) , 2, 3, 4, 5 In a language which does not support anonymous functions, we have to bind it to a name instead: int f(int x) int main()


Non-local variables and closures

Once we have anonymous or nested functions, it becomes natural for them to refer to variables outside of their body (called ''non-local variables''): main = let a = 3 b = 1 in map (\x -> a * x + b) , 2, 3, 4, 5 If functions are represented with bare function pointers, we can not know anymore how the value that is outside of the function's body should be passed to it, and because of that a closure needs to be built manually. Therefore we can not speak of "first-class" functions here. typedef struct closure_t; void map(closure_t *closure, int x[], size_t n) int f(int a, int b, int x) void main() Also note that the map is now specialized to functions referring to two ints outside of their environment. This can be set up more generally, but requires more
boilerplate code In computer programming, boilerplate code, or simply boilerplate, are sections of code that are repeated in multiple places with little to no variation. When using languages that are considered ''verbose'', the programmer must write a lot of boile ...
. If f would have been a
nested function In computer programming, a nested function (or nested procedure or subroutine) is a named function that is defined within another, enclosing, block and is lexically scoped within the enclosing block meaning it is only callable by name within t ...
we would still have run into the same problem and this is the reason they are not supported in C.


Higher-order functions: returning functions as results

When returning a function, we are in fact returning its closure. In the C example any local variables captured by the closure will go out of scope once we return from the function that builds the closure. Forcing the closure at a later point will result in undefined behaviour, possibly corrupting the stack. This is known as the upwards funarg problem.


Assigning functions to variables

Assigning functions to variables and storing them inside (global) datastructures potentially suffers from the same difficulties as returning functions. f ::
Integer An integer is the number zero (0), a positive natural number (1, 2, 3, ...), or the negation of a positive natural number (−1, −2, −3, ...). The negations or additive inverses of the positive natural numbers are referred to as negative in ...
-> nteger f = let a = 3 b = 1 in ap (\x -> a * x + b), map (\x -> b * x + a)


Equality of functions

As one can test most literals and values for equality, it is natural to ask whether a programming language can support testing functions for equality. On further inspection, this question appears more difficult and one has to distinguish between several types of function equality: ; Extensional equality: Two functions ''f'' and ''g'' are considered extensionally equal if they agree on their outputs for all inputs (∀''x''. ''f''(''x'') = ''g''(''x'')). Under this definition of equality, for example, any two implementations of a stable sorting algorithm, such as
insertion sort Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time by comparisons. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort. Howev ...
and
merge sort In computer science, merge sort (also commonly spelled as mergesort and as ) is an efficient, general-purpose, and comparison sort, comparison-based sorting algorithm. Most implementations of merge sort are Sorting algorithm#Stability, stable, wh ...
, would be considered equal. Deciding on extensional equality is undecidable in general and even for functions with finite domains often intractable. For this reason no programming language implements function equality as extensional equality. ; Intensional equality: Under intensional equality, two functions ''f'' and ''g'' are considered equal if they have the same "internal structure". This kind of equality could be implemented in
interpreted language In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An inter ...
s by comparing the
source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...
of the function bodies (such as in Interpreted Lisp 1.5) or the
object code In computing, object code or object module is the product of an assembler or compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' ...
in
compiled language Compiled language categorizes a programming language as used with a compiler and generally implies not used with an interpreter. But, since any language can theoretically be compiled or interpreted the term lacks clarity. In practice, for some lan ...
s. Intensional equality implies extensional equality (assuming the functions are deterministic and have no hidden inputs, such as the
program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel x86 and Itanium microprocessors, and sometimes called the instruction address register (IAR), the instruction counter, or just part of the instruction sequencer, ...
or a mutable
global variable In computer programming, a global variable is a variable with global scope, meaning that it is visible (hence accessible) throughout the program, unless shadowed. The set of all global variables is known as the ''global environment'' or ''global ...
.) ; Reference equality: Given the impracticality of implementing extensional and intensional equality, most languages supporting testing functions for equality use reference equality. All functions or closures are assigned a unique identifier (usually the address of the function body or the closure) and equality is decided based on equality of the identifier. Two separately defined, but otherwise identical function definitions will be considered unequal. Referential equality implies intensional and extensional equality. Referential equality breaks
referential transparency In analytic philosophy and computer science, referential transparency and referential opacity are properties of linguistic constructions, and by extension of languages. A linguistic construction is called ''referentially transparent'' when for an ...
and is therefore not supported in pure languages, such as Haskell.


Type theory

In
type theory In mathematics and theoretical computer science, a type theory is the formal presentation of a specific type system. Type theory is the academic study of type systems. Some type theories serve as alternatives to set theory as a foundation of ...
, the type of functions accepting values of type ''A'' and returning values of type ''B'' may be written as ''A'' → ''B'' or ''B''''A''. In the
Curry–Howard correspondence In programming language theory and proof theory, the Curry–Howard correspondence is the direct relationship between computer programs and mathematical proofs. It is also known as the Curry–Howard isomorphism or equivalence, or the proofs-as-p ...
,
function type In computer science and mathematical logic, a function type (or arrow type or exponential) is the type of a variable or parameter to which a function has or can be assigned, or an argument or result type of a higher-order function taking or return ...
s are related to
logical implication Logic is the study of correct reasoning. It includes both formal and informal logic. Formal logic is the study of deductively valid inferences or logical truths. It examines how conclusions follow from premises based on the structure of ...
; lambda abstraction corresponds to discharging hypothetical assumptions and function application corresponds to the
modus ponens In propositional logic, (; MP), also known as (), implication elimination, or affirming the antecedent, is a deductive argument form and rule of inference. It can be summarized as "''P'' implies ''Q.'' ''P'' is true. Therefore, ''Q'' must ...
inference rule. Besides the usual case of programming functions, type theory also uses first-class functions to model
associative array In computer science, an associative array, key-value store, map, symbol table, or dictionary is an abstract data type that stores a collection of (key, value) pairs, such that each possible key appears at most once in the collection. In math ...
s and similar
data structure In computer science, a data structure is a data organization and storage format that is usually chosen for Efficiency, efficient Data access, access to data. More precisely, a data structure is a collection of data values, the relationships amo ...
s. In category-theoretical accounts of programming, the availability of first-class functions corresponds to the closed category assumption. For instance, the
simply typed lambda calculus The simply typed lambda calculus (), a form of type theory, is a typed interpretation of the lambda calculus with only one type constructor () that builds function types. It is the canonical and simplest example of a typed lambda calculus. The ...
corresponds to the internal language of Cartesian closed categories.


Language support

Functional programming languages, such as Erlang, Scheme, ML,
Haskell Haskell () is a general-purpose, statically typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research, and industrial applications, Haskell pioneered several programming language ...
, F#, and Scala, all have first-class functions. When
Lisp Lisp (historically LISP, an abbreviation of "list processing") is a family of programming languages with a long history and a distinctive, fully parenthesized Polish notation#Explanation, prefix notation. Originally specified in the late 1950s, ...
, one of the earliest functional languages, was designed, not all aspects of first-class functions were then properly understood, resulting in functions being dynamically scoped. The later Scheme and
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ''ANSI INCITS 226-1994 (S2018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperli ...
dialects do have lexically scoped first-class functions. Many scripting languages, including
Perl Perl is a high-level, general-purpose, interpreted, dynamic programming language. Though Perl is not officially an acronym, there are various backronyms in use, including "Practical Extraction and Reporting Language". Perl was developed ...
, Python,
PHP PHP is a general-purpose scripting language geared towards web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by the PHP Group. ...
, Lua,
Tcl TCL or Tcl or TCLs may refer to: Business * TCL Technology, a Chinese consumer electronics and appliance company ** TCL Electronics, a subsidiary of TCL Technology * Texas Collegiate League, a collegiate baseball league * Trade Centre Limited ...
/Tk,
JavaScript JavaScript (), often abbreviated as JS, is a programming language and core technology of the World Wide Web, alongside HTML and CSS. Ninety-nine percent of websites use JavaScript on the client side for webpage behavior. Web browsers have ...
and Io, have first-class functions. For imperative languages, a distinction has to be made between Algol and its descendants such as Pascal, the traditional C family, and the modern garbage-collected variants. The Algol family has allowed nested functions and higher-order taking function as arguments, but not higher-order functions that return functions as results (except Algol 68, which allows this). The reason for this was that it was not known how to deal with non-local variables if a nested-function was returned as a result (and Algol 68 produces runtime errors in such cases). The C family allowed both passing functions as arguments and returning them as results, but avoided any problems by not supporting nested functions. (The gcc compiler allows them as an extension.) As the usefulness of returning functions primarily lies in the ability to return nested functions that have captured non-local variables, instead of top-level functions, these languages are generally not considered to have first-class functions. Modern imperative languages often support garbage-collection making the implementation of first-class functions feasible. First-class functions have often only been supported in later revisions of the language, including C# 2.0 and Apple's Blocks extension to C, C++, and Objective-C. C++11 has added support for anonymous functions and closures to the language, but because of the non-garbage collected nature of the language, special care has to be taken for non-local variables in functions to be returned as results (see below). ; C++:
C++11 C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior vers ...
closures can capture non-local variables by copy construction, by reference (without extending their lifetime), or by move construction (the variable lives as long as the closure does). The first option is safe if the closure is returned but requires a copy and cannot be used to modify the original variable (which might not exist any more at the time the closure is called). The second option potentially avoids an expensive copy and allows to modify the original variable but is unsafe in case the closure is returned (see
dangling reference Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references ar ...
s). The third option is safe if the closure is returned and does not require a copy but cannot be used to modify the original variable either. ; Java:
Java 8 The Java language has undergone several changes since JDK 1.0 as well as numerous additions of classes and packages to the standard library. Since J2SE 1.4, the evolution of the Java language has been governed by the Java Community P ...
closures can only capture final or "effectively final" non-local variables. Java's
function type In computer science and mathematical logic, a function type (or arrow type or exponential) is the type of a variable or parameter to which a function has or can be assigned, or an argument or result type of a higher-order function taking or return ...
s are represented as Classes. Anonymous functions take the type inferred from the context. Method references are limited. For more details, see . ; Lisp :
Lexically scoped In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...
Lisp variants support closures.
Dynamically scoped In computer programming, the scope of a name binding (an association of a name to an entity, such as a Variable (programming), variable) is the part of a Computer program, program where the name binding is valid; that is, where the name can be use ...
variants do not support closures or need a special construct to create closures.Closures in ZetaLisp
: In
Common Lisp Common Lisp (CL) is a dialect of the Lisp programming language, published in American National Standards Institute (ANSI) standard document ''ANSI INCITS 226-1994 (S2018)'' (formerly ''X3.226-1994 (R1999)''). The Common Lisp HyperSpec, a hyperli ...
, the identifier of a function in the function namespace cannot be used as a reference to a first-class value. The special operator function must be used to retrieve the function as a value: (function foo) evaluates to a function object. #'foo exists as a shorthand notation. To apply such a function object, one must use the funcall function: (funcall #'foo bar baz). ; Python : Explicit partial application with functools.partial
/code> since version 2.5, and
/code> since version 2.6. ; Ruby : The identifier of a regular "function" in Ruby (which is really a method) cannot be used as a value or passed. It must first be retrieved into a Method or Proc object to be used as first-class data. The syntax for calling such a function object differs from calling regular methods. : Nested method definitions do not actually nest the scope. : Explicit currying with
/code>.


See also

* Defunctionalization *
eval In some programming languages, eval , short for evaluate, is a function which evaluates a string as though it were an expression in the language, and returns a result; in others, it executes multiple lines of code as though they had been incl ...
* First-class message * Kappa calculus – a formalism which excludes first-class functions *
Man or boy test The man or boy test was proposed by computer scientist Donald Knuth as a means of evaluating implementations of the ALGOL 60 programming language. The aim of the test was to distinguish compilers that correctly implemented "recursion and non-local ...
*
Partial application Partial may refer to: Mathematics *Partial derivative, derivative with respect to one of several variables of a function, with the other variables held constant ** ∂, a symbol that can denote a partial derivative, sometimes pronounced "partial ...


Notes


References

* Leonidas Fegaras
"Functional Languages and Higher-Order Functions"
CSE5317/CSE4305: Design and Construction of Compilers. University of Texas at Arlington.


External links


First-class functions
on
Rosetta Code Rosetta Code is a wiki-based programming chrestomathy website with implementations of common algorithms and solutions to various computer programming, programming problems in many different programming languages. It is named for the Rosetta Stone ...
.
Higher order functions
at IBM developerWorks {{DEFAULTSORT:First-class function Articles with example C code Articles with example Haskell code Compiler construction Data types Functional programming Primitive types Programming language theory Subroutines