In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, the funarg problem ''(function argument problem)'' refers to the difficulty in implementing
first-class function
In computer science, a programming language is said to have first-class functions if it treats functions as first-class citizens. This means the language supports passing functions as arguments to other functions, returning them as the values from ...
s (
function
Function or functionality may refer to:
Computing
* Function key, a type of key on computer keyboards
* Function model, a structured representation of processes in a system
* Function object or functor or functionoid, a concept of object-oriente ...
s as
first-class object
In programming language design, a first-class citizen (also type, object, entity, or value) in a given programming language is an entity which supports all the operations generally available to other entities. These operations typically include ...
s) in programming language implementations so as to use
stack-based memory allocation
Stack (abstract data type)#Hardware_stack, Stacks in computing architectures are regions of memory (computers), memory where data is added or removed in a LIFO (computing), last-in-first-out (LIFO) manner.
In most modern computer systems, each ...
of the functions.
The difficulty only arises if the body of a
nested function
In computer programming, a nested function (or nested procedure or subroutine) is a function which is defined within another function, the ''enclosing function''. Due to simple recursive scope rules, a nested function is itself invisible outside o ...
refers directly (i.e., not by argument passing) to identifiers defined in the environment in which the function is defined, but not in the environment of the function call. A standard resolution is either to forbid such references or to create
closures.
There are two subtly different versions of the funarg problem. The upwards funarg problem arises from returning (or otherwise transmitting "upwards") a function from a function call. The downwards funarg problem arises from passing a function as a parameter to another function call.
Upwards funarg problem
When one function calls another during a typical program's execution, the local state of the caller (including
parameters
A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
and
local variable
In computer science, a local variable is a Variable (programming), variable that is given ''local scope (programming), scope''. A local variable reference in the subroutine, function or block (programming), block in which it is declared overrides ...
s) must be preserved in order for execution to proceed after the callee returns. In most compiled programs, this local state is stored on the
call stack
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or ma ...
in a data structure called a ''
stack frame
In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mac ...
'' or ''activation record''. This stack frame is pushed, or allocated, as prelude to calling another function, and is popped, or deallocated, when the other function returns to the function that did the call. The upwards funarg problem arises when the calling function refers to the called/exited function's state after that function has returned. Therefore, the stack frame containing the called function's state variables must not be deallocated when the function returns, violating the
stack-based function call paradigm.
One solution to the upwards funarg problem is to simply allocate all activation records from the
heap instead of the stack and rely on some form of
garbage collection
Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclabl ...
or
reference counting to deallocate them when they are no longer needed. Managing activation records on the heap has historically been perceived to be less efficient than on the stack (although this is partially contradicted
[ Andrew W. Appel, Zhong Shao. An Empirical and Analytic Study of Stack vs. Heap Cost for Languages with Closures. tp://ftp.cs.princeton.edu/techreports/1994/450.ps.gz Princeton CS Tech Report TR-450-94 1994.]) and has been perceived to impose significant implementation complexity. Most functions in typical programs (less so for programs in
functional programming languages
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that m ...
) do not create upwards funargs, adding to concerns about potential overhead associated with their implementation. Furthermore, this approach is genuinely difficult in languages that do not support garbage collection.
Some efficiency-minded compilers employ a hybrid approach in which the activation records for a function are allocated from the stack if the compiler is able to deduce, through
static program analysis
In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution.
The term ...
, that the function creates no upwards funargs. Otherwise, the activation records are allocated from the heap.
Another solution is to simply copy the value of the variables into the closure at the time the closure is created. This will cause a different behavior in the case of mutable variables, because the state will no longer be shared between closures. But if it is known that the variables are constant, then this approach will be equivalent. The
ML languages take this approach, since variables in those languages are bound to values—i.e. variables cannot be changed.
Java
Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
also takes this approach with respect to anonymous classes, in that it only allows one to refer to variables in the enclosing scope that are
final
(i.e. constant).
Some languages allow the programmer to explicitly choose between the two behaviors.
PHP
PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. ...
5.3's anonymous functions require one to specify which variables to include in the closure using the
use ()
clause; if the variable is listed by reference, it includes a reference to the original variable; otherwise, it passes the value. In Apple's
Blocks anonymous functions, captured local variables are by default captured by value; if one wants to share the state between closures or between the closure and the outside scope, the variable must be declared with the
__block
modifier, in which case that variable is allocated on the heap.
Example
The following
Haskell
Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
-like
pseudocode
In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
defines
function composition
In mathematics, function composition is an operation that takes two functions and , and produces a function such that . In this operation, the function is applied to the result of applying the function to . That is, the functions and ...
:
compose f g = λx → f (g x)
λ
is the operator for constructing a new function, which in this case has one argument,
x
, and returns the result of first applying
g
to
x
, then applying
f
to that. This λ function carries the functions
f
and
g
(or pointers to them) as internal state.
The problem in this case exists if the compose function allocates the parameter variables
f
and
g
on the stack. When
compose
returns, the stack frame containing
f
and
g
is discarded. When the internal function
λx
attempts to access
g
, it will access a discarded memory area.
Downwards funarg problem
A downwards funarg may also refer to a function's state when that function is not actually executing. However, because, by definition, the existence of a downwards funarg is contained in the execution of the function that creates it, the stack frame for the function can usually still be stored on the stack. Nonetheless, the existence of downwards funargs implies a tree structure of closures and stack frames that can complicate human and machine reasoning about the program state.
The downwards funarg problem complicates the efficient compilation of
tail call
In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recur ...
s and code written in
continuation-passing style In functional programming, continuation-passing style (CPS) is a style of programming in which control is passed explicitly in the form of a continuation. This is contrasted with direct style, which is the usual style of programming. Gerald Jay Suss ...
. In these special cases, the intent of the programmer is (usually) that the function run in limited stack space, so the "faster" behavior may actually be undesirable.
Practical implications
Historically, the upwards funarg problem has proven to be the more difficult. For example, the
Pascal programming language
Pascal is an Imperative programming, imperative and Procedural programming, procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming an ...
allows functions to be passed as arguments but not returned as results; thus implementations of Pascal are required to address the downwards funarg problem but not the upwards one. The
Modula-2
Modula-2 is a structured, procedural programming language developed between 1977 and 1985/8 by Niklaus Wirth at ETH Zurich. It was created as the language for the operating system and application software of the Lilith personal workstation. It w ...
and
Oberon
Oberon () is a king of the fairies in medieval and Renaissance literature. He is best known as a character in William Shakespeare's play ''A Midsummer Night's Dream'', in which he is King of the Fairies and spouse of Titania, Queen of the Fair ...
programming languages (descendants of Pascal) allow functions both as parameters and return values, but the assigned function may not be a nested function. The
C programming language
''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
historically avoids the main difficulty of the funarg problem by not allowing function definitions to be nested; because the environment of every function is the same, containing just the statically allocated global variables and functions, a pointer to a function's code describes the function completely.
Apple
An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
has proposed and implemented a
closure syntax for C that solves the upwards funarg problem by dynamically moving closures from the stack to the heap as necessary. The
Java programming language
Java is a high-level, class-based, object-oriented programming language that is designed to have as few implementation dependencies as possible. It is a general-purpose programming language intended to let programmers ''write once, run anywh ...
deals with it by requiring that context used by nested functions in anonymous inner and local classes be declared
final
Final, Finals or The Final may refer to:
*Final (competition), the last or championship round of a sporting competition, match, game, or other contest which decides a winner for an event
** Another term for playoffs, describing a sequence of cont ...
, and context used by
lambda expressions Lambda expression may refer to:
*Lambda expression in computer programming, also called an anonymous function, is a defined function not bound to an identifier.
* Lambda expression in lambda calculus, a formal system in mathematical logic and ...
be effectively final.
C# and
D have lambdas (closures) that encapsulate a function pointer and related variables.
In
functional language
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that ...
s, functions are first-class values that can be passed anywhere. Thus, implementations of
Scheme A scheme is a systematic plan for the implementation of a certain idea.
Scheme or schemer may refer to:
Arts and entertainment
* ''The Scheme'' (TV series), a BBC Scotland documentary series
* The Scheme (band), an English pop band
* ''The Schem ...
or
Standard ML
Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of the ...
must address both the upwards and downwards funarg problems. This is usually accomplished by representing function values as
heap-allocated closures, as previously described. The
OCaml
OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
compiler employs a hybrid technique (based on
program analysis
In computer science, program analysis is the process of automatically analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness.
Program analysis focuses on two major areas: program op ...
) to maximize efficiency.
See also
*
Closure (computer science)
In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function toge ...
*
Functional programming
In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declar ...
*
Lambda calculus
Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation ...
*
Man or boy test
The man or boy test was proposed by computer scientist Donald Knuth as a means of evaluating implementations of the ALGOL 60 programming language. The aim of the test was to distinguish compilers that correctly implemented "recursion and non-loca ...
*
Name binding
In programming languages, name binding is the association of entities (data and/or code) with identifiers. An identifier bound to an object is said to reference that object. Machine languages have no built-in notion of identifiers, but name-objec ...
*
Referential transparency
In computer science, referential transparency and referential opacity are properties of parts of computer programs. An expression is called ''referentially transparent'' if it can be replaced with its corresponding value (and vice-versa) withou ...
*
Scope (programming)
In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts ...
*
Spaghetti stack
In computer science, an in-tree or parent pointer tree is an -ary tree data structure in which each node has a pointer to its parent node, but no pointers to child nodes. When used to implement a set of stacks, the structure is called a spaghett ...
References
{{Reflist
External links
*
Joseph Weizenbaum
Joseph Weizenbaum (8 January 1923 – 5 March 2008) was a German American computer scientist and a professor at MIT. The Weizenbaum Award is named after him. He is considered one of the fathers of modern artificial intelligence.
Life and caree ...
"The FUNARG Problem Explained" 1968.
*
Joel Moses
Joel Moses (24 November 1941 – 29 May 2022) was an Israeli-American mathematician, computer scientist, and Institute Professor at the Massachusetts Institute of Technology (MIT).
Biography
Joel Moses was born in Mandatory Palestine on 24 Nove ...
"The Function of FUNCTION in LISP, or Why the FUNARG Problem Should be Called the Environment Problem" MIT AI Memo 199, 1970.
Bindings, Procedures, Functions, Functional Programming, and the Lambda Calculus
Compiler construction
Programming language implementation