Upwards Funarg Problem
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, the funarg problem ''(function argument problem)'' refers to the difficulty in implementing first-class functions ( functions as
first-class object In programming language design, a first-class citizen (also type, object, entity, or value) in a given programming language is an entity which supports all the operations generally available to other entities. These operations typically include ...
s) in programming language implementations so as to use
stack-based memory allocation Stack (abstract data type)#Hardware_stack, Stacks in computing architectures are regions of memory (computers), memory where data is added or removed in a LIFO (computing), last-in-first-out (LIFO) manner. In most modern computer systems, each ...
of the functions. The difficulty only arises if the body of a
nested function In computer programming, a nested function (or nested procedure or subroutine) is a function which is defined within another function, the ''enclosing function''. Due to simple recursive scope rules, a nested function is itself invisible outside o ...
refers directly (i.e., not by argument passing) to identifiers defined in the environment in which the function is defined, but not in the environment of the function call. A standard resolution is either to forbid such references or to create closures. There are two subtly different versions of the funarg problem. The upwards funarg problem arises from returning (or otherwise transmitting "upwards") a function from a function call. The downwards funarg problem arises from passing a function as a parameter to another function call.


Upwards funarg problem

When one function calls another during a typical program's execution, the local state of the caller (including
parameters A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
and local variables) must be preserved in order for execution to proceed after the callee returns. In most compiled programs, this local state is stored on the
call stack In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or ma ...
in a data structure called a ''
stack frame In computer science, a call stack is a stack data structure that stores information about the active subroutines of a computer program. This kind of stack is also known as an execution stack, program stack, control stack, run-time stack, or mac ...
'' or ''activation record''. This stack frame is pushed, or allocated, as prelude to calling another function, and is popped, or deallocated, when the other function returns to the function that did the call. The upwards funarg problem arises when the calling function refers to the called/exited function's state after that function has returned. Therefore, the stack frame containing the called function's state variables must not be deallocated when the function returns, violating the stack-based function call paradigm. One solution to the upwards funarg problem is to simply allocate all activation records from the
heap Heap or HEAP may refer to: Computing and mathematics * Heap (data structure), a data structure commonly used to implement a priority queue * Heap (mathematics), a generalization of a group * Heap (programming) (or free store), an area of memory f ...
instead of the stack and rely on some form of
garbage collection Waste collection is a part of the process of waste management. It is the transfer of solid waste from the point of use and disposal to the point of treatment or landfill. Waste collection also includes the curbside collection of recyclable m ...
or
reference counting In computer science, reference counting is a programming technique of storing the number of references, pointers, or handles to a resource, such as an object, a block of memory, disk space, and others. In garbage collection algorithms, referenc ...
to deallocate them when they are no longer needed. Managing activation records on the heap has historically been perceived to be less efficient than on the stack (although this is partially contradicted Andrew W. Appel, Zhong Shao. An Empirical and Analytic Study of Stack vs. Heap Cost for Languages with Closures. tp://ftp.cs.princeton.edu/techreports/1994/450.ps.gz Princeton CS Tech Report TR-450-94 1994.) and has been perceived to impose significant implementation complexity. Most functions in typical programs (less so for programs in
functional programming languages In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that m ...
) do not create upwards funargs, adding to concerns about potential overhead associated with their implementation. Furthermore, this approach is genuinely difficult in languages that do not support garbage collection. Some efficiency-minded compilers employ a hybrid approach in which the activation records for a function are allocated from the stack if the compiler is able to deduce, through
static program analysis In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution. The term i ...
, that the function creates no upwards funargs. Otherwise, the activation records are allocated from the heap. Another solution is to simply copy the value of the variables into the closure at the time the closure is created. This will cause a different behavior in the case of mutable variables, because the state will no longer be shared between closures. But if it is known that the variables are constant, then this approach will be equivalent. The ML languages take this approach, since variables in those languages are bound to values—i.e. variables cannot be changed.
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
also takes this approach with respect to anonymous classes, in that it only allows one to refer to variables in the enclosing scope that are final (i.e. constant). Some languages allow the programmer to explicitly choose between the two behaviors.
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group ...
5.3's anonymous functions require one to specify which variables to include in the closure using the use () clause; if the variable is listed by reference, it includes a reference to the original variable; otherwise, it passes the value. In Apple's Blocks anonymous functions, captured local variables are by default captured by value; if one wants to share the state between closures or between the closure and the outside scope, the variable must be declared with the __block modifier, in which case that variable is allocated on the heap.


Example

The following Haskell-like
pseudocode In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine re ...
defines
function composition In mathematics, function composition is an operation that takes two functions and , and produces a function such that . In this operation, the function is applied to the result of applying the function to . That is, the functions and ...
: compose f g = λx → f (g x) λ is the operator for constructing a new function, which in this case has one argument, x, and returns the result of first applying g to x, then applying f to that. This λ function carries the functions f and g (or pointers to them) as internal state. The problem in this case exists if the compose function allocates the parameter variables f and g on the stack. When compose returns, the stack frame containing f and g is discarded. When the internal function λx attempts to access g, it will access a discarded memory area.


Downwards funarg problem

A downwards funarg may also refer to a function's state when that function is not actually executing. However, because, by definition, the existence of a downwards funarg is contained in the execution of the function that creates it, the stack frame for the function can usually still be stored on the stack. Nonetheless, the existence of downwards funargs implies a tree structure of closures and stack frames that can complicate human and machine reasoning about the program state. The downwards funarg problem complicates the efficient compilation of
tail call In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recur ...
s and code written in
continuation-passing style In functional programming, continuation-passing style (CPS) is a style of programming in which control is passed explicitly in the form of a continuation. This is contrasted with direct style, which is the usual style of programming. Gerald Jay Suss ...
. In these special cases, the intent of the programmer is (usually) that the function run in limited stack space, so the "faster" behavior may actually be undesirable.


Practical implications

Historically, the upwards funarg problem has proven to be the more difficult. For example, the
Pascal programming language Pascal is an Imperative programming, imperative and Procedural programming, procedural programming language, designed by Niklaus Wirth as a small, efficient language intended to encourage good programming practices using structured programming an ...
allows functions to be passed as arguments but not returned as results; thus implementations of Pascal are required to address the downwards funarg problem but not the upwards one. The
Modula-2 Modula-2 is a structured, procedural programming language developed between 1977 and 1985/8 by Niklaus Wirth at ETH Zurich. It was created as the language for the operating system and application software of the Lilith personal workstation. It w ...
and
Oberon Oberon () is a king of the fairies in medieval and Renaissance literature. He is best known as a character in William Shakespeare's play ''A Midsummer Night's Dream'', in which he is King of the Fairies and spouse of Titania, Queen of the Fair ...
programming languages (descendants of Pascal) allow functions both as parameters and return values, but the assigned function may not be a nested function. The
C programming language ''The C Programming Language'' (sometimes termed ''K&R'', after its authors' initials) is a computer programming book written by Brian Kernighan and Dennis Ritchie, the latter of whom originally designed and implemented the language, as well as ...
historically avoids the main difficulty of the funarg problem by not allowing function definitions to be nested; because the environment of every function is the same, containing just the statically allocated global variables and functions, a pointer to a function's code describes the function completely.
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
has proposed and implemented a closure syntax for C that solves the upwards funarg problem by dynamically moving closures from the stack to the heap as necessary. The Java programming language deals with it by requiring that context used by nested functions in anonymous inner and local classes be declared
final Final, Finals or The Final may refer to: *Final (competition), the last or championship round of a sporting competition, match, game, or other contest which decides a winner for an event ** Another term for playoffs, describing a sequence of cont ...
, and context used by
lambda expressions Lambda expression may refer to: *Lambda expression in computer programming, also called an anonymous function In computer programming, an anonymous function (function literal, lambda abstraction, lambda function, lambda expression or block) is a f ...
be effectively final. C# and D have lambdas (closures) that encapsulate a function pointer and related variables. In functional languages, functions are first-class values that can be passed anywhere. Thus, implementations of
Scheme A scheme is a systematic plan for the implementation of a certain idea. Scheme or schemer may refer to: Arts and entertainment * ''The Scheme'' (TV series), a BBC Scotland documentary series * The Scheme (band), an English pop band * ''The Schem ...
or
Standard ML Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of the ...
must address both the upwards and downwards funarg problems. This is usually accomplished by representing function values as heap-allocated closures, as previously described. The
OCaml OCaml ( , formerly Objective Caml) is a general-purpose programming language, general-purpose, multi-paradigm programming language which extends the Caml dialect of ML (programming language), ML with object-oriented programming, object-oriented ...
compiler employs a hybrid technique (based on
program analysis In computer science, program analysis is the process of automatically analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness. Program analysis focuses on two major areas: program op ...
) to maximize efficiency.


See also

* Closure (computer science) *
Functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declar ...
*
Lambda calculus Lambda calculus (also written as ''λ''-calculus) is a formal system in mathematical logic for expressing computation based on function abstraction and application using variable binding and substitution. It is a universal model of computation ...
*
Man or boy test The man or boy test was proposed by computer scientist Donald Knuth as a means of evaluating implementations of the ALGOL 60 programming language. The aim of the test was to distinguish compilers that correctly implemented "recursion and non-local ...
* Name binding *
Referential transparency In computer science, referential transparency and referential opacity are properties of parts of computer programs. An expression is called ''referentially transparent'' if it can be replaced with its corresponding value (and vice-versa) withou ...
*
Scope (programming) In computer programming, the scope of a name binding (an association of a name to an entity, such as a variable) is the part of a program where the name binding is valid; that is, where the name can be used to refer to the entity. In other parts o ...
*
Spaghetti stack In computer science, an in-tree or parent pointer tree is an -ary tree data structure in which each node has a pointer to its parent node, but no pointers to child nodes. When used to implement a set of stacks, the structure is called a spaghett ...


References

{{Reflist


External links

* Joseph Weizenbaum
"The FUNARG Problem Explained"
1968. *
Joel Moses Joel Moses (24 November 1941 – 29 May 2022) was an Israeli-American mathematician, computer scientist, and Institute Professor at the Massachusetts Institute of Technology (MIT). Biography Joel Moses was born in Mandatory Palestine on 24 Novem ...

"The Function of FUNCTION in LISP, or Why the FUNARG Problem Should be Called the Environment Problem"
MIT AI Memo 199, 1970.
Bindings, Procedures, Functions, Functional Programming, and the Lambda Calculus
Compiler construction Programming language implementation