computer programming Computer programming or coding is the composition of sequences of instructions, called computer program, programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of proc ...

, a thunk is a

subroutine In computer programming, a function (also procedure, method, subroutine, routine, or subprogram) is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times. Callable units provide a ...

used to inject a calculation into another subroutine. Thunks are primarily used to delay a calculation until its result is needed, or to insert operations at the beginning or end of the other subroutine. They have many other applications in compiler code generation and

modular programming Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules, such that each contains everything necessary to execute only one aspect or "concern" of the d ...

. The term originated as a whimsical irregular form of the verb ''think''. It refers to the original use of thunks in

ALGOL 60 ALGOL 60 (short for ''Algorithmic Language 1960'') is a member of the ALGOL family of computer programming languages. It followed on from ALGOL 58 which had introduced code blocks and the begin and end pairs for delimiting them, representing a ...

compilers, which required special analysis (thought) to determine what type of routine to generate.

Background

The early years of

compiler In computing, a compiler is a computer program that Translator (computing), translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primaril ...

research saw broad experimentation with different evaluation strategies. A key question was how to compile a subroutine call if the arguments can be arbitrary mathematical expressions rather than constants. One approach, known as "

call by value In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...

", calculates all of the arguments before the call and then passes the resulting values to the subroutine. In the rival "

call by name In a programming language, an evaluation strategy is a set of rules for evaluating expressions. The term is often used to refer to the more specific notion of a ''parameter-passing strategy'' that defines the kind of value that is passed to the ...

" approach, the subroutine receives the unevaluated argument expression and must evaluate it. A simple implementation of "call by name" might substitute the code of an argument expression for each appearance of the corresponding parameter in the subroutine, but this can produce multiple versions of the subroutine and multiple copies of the expression code. As an improvement, the compiler can generate a helper subroutine, called a ''thunk'', that calculates the value of the argument. The address and environment of this helper subroutine are then passed to the original subroutine in place of the original argument, where it can be called as many times as needed. Peter Ingerman first described thunks in reference to the ALGOL 60 programming language, which supports call-by-name evaluation.

Applications

Functional programming

Although the software industry largely standardized on call-by-value and call-by-reference evaluation, active study of call-by-name continued in the

functional programming In computer science, functional programming is a programming paradigm where programs are constructed by Function application, applying and Function composition (computer science), composing Function (computer science), functions. It is a declarat ...

community. This research produced a series of

lazy evaluation In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an Expression (computer science), expression until its value is needed (non-strict evaluation) and which avoids repeated eva ...

programming languages in which some variant of call-by-name is the standard evaluation strategy. Compilers for these languages, such as the Glasgow Haskell Compiler, have relied heavily on thunks, with the added feature that the thunks save their initial result so that they can avoid recalculating it; this is known as

memoization In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls to pure functions and returning the cached result when the same inputs occur ag ...

or call-by-need. Functional programming languages have also allowed programmers to explicitly generate thunks. This is done in

source code In computing, source code, or simply code or source, is a plain text computer program written in a programming language. A programmer writes the human readable source code to control the behavior of a computer. Since a computer, at base, only ...

by wrapping an argument expression in an

anonymous function In computer programming, an anonymous function (function literal, expression or block) is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for const ...

that has no parameters of its own. This prevents the expression from being evaluated until a receiving function calls the anonymous function, thereby achieving the same effect as call-by-name. The adoption of anonymous functions into other programming languages has made this capability widely available.

Object-oriented programming

Thunks are useful in

object-oriented programming Object-oriented programming (OOP) is a programming paradigm based on the concept of '' objects''. Objects can contain data (called fields, attributes or properties) and have actions they can perform (called procedures or methods and impl ...

platforms that allow a

class Class, Classes, or The Class may refer to: Common uses not otherwise categorized * Class (biology), a taxonomic rank * Class (knowledge representation), a collection of individuals or objects * Class (philosophy), an analytical concept used d ...

to inherit multiple interfaces, leading to situations where the same

method Method (, methodos, from μετά/meta "in pursuit or quest of" + ὁδός/hodos "a method, system; a way or manner" of doing, saying, etc.), literally means a pursuit of knowledge, investigation, mode of prosecuting such inquiry, or system. In re ...

might be called via any of several interfaces. The following code illustrates such a situation in C++. class A ; class B ; class C : public A, public B ; int use(B *b) int main() In this example, the code generated for each of the classes A, B and C will include a dispatch table that can be used to call on an object of that type, via a reference that has the same type. Class C will have an additional dispatch table, used to call on an object of type C via a reference of type B. The expression will use B's own dispatch table or the additional C table, depending on the type of object b refers to. If it refers to an object of type C, the compiler must ensure that C's implementation receives an instance address for the entire C object, rather than the inherited B part of that object. As a direct approach to this pointer adjustment problem, the compiler can include an integer offset in each dispatch table entry. This offset is the difference between the reference's address and the address required by the method implementation. The code generated for each call through these dispatch tables must then retrieve the offset and use it to adjust the instance address before calling the method. The solution just described has problems similar to the naïve implementation of call-by-name described earlier: the compiler generates several copies of code to calculate an argument (the instance address), while also increasing the dispatch table sizes to hold the offsets. As an alternative, the compiler can generate an ''adjustor thunk'' along with C's implementation of that adjusts the instance address by the required amount and then calls the method. The thunk can appear in C's dispatch table for B, thereby eliminating the need for callers to adjust the address themselves.

Interoperability

Thunks have been widely used to provide interoperability between software modules whose routines cannot call each other directly. This may occur because the routines have different calling conventions, run in different

CPU modes CPU modes (also called processor modes, CPU states, CPU privilege levels and other names) are operating modes for the central processing unit of most computer architectures that place restrictions on the type and scope of operations that can be ...

address space In computing, an address space defines a range of discrete addresses, each of which may correspond to a network host, peripheral device, disk sector, a memory cell or other logical or physical entity. For software programs to save and retrieve ...

s, or at least one runs in a

virtual machine In computing, a virtual machine (VM) is the virtualization or emulator, emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve ...

. A compiler (or other tool) can solve this problem by generating a thunk that automates the additional steps needed to call the target routine, whether that is transforming arguments, copying them to another location, or switching the CPU mode. A successful thunk minimizes the extra work the caller must do compared to a normal call. Much of the literature on interoperability thunks relates to various

Wintel Wintel (portmanteau of ''Windows'' and ''Intel'') is the partnership of Microsoft and Intel producing personal computers (PCs) using Intel x86-compatible processors running Windows. Background By the early 1980s, the chaos and incompatibility ...

platforms, including

MS-DOS MS-DOS ( ; acronym for Microsoft Disk Operating System, also known as Microsoft DOS) is an operating system for x86-based personal computers mostly developed by Microsoft. Collectively, MS-DOS, its rebranding as IBM PC DOS, and a few op ...

OS/2 OS/2 is a Proprietary software, proprietary computer operating system for x86 and PowerPC based personal computers. It was created and initially developed jointly by IBM and Microsoft, under the leadership of IBM software designer Ed Iacobucci, ...

Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...

and

.NET The .NET platform (pronounced as "''dot net"'') is a free and open-source, managed code, managed computer software framework for Microsoft Windows, Windows, Linux, and macOS operating systems. The project is mainly developed by Microsoft emplo ...

, and to the transition from

16-bit 16-bit microcomputers are microcomputers that use 16-bit microprocessors. A 16-bit register can store 216 different values. The range of integer values that can be stored in 16 bits depends on the integer representation used. With the two ...

32-bit In computer architecture, 32-bit computing refers to computer systems with a processor, memory, and other major system components that operate on data in a maximum of 32- bit units. Compared to smaller bit widths, 32-bit computers can perform la ...

memory addressing. As customers have migrated from one platform to another, thunks have been essential to support

legacy software Legacy or Legacies may refer to: Arts and entertainment Comics * " Batman: Legacy", a 1996 Batman storyline * '' DC Universe: Legacies'', a comic book series from DC Comics * ''Legacy'', a 1999 quarterly series from Antarctic Press * ''Legacy ...

written for the older platforms. UEFI CSM is another example to do thunk for legacy

boot loader A bootloader, also spelled as boot loader or called bootstrap loader, is a computer program that is responsible for booting a computer and booting an operating system. If it also provides an interactive menu with multiple boot choices then it's o ...

s. The transition from 32-bit to 64-bit code on x86 also uses a form of thunking ( WoW64). However, because the x86-64 address space is larger than the one available to 32-bit code, the old "generic thunk" mechanism could not be used to call 64-bit code from 32-bit code. The only case of 32-bit code calling 64-bit code is in the WoW64's thunking of Windows APIs to 32-bit.

Overlays and dynamic linking

On systems that lack automatic

virtual memory In computing, virtual memory, or virtual storage, is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a ver ...

hardware, thunks can implement a limited form of virtual memory known as overlays. With overlays, a developer divides a program's code into segments that can be loaded and unloaded independently, and identifies the

entry point In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments. To start a program's execution, the loader or operating system passes co ...

s into each segment. A segment that calls into another segment must do so indirectly via a

branch table A branch, also called a ramus in botany, is a Plant stem, stem that grows off from another stem, or when structures like veins in leaves are divided into smaller veins. History and etymology In Old English, there are numerous words for bra ...

. When a segment is in memory, its branch table entries jump into the segment. When a segment is unloaded, its entries are replaced with "reload thunks" that can reload it on demand. Similarly, systems that dynamically link modules of a program together at run-time can use thunks to connect the modules. Each module can call the others through a table of thunks that the linker fills in when it loads the module. This way the modules can interact without prior knowledge of where they are located in memory.

Notes

References

{{reflist, refs= {{cite book , author-last=Levine , author-first=John R. , author-link=John R. Levine , title=Linkers and Loaders , date=2000 , orig-year=October 1999 , edition=1 , publisher=

Morgan Kaufmann Morgan Kaufmann Publishers is a Burlington, Massachusetts (San Francisco, California until 2008) based publisher specializing in computer science and engineering content. Since 1984, Morgan Kaufmann has been publishing contents on information te ...

, series=The Morgan Kaufmann Series in Software Engineering and Programming , location=San Francisco, USA , isbn=1-55860-496-0 , oclc=42413382 , url=https://www.iecc.com/linker/ , access-date=2020-01-12 , url-status=live , archive-url=https://archive.today/20121205032107/http://www.iecc.com/linker/ , archive-date=2012-12-05 Code

ftp://ftp.iecc.com/pub/linker/]{{dead link, date=May 2025, bot=medic{{cbignore, bot=medic Errata
https://archive.today/20200114224817/https://linker.iecc.com/ 2020-01-14 -->
/ref> Computing terminology Functional programming