In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includin ...
, overhead is any combination of excess or indirect computation time, memory, bandwidth, or other resources that are required to perform a specific
task. It is a special case of
engineering overhead. Overhead can be a deciding factor in software design, with regard to structure, error correction, and feature inclusion. Examples of computing overhead may be found in Object Oriented Programming (OOP),
functional programming
In computer science, functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that ...
, data transfer, and data structures.
Software design
Choice of implementation
A programmer/software engineer may have a choice of several
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
s,
encoding
In communications and information processing, code is a system of rules to convert information—such as a letter (alphabet), letter, word, sound, image, or gesture—into another form, sometimes data compression, shortened or secrecy, secret ...
s,
data types or
data structures, each of which have known characteristics. When choosing among them, their respective overhead should also be considered.
Tradeoffs
In
software engineering
Software engineering is a systematic engineering approach to software development.
A software engineer is a person who applies the principles of software engineering to design, develop, maintain, test, and evaluate computer software. The term ' ...
, overhead can influence the decision whether or not to include features in new products, or indeed whether to fix bugs. A feature that has a high overhead may not be included – or needs a big financial incentive to do so. Often, even though software providers are well aware of bugs in their products, the payoff of fixing them is not worth the reward, because of the overhead.
For example, an
implicit data structure or
succinct data structure may provide low space overhead, but at the cost of slow performance (space/time tradeoff).
Run-time complexity of software
Algorithmic complexity is generally specified using
Big O notation. This makes no comment on how long something takes to run or how much memory it uses, but how its increase depends on the size of the input. Overhead is ''deliberately'' not part of this calculation, since it varies from one machine to another, whereas the fundamental running time of an algorithm does not.
This should be contrasted with
algorithmic efficiency, which takes into account all kinds of resources – a combination (though not a trivial one) of complexity and overhead.
Examples
Computer programming (run-time and computational overhead)
Invoking a
function introduces a small run-time overhead. Sometimes the compiler can
minimize this overhead by
inlining some of these
function calls.
CPU caches
In a
CPU cache
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, wh ...
, the "cache size" (or capacity) refers to how much data a ''cache'' stores. For instance, a "4 KB cache" is a cache that holds 4 KB of data. The "4 KB" in this example excludes
overhead bits such as frame, address, and tag information.
Communications (data transfer overhead)
Reliably sending a
payload of data over a communications network requires sending more than just payload itself. It also involves sending various control and signalling data (
TCP
TCP may refer to:
Science and technology
* Transformer coupled plasma
* Tool Center Point, see Robot end effector
Computing
* Transmission Control Protocol, a fundamental Internet standard
* Telephony control protocol, a Bluetooth communication s ...
) required to reach the destination. This creates a so-called protocol overhead as the additional data does not contribute to the intrinsic meaning of the message.
Protocol Overhead in IP/ATM Networks
Minnesota Supercomputer Center
In telephony
Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is i ...
, number dialing and call set-up time are overheads. In two-way (but half-duplex) radios, the use of "over" and other signalling needed to avoid collisions is an overhead.
Protocol overhead can be expressed as a percentage of non-application byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s (protocol and frame synchronization) divided by the total number of bytes in the message.
Encodings and data structures (size overhead)
The encoding
In communications and information processing, code is a system of rules to convert information—such as a letter (alphabet), letter, word, sound, image, or gesture—into another form, sometimes data compression, shortened or secrecy, secret ...
of information and data introduces overhead too. The date and time ''"2011-07-12 07:18:47"'' can be expressed as Unix time
Current Unix time ()
Unix time is a date and time representation widely used in computing. It measures time by the number of seconds that have elapsed since 00:00:00 UTC on 1 January 1970, the beginning of the Unix epoch, less adjustments ...
with the 32-bit signed integer
An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign ( −1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
1310447927
, consuming only 4 byte
The byte is a unit of digital information that most commonly consists of eight bits. Historically, the byte was the number of bits used to encode a single character of text in a computer and for this reason it is the smallest addressable unit ...
s. Represented as ISO 8601
ISO 8601 is an international standard covering the worldwide exchange and communication of date and time-related data. It is maintained by the Geneva-based International Organization for Standardization (ISO) and was first published in 1988, ...
formatted UTF-8
UTF-8 is a variable-length character encoding used for electronic communication. Defined by the Unicode Standard, the name is derived from ''Unicode'' (or ''Universal Coded Character Set'') ''Transformation Format 8-bit''.
UTF-8 is capable of ...
encoded string
String or strings may refer to:
*String (structure), a long flexible structure made from threads twisted together, which is used to tie, bind, or hang other objects
Arts, entertainment, and media Films
* ''Strings'' (1991 film), a Canadian anim ...
2011-07-12 07:18:47
the date would consume 19 bytes, a size overhead of 375% over the binary integer representation. As XML this date can be written as follows with an overhead of 218 characters, while adding the semantic context that it is a CHANGEDATE with index 1.
2011
07
12
07
18
47
The 349 bytes, resulting from the UTF-8 encoded XML, correlates to a size overhead of 8625% over the original integer representation.
See also
* Rule of least power
* Universal Turing machine
References
{{Reflist
Software optimization