computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, ...

, computer performance is the amount of useful work accomplished by a computer system. Outside of specific contexts, computer performance is estimated in terms of accuracy,

efficiency Efficiency is the often measurable ability to avoid wasting materials, energy, efforts, money, and time in doing something or in producing a desired result. In a more general sense, it is the ability to do things well, successfully, and without ...

and speed of executing

computer program A computer program is a sequence or set of instructions in a programming language for a computer to Execution (computing), execute. Computer programs are one component of software, which also includes software documentation, documentation and oth ...

instructions. When it comes to high computer performance, one or more of the following factors might be involved: * Short response time for a given piece of work. * High throughput (rate of processing work). * Low utilization of computing resource(s). ** Fast (or highly compact)

data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...

and decompression. *

High availability High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period. Modernization has resulted in an increased reliance on these systems. F ...

of the computing system or application. * High bandwidth. * Short

data transmission Data transmission and data reception or, more broadly, data communication or digital communications is the transfer and reception of data in the form of a digital bitstream or a digitized analog signal transmitted over a point-to-point or ...

time.

Technical and non-technical definitions

The performance of any computer system can be evaluated in measurable, technical terms, using one or more of the metrics listed above. This way the performance can be * Compared relative to other systems or the same system before/after changes * In absolute terms, e.g. for fulfilling a contractual obligation Whilst the above definition relates to a scientific, technical approach, the following definition given by

Arnold Allen Arnold Billy Allen (born 22 January 1994) is an English professional mixed martial artist. He currently competes in the Featherweight division in the Ultimate Fighting Championship (UFC). As of November 8, 2022, he is #4 in the UFC featherweig ...

would be useful for a non-technical audience:

''The word ''performance'' in computer performance means the same thing that performance means in other contexts, that is, it means "How well is the computer doing the work it is supposed to do?"''

As an aspect of software quality

Computer software Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consist ...

performance, particularly

software application Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work. At the lowest programming level, executable code consists ...

response time, is an aspect of

software quality In the context of software engineering, software quality refers to two related but distinct notions: * Software functional quality reflects how well it complies with or conforms to a given design, based on functional requirements or specificatio ...

that is important in

human–computer interaction Human–computer interaction (HCI) is research in the design and the use of computer technology, which focuses on the interfaces between people (users) and computers. HCI researchers observe the ways humans interact with computers and design te ...

Performance engineering

Performance engineering within systems engineering encompasses the set of roles, skills, activities, practices, tools, and deliverables applied at every phase of the systems development life cycle which ensures that a solution will be designed, implemented, and operationally supported to meet the performance requirements defined for the solution. Performance engineering continuously deals with trade-offs between types of performance. Occasionally a

CPU design Processor design is a subfield of computer engineering and electronics engineering (fabrication) that deals with creating a processor, a key component of computer hardware. The design process involves choosing an instruction set and a certain exe ...

er can find a way to make a

CPU A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, a ...

with better overall performance by improving one of the aspects of performance, presented below, without sacrificing the CPU's performance in other areas. For example, building the CPU out of better, faster

transistor upright=1.4, gate (G), body (B), source (S) and drain (D) terminals. The gate is separated from the body by an insulating layer (pink). A transistor is a semiconductor device used to Electronic amplifier, amplify or electronic switch, switch ...

s. However, sometimes pushing one type of performance to an extreme leads to a CPU with worse overall performance, because other important aspects were sacrificed to get one impressive-looking number, for example, the chip's clock rate (see the megahertz myth).

Application performance engineering

Application Performance Engineering (APE) is a specific methodology within performance engineering designed to meet the challenges associated with application performance in increasingly distributed mobile, cloud and terrestrial IT environments. It includes the roles, skills, activities, practices, tools and deliverables applied at every phase of the application lifecycle that ensure an application will be designed, implemented and operationally supported to meet non-functional performance requirements.

Aspects of performance

Computer performance metrics (things to measure) include

availability In reliability engineering, the term availability has the following meanings: * The degree to which a system, subsystem or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at ...

, response time, channel capacity, latency,

completion time Completion may refer to: *Completion (American football) *Completion (oil and gas wells) * ''Completion'', a 2004 studio album by Bodychoke * One of the landmarks in conveyancing, transfer of the title of property from one person to another Mat ...

, service time, bandwidth, throughput,

relative efficiency In statistics, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, needs fewer input data or observations than a less efficient one to ach ...

scalability Scalability is the property of a system to handle a growing amount of work by adding resources to the system. In an economic context, a scalable business model implies that a company can increase sales given increased resources. For example, a ...

performance per watt In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power con ...

compression ratio The compression ratio is the ratio between the volume of the cylinder and combustion chamber in an internal combustion engine at their maximum and minimum values. A fundamental specification for such engines, it is measured two ways: the stati ...

, instruction path length and speed up.

benchmarks are available.

Availability

Availability of a system is typically measured as a factor of its reliability - as reliability increases, so does availability (that is, less

downtime The term downtime is used to refer to periods when a system is unavailable. The unavailability is the proportion of a time-span that a system is unavailable or offline. This is usually a result of the system failing to function because of an un ...

). Availability of a system may also be increased by the strategy of focusing on increasing testability and maintainability and not on reliability. Improving maintainability is generally easier than reliability. Maintainability estimates (Repair rates) are also generally more accurate. However, because the uncertainties in the reliability estimates are in most cases very large, it is likely to dominate the availability (prediction uncertainty) problem, even while maintainability levels are very high.

Response time

Response time is the total amount of time it takes to respond to a request for service. In computing, that service can be any unit of work from a simple disk IO to loading a complex web page. The response time is the sum of three numbers: * Service time - How long it takes to do the work requested. * Wait time - How long the request has to wait for requests queued ahead of it before it gets to run. * Transmission time – How long it takes to move the request to the computer doing the work and the response back to the requestor.

Processing speed

Most consumers pick a computer architecture (normally

Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 ser ...

IA-32 IA-32 (short for "Intel Architecture, 32-bit", commonly called i386) is the 32-bit version of the x86 instruction set architecture, designed by Intel and first implemented in the 80386 microprocessor in 1985. IA-32 is the first incarnatio ...

architecture) to be able to run a large base of pre-existing, pre-compiled software. Being relatively uninformed on computer benchmarks, some of them pick a particular CPU based on operating frequency (see megahertz myth). Some system designers building parallel computers pick CPUs based on the speed per dollar.

Channel capacity

Channel capacity is the tightest upper bound on the rate of

information Information is an abstract concept that refers to that which has the power to inform. At the most fundamental level information pertains to the interpretation of that which may be sensed. Any natural process that is not completely random, ...

that can be reliably transmitted over a communications channel. By the

noisy-channel coding theorem In information theory, the noisy-channel coding theorem (sometimes Shannon's theorem or Shannon's limit), establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (di ...

, the channel capacity of a given channel is the limiting information rate (in units of

per unit time) that can be achieved with arbitrarily small error probability.

Information theory Information theory is the scientific study of the quantification, storage, and communication of information. The field was originally established by the works of Harry Nyquist and Ralph Hartley, in the 1920s, and Claude Shannon in the 1940s. ...

, developed by Claude E. Shannon during

World War II World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the World War II by country, vast majority of the world's countries—including all of the great power ...

, defines the notion of channel capacity and provides a mathematical model by which one can compute it. The key result states that the capacity of the channel, as defined above, is given by the maximum of the

mutual information In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the " amount of information" (in units such ...

between the input and output of the channel, where the maximization is with respect to the input distribution.

Latency

Latency is a time delay between the cause and the effect of some physical change in the system being observed. Latency is a result of the limited velocity with which any physical interaction can take place. This velocity is always lower or equal to speed of light. Therefore, every physical system that has non-zero spatial dimensions will experience some sort of latency. The precise definition of latency depends on the system being observed and the nature of stimulation. In communications, the lower limit of latency is determined by the medium being used for communications. In reliable two-way communication systems, latency limits the maximum rate that information can be transmitted, as there is often a limit on the amount of information that is "in-flight" at any one moment. In the field of human-machine interaction, perceptible latency (delay between what the user commands and when the computer provides the results) has a strong effect on user satisfaction and usability. Computers run sets of instructions called a process. In operating systems, the execution of the process can be postponed if other processes are also executing. In addition, the operating system can schedule when to perform the action that the process is commanding. For example, suppose a process commands that a computer card's voltage output be set high-low-high-low and so on at a rate of 1000 Hz. The operating system may choose to adjust the scheduling of each transition (high-low or low-high) based on an internal clock. The latency is the delay between the process instruction commanding the transition and the hardware actually transitioning the voltage from high to low or low to high. System designers building

real-time computing Real-time computing (RTC) is the computer science term for hardware and software systems subject to a "real-time constraint", for example from event to system response. Real-time programs must guarantee response within specified time constrai ...

systems want to guarantee worst-case response. That is easier to do when the CPU has low interrupt latency and when it has deterministic response.

Bandwidth

In computer networking, bandwidth is a measurement of bit-rate of available or consumed data communication resources, expressed in bits per second or multiples of it (bit/s, kbit/s, Mbit/s, Gbit/s, etc.). Bandwidth sometimes defines the net bit rate (aka. peak bit rate, information rate, or physical layer useful bit rate), channel capacity, or the maximum throughput of a logical or physical communication path in a digital communication system. For example, bandwidth tests measure the maximum throughput of a computer network. The reason for this usage is that according to Hartley's law, the maximum data rate of a physical communication link is proportional to its bandwidth in hertz, which is sometimes called frequency bandwidth, spectral bandwidth, RF bandwidth, signal bandwidth or analog bandwidth.

Throughput

In general terms, throughput is the rate of production or the rate at which something can be processed. In communication networks, throughput is essentially synonymous to digital bandwidth consumption. In

wireless network A wireless network is a computer network that uses wireless data connections between network nodes. Wireless networking is a method by which homes, telecommunications networks and business installations avoid the costly process of introducing ...

s or cellular communication networks, the

system spectral efficiency Spectral efficiency, spectrum efficiency or bandwidth efficiency refers to the information rate that can be transmitted over a given bandwidth in a specific communication system. It is a measure of how efficiently a limited frequency spectrum is ut ...

in bit/s/Hz/area unit, bit/s/Hz/site or bit/s/Hz/cell, is the maximum system throughput (aggregate throughput) divided by the analog bandwidth and some measure of the system coverage area. In integrated circuits, often a block in a

data flow diagram A data-flow diagram is a way of representing a flow of data through a process or a system (usually an information system). The DFD also provides information about the outputs and inputs of each entity and the process itself. A data-flow diagram h ...

has a single input and a single output, and operate on discrete packets of information. Examples of such blocks are FFT modules or binary multipliers. Because the units of throughput are the reciprocal of the unit for propagation delay, which is 'seconds per message' or 'seconds per output', throughput can be used to relate a computational device performing a dedicated function such as an

ASIC An application-specific integrated circuit (ASIC ) is an integrated circuit (IC) chip customized for a particular use, rather than intended for general-purpose use, such as a chip designed to run in a digital voice recorder or a high-efficie ...

embedded processor An embedded system is a computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is ''embedded'' ...

to a communications channel, simplifying system analysis.

Relative efficiency

Scalability

Scalability is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth

Power consumption

The amount of

electric power Electric power is the rate at which electrical energy is transferred by an electric circuit. The SI unit of power is the watt, one joule per second. Standard prefixes apply to watts as with other SI units: thousands, millions and billions ...

used by the computer ( power consumption). This becomes especially important for systems with limited power sources such as solar, batteries, human power.

Performance per watt

System designers building

parallel computers Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different for ...

, such as Google's hardware, pick CPUs based on their speed per watt of power, because the cost of powering the CPU outweighs the cost of the CPU itself. For spaceflight computers, the processing speed per watt ratio is a more useful performance criterion than raw processing speed.

Compression ratio

Compression is useful because it helps reduce resource usage, such as data storage space or transmission capacity. Because compressed data must be decompressed to use, this extra processing imposes computational or other costs through decompression; this situation is far from being a free lunch. Data compression is subject to a space–time complexity trade-off.

Size and weight

This is an important performance feature of mobile systems, from the smart phones you keep in your pocket to the portable embedded systems in a spacecraft.

Environmental impact

The effect of a computer or computers on the environment, during manufacturing and recycling as well as during use. Measurements are taken with the objectives of reducing waste, reducing hazardous materials, and minimizing a computer's

ecological footprint The ecological footprint is a method promoted by the Global Footprint Network to measure human demand on natural capital, i.e. the quantity of nature it takes to support people or an economy. It tracks this demand through an ecological accounti ...

Transistor count

The transistor count is the number of

s on an

integrated circuit An integrated circuit or monolithic integrated circuit (also referred to as an IC, a chip, or a microchip) is a set of electronic circuits on one small flat piece (or "chip") of semiconductor material, usually silicon. Large numbers of tiny ...

(IC). Transistor count is the most common measure of IC complexity.

Benchmarks

Because there are so many programs to test a CPU on all aspects of performance, benchmarks were developed. The most famous benchmarks are the SPECint and

SPECfp SPECfp is a computer benchmark designed to test the floating-point performance of a computer. It is managed by the Standard Performance Evaluation Corporation. SPECfp is the floating-point performance testing component of the SPEC CPU testing su ...

benchmarks developed by

Standard Performance Evaluation Corporation The Standard Performance Evaluation Corporation (SPEC) is an American non-profit corporation that aims to "produce, establish, maintain and endorse a standardized set" of performance benchmarks for computers. SPEC was founded in 1988. SPEC b ...

and the

Certification Mark A certification mark (or conformity mark) on a commercial product indicates the existence of an accepted product standard or regulation and a claim that the manufacturer has verified compliance with those standards or regulations. The specifi ...

benchmark developed by the Embedded Microprocessor Benchmark Consortium EEMBC.

Software performance testing

In software engineering, performance testing is in general testing performed to determine how a system performs in terms of responsiveness and stability under a particular workload. It can also serve to investigate, measure, validate or verify other quality attributes of the system, such as scalability, reliability and resource usage. Performance testing is a subset of performance engineering, an emerging computer science practice which strives to build performance into the implementation, design and architecture of a system.

Profiling (performance analysis)

software engineering Software engineering is a systematic engineering approach to software development. A software engineer is a person who applies the principles of software engineering to design, develop, maintain, test, and evaluate computer software. The term '' ...

, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or frequency and duration of function calls. The most common use of profiling information is to aid program

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...

. Profiling is achieved by instrumenting either the program

source code In computing, source code, or simply code, is any collection of code, with or without comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the ...

or its binary executable form using a tool called a ''profiler'' (or ''code profiler''). A number of different techniques may be used by profilers, such as event-based, statistical, instrumented, and simulation methods.

Performance tuning

Performance tuning is the improvement of

system A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment, is described by its boundaries, structure and purpose and express ...

performance. This is typically a computer application, but the same methods can be applied to economic markets, bureaucracies or other complex systems. The motivation for such activity is called a performance problem, which can be real or anticipated. Most systems will respond to increased load with some degree of decreasing performance. A system's ability to accept a higher load is called

, and modifying a system to handle a higher load is synonymous to performance tuning. Systematic tuning follows these steps: # Assess the problem and establish numeric values that categorize acceptable behavior. # Measure the performance of the system before modification. # Identify the part of the system that is critical for improving the performance. This is called the

bottleneck Bottleneck literally refers to the narrowed portion (neck) of a bottle near its opening, which limit the rate of outflow, and may describe any object of a similar shape. The literal neck of a bottle was originally used to play what is now known as ...

. # Modify that part of the system to remove the bottleneck. # Measure the performance of the system after modification. # If the modification makes the performance better, adopt it. If the modification makes the performance worse, put it back to the way it was.

Perceived performance

Perceived performance, in computer engineering, refers to how quickly a software feature appears to perform its task. The concept applies mainly to user acceptance aspects. The amount of time an application takes to start up, or a file to download, is not made faster by showing a startup screen (see Splash screen) or a file progress dialog box. However, it satisfies some human needs: it appears faster to the user as well as providing a visual cue to let them know the system is handling their request. In most cases, increasing real performance increases perceived performance, but when real performance cannot be increased due to physical limitations, techniques can be used to increase perceived performance.

Performance Equation

The total amount of time (t) required to execute a particular benchmark program is :

t=\tfrac

, or equivalently :

P=\tfrac

Paul DeMone. "The Incredible Shrinking CPU". 2004

where *

P = \frac

is "the performance" in terms of time-to-execute *

N

is the number of instructions actually executed (the instruction path length). The instruction set#Code density, code density of the

instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an ...

strongly affects N. The value of N can either be determined exactly by using an instruction set simulator (if available) or by estimation—itself based partly on estimated or actual frequency distribution of input variables and by examining generated machine code from an HLL compiler. It cannot be determined from the number of lines of HLL source code. N is not affected by other processes running on the same processor. The significant point here is that hardware normally does not keep track of (or at least make easily available) a value of N for executed programs. The value can therefore only be accurately determined by instruction set simulation, which is rarely practiced. *

f

is the clock frequency in cycles per second. *

C = \frac

is the average

cycles per instruction In computer architecture, cycles per instruction (aka clock cycles per instruction, clocks per instruction, or CPI) is one aspect of a processor's performance: the average number of clock cycles per instruction for a program or program fragment. ...

(CPI) for this benchmark. *

I = \frac

is the average

instructions per cycle In computer architecture, instructions per cycle (IPC), commonly called instructions per clock is one aspect of a processor's performance: the average number of instructions executed for each clock cycle. It is the multiplicative inverse of ...

(IPC) for this benchmark. Even on one machine, a different compiler or the same compiler with different

compiler optimization In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power con ...

switches can change N and CPI—the benchmark executes faster if the new compiler can improve N or C without making the other worse, but often there is a trade-off between them—is it better, for example, to use a few complicated instructions that take a long time to execute, or to use instructions that execute very quickly, although it takes more of them to execute the benchmark? {{anchor, brainiac A CPU designer is often required to implement a particular

, and so cannot change N. Sometimes a designer focuses on improving performance by making significant improvements in f (with techniques such as deeper pipelines and faster caches), while (hopefully) not sacrificing too much C—leading to a speed-demon CPU design. Sometimes a designer focuses on improving performance by making significant improvements in CPI (with techniques such as

out-of-order execution In computer engineering, out-of-order execution (or more formally dynamic execution) is a paradigm used in most high-performance central processing units to make use of instruction cycles that would otherwise be wasted. In this paradigm, a proces ...

superscalar A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a sup ...

CPUs, larger caches, caches with improved hit rates, improved

branch prediction In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch (e.g., an if–then–else structure) will go before this is known definitively. The purpose of the branch predictor is to improve the flow ...

speculative execution Speculative execution is an optimization technique where a computer system performs some task that may not be needed. Work is done before it is known whether it is actually needed, so as to prevent a delay that would have to be incurred by doing ...

, etc.), while (hopefully) not sacrificing too much clock frequency—leading to a brainiac CPU design."Brainiacs, Speed Demons, and Farewell"
by Linley Gwennap For a given instruction set (and therefore fixed N) and semiconductor process, the maximum single-thread performance (1/t) requires a balance between brainiac techniques and speedracer techniques.