HOME

TheInfoList




In
software engineering Software engineering is the systematic application of engineering approaches to the software development, development of software. A software engineer is a person who applies the principles of software engineering to design, develop, maintain, tes ...
, profiling ("program profiling", "software profiling") is a form of
dynamic program analysis Dynamic program analysis is the analysis of computer software that is performed by executing programs on a real or virtual processor. For dynamic program analysis to be effective, the target program must be executed with sufficient test inputs to ...
that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid
program optimization In computer science, program optimization, code optimization, or software optimization is the process of modifying a software system to make some aspect of it work more algorithmic efficiency, efficiently or use fewer resources. In general, a compu ...
, and more specifically,
performance engineering Performance engineering encompasses the techniques applied during a systems development life cycle to ensure the non-functional requirements for performance (such as throughput In general terms, throughput is the rate of production or the rate ...

performance engineering
. Profiling is achieved by instrumenting either the program
source code In , source code is any collection of code, with or without , written using a ''human-readable'' , usually as . The source code of a program is specially designed to facilitate the work of computer s, who specify the actions to be performed ...

source code
or its binary executable form using a tool called a ''profiler'' (or ''code profiler''). Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.


Gathering program events

Profilers use a wide variety of techniques to collect data, including
hardware interrupt In digital computer A computer is a machine A machine is a man-made device that uses power to apply forces and control movement to perform an action. Machines can be driven by animals and people A people is a plurality of per ...
s, code instrumentation, instruction set simulation, operating system
hooksHooks may refer to: Places ;United States * Hooks, Alabama, an unincorporated community * Hooks, Texas, a city * Hooks Island, an island, New York People * Hooks (surname) * Hooks (nickname) Other uses * Corpus Christi Hooks, a minor league team ...
, and performance counters.


Use of profilers

The output of a profiler may be: * A statistical ''summary'' of the events observed (a profile) :Summary profile information is often shown annotated against the source code statements where the events occur, so the size of measurement data is linear to the code size of the program. /* ------------ source------------------------- count */ 0001 IF X = "A" 0055 0002 THEN DO 0003 ADD 1 to XCOUNT 0032 0004 ELSE 0005 IF X = "B" 0055 * A stream of recorded events (a trace) :For sequential programs, a summary profile is usually sufficient, but performance problems in parallel programs (waiting for messages or synchronization issues) often depend on the time relationship of events, thus requiring a full trace to get an understanding of what is happening. : The size of a (full) trace is linear to the program's
instruction path length In computer performance In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardwar ...
, making it somewhat impractical. A trace may therefore be initiated at one point in a program and terminated at another point to limit the output. * An ongoing interaction with the
hypervisor A hypervisor (or virtual machine monitor, VMM, virtualizer) is similar to an emulator emulates the command-line interface A command-line interface (CLI) processes commands to a computer program in the form of lines of text. The program whi ...
(continuous or periodic monitoring via on-screen display for instance) : This provides the opportunity to switch a trace on or off at any desired point during execution in addition to viewing on-going metrics about the (still executing) program. It also provides the opportunity to suspend asynchronous processes at critical points to examine interactions with other parallel processes in more detail. A profiler can be applied to an individual method or at the scale of a module or program, to identify performance bottlenecks by making long-running code obvious. A profiler can be used to understand code from a timing point of view, with the objective of optimizing it to handle various runtime conditions or various loads. Profiling results can be ingested by a compiler that provides
profile-guided optimization Profile-guided optimization (PGO, sometimes pronounced as ''pogo''), also known as profile-directed feedback (PDF), and feedback-directed optimization (FDO) is a compiler optimization In computing, an optimizing compiler is a compiler that tries ...
. Profiling results can be used to guide the design and optimization of an individual algorithm; the
Krauss matching wildcards algorithmIn computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of Algorith ...
is an example. Profilers are built into some application performance management systems that aggregate profiling data to provide insight into transaction workloads in
distributedDistribution may refer to: Mathematics *Distribution (mathematics) Distributions, also known as Schwartz distributions or generalized functions, are objects that generalize the classical notion of functions in mathematical analysis. Distr ...
applications.


History

Performance-analysis tools existed on
IBM/360 The IBM System/360 (S/360) is a family of mainframe computer A mainframe computer, informally called a mainframe or big iron, is a computer A computer is a machine that can be programmed to carry out sequences of arithmetic or log ...
and
IBM/370 The IBM System/370 (S/370) is a model range of IBM mainframe IBM mainframes are large computer systems produced by IBM since 1952. During the 1960s and 1970s, IBM dominated the large computer market. Current mainframe computers in IBM's li ...
platforms from the early 1970s, usually based on timer interrupts which recorded the
program status word The program status word (PSW) is a register that performs the function of a status register and program counter, and sometimes more. The term is also applied to a copy of the PSW in storage. This article only discusses the PSW in the IBM System/3 ...
(PSW) at set timer-intervals to detect "hot spots" in executing code. This was an early example of
sampling Sampling may refer to: *Sampling (signal processing), converting a continuous signal into a discrete signal *Sample (graphics), Sampling (graphics), converting continuous colors into discrete color components *Sampling (music), the reuse of a sound ...
(see below). In early 1974 instruction-set simulators permitted full trace and other performance-monitoring features. Profiler-driven program analysis on Unix dates back to 1973,Unix Programmer's Manual, 4th Edition
/ref> when Unix systems included a basic tool, prof, which listed each function and how much of program execution time it used. In 1982 gprof extended the concept to a complete
call graph A call graph (also known as a call multigraph) is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge ''(f, g)'' indicates that procedure ''f'' ca ...

call graph
analysis. S.L. Graham, P.B. Kessler, and M.K. McKusick
''gprof: a Call Graph Execution Profiler''
Proceedings of the SIGPLAN '82 Symposium on Compiler Construction, ''
SIGPLAN SIGPLAN is the Association for Computing Machinery The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing soc ...
Notices'', Vol. 17, No 6, pp. 120-126; doi:10.1145/800230.806987
In 1994, Amitabh Srivastava and
Alan Eustace Robert Alan Eustace is an American computer scientist who served as Senior Vice President of Engineering at Google Google LLC is an American Multinational corporation, multinational technology company that specializes in Internet-related ...
of
Digital Equipment Corporation Digital Equipment Corporation (DEC ), using the Digital, was a major American company in the from the 1960s to the 1990s. The company was co-founded by and in 1957. Olsen was president until forced to resign in 1992, after the company had go ...
published a paper describing ATOM (Analysis Tools with OM). The ATOM platform converts a program into its own profiler: at
compile time In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of Algo ...
, it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique - modifying a program to analyze itself - is known as "
instrumentation Instrumentation is a collective term for measuring instruments that are used for indicating, measuring and recording physical quantities. The term has its origins in the art and science of Scientific instrument, scientific instrument-making. Instr ...
". In 2004 both the gprof and ATOM papers appeared on the list of the 50 most influential
PLDIProgramming Language Design and Implementation (PLDI) is one of the Association for Computing Machinery, ACM SIGPLAN's most important conferences. The precursor of PLDI was the Symposium on Compiler Optimization, held July 27–28, 1970 at the Un ...
papers for the 20-year period ending in 1999.


Profiler types based on output


Flat profiler

Flat profilers compute the average call times, from the calls, and do not break down the call times based on the callee or the context.


Call-graph profiler

Call graph A call graph (also known as a call multigraph) is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge ''(f, g)'' indicates that procedure ''f'' ca ...

Call graph
profilers show the call times, and frequencies of the functions, and also the call-chains involved based on the callee. In some tools full context is not preserved.


Input-sensitive profiler

Input-sensitive profilersE. Coppa, C. Demetrescu, and I. Finocchi
Profiling''
IEEE Trans. Software Eng. 40(12): 1185-1205 (2014); doi:10.1109/TSE.2014.2339825
add a further dimension to flat or call-graph profilers by relating performance measures to features of the input workloads, such as input size or input values. They generate charts that characterize how an application's performance scales as a function of its input.


Data granularity in profiler types

Profilers, which are also programs themselves, analyze target programs by collecting information on their execution. Based on their data granularity, on how profilers collect information, they are classified into event based or statistical profilers. Profilers interrupt program execution to collect information, which may result in a limited resolution in the time measurements, which should be taken with a grain of salt.
Basic block In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compiler In computin ...
profilers report a number of machine
clock cycles In electronics Electronics comprises the physics, engineering, technology and applications that deal with the emission, flow and control of electrons in vacuum and matter. It uses active devices to control electron flow by amplifier, amplificati ...
devoted to executing each line of code, or a timing based on adding these together; the timings reported per basic block may not reflect a difference between
cache Cache, caching, or caché may refer to: Places * Cache (Aosta) Cache is a frazione of the city of Aosta, in the Aosta Valley region of Italy. Frazioni of Aosta Valley Aosta {{Aosta-geo-stub ..., a frazione in Italy * Cache Creek (disambig ...
hits and misses.


Event-based profilers

The programming languages listed here have event-based profilers: *
Java Java ( id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands The Greater Sunda Islands are four tropical islands situated within Southeast Asia, in the Pacific Ocean. The islands, Borneo, Java, Sulawesi and Sumatra, are internat ...
: the JVMTI (JVM Tools Interface) API, formerly JVMPI (JVM Profiling Interface), provides hooks to profilers, for trapping events like calls, class-load, unload, thread enter leave. *
.NET The domain name net is a generic top-level domain (gTLD) used in the Domain Name System of the Internet. The name is derived from the word ''network'', indicating it was originally intended for organizations involved in networking technologies, ...
: Can attach a profiling agent as a ''COM'' server to the ''CLR'' using Profiling ''API''. Like Java, the runtime then provides various callbacks into the agent, for trapping events like method JIT / enter / leave, object creation, etc. Particularly powerful in that the profiling agent can rewrite the target application's bytecode in arbitrary ways. *
Python PYTHON was a Cold War contingency plan of the Government of the United Kingdom, British Government for the continuity of government in the event of Nuclear warfare, nuclear war. Background Following the report of the Strath Committee in 1955, the ...
: Python profiling includes the profile module, hotshot (which is call-graph based), and using the 'sys.setprofile' function to trap events like c_, python_. *
Ruby A ruby is a pink to blood-red coloured gemstone A gemstone (also called a gem, fine gem, jewel, precious stone, or semi-precious stone) is a piece of mineral crystal which, in cut and polished form, is used to make jewellery, jewelry or othe ...
: Ruby also uses a similar interface to Python for profiling. Flat-profiler in profile.rb, module, and ruby-prof a C-extension are present.


Statistical profilers

Some profilers operate by
sampling Sampling may refer to: *Sampling (signal processing), converting a continuous signal into a discrete signal *Sample (graphics), Sampling (graphics), converting continuous colors into discrete color components *Sampling (music), the reuse of a sound ...
. A sampling profiler probes the target program's
call stack In computer science Computer science deals with the theoretical foundations of information, algorithms and the architectures of its computation as well as practical techniques for their application. Computer science is the study of Algor ...
at regular intervals using
operating system An operating system (OS) is system software System software is software designed to provide a platform for other software. Examples of system software include operating systems like macOS macOS (; previously Mac OS X and later ...

operating system
interrupt In s, an interrupt is a response by the to an event that needs attention from the software. An interrupt condition alerts the processor and serves as a request for the processor to interrupt the currently executing code when permitted, so that ...

interrupt
s. Sampling profiles are typically less numerically accurate and specific, but allow the target program to run at near full speed. The resulting data are not exact, but a statistical approximation. "The actual amount of error is usually more than one sampling period. In fact, if a value is n times the sampling period, the expected error in it is the square-root of n sampling periods." In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program, and thus don't have as many side effects (such as on memory caches or instruction decoding pipelines). Also since they don't affect the execution speed as much, they can detect issues that would otherwise be hidden. They are also relatively immune to over-evaluating the cost of small, frequently called routines or 'tight' loops. They can show the relative amount of time spent in user mode versus interruptible kernel mode such as
system call In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and soft ...
processing. Still, kernel code to handle the interrupts entails a minor loss of CPU cycles, diverted cache usage, and is unable to distinguish the various tasks occurring in uninterruptible kernel code (microsecond-range activity). Dedicated hardware can go beyond this: ARM Cortex-M3 and some recent MIPS processors JTAG interface have a PCSAMPLE register, which samples the
program counter The program counter (PC), commonly called the instruction pointer (IP) in Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, Santa Clara, California. It is the wo ...
in a truly undetectable manner, allowing non-intrusive collection of a flat profile. Some commonly used statistical profilers for Java/managed code are
SmartBear Software SmartBear Software is an American privately-held information technology company that delivers tools for Application performance management, application performance monitoring (APM), software development, software testing, API testing and API manag ...
's AQtime and
Microsoft Microsoft Corporation is an American multinational corporation, multinational technology company, technology corporation which produces Software, computer software, consumer electronics, personal computers, and related services. Its best-know ...

Microsoft
's CLR Profiler. Those profilers also support native code profiling, along with
Apple Inc. Apple Inc. is an American multinational Multinational may refer to: * Multinational corporation, a corporate organization operating in multiple countries * Multinational force, a military body from multiple countries * Multinational state, ...
's
Shark Sharks are a group of elasmobranch fish characterized by a Chondrichthyes#Skeleton, cartilaginous skeleton, five to seven gill slits on the sides of the head, and pectoral fins that are not fused to the head. Modern sharks are classified withi ...

Shark
(OSX), OProfile (Linux),
Intel Intel Corporation is an American multinational corporation A multinational company (MNC) is a corporate A corporation is an organization—usually a group of people or a company A company, abbreviated as co., is a Legal personalit ...

Intel
VTune __NOTOC__ VTune Profiler (formerly VTune Amplifier) is a performance analysis tool for x86 based machines running Linux or Microsoft Windows operating systems. Many features work on both Intel and AMD hardware, but advanced hardware-based sampling ...
and Parallel Amplifier (part of Intel Parallel Studio), and
Oracle An oracle is a person or agency Agency may refer to: * a governmental or other institution Institutions, according to Samuel P. Huntington, are "stable, valued, recurring patterns of behavior". Institutions can refer to mechanisms which go ...
Performance Analyzer, among others.


Instrumentation

This technique effectively adds instructions to the target program to collect the required information. Note that instrumenting a program can cause performance changes, and may in some cases lead to inaccurate results and/or
heisenbug In computer programming jargon, a heisenbug is a software bug that seems to disappear or alter its behavior when one attempts to study it. The term is a pun on the name of Werner Heisenberg, the physicist who first asserted the observer effect (ph ...
s. The effect will depend on what information is being collected, on the level of timing details reported, and on whether basic block profiling is used in conjunction with instrumentation. For example, adding code to count every procedure/routine call will probably have less effect than counting how many times each statement is obeyed. A few computers have special hardware to collect information; in this case the impact on the program is minimal. Instrumentation is key to determining the level of control and amount of time resolution available to the profilers. * Manual: Performed by the programmer, e.g. by adding instructions to explicitly calculate runtimes, simply count events or calls to measurement
API In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and so ...

API
s such as the
Application Response Measurement Application Response Measurement (ARM) is an open standard published by the Open Group for monitoring and diagnosing performance bottlenecks within complex enterprise applications that use loosely-coupled designs or service-oriented architecture ...
standard. * Automatic source level: instrumentation added to the source code by an automatic tool according to an instrumentation policy. * Intermediate language: instrumentation added to
assembly Assembly may refer to: Organisations and meetings * Deliberative assembly A deliberative assembly is a gathering of members (of any kind of collective) who use parliamentary procedure Parliamentary procedure is the body of ethics, Procedural l ...
or decompiled
bytecode Bytecode, also termed portable code or p-code, is a form of instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions ...
s giving support for multiple higher-level source languages and avoiding (non-symbolic) binary offset re-writing issues. * Compiler assisted * Binary translation: The tool adds instrumentation to a compiled
executable In computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes and development of both computer hardware , hardware and ...
. * Runtime instrumentation: Directly before execution the code is instrumented. The program run is fully supervised and controlled by the tool. * Runtime injection: More lightweight than runtime instrumentation. Code is modified at runtime to have jumps to helper functions.


Interpreter instrumentation

* Interpreter debug options can enable the collection of performance metrics as the interpreter encounters each target statement. A
bytecode Bytecode, also termed portable code or p-code, is a form of instruction set In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions ...
,
control table Control tables are table Table may refer to: * Table (information) A table is an arrangement of data Data are units of information Information can be thought of as the resolution of uncertainty; it answers the question of "What an ...

control table
or
JIT Sarajevo Sound or Jit (also known as Sarajevo Soundi, Sarajevo Sound-jive and the Harare beat) is a style of popular Zimbabwean music, Zimbabwean dance music. It features a swift rhythm played on drums and accompanied by a guitar. Sarajevo Soun ...
interpreters are three examples that usually have complete control over execution of the target code, thus enabling extremely comprehensive data collection opportunities.


Hypervisor/Simulator

* Hypervisor: Data are collected by running the (usually) unmodified program under a
hypervisor A hypervisor (or virtual machine monitor, VMM, virtualizer) is similar to an emulator emulates the command-line interface A command-line interface (CLI) processes commands to a computer program in the form of lines of text. The program whi ...
. Example:
SIMMON SIMMON (SIMulation MONitor) was a proprietary {{Short pages monitor


References


External links

* Article
Need for speed — Eliminating performance bottlenecks
on doing execution time analysis of Java applications using IBM Rational Application Developer.
Profiling Runtime Generated and Interpreted Code using the VTune Performance Analyzer
{{DEFAULTSORT:Software Performance Analysis Software optimization *