software analysis
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, program analysis is the process of automatically analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness. Program analysis focuses on two major areas:
program optimization In computer science, program optimization, code optimization, or software optimization, is the process of modifying a software system to make some aspect of it work more efficiently or use fewer resources. In general, a computer program may be o ...
and
program correctness In theoretical computer science, an algorithm is correct with respect to a specification if it behaves as specified. Best explored is ''functional'' correctness, which refers to the input-output behavior of the algorithm (i.e., for each input it p ...
. The first focuses on improving the program’s performance while reducing the resource usage while the latter focuses on ensuring that the program does what it is supposed to do. Program analysis can be performed without executing the program (
static program analysis In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution. The term ...
), during runtime ( dynamic program analysis) or in a combination of both.


Static program analysis

In the context of program correctness, static analysis can discover vulnerabilities during the development phase of the program.Jovanovic, N., Kruegel, C., & Kirda, E. (2006, May). Pixy: A static analysis tool for detecting web application vulnerabilities. In Security and Privacy, 2006 IEEE Symposium on (pp. 6-pp). IEEE. These vulnerabilities are easier to correct than the ones found during the testing phase since static analysis leads to the root of the vulnerability. Due to many forms of static analysis being computationally undecidable, the mechanisms for doing it will not always terminate with the right answer either because they sometimes return a false negative ("no problems found" when the code does in fact have problems) or a false positive, or because they never return the wrong answer but sometimes never terminate. Despite their limitations, the first type of mechanism might reduce the number of vulnerabilities, while the second can sometimes give strong assurance of the lack of a certain class of vulnerabilities. Incorrect optimizations are highly undesirable. So, in the context of program optimization, there are two main strategies to handle computationally undecidable analysis: # An optimizer that is expected to complete in a relatively short amount of time, such as the optimizer in an
optimizing compiler In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power cons ...
, may use a truncated version of an analysis that is guaranteed to complete in a finite amount of time, and guaranteed to only find correct optimizations. # A third-party optimization tool may be implemented in such a way as to never produce an incorrect optimization, but also so that it can, in some situations, continue running indefinitely until it finds one (which may never happen). In this case, the developer using the tool would have to stop the tool and avoid running the tool on that piece of code again (or possibly modify the code to avoid tripping up the tool). However, there is also a third strategy that is sometimes applicable for languages that are not completely specified, such as C. An optimizing compiler is at liberty to generate code that does anything at runtime even crashes if it encounters source code whose semantics are unspecified by the language standard in use.


Control-flow

The purpose of control-flow analysis is to obtain information about which functions can be called at various points during the execution of a program. The collected information is represented by a
control-flow graph In computer science, a control-flow graph (CFG) is a representation, using graph notation, of all paths that might be traversed through a program during its execution. The control-flow graph was discovered by Frances E. Allen, who noted that ...
(CFG) where the nodes are instructions of the program and the edges represent the flow of control. By identifying code blocks and loops a CFG becomes a starting point for compiler-made optimizations.


Data-flow analysis

Data-flow analysis is a technique designed to gather information about the values at each point of the program and how they change over time. This technique is often used by compilers to optimize the code. One of the most well known examples of data-flow analysis is
taint checking Taint checking is a feature in some computer programming languages, such as Perl, Ruby or Ballerina designed to increase security by preventing malicious users from executing commands on a host computer. Taint checks highlight specific security ...
, which consists of considering all variables that contain user-supplied data which is considered "tainted", i.e. insecure and preventing those variables from being used until they have been sanitized. This technique is often used to prevent
SQL injection In computing, SQL injection is a code injection technique used to attack data-driven applications, in which malicious SQL statements are inserted into an entry field for execution (e.g. to dump the database contents to the attacker). SQL in ...
attacks. Taint checking can be done statically or dynamically.


Abstract interpretation

Abstract interpretation allows the extraction of information about a possible execution of a program without actually executing the program. This information can be used by compilers to look for possible optimizations or for certifying a program against certain classes of bugs.


Type systems

Type systems associate types to programs that fulfill certain requirements. Their purpose is to select a subset of programs of a language that are considered correct according to a property. *
Type checking In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every "term" (a word, phrase, or other set of symbols). Usually the terms are various constructs of a computer progra ...
– verify whether the program is accepted by the type system. Type checking is used in programming to limit how programming objects are used and what can they do. This is done by the compiler or interpreter. Type checking can also help prevent vulnerabilities by ensuring that a signed value isn't attributed to an unsigned variable. Type checking can be done statically (at compile time), dynamically (at runtime) or a combination of both. Static type information (either
inferred Inferences are steps in reasoning, moving from premises to logical consequences; etymologically, the word ''infer'' means to "carry forward". Inference is theoretically traditionally divided into deduction and induction, a distinction that i ...
, or explicitly provided by type annotations in the source code) can also be used to do optimizations, such as replacing boxed arrays with unboxed arrays.


Effect systems

Effect systems are formal systems designed to represent the effects that executing a function or method can have. An effect codifies what is being done and with what it is being done usually referred to as ''effect kind'' and ''effect region'', respectively.


Model checking

Model checking refers to strict, formal, and automated ways to check if a ''model'' (which in this context means a formal model of a piece of code, though in other contexts it can be a model of a piece of hardware) complies with a given specification. Due to the inherent finite-state nature of code, and both the specification and the code being convertible into logical formulae, it is possible to check if the system violates the specification using efficient algorithmic methods.


Dynamic program analysis

Dynamic analysis can use runtime knowledge of the program to increase the precision of the analysis, while also providing runtime protection, but it can only analyze a single execution of the problem and might degrade the program’s performance due to the runtime checks.


Testing

Software should be tested to ensure its quality and that it performs as it is supposed to in a reliable manner, and that it won’t create conflicts with other software that may function alongside it. The tests are performed by executing the program with an input and evaluating its behavior and the produced output. Even if no security requirements are specified, additional
security testing Security testing is a process intended to reveal flaws in the security mechanisms of an information system that protect data and maintain functionality as intended. Due to the logical limitations of security testing, passing the security testin ...
should be performed to ensure that an attacker can’t tamper with the software and steal information, disrupt the software’s normal operations, or use it as a pivot to attack its users.


Monitoring

Program monitoring records and logs different kinds of information about the program such as resource usage, events, and interactions, so that it can be reviewed to find or pinpoint causes of abnormal behavior. Furthermore, it can be used to perform security audits. Automated monitoring of programs is sometimes referred to as
runtime verification Runtime verification is a computing system analysis and execution approach based on extracting information from a running system and using it to detect and possibly react to observed behaviors satisfying or violating certain properties. Some very p ...
.


Program slicing

For a given subset of a program’s behavior, program slicing consists of reducing the program to the minimum form that still produces the selected behavior. The reduced program is called a “slice” and is a faithful representation of the original program within the domain of the specified behavior subset. Generally, finding a slice is an unsolvable problem, but by specifying the target behavior subset by the values of a set of variables, it is possible to obtain approximate slices using a data-flow algorithm. These slices are usually used by developers during debugging to locate the source of errors.


See also

*
Automated code review Automated code review software checks source code for compliance with a predefined set of rules or best practices. The use of analytical methods to inspect and review source code to detect bugs or security issues has been a standard development pr ...
*
Language-based security In computer science, language-based security (LBS) is a set of techniques that may be used to strengthen the security of applications on a high level by using the properties of programming languages. LBS is considered to enforce computer security o ...
*
Polyvariance In program analysis, polyvariance is an analysis in which functions are analyzed multiple times—typically once at each call site In programming, a spot of a function or subroutine is the location (line of code) where the function is called ...
*
Profiling (computer programming) In software engineering, profiling ("program profiling", "software profiling") is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the f ...
*
Program verification In the context of hardware and software systems, formal verification is the act of proving or disproving the correctness of intended algorithms underlying a system with respect to a certain formal specification or property, using formal metho ...
*
Termination analysis In computer science, termination analysis is program analysis which attempts to determine whether the evaluation of a given program halts for ''each'' input. This means to determine whether the input program computes a ''total'' function. It is c ...


References


Further reading

* * *


External links

*{{Commonscatinline