Concolic Execution
   HOME

TheInfoList



OR:

Concolic testing (a
portmanteau In linguistics, a blend—also known as a blend word, lexical blend, or portmanteau—is a word formed by combining the meanings, and parts of the sounds, of two or more words together.
of ''concrete'' and ''symbolic'', also known as dynamic symbolic execution) is a hybrid
software verification Software verification is a discipline of software engineering, programming languages, and theory of computation whose goal is to assure that software satisfies the expected requirements. Broad scope and classification A broad definition of verif ...
technique that performs
symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for i ...
, a classical technique that treats program variables as symbolic variables, along a ''concrete execution'' ( testing on particular inputs) path. Symbolic execution is used in conjunction with an
automated theorem prover Automated theorem proving (also known as ATP or automated deduction) is a subfield of automated reasoning and mathematical logic dealing with proving mathematical theorems by computer programs. Automated reasoning over mathematical proof was a ma ...
or constraint solver based on
constraint logic programming Constraint logic programming is a form of constraint programming, in which logic programming is extended to include concepts from constraint satisfaction. A constraint logic program is a logic program that contains constraints in the body of claus ...
to generate new concrete inputs (test cases) with the aim of maximizing
code coverage In software engineering, code coverage, also called test coverage, is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high code coverage has more of its ...
. Its main focus is finding bugs in real-world software, rather than demonstrating program correctness. A description and discussion of the concept was introduced in "DART: Directed Automated Random Testing" by Patrice Godefroid, Nils Klarlund, and Koushik Sen. The paper "CUTE: A concolic unit testing engine for C", by Koushik Sen, Darko Marinov, and Gul Agha, further extended the idea to data structures, and first coined the term ''concolic testing''. Another tool, called EGT (renamed to EXE and later improved and renamed to KLEE), based on similar ideas was independently developed by Cristian Cadar and Dawson Engler in 2005, and published in 2005 and 2006. PathCrawler first proposed to perform symbolic execution along a concrete execution path, but unlike concolic testing PathCrawler does not simplify complex symbolic constraints using concrete values. These tools (DART and CUTE, EXE) applied concolic testing to unit testing of C programs and concolic testing was originally conceived as a white box improvement upon established
random testing Random testing is a black-box software testing technique where programs are tested by generating random, independent inputs. Results of the output are compared against software specifications to verify that the test output is pass or fail. In case ...
methodologies. The technique was later generalized to testing multithreaded
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
programs with , and unit testing programs from their executable codes (tool OSMOSE). It was also combined with
fuzz testing In programming and software development, fuzzing or fuzz testing is an automated software testing technique that involves providing invalid, unexpected, or random data as inputs to a computer program. The program is then monitored for exceptio ...
and extended to detect exploitable security issues in large-scale
x86 x86 (also known as 80x86 or the 8086 family) is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel, based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. Th ...
binaries by
Microsoft Research Microsoft Research (MSR) is the research subsidiary of Microsoft. It was created in 1991 by Richard Rashid, Bill Gates and Nathan Myhrvold with the intent to advance state-of-the-art computing and solve difficult world problems through technologi ...
's SAGE. The concolic approach is also applicable to
model checking In computer science, model checking or property checking is a method for checking whether a finite-state model of a system meets a given specification (also known as correctness). This is typically associated with hardware or software syst ...
. In a concolic model checker, the model checker traverses states of the model representing the software being checked, while storing both a concrete state and a symbolic state. The symbolic state is used for checking properties on the software, while the concrete state is used to avoid reaching unreachable states. One such tool is ExpliSAT by Sharon Barner, Cindy Eisner, Ziv Glazberg,
Daniel Kroening Daniel Kroening (born 6 November 1975) is a German computer scientist, Professor in computer science at the University of Oxford, and Chief Science Officer at the company he co-founded, Diffblue Ltd. He is a fellow of Magdalen College. Early lif ...
and Ishai Rabinovitz


Birth of concolic testing

Implementation of traditional symbolic execution based testing requires the implementation of a full-fledged symbolic interpreter for a programming language. Concolic testing implementors noticed that implementation of full-fledged symbolic execution can be avoided if symbolic execution can be piggy-backed with the normal execution of a program through
instrumentation Instrumentation is a collective term for measuring instruments, used for indicating, measuring, and recording physical quantities. It is also a field of study about the art and science about making measurement instruments, involving the related ...
. This idea of simplifying implementation of symbolic execution gave birth to concolic testing.


Development of SMT solvers

An important reason for the rise of concolic testing (and more generally, symbolic-execution based analysis of programs) in the decade since it was introduced in 2005 is the dramatic improvement in the efficiency and expressive power of SMT Solvers. The key technical developments that lead to the rapid development of SMT solvers include combination of theories, lazy solving, DPLL(T) and the huge improvements in the speed of
SAT solvers The SAT ( ) is a standardized test widely used for college admissions in the United States. Since its debut in 1926, its name and scoring have changed several times. For much of its history, it was called the Scholastic Aptitude Test and had two ...
. SMT solvers that are particularly tuned for concolic testing include Z3, STP, Z3str2, and
Boolector In computer science and mathematical logic, satisfiability modulo theories (SMT) is the problem of determining whether a mathematical formula is satisfiable. It generalizes the Boolean satisfiability problem (SAT) to more complex formulas involv ...
.


Example

Consider the following simple example, written in C: void f(int x, int y) Simple random testing, trying random values of ''x'' and ''y'', would require an impractically large number of tests to reproduce the failure. We begin with an arbitrary choice for ''x'' and ''y'', for example ''x'' = ''y'' = 1. In the concrete execution, line 2 sets ''z'' to 2, and the test in line 3 fails since 1 ≠ 100000. Concurrently, the symbolic execution follows the same path but treats ''x'' and ''y'' as symbolic variables. It sets ''z'' to the expression 2''y'' and notes that, because the test in line 3 failed, ''x'' ≠ 100000. This inequality is called a ''path condition'' and must be true for all executions following the same execution path as the current one. Since we'd like the program to follow a different execution path on the next run, we take the last path condition encountered, ''x'' ≠ 100000, and negate it, giving ''x'' = 100000. An automated theorem prover is then invoked to find values for the input variables ''x'' and ''y'' given the complete set of symbolic variable values and path conditions constructed during symbolic execution. In this case, a valid response from the theorem prover might be ''x'' = 100000, ''y'' = 0. Running the program on this input allows it to reach the inner branch on line 4, which is not taken since 100000 (''x'') is not less than 0 (''z'' = 2''y''). The path conditions are ''x'' = 100000 and ''x'' ≥ ''z''. The latter is negated, giving ''x'' < ''z''. The theorem prover then looks for ''x'', ''y'' satisfying ''x'' = 100000, ''x'' < ''z'', and ''z'' = 2''y''; for example, ''x'' = 100000, ''y'' = 50001. This input reaches the error.


Algorithm

Essentially, a concolic testing algorithm operates as follows: # Classify a particular set of variables as ''input variables''. These variables will be treated as symbolic variables during symbolic execution. All other variables will be treated as concrete values. # Instrument the program so that each operation which may affect a symbolic variable value or a path condition is logged to a trace file, as well as any error that occurs. # Choose an arbitrary input to begin with. # Execute the program. # Symbolically re-execute the program on the trace, generating a set of symbolic constraints (including path conditions). # Negate the last path condition not already negated in order to visit a new execution path. If there is no such path condition, the algorithm terminates. # Invoke an automated satisfiability solver on the new set of path conditions to generate a new input. If there is no input satisfying the constraints, return to step 6 to try the next execution path. # Return to step 4. There are a few complications to the above procedure: * The algorithm performs a
depth-first search Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. The algorithm starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and explores as far as possible al ...
over an implicit
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, e.g., including only woody plants with secondary growth, only ...
of possible execution paths. In practice programs may have very large or infinite path trees – a common example is testing data structures that have an unbounded size or length. To prevent spending too much time on one small area of the program, the search may be depth-limited (bounded). * Symbolic execution and automated theorem provers have limitations on the classes of constraints they can represent and solve. For example, a theorem prover based on linear arithmetic will be unable to cope with the nonlinear path condition ''xy'' = 6. Any time that such constraints arise, the symbolic execution may substitute the current concrete value of one of the variables to simplify the problem. An important part of the design of a concolic testing system is selecting a symbolic representation precise enough to represent the constraints of interest.


Commercial success

Symbolic-execution based analysis and testing, in general, has witnessed a significant level of interest from industry. Perhaps the most famous commercial tool that uses dynamic symbolic execution (aka concolic testing) is the SAGE tool from Microsoft. The KLEE and S2E tools (both of which are open-source tools, and use the STP constraint solver) are widely used in many companies including Micro Focus Fortify, NVIDIA, and IBM. Increasingly these technologies are being used by many security companies and hackers alike to find security vulnerabilities.


Limitations

Concolic testing has a number of limitations: * If the program exhibits nondeterministic behavior, it may follow a different path than the intended one. This can lead to nontermination of the search and poor coverage. * Even in a deterministic program, a number of factors may lead to poor coverage, including imprecise symbolic representations, incomplete theorem proving, and failure to search the most fruitful portion of a large or infinite path tree. * Programs which thoroughly mix the state of their variables, such as cryptographic primitives, generate very large symbolic representations that cannot be solved in practice. For example, the condition if(sha256_hash(input)

0x12345678)
requires the theorem prover to invert
SHA256 SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA) and first published in 2001. They are built using the Merkle–Damgård construction, from a one-way compression ...
, which is an open problem.


Tools


pathcrawler-online.com
is a restricted version of the current PathCrawler tool which is publicly available as an online test-case server for evaluation and education purposes.
jCUTE
is available as binary under a research-use only license by Urbana-Champaign for
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
.
CREST
is an open-source solution for C that replaced CUTE (
modified BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD licen ...
).
KLEE
is an open source solution built on-top of the
LLVM LLVM, also called LLVM Core, is a target-independent optimizer and code generator. It can be used to develop a Compiler#Front end, frontend for any programming language and a Compiler#Back end, backend for any instruction set architecture. LLVM i ...
infrastructure (
UIUC license The University of Illinois/NCSA Open Source License is a permissive free software license, based on the MIT/X11 license and the 3-clause BSD license. By combining parts of these two licenses, it attempts to be clearer and more concise than ...
).
CATG
is an open-source solution for
Java Java is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea (a part of Pacific Ocean) to the north. With a population of 156.9 million people (including Madura) in mid 2024, proje ...
(
BSD license BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD lic ...
).
Jalangi
is an open-source concolic testing and symbolic execution tool for JavaScript. Jalangi supports integers and strings.
Microsoft Pex
developed at Microsoft Rise, is publicly available as a
Microsoft Visual Studio 2010 Visual Studio is an integrated development environment (IDE) developed by Microsoft. It is used to develop computer programs including websites, web apps, web services and mobile apps. Visual Studio uses Microsoft software development platforms ...
Power Tool for the
NET Framework The .NET Framework (pronounced as "''dot net''") is a proprietary software framework developed by Microsoft that runs primarily on Microsoft Windows. It was the predominant implementation of the Common Language Infrastructure (CLI) until being ...
.
Triton
is an open-source concolic execution library for binary code.
CutEr
is an open-source concolic testing tool for the Erlang functional programming language.
Owi
is an open-source concolic engine for C, C++,
Rust Rust is an iron oxide, a usually reddish-brown oxide formed by the reaction of iron and oxygen in the catalytic presence of water or air moisture. Rust consists of hydrous iron(III) oxides (Fe2O3·nH2O) and iron(III) oxide-hydroxide (FeO(OH) ...
,
WebAssembly WebAssembly (Wasm) defines a portable binary-code format and a corresponding text format for executable programs as well as software interfaces for facilitating communication between such programs and their host environment. The main goal of ...
and Zig. Many tools, notably DART and SAGE, have not been made available to the public at large. Note however that for instance SAGE is "used daily" for internal security testing at Microsoft.


References

{{Software testing Automated theorem proving Software testing