In
programming and
software development
Software development is the process of conceiving, specifying, designing, programming, documenting, testing, and bug fixing involved in creating and maintaining applications, frameworks, or other software components. Software development invol ...
, fuzzing or fuzz testing is an automated
software testing
Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to apprecia ...
technique that involves providing invalid, unexpected, or
random data
In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual ran ...
as inputs to a
computer program
A computer program is a sequence or set of instructions in a programming language for a computer to execute. Computer programs are one component of software, which also includes documentation and other intangible components.
A computer program ...
. The program is then monitored for exceptions such as
crashes, failing built-in code
assertions, or potential
memory leak
In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in a way that Computer memory, memory which is no longer needed is not released. A memory leak may also happe ...
s. Typically, fuzzers are used to test programs that take structured inputs. This structure is specified, e.g., in a
file format
A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.
Some file formats ...
or
protocol
Protocol may refer to:
Sociology and politics
* Protocol (politics), a formal agreement between nation states
* Protocol (diplomacy), the etiquette of diplomacy and affairs of state
* Etiquette, a code of personal behavior
Science and technology
...
and distinguishes valid from invalid input. An effective fuzzer generates semi-valid inputs that are "valid enough" in that they are not directly rejected by the parser, but do create unexpected behaviors deeper in the program and are "invalid enough" to expose
corner case
In engineering, a corner case (or pathological case) involves a problem or situation that occurs only outside normal operating parameters—specifically one that manifests itself when multiple environmental variables or conditions are simultaneou ...
s that have not been properly dealt with.
For the purpose of security, input that crosses a
trust boundary
Trust boundary is a term used in computer science and computer security, security which describes a boundary where program data or execution changes its level of "trust," or where two principals with different capabilities exchange data or commands ...
is often the most useful.
For example, it is more important to fuzz code that handles the upload of a file by any user than it is to fuzz the code that parses a configuration file that is accessible only to a privileged user.
History
The term "fuzz" originates from a fall 1988 class project
in the graduate Advanced Operating Systems class (CS736), taught by Prof. Barton Miller at the University of Wisconsin, whose results were subsequently published in 1990.
To fuzz test a
UNIX
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
utility meant to automatically generate random input and command-line parameters for the utility. The project was designed to test the reliability of UNIX command line programs by executing a large number of random inputs in quick succession until they crashed. Miller's team was able to crash 25 to 33 percent of the utilities that they tested. They then debugged each of the crashes to determine the cause and categorized each detected failure. To allow other researchers to conduct similar experiments with other software, the source code of the tools, the test procedures, and the raw result data were made publicly available.
This early fuzzing would now be called black box, generational, unstructured (dumb) fuzzing.
This 1990 fuzz paper also noted the relationship of reliability to security: "Second, one of the bugs that we found was caused by the same programming practice that provided one of the security holes to the Internet worm (the 'gets finger' bug). We have found additional bugs that might indicate future security holes." (Referring to the
Morris worm of November 1988.)
The original fuzz project went on to make contributions in 1995, 2000, 2006 and most recently in 2020:
* 1995:
The "fuzz revisited" paper consisted of four parts.
*# Reproduced the original command line study, including a wider variety of UNIX systems and more utilities. The study showed that, if anything, reliability had gotten worse. This was the first study that included open source
GNU
GNU () is an extensive collection of free software (383 packages as of January 2022), which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operat ...
and
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
utilities that, interestingly, were significantly more reliable than those from the commercial UNIX systems.
*# Introduced the fuzz testing of GUI (window based) applications under X-Windows. This study used both unstructured and structured (valid sequences of mouse and keyboard events) input data. They were able to crash 25% of the X-Windows applications. In addition, they tested the X-Windows server and showed that it was resilient to crashes.
*# Introduced fuzz testing of network services, again based on structured test input. None of these services had crashed.
*# Introduced random testing of system library call return values, specifically randomly returning zero from the malloc family of functions. Almost half the standard UNIX programs failed to properly check such return values.
* 2000:
Applied fuzz testing to the recently released
Windows NT
Windows NT is a proprietary graphical operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.
Time-sharing operating systems sc ...
operating system, testing applications that ran under the
Win32
The Windows API, informally WinAPI, is Microsoft's core set of application programming interfaces (APIs) available in the Microsoft Windows operating systems. The name Windows API collectively refers to several different platform implementations th ...
window system. They were able to crash 21% of the applications and hang an additional 24% of those tested. Again, applications were testing with both unstructured and structured (valid keyboard and mouse events) input, crashing almost half of those applications that they tested. They identified the causes of the failures and found them to be similar to previous studies.
* 2006:
Applied fuzz testing to
Mac OS X
macOS (; previously OS X and originally Mac OS X) is a Unix operating system developed and marketed by Apple Inc. since 2001. It is the primary operating system for Apple's Mac (computer), Mac computers. Within the market of ...
, for both command line and window based applications. They tested 135 command line utility programs, crashing 7% of those tested. In addition, they tested 30 applications that ran under the MacOS
Aqua
Aqua is the Latin word for water. It is used in many words which relate to water, such as aquatic life. In English, it may also refer to:
Arts
* Aqua (color), a greenish-blue color
Business
* Aqua (skyscraper), an 82-story residential skysc ...
window system, crashing 73% of those tested.
* 2020:
Most recently, they applied the classic generational, black box, unstructured testing to current UNIX systems, specially Linux,
FreeBSD
FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular ...
and MacOS, to see if the original techniques were still relevant and if current utility programs were resistant to this type of testing. They tested approximately 75 utilities on each platform, with failure rates of 12% on Linux, 16% on MacOS and 19% on FreeBSD. (Note that these failure rates were ''worse'' than the results from earlier test of the same systems.) When they analyzed each failure and categorized them, they found that the classic categories of failures, such as pointer and array errors and not checking return codes, were still broadly present in the new results. In addition, new failure causes cropped up from complex program state and algorithms that did not scale with the input size or complexity. They also tested UNIX utilities written more recently in Rust and found that they were of similar reliability to the ones written in C, though (as expected) less likely to have memory errors.
In April 2012, Google announced ClusterFuzz, a cloud-based fuzzing infrastructure for security-critical components of the
Chromium web browser.
Security researchers can upload their own fuzzers and collect bug bounties if ClusterFuzz finds a crash with the uploaded fuzzer.
In September 2014,
Shellshock was disclosed as a family of
security bug
Security is protection from, or resilience against, potential harm (or other unwanted Coercion, coercive change) caused by others, by restraining the freedom of others to act. Beneficiaries (technically referents) of security may be of persons an ...
s in the widely used
UNIX
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and ot ...
Bash
Bash or BASH may refer to:
Arts and entertainment
* ''Bash!'' (Rockapella album), 1992
* ''Bash!'' (Dave Bailey album), 1961
* '' Bash: Latter-Day Plays'', a dramatic triptych
* ''BASH!'' (role-playing game), a 2005 superhero game
* "Bash" ('' ...
shell
Shell may refer to:
Architecture and design
* Shell (structure), a thin structure
** Concrete shell, a thin shell of concrete, usually with no interior columns or exterior buttresses
** Thin-shell structure
Science Biology
* Seashell, a hard o ...
; most vulnerabilities of Shellshock were found using the fuzzer
AFL
AFL may refer to:
Sports
* American Football League (AFL), a name shared by several separate and unrelated professional American football leagues:
** American Football League (1926) (a.k.a. "AFL I"), first rival of the National Football Leagu ...
. (Many Internet-facing services, such as some web server deployments, use Bash to process certain requests, allowing an attacker to cause vulnerable versions of Bash to
execute arbitrary commands. This can allow an attacker to gain unauthorized access to a computer system.
)
In April 2015, Hanno Böck showed how the fuzzer AFL could have found the 2014 Heartbleed vulnerability. (The
Heartbleed
Heartbleed was a security bug in the OpenSSL cryptography library, which is a widely used implementation of the Transport Layer Security (TLS) protocol. It was introduced into the software in 2012 and publicly disclosed in April 2014. Heartble ...
vulnerability was disclosed in April 2014. It is a serious vulnerability that allows adversaries to decipher otherwise
encrypted communication
Secure communication is when two entities are communicating and do not want a third party to listen in. For this to be the case, the entities need to communicate in a way that is unsusceptible to eavesdropping or interception. Secure communication ...
. The vulnerability was accidentally introduced into
OpenSSL
OpenSSL is a software library for applications that provide secure communications over computer networks against eavesdropping or need to identify the party at the other end. It is widely used by Internet servers, including the majority of HTT ...
which implements
TLS and is used by the majority of the servers on the internet.
Shodan
SHODAN (Sentient Hyper-Optimized Data Access Network) is a fictional artificial intelligence and the main antagonist of the cyberpunk-horror themed video games ''System Shock'' and ''System Shock 2''.
Character design
SHODAN is an artificial in ...
reported 238,000 machines still vulnerable in April 2016; 200,000 in January 2017.)
In August 2016, the
Defense Advanced Research Projects Agency
The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense responsible for the development of emerging technologies for use by the military.
Originally known as the Adv ...
(DARPA) held the finals of the first
Cyber Grand Challenge, a fully automated
capture-the-flag competition that lasted 11 hours.
The objective was to develop automatic defense systems that can discover,
exploit
Exploit means to take advantage of something (a person, situation, etc.) for one's own end, especially unethically or unjustifiably.
Exploit can mean:
*Exploitation of natural resources
*Exploit (computer security)
* Video game exploit
*Exploitat ...
, and
correct software flaws in
real-time
Real-time or real time describes various operations in computing or other processes that must guarantee response times within a specified time (deadline), usually a relatively short time. A real-time process is generally one that happens in defined ...
. Fuzzing was used as an effective offense strategy to discover flaws in the software of the opponents. It showed tremendous potential in the automation of vulnerability detection. The winner was a system called "Mayhem"
developed by the team ForAllSecure led by
David Brumley
David Brumley is a professor at Carnegie Mellon University. He is a well-known researcher in software security, network security, and applied cryptography. Prof. Brumley also worked for 5 years as a Computer Security Officer for Stanford Unive ...
.
In September 2016, Microsoft announced Project Springfield, a cloud-based fuzz testing service for finding security critical bugs in software.
In December 2016, Google announced OSS-Fuzz which allows for continuous fuzzing of several security-critical open-source projects.
At Black Hat 2018, Christopher Domas demonstrated the use of fuzzing to expose the existence of a hidden
RISC
In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comput ...
core in a processor.
This core was able to bypass existing security checks to execute
Ring 0 commands from Ring 3.
In September 2020,
Microsoft
Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
released
OneFuzz
OneFuzz is a cross-platform free and open source fuzz testing framework by Microsoft. The software enables continuous developer-driven fuzz testing to identify weaknesses in computer software prior to release.
Overview
OneFuzz is a self-hosted ...
, a
self-hosted fuzzing-as-a-service platform that automates the detection of
software bug
A software bug is an error, flaw or fault in the design, development, or operation of computer software that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The process of finding and correcting bugs i ...
s. It supports
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for serv ...
and Linux.
Early random testing
Testing programs with random inputs dates back to the 1950s when data was still stored on
punched cards
A punched card (also punch card or punched-card) is a piece of stiff paper that holds digital data represented by the presence or absence of holes in predefined positions. Punched cards were once common in data processing applications or to d ...
.
Programmers would use punched cards that were pulled from the trash or card decks of random numbers as input to computer programs. If an execution revealed undesired behavior, a
bug had been detected.
The execution of random inputs is also called
random testing
Random testing is a black-box software testing technique where programs are tested by generating random, independent inputs. Results of the output are compared against software specifications to verify that the test output is pass or fail. In case ...
or
monkey testing
In software testing, monkey testing is a technique where the user tests the application or system by providing random inputs and checking the behavior, or seeing whether the application or system will crash. Monkey testing is usually implemented as ...
.
In 1981, Duran and Ntafos formally investigated the effectiveness of testing a program with random inputs.
While random testing had been widely perceived to be the worst means of testing a program, the authors could show that it is a cost-effective alternative to more systematic testing techniques.
In 1983,
Steve Capps
Steve Capps is an American computer programmer, who was one of the designers of the original Apple Macintosh computer.
Capps started working at the Xerox Corporation while still a computer science student at the Rochester Institute of Technology. ...
at Apple developed "The Monkey",
a tool that would generate random inputs for
classic Mac OS
Mac OS (originally System Software; retronym: Classic Mac OS) is the series of operating systems developed for the Macintosh family of personal computers by Apple Computer from 1984 to 2001, starting with System 1 and ending with Mac OS 9. The ...
applications, such as
MacPaint
MacPaint is a raster graphics editor developed by Apple Computer and released with the original Macintosh personal computer on January 24, 1984. It was sold separately for US$195 with its word processing counterpart, MacWrite. MacPaint was nota ...
.
The figurative "monkey" refers to the
infinite monkey theorem
The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, such as the complete works of William Shakespeare. In fact, the monkey would ...
which states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will eventually type out the entire works of Shakespeare. In the case of testing, the monkey would write the particular sequence of inputs that would trigger a crash.
In 1991, the crashme tool was released, which was intended to test the robustness of Unix and
Unix-like
A Unix-like (sometimes referred to as UN*X or *nix) operating system is one that behaves in a manner similar to a Unix system, although not necessarily conforming to or being certified to any version of the Single UNIX Specification. A Unix-li ...
operating systems
An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.
Time-sharing operating systems schedule tasks for efficient use of the system and may also inc ...
by randomly executing systems calls with randomly chosen parameters.
Types
A fuzzer can be categorized in several ways:
# A fuzzer can be generation-based or mutation-based depending on whether inputs are generated from scratch or by modifying existing inputs.
# A fuzzer can be dumb (unstructured) or smart (structured) depending on whether it is aware of input structure.
# A fuzzer can be white-, grey-, or black-box, depending on whether it is aware of program structure.
Reuse of existing input seeds
A mutation-based fuzzer leverages an existing corpus of seed inputs during fuzzing. It generates inputs by modifying (or rather
mutating) the provided seeds. For example, when fuzzing the image library
libpng
libpng is the official Portable Network Graphics (PNG) reference library (originally called pnglib). It is a platform-independent library that contains C functions for handling PNG images. It supports almost all of PNG's features, is extensible ...
, the user would provide a set of valid
PNG image files as seeds while a mutation-based fuzzer would modify these seeds to produce semi-valid variants of each seed. The corpus of seed files may contain thousands of potentially similar inputs. Automated seed selection (or test suite reduction) allows users to pick the best seeds in order to maximize the total number of bugs found during a fuzz campaign.
A generation-based fuzzer generates inputs from scratch. For instance, a smart generation-based fuzzer
takes the input model that was provided by the user to generate new inputs. Unlike mutation-based fuzzers, a generation-based fuzzer does not depend on the existence or quality of a corpus of seed inputs.
Some fuzzers have the capability to do both, to generate inputs from scratch and to generate inputs by mutation of existing seeds.
Aware of input structure
Typically, fuzzers are used to generate inputs for programs that take structured inputs, such as a
file
File or filing may refer to:
Mechanical tools and processes
* File (tool), a tool used to ''remove'' fine amounts of material from a workpiece
**Filing (metalworking), a material removal process in manufacturing
** Nail file, a tool used to gent ...
, a sequence of keyboard or mouse
events
Event may refer to:
Gatherings of people
* Ceremony, an event of ritual significance, performed on a special occasion
* Convention (meeting), a gathering of individuals engaged in some common interest
* Event management, the organization of ev ...
, or a sequence of
messages. This structure distinguishes valid input that is accepted and processed by the program from invalid input that is quickly rejected by the program. What constitutes a valid input may be explicitly specified in an input model. Examples of input models are
formal grammar
In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
s,
file format
A file format is a standard way that information is encoded for storage in a computer file. It specifies how bits are used to encode information in a digital storage medium. File formats may be either proprietary or free.
Some file formats ...
s,
GUI
The GUI ( "UI" by itself is still usually pronounced . or ), graphical user interface, is a form of user interface that allows users to interact with electronic devices through graphical icons and audio indicator such as primary notation, inste ...
-models, and
network protocols
A communication protocol is a system of rules that allows two or more entities of a communications system to transmit information via any kind of variation of a physical quantity. The protocol defines the rules, syntax, semantics and synchroniza ...
. Even items not normally considered as input can be fuzzed, such as the contents of
database
In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases sp ...
s,
shared memory
In computer science, shared memory is memory that may be simultaneously accessed by multiple programs with an intent to provide communication among them or avoid redundant copies. Shared memory is an efficient means of passing data between progr ...
,
environment variable
An environment variable is a dynamic-named value that can affect the way running processes will behave on a computer. They are part of the environment in which a process runs. For example, a running process can query the value of the TEMP env ...
s or the precise interleaving of
threads. An effective fuzzer generates semi-valid inputs that are "valid enough" so that they are not directly rejected from the
parser
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lati ...
and "invalid enough" so that they might stress
corner case
In engineering, a corner case (or pathological case) involves a problem or situation that occurs only outside normal operating parameters—specifically one that manifests itself when multiple environmental variables or conditions are simultaneou ...
s and exercise interesting program behaviours.
A smart (model-based,
grammar-based,
or protocol-based) fuzzer leverages the input model to generate a greater proportion of valid inputs. For instance, if the input can be modelled as an
abstract syntax tree
In computer science, an abstract syntax tree (AST), or just syntax tree, is a tree representation of the abstract syntactic structure of text (often source code) written in a formal language. Each node of the tree denotes a construct occurring ...
, then a smart mutation-based fuzzer
would employ random
transformations to move complete subtrees from one node to another. If the input can be modelled by a
formal grammar
In formal language theory, a grammar (when the context is not given, often called a formal grammar for clarity) describes how to form strings from a language's alphabet that are valid according to the language's syntax. A grammar does not describe ...
, a smart generation-based fuzzer
would instantiate the
production rules to generate inputs that are valid with respect to the grammar. However, generally the input model must be explicitly provided, which is difficult to do when the model is proprietary, unknown, or very complex. If a large corpus of valid and invalid inputs is available, a
grammar induction
Grammar induction (or grammatical inference) is the process in machine learning of learning a formal grammar (usually as a collection of ''re-write rules'' or '' productions'' or alternatively as a finite state machine or automaton of some kind) fr ...
technique, such as
Angluin's L* algorithm, would be able to generate an input model.
A dumb fuzzer
does not require the input model and can thus be employed to fuzz a wider variety of programs. For instance,
AFL
AFL may refer to:
Sports
* American Football League (AFL), a name shared by several separate and unrelated professional American football leagues:
** American Football League (1926) (a.k.a. "AFL I"), first rival of the National Football Leagu ...
is a dumb mutation-based fuzzer that modifies a seed file by
flipping random bits, by substituting random bytes with "interesting" values, and by moving or deleting blocks of data. However, a dumb fuzzer might generate a lower proportion of valid inputs and stress the
parser
Parsing, syntax analysis, or syntactic analysis is the process of analyzing a string of symbols, either in natural language, computer languages or data structures, conforming to the rules of a formal grammar. The term ''parsing'' comes from Lati ...
code rather than the main components of a program. The disadvantage of dumb fuzzers can be illustrated by means of the construction of a valid
checksum
A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data ...
for a
cyclic redundancy check
A cyclic redundancy check (CRC) is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to digital data. Blocks of data entering these systems get a short ''check value'' attached, based on t ...
(CRC). A CRC is an
error-detecting code
In information theory and coding theory with applications in computer science and telecommunication, error detection and correction (EDAC) or error control are techniques that enable reliable delivery of digital data over unreliable communica ...
that ensures that the
integrity
Integrity is the practice of being honest and showing a consistent and uncompromising adherence to strong moral and ethical principles and values.
In ethics, integrity is regarded as the honesty and truthfulness or accuracy of one's actions. Inte ...
of the data contained in the input file is preserved during
transmission
Transmission may refer to:
Medicine, science and technology
* Power transmission
** Electric power transmission
** Propulsion transmission, technology allowing controlled application of power
*** Automatic transmission
*** Manual transmission
*** ...
. A checksum is computed over the input data and recorded in the file. When the program processes the received file and the recorded checksum does not match the re-computed checksum, then the file is rejected as invalid. Now, a fuzzer that is unaware of the CRC is unlikely to generate the correct checksum. However, there are attempts to identify and re-compute a potential checksum in the mutated input, once a dumb mutation-based fuzzer has modified the protected data.
Aware of program structure
Typically, a fuzzer is considered more effective if it achieves a higher degree of
code coverage
In computer science, test coverage is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high test coverage has more of its source code executed during testing, ...
. The rationale is, if a fuzzer does not exercise certain structural elements in the program, then it is also not able to reveal
bugs that are hiding in these elements. Some program elements are considered more critical than others. For instance, a division operator might cause a
division by zero
In mathematics, division by zero is division (mathematics), division where the divisor (denominator) is 0, zero. Such a division can be formally expression (mathematics), expressed as \tfrac, where is the dividend (numerator). In ordinary ari ...
error, or a
system call
In computing, a system call (commonly abbreviated to syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, acc ...
may crash the program.
A
black-box
In science, computing, and engineering, a black box is a system which can be viewed in terms of its inputs and outputs (or transfer characteristics), without any knowledge of its internal workings. Its implementation is "opaque" (black). The te ...
fuzzer
treats the program as a
black box
In science, computing, and engineering, a black box is a system which can be viewed in terms of its inputs and outputs (or transfer characteristics), without any knowledge of its internal workings. Its implementation is "opaque" (black). The te ...
and is unaware of internal program structure. For instance, a
random testing
Random testing is a black-box software testing technique where programs are tested by generating random, independent inputs. Results of the output are compared against software specifications to verify that the test output is pass or fail. In case ...
tool that generates inputs at random is considered a blackbox fuzzer. Hence, a blackbox fuzzer can execute several hundred inputs per second, can be easily parallelized, and can scale to programs of arbitrary size. However, blackbox fuzzers may only scratch the surface and expose "shallow" bugs. Hence, there are attempts to develop blackbox fuzzers that can incrementally learn about the internal structure (and behavior) of a program during fuzzing by observing the program's output given an input. For instance, LearnLib employs
active learning
Active learning is "a method of learning in which students are actively or experientially involved in the learning process and where there are different levels of active learning, depending on student involvement." states that "students partici ...
to generate an
automaton
An automaton (; plural: automata or automatons) is a relatively self-operating machine, or control mechanism designed to automatically follow a sequence of operations, or respond to predetermined instructions.Automaton – Definition and More ...
that represents the behavior of a web application.
A
white-box fuzzer
leverages
program analysis
In computer science, program analysis is the process of automatically analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness.
Program analysis focuses on two major areas: program op ...
to systematically increase
code coverage
In computer science, test coverage is a percentage measure of the degree to which the source code of a program is executed when a particular test suite is run. A program with high test coverage has more of its source code executed during testing, ...
or to reach certain critical program locations. For instance, SAGE
leverages
symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inp ...
to systematically explore different paths in the program.
If the
program's specification is available, a whitebox fuzzer might leverage techniques from
model-based testing
Model-based testing is an application of model-based design for designing and optionally also executing artifacts to perform software testing or system testing. Models can be used to represent the desired behavior of a system under test (SUT), or ...
to generate inputs and check the program outputs against the program specification.
A whitebox fuzzer can be very effective at exposing bugs that hide deep in the program. However, the time used for analysis (of the program or its specification) can become prohibitive. If the whitebox fuzzer takes relatively too long to generate an input, a blackbox fuzzer will be more efficient.
Hence, there are attempts to combine the efficiency of blackbox fuzzers and the effectiveness of whitebox fuzzers.
A
gray-box fuzzer leverages
instrumentation
Instrumentation a collective term for measuring instruments that are used for indicating, measuring and recording physical quantities. The term has its origins in the art and science of scientific instrument-making.
Instrumentation can refer to ...
rather than program analysis to glean information about the program. For instance, AFL and libFuzzer utilize lightweight instrumentation to trace
basic block
In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually deco ...
transitions exercised by an input. This leads to a reasonable performance overhead but informs the fuzzer about the increase in code coverage during fuzzing, which makes gray-box fuzzers extremely efficient vulnerability detection tools.
Uses
Fuzzing is used mostly as an automated technique to expose
vulnerabilities
Vulnerability refers to "the quality or state of being exposed to the possibility of being attacked or harmed, either physically or emotionally."
A window of vulnerability (WOV) is a time frame within which defensive measures are diminished, com ...
in security-critical programs that might be
exploited with malicious intent.
More generally, fuzzing is used to demonstrate the presence of bugs rather than their absence. Running a fuzzing campaign for several weeks without finding a bug does not prove the program correct.
After all, the program may still fail for an input that has not been executed, yet; executing a program for all inputs is prohibitively expensive. If the objective is to prove a program correct for all inputs, a
formal specification
In computer science, formal specifications are mathematically based techniques whose purpose are to help with the implementation of systems and software. They are used to describe a system, to analyze its behavior, and to aid in its design by verif ...
must exist and techniques from
formal methods
In computer science, formal methods are mathematically rigorous techniques for the specification, development, and verification of software and hardware systems. The use of formal methods for software and hardware design is motivated by the expec ...
must be used.
Exposing bugs
In order to expose bugs, a fuzzer must be able to distinguish expected (normal) from unexpected (buggy) program behavior. However, a machine cannot always distinguish a bug from a feature. In automated
software testing
Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to apprecia ...
, this is also called the
test oracle
In computing, software engineering, and software testing, a test oracle (or just oracle) is a mechanism for determining whether a test has passed or failed. The use of oracles involves comparing the output(s) of the system under test, for a given ...
problem.
Typically, a fuzzer distinguishes between crashing and non-crashing inputs in the absence of
specifications
A specification often refers to a set of documented requirements to be satisfied by a material, design, product, or service. A specification is often a type of technical standard.
There are different types of technical or engineering specificati ...
and to use a simple and objective measure.
Crashes can be easily identified and might indicate potential vulnerabilities (e.g.,
denial of service
In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host connect ...
or
arbitrary code execution
In computer security, arbitrary code execution (ACE) is an attacker's ability to run any commands or code of the attacker's choice on a target machine or in a target process. An arbitrary code execution vulnerability is a security flaw in softw ...
). However, the absence of a crash does not indicate the absence of a vulnerability. For instance, a program written in
C may or may not crash when an input causes a
buffer overflow
In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations.
Buffers are areas of memory ...
. Rather the program's behavior is
undefined
Undefined may refer to:
Mathematics
* Undefined (mathematics), with several related meanings
** Indeterminate form, in calculus
Computing
* Undefined behavior, computer code whose behavior is not specified under certain conditions
* Undefined ...
.
To make a fuzzer more sensitive to failures other than crashes, sanitizers can be used to inject assertions that crash the program when a failure is detected. There are different sanitizers for different kinds of bugs:
*to detect memory related errors, such as
buffer overflow
In information security and programming, a buffer overflow, or buffer overrun, is an anomaly whereby a program, while writing data to a buffer, overruns the buffer's boundary and overwrites adjacent memory locations.
Buffers are areas of memory ...
s and
use-after-free
Dangling pointers and wild pointers in computer programming are pointers that do not point to a valid object of the appropriate type. These are special cases of memory safety violations. More generally, dangling references and wild references are ...
(using
memory debugger
Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembered, ...
s such as
AddressSanitizer
AddressSanitizer (or ASan) is an open source programming tool that detects memory corruption bugs such as buffer overflows or accesses to a dangling pointer (use-after-free). AddressSanitizer is based on compiler instrumentation and directly mappe ...
),
*to detect
race condition
A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of t ...
s and
deadlock
In concurrent computing, deadlock is any situation in which no member of some group of entities can proceed because each waits for another member, including itself, to take action, such as sending a message or, more commonly, releasing a lo ...
s (ThreadSanitizer),
*to detect
undefined behavior
In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, ...
(UndefinedBehaviorSanitizer),
*to detect
memory leak
In computer science, a memory leak is a type of resource leak that occurs when a computer program incorrectly manages memory allocations in a way that Computer memory, memory which is no longer needed is not released. A memory leak may also happe ...
s (LeakSanitizer), or
*to check
control-flow integrity
Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program.
Techniques
Associated techniques include code-p ...
(CFISanitizer).
Fuzzing can also be used to detect "differential" bugs if a
reference implementation
In the software development process, a reference implementation (or, less frequently, sample implementation or model implementation) is a program that implements all requirements from a corresponding specification. The reference implementation o ...
is available. For automated
regression testing
Regression testing (rarely, ''non-regression testing'') is re-running functional and non-functional tests to ensure that previously developed and tested software still performs as expected after a change. If not, that would be called a '' regre ...
, the generated inputs are executed on two
versions of the same program. For automated
differential testing, the generated inputs are executed on two implementations of the same program (e.g.,
lighttpd
lighttpd (pronounced "lighty") is an open-source web server optimized for speed-critical environments while remaining standards-compliant, secure and flexible. It was originally written by Jan Kneschke as a proof-of-concept of the c10k problem †...
and
are both implementations of a web server). If the two variants produce different output for the same input, then one may be buggy and should be examined more closely.
Validating static analysis reports
Static program analysis
In computer science, static program analysis (or static analysis) is the analysis of computer programs performed without executing them, in contrast with dynamic program analysis, which is performed on programs during their execution.
The term i ...
analyzes a program without actually executing it. This might lead to
false positive
A false positive is an error in binary classification in which a test result incorrectly indicates the presence of a condition (such as a disease when the disease is not present), while a false negative is the opposite error, where the test result ...
s where the tool reports problems with the program that do not actually exist. Fuzzing in combination with
dynamic program analysis
Dynamic program analysis is the analysis of computer software that is performed by executing programs on a real or virtual processor. For dynamic program analysis to be effective, the target program must be executed with sufficient test inputs ...
can be used to try to generate an input that actually witnesses the reported problem.
Browser security
Modern web browsers undergo extensive fuzzing. The
Chromium
Chromium is a chemical element with the symbol Cr and atomic number 24. It is the first element in group 6. It is a steely-grey, lustrous, hard, and brittle transition metal.
Chromium metal is valued for its high corrosion resistance and hardne ...
code of
Google Chrome
Google Chrome is a cross-platform web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS ...
is continuously fuzzed by the Chrome Security Team with 15,000 cores.
For
Microsoft Edge
Microsoft Edge is a proprietary, cross-platform web browser created by Microsoft. It was first released in 2015 as part of Windows 10 and Xbox One and later ported to other platforms as a fork of Google's Chromium open-source project: Android ...
and
Internet Explorer
Internet Explorer (formerly Microsoft Internet Explorer and Windows Internet Explorer, commonly abbreviated IE or MSIE) is a series of graphical user interface, graphical web browsers developed by Microsoft which was used in the Microsoft Wind ...
,
Microsoft
Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
performed fuzzed testing with 670 machine-years during product development, generating more than 400 billion DOM manipulations from 1 billion HTML files.
Toolchain
A fuzzer produces a large number of inputs in a relatively short time. For instance, in 2016 the Google OSS-fuzz project produced around 4
trillion
''Trillion'' is a number with two distinct definitions:
* 1,000,000,000,000, i.e. one million million, or (ten to the twelfth power), as defined on the short scale. This is now the meaning in both American and British English.
* 1,000,000,000,0 ...
inputs a week.
Hence, many fuzzers provide a
toolchain
In software, a toolchain is a set of programming tools that is used to perform a complex software development task or to create a software product, which is typically another computer program or a set of related programs. In general, the tools form ...
that automates otherwise manual and tedious tasks which follow the automated generation of failure-inducing inputs.
Automated bug triage
Automated bug triage is used to group a large number of failure-inducing inputs by
root cause Root cause may refer to:
* "Root Cause" (''Person of Interest''), a TV show episode
* Root cause analysis
In science
Science is a systematic endeavor that builds and organizes knowledge in the form of testable explanations and predic ...
and to prioritize each individual bug by severity. A fuzzer produces a large number of inputs, and many of the failure-inducing ones may effectively expose the same
software bug
A software bug is an error, flaw or fault in the design, development, or operation of computer software that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. The process of finding and correcting bugs i ...
. Only some of these bugs are
security-critical and should be
patched
Patched (Ptc) is a conserved 12-pass transmembrane protein receptor that plays an obligate negative regulatory role in the Hedgehog signaling pathway in insects and vertebrates. Patched is an essential gene in embryogenesis for proper segm ...
with higher priority. For instance the
CERT Coordination Center
The CERT Coordination Center (CERT/CC) is the coordination center of the computer emergency response team (CERT) for the Software Engineering Institute (SEI), a non-profit United States federally funded research and development center. The CERT/C ...
provides the Linux triage tools which group crashing inputs by the produced
stack trace
In computing, a stack trace (also called stack backtrace or stack traceback) is a report of the active stack frames at a certain point in time during the execution of a program. When a program is run, memory is often dynamically allocated in two p ...
and lists each group according to their probability to be
exploitable. The Microsoft Security Research Centre (MSEC) developed the !exploitable tool which first creates a
hash for a crashing input to determine its uniqueness and then assigns an exploitability rating:
*Exploitable
*Probably Exploitable
*Probably Not Exploitable, or
*Unknown.
Previously unreported, triaged bugs might be automatically
reported to a
bug tracking system
A bug tracking system or defect tracking system is a software application that keeps track of reported software bugs in software development projects. It may be regarded as a type of issue tracking system.
Many bug tracking systems, such as those ...
. For instance, OSS-Fuzz runs large-scale, long-running fuzzing campaigns for several security-critical software projects where each previously unreported, distinct bug is reported directly to a bug tracker.
The OSS-Fuzz bug tracker automatically informs the
maintainer
Maintenance may refer to:
Biological science
* Maintenance of an organism
* Maintenance respiration
Non-technical maintenance
* Alimony, also called ''maintenance'' in British English
* Champerty and maintenance, two related legal doctr ...
of the vulnerable software and checks in regular intervals whether the bug has been fixed in the most recent
revision
Revision is the process of revising.
More specifically, it may refer to:
* Update, a modification of software or a database
* Revision control, the management of changes to sets of computer files
* ''ReVisions'', a 2004 anthology of alternate hi ...
using the uploaded minimized failure-inducing input.
Automated input minimization
Automated input minimization (or test case reduction) is an automated
debugging
In computer programming and software development, debugging is the process of finding and resolving '' bugs'' (defects or problems that prevent correct operation) within computer programs, software, or systems.
Debugging tactics can involve in ...
technique to isolate that part of the failure-inducing input that is actually inducing the failure.
If the failure-inducing input is large and mostly malformed, it might be difficult for a developer to understand what exactly is causing the bug. Given the failure-inducing input, an automated minimization tool would remove as many input bytes as possible while still reproducing the original bug. For instance,
Delta Debugging is an automated input minimization technique that employs an extended
binary search algorithm
In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the m ...
to find such a minimal input.
See also
*
American fuzzy lop (fuzzer)
The American Fuzzy Lop is a rabbit breed recognized by the American Rabbit Breeders Association (ARBA). It is similar in appearance to a Holland Lop. However, the American Fuzzy Lop is a wool breed and will have wool similar to the Angora breeds a ...
*
Concolic testing
Concolic testing (a portmanteau of ''concrete'' and ''symbolic'', also known as dynamic symbolic execution) is a hybrid software verification technique that performs symbolic execution, a classical technique that treats program variables as symbo ...
*
Glitch
A glitch is a short-lived fault in a system, such as a transient fault that corrects itself, making it difficult to troubleshoot. The term is particularly common in the computing and electronics industries, in circuit bending, as well as among ...
*
Glitching
Glitching is an activity in which a person finds and exploits flaws or glitches in video games to achieve something that was not intended by the game designers. Players who engage in this practice are known as glitchers. Some glitches can be ea ...
*
Monkey testing
In software testing, monkey testing is a technique where the user tests the application or system by providing random inputs and checking the behavior, or seeing whether the application or system will crash. Monkey testing is usually implemented as ...
*
Random testing
Random testing is a black-box software testing technique where programs are tested by generating random, independent inputs. Results of the output are compared against software specifications to verify that the test output is pass or fail. In case ...
*
Responsible disclosure
In computer security, coordinated vulnerability disclosure, or "CVD" (formerly known as responsible disclosure) is a vulnerability disclosure model in which a vulnerability or an issue is disclosed to the public only after the responsible partie ...
*
Runtime error detection
Runtime error detection is a software verification method that analyzes a software application as it executes and reports defects that are detected during that execution. It can be applied during unit testing, component testing, integration tes ...
*
Security testing
Security testing is a process intended to reveal flaws in the security mechanisms of an information system that protect data and maintain functionality as intended. Due to the logical limitations of security testing, passing the security testing ...
*
Smoke testing (software)
In computer programming and software testing, smoke testing (also confidence testing, sanity testing,ISTQB® Glossary for the International Software Testing Qualification Board® software testing qualification schemeISTQB GlossaryInternational So ...
*
Symbolic execution In computer science, symbolic execution (also symbolic evaluation or symbex) is a means of analyzing a program to determine what inputs cause each part of a program to execute. An interpreter follows the program, assuming symbolic values for inp ...
*
System testing
System testing is testing conducted on a complete integrated system to evaluate the system's compliance with its specified requirements.
System testing takes, as its input, all of the integrated components that have passed integration testing. ...
*
Test automation
In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive bu ...
References
Further reading
*Ari Takanen, Jared D. DeMott, Charles Miller, ''Fuzzing for Software Security Testing and Quality Assurance'', 2008,
*Michael Sutton, Adam Greene, and Pedram Amini. ''Fuzzing: Brute Force Vulnerability Discovery'', 2007, .
*H. Pohl
''Cost-Effective Identification of Zero-Day Vulnerabilities with the Aid of Threat Modeling and Fuzzing'' 2011
*Fabien Duchene
Detection of Web Vulnerabilities via Model Inference assisted Evolutionary Fuzzing, 2014, PhD Thesis*Bratus, S., Darley, T., Locasto, M., Patterson, M.L., Shapiro, R.B., Shubina, A., ''Beyond Planted Bugs in "Trusting Trust": The Input-Processing Frontier''
IEEE Security & Privacy Vol 12, Issue 1, (Jan-Feb 2014), pp. 83–87€”Basically highlights why fuzzing works so well: because the input is the controlling program of the interpreter.
External links
Fuzzing Project includes tutorials, a list of security-critical
open-source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
projects, and other resources.
University of Wisconsin Fuzz Testing (the original fuzz project)Source of papers and fuzz software.
Designing Inputs That Make Software Fail conference video including fuzzy testing
Link to the Oulu (Finland) University Secure Programming GroupBuilding 'Protocol Aware' Fuzzing Frameworks
{{Software testing
Software testing
Security testing