HOME

TheInfoList



OR:

Bisection is a method used in software development to identify change sets that result in a specific behavior change. It is mostly employed for finding the patch that introduced a bug. Another application area is finding the patch that indirectly fixed a bug.


Overview

The process of locating the
changeset In version control software, a changeset (also known as commit and revision) is a set of alterations packaged together, along with meta-information about the alterations. A changeset describes the exact differences between two successive versio ...
that introduced a specific regression was described as "source change isolation" in 1997 by Brian Ness and Viet Ngo of
Cray Research Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed ...
.
Regression testing Regression testing (rarely, ''non-regression testing'') is re-running functional and non-functional tests to ensure that previously developed and tested software still performs as expected after a change. If not, that would be called a '' regre ...
was performed on Cray's
compiler In computing, a compiler is a computer program that translates computer code written in one programming language (the ''source'' language) into another language (the ''target'' language). The name "compiler" is primarily used for programs tha ...
s in editions comprising one or more changesets. Editions with known regressions could not be validated until developers addressed the problem. Source change isolation narrowed the cause to a single changeset that could then be excluded from editions, unblocking them with respect to this problem, while the author of the change worked on a fix. Ness and Ngo outlined linear search and
binary search In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the ...
methods of performing this isolation. Code bisection has the goal of minimizing the effort to find a specific change set. It employs a
divide and conquer algorithm In computer science, divide and conquer is an algorithm design paradigm. A divide-and-conquer algorithm recursively breaks down a problem into two or more sub-problems of the same or related type, until these become simple enough to be solved direc ...
that depends on having access to the code history which is usually preserved by
revision control In software engineering, version control (also known as revision control, source control, or source code management) is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections o ...
in a code repository.


Bisection method


Code bisection algorithm

Code history has the structure of a
directed acyclic graph In mathematics, particularly graph theory, and computer science, a directed acyclic graph (DAG) is a directed graph with no directed cycles. That is, it consists of vertices and edges (also called ''arcs''), with each edge directed from one v ...
which can be topologically sorted. This makes it possible to use a divide and conquer search algorithm which: * splits up the search space of candidate revisions * tests for the behavior in question * reduces the search space depending on the test result * re-iterates the steps above until a range with at most one bisectable patch candidate remains


Algorithmic complexity

Bisection is in LSPACE having an
algorithmic complexity Algorithmic may refer to: *Algorithm, step-by-step instructions for a calculation **Algorithmic art, art made by an algorithm **Algorithmic composition, music made by an algorithm ** Algorithmic trading, trading decisions made by an algorithm **Alg ...
of O(\log N) with N denoting the number of revisions in the search space, and is similar to a
binary search In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the ...
.


Desirable repository properties

For code bisection it is desirable that each revision in the search space can be built and tested independently.


Monotonicity

For the bisection algorithm to identify a single changeset which caused the behavior being tested to change, the behavior must change monotonically across the search space. For a Boolean function such as a pass/fail test, this means that it only changes once across all changesets between the start and end of the search space. If there are multiple changesets across the search space where the behavior being tested changes between false and true, then the bisection algorithm will find one of them, but it will not necessarily be the root cause of the change in behavior between the start and the end of the search space. The root cause could be a different changeset, or a combination of two or more changesets across the search space. To help deal with this problem, automated tools allow specific changesets to be ignored during a bisection search.


Automation support

Although the bisection method can be completed manually, one of its main advantages is that it can be easily automated. It can thus fit into existing
test automation In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive bu ...
processes: failures in exhaustive automated regression tests can trigger automated bisection to localize faults. Ness and Ngo focused on its potential in Cray's
continuous delivery Continuous delivery (CD) is a software engineering approach in which teams produce software in short cycles, ensuring that the software can be reliably released at any time and, following a pipeline through a "production-like environment", withou ...
-style environment in which the automatically isolated bad changeset could be automatically excluded from builds. The revision control systems
Fossil A fossil (from Classical Latin , ) is any preserved remains, impression, or trace of any once-living thing from a past geological age. Examples include bones, shells, exoskeletons, stone imprints of animals or microbes, objects preserved ...
,
Git Git () is a distributed version control system: tracking changes in any set of files, usually used for coordinating work among programmers collaboratively developing source code during software development. Its goals include speed, data in ...
and
Mercurial Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows and Unix-like systems, such as FreeBSD, macOS, and Linux. Mercurial's major design goals include high performance and scalability, d ...
have built-in functionality for code bisection. The user can start a bisection session with a specified range of revisions from which the revision control system proposes a revision to test, the user tells the system whether the revision tested as "good" or "bad", and the process repeats until the specific "bad" revision has been identified. Other revision control systems, such as Bazaar or Subversion, support bisection through plugins or external scripts.
Phoronix Test Suite Phoronix Test Suite (PTS) is a free and open-source benchmark software for Linux and other operating systems which is developed by Michael Larabel and Matthew Tippett. The Phoronix Test Suite has been endorsed by sites such as Linux.com, Linux ...
can do bisection automatically to find performance regressions.


See also

* Delta debugging (generalization of finding a minimal cause of a bug) * (determining changesets that edited a line in a file)


References

{{Reflist Version control Version control systems Software development process Algorithms