Betweenness
   HOME

TheInfoList



OR:

Betweenness is an
algorithmic problem In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing cal ...
in
order theory Order theory is a branch of mathematics that investigates the intuitive notion of order using binary relations. It provides a formal framework for describing statements such as "this is less than that" or "this precedes that". This article intr ...
about ordering a collection of items subject to constraints that some items must be placed between others.. It has applications in
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
. and was shown to be NP-complete by .


Problem statement

The input to a betweenness problem is a collection of
ordered triple In mathematics, a tuple is a finite ordered list (sequence) of elements. An -tuple is a sequence (or ordered list) of elements, where is a non-negative integer. There is only one 0-tuple, referred to as ''the empty tuple''. An -tuple is defi ...
s of items. The items listed in these triples should be placed into a
total order In mathematics, a total or linear order is a partial order in which any two elements are comparable. That is, a total order is a binary relation \leq on some set X, which satisfies the following for all a, b and c in X: # a \leq a ( reflexive) ...
, with the property that for each of the given triples, the middle item in the triple appears in the output somewhere between the other two items. The items of each triple are not required to be consecutive in the output.


Examples

As an example, the collection of input triples :(2,1,3), (3,4,5), (1,4,5), (2,4,1), (5,2,3) is satisfied by the output ordering :3, 1, 4, 2, 5 but not by :3, 1, 2, 4, 5. In the first of these output orderings, for all five of the input triples, the middle item of the triple appears between the other two items However, for the second output ordering, item 4 is not between items 1 and 2, contradicting the requirement given by the triple (2,4,1). If an input contains two triples like (1,2,3) and (2,3,1) with the same three items but a different choice of the middle item, then there is no valid solution. However, there are more complicated ways of forming a set of triples with no valid solution, that do not contain such a pair of contradictory triples.


Complexity

showed that the decision version of the betweenness problem (in which an algorithm must decide whether or not there exists a valid solution) is NP-complete in two ways, by a reduction from
3-satisfiability In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfie ...
and also by a different reduction from hypergraph 2-coloring.. However, it can easily be solved when all unordered triples of items are represented by an ordered triple of the input, by choosing one of the two items that are not between any others to be the start of the ordering and then using the triples involving this item to compare the relative positions of each pair of remaining items. The related problem of finding an ordering that maximizes the number of satisfied triples is MAXSNP-hard, implying that it is impossible to achieve an
approximation ratio An approximation is anything that is intentionally similar but not exactly equal to something else. Etymology and usage The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very near'' and the prefix ' ...
arbitrarily close to 1 in
polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
unless P = NP. It remains hard to solve or approximate even for dense instances that include an ordered triple for each possible unordered triple of items. The minimum version of the problem restricted to the tournaments was proven to have polynomial time approximation schemes (PTAS). One can achieve an approximation ratio of 1/3 (in expectation) by ordering the items randomly, and this simple strategy gives the best possible polynomial-time approximation if the
unique games conjecture In computational complexity theory, the unique games conjecture (often referred to as UGC) is a conjecture made by Subhash Khot in 2002. The conjecture postulates that the problem of determining the approximate ''value'' of a certain type of gam ...
is true. It is also possible to use semidefinite programming or combinatorial methods to find an ordering that satisfies at least half of the triples of any satisfiable instance, in polynomial time. In parameterized complexity, the problem of satisfying as many constraints as possible from a set ''C'' of constraints is fixed-parameter tractable when parameterized by the difference ''q'' − , ''C'', /3 between the solution quality ''q'' found by the parameterized algorithm and the , ''C'', /3 quality guaranteed in expectation by a random ordering. Although not guaranteed to succeed, a
greedy heuristic A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...
can find solutions to many instances of the betweenness problem arising in practice.


Applications

One application of betweenness arises in
bioinformatics Bioinformatics () is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combi ...
, as part of the process of gene mapping. Certain types of genetic experiments can be used to determine the ordering of triples of genetic markers, but do not distinguish a genetic sequence from its reversal, so the information yielded from such an experiment determines only which one out of three markers is the middle one. The betweenness problem is an abstraction of the problem of assembling a collection of markers into a single sequence given experimental data of this type. The betweenness problem has also been used to model theories of
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
,
causality Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the cau ...
, and
time Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, to ...
..


References

{{reflist NP-complete problems Order theory