graph theory
In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
, a vertex cover (sometimes node cover) of a
graph
Graph may refer to:
Mathematics
*Graph (discrete mathematics), a structure made of vertices and edges
**Graph theory, the study of such graphs and their properties
*Graph (topology), a topological space resembling a graph in the sense of discre ...
is a set of vertices that includes at least one endpoint of every
edge
Edge or EDGE may refer to:
Technology Computing
* Edge computing, a network load-balancing system
* Edge device, an entry point to a computer network
* Adobe Edge, a graphical development application
* Microsoft Edge, a web browser developed by ...
of the graph.
In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, the problem of finding a minimum vertex cover is a classical
optimization problem
In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions.
Optimization problems can be divided into two categories, depending on whether the variables ...
. It is
NP-hard
In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
P ≠ NP
The P versus NP problem is a major unsolved problem in theoretical computer science. In informal terms, it asks whether every problem whose solution can be quickly verified can also be quickly solved.
The informal term ''quickly'', used above ...
. Moreover, it is
hard to approximate In computer science, hardness of approximation is a field that studies the Computational complexity theory, algorithmic complexity of finding near-optimal solutions to optimization problems.
Scope
Hardness of approximation complements the study of ...
– it cannot be approximated up to a factor smaller than 2 if the
unique games conjecture
In computational complexity theory, the unique games conjecture (often referred to as UGC) is a conjecture made by Subhash Khot in 2002.
The conjecture postulates that the problem of determining the approximate ''value'' of a certain type of ga ...
is true. On the other hand, it has several simple 2-factor approximations. It is a typical example of an NP-hard optimization problem that has an
approximation algorithm
In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
Karp's 21 NP-complete problems
In computational complexity theory, Karp's 21 NP-complete problems are a set of computational problems which are NP-complete. In his 1972 paper, "Reducibility Among Combinatorial Problems", Richard Karp used Stephen Cook's 1971 theorem that the b ...
and is therefore a classical
NP-complete
In computational complexity theory, a problem is NP-complete when:
# it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
problem in
computational complexity theory
In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by ...
. Furthermore, the vertex cover problem is
fixed-parameter tractable
In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to ''multiple'' parameters of the input or output. T ...
linear program
Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear relationships. Linear programming i ...
whose
dual linear program The dual of a given linear program (LP) is another LP that is derived from the original (the primal) LP in the following schematic way:
* Each variable in the primal LP becomes a constraint in the dual LP;
* Each constraint in the primal LP becomes ...
is the
maximum matching problem
In the mathematical discipline of graph theory, a matching or independent edge set in an undirected Graph (discrete mathematics), graph is a set of Edge (graph theory), edges without common vertex (graph theory), vertices. Finding a matching in a ...
.
Vertex cover problems have been generalized to
hypergraph
In mathematics, a hypergraph is a generalization of a graph in which an edge can join any number of vertices. In contrast, in an ordinary graph, an edge connects exactly two vertices.
Formally, an undirected hypergraph H is a pair H = (X,E) wh ...
s, see ''
Vertex cover in hypergraphs
In graph theory, a vertex cover in a hypergraph is a set of vertices, such that every hyperedge of the hypergraph contains at least one vertex of that set. It is an extension of the notion of vertex cover in a graph.
An equivalent term is a hit ...
''.
Definition
Formally, a vertex cover of an undirected graph is a subset of such that , that is to say it is a set of vertices where every edge has at least one endpoint in the vertex cover . Such a set is said to ''cover'' the edges of . The upper figure shows two examples of vertex covers, with some vertex cover marked in red.
A ''minimum vertex cover'' is a vertex cover of smallest possible size. The vertex cover number is the size of a minimum vertex cover, i.e. . The lower figure shows examples of minimum vertex covers in the previous graphs.
Examples
* The set of all vertices is a vertex cover.
* The endpoints of any
maximal matching
In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. Finding a matching in a bipartite graph can be treated as a network flow problem.
Definit ...
form a vertex cover.
* The
complete bipartite graph
In the mathematical field of graph theory, a complete bipartite graph or biclique is a special kind of bipartite graph where every vertex of the first set is connected to every vertex of the second set..Electronic edition page 17.
Graph theory i ...
has a minimum vertex cover of size .
Properties
* A set of vertices is a vertex cover if and only if its
complement
A complement is something that completes something else.
Complement may refer specifically to:
The arts
* Complement (music), an interval that, when added to another, spans an octave
** Aggregate complementation, the separation of pitch-clas ...
is an independent set.
* Consequently, the number of vertices of a graph is equal to its minimum vertex cover number plus the size of a maximum independent set ( Gallai 1959).
Computational problem
The minimum vertex cover problem is the
optimization problem
In mathematics, computer science and economics, an optimization problem is the problem of finding the ''best'' solution from all feasible solutions.
Optimization problems can be divided into two categories, depending on whether the variables ...
of finding a smallest vertex cover in a given graph.
:INSTANCE: Graph
:OUTPUT: Smallest number such that has a vertex cover of size .
If the problem is stated as a
decision problem
In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm wheth ...
, it is called the vertex cover problem:
:INSTANCE: Graph and positive integer .
:QUESTION: Does have a vertex cover of size at most ?
The vertex cover problem is an
NP-complete
In computational complexity theory, a problem is NP-complete when:
# it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
problem: it was one of
Karp's 21 NP-complete problems
In computational complexity theory, Karp's 21 NP-complete problems are a set of computational problems which are NP-complete. In his 1972 paper, "Reducibility Among Combinatorial Problems", Richard Karp used Stephen Cook's 1971 theorem that the b ...
. It is often used in
computational complexity theory
In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved by ...
as a starting point for
NP-hard
In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
ness proofs.
ILP formulation
Assume that every vertex has an associated cost of .
The (weighted) minimum vertex cover problem can be formulated as the following
integer linear program
An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objectiv ...
(ILP).
:
This ILP belongs to the more general class of ILPs for
covering problems
In combinatorics and computer science, covering problems are computational problems that ask whether a certain combinatorial structure 'covers' another, or how large the structure has to be to do that. Covering problems are Optimization (mathematic ...
.
The
integrality gap
In mathematics, the relaxation of a (mixed) integer linear program is the problem that arises by removing the integrality constraint of each variable.
For example, in a 0–1 integer program, all constraints are of the form
:x_i\in\.
The relax ...
of this ILP is , so its relaxation (allowing each variable to be in the interval from 0 to 1, rather than requiring the variables to be only 0 or 1) gives a factor-
approximation algorithm
In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
for the minimum vertex cover problem.
Furthermore, the linear programming relaxation of that ILP is ''half-integral'', that is, there exists an optimal solution for which each entry is either 0, 1/2, or 1. A 2-approximate vertex cover can be obtained from this fractional solution by selecting the subset of vertices whose variables are nonzero.
Exact evaluation
The decision variant of the vertex cover problem is
NP-complete
In computational complexity theory, a problem is NP-complete when:
# it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
, which means it is unlikely that there is an efficient algorithm to solve it exactly for arbitrary graphs. NP-completeness can be proven by reduction from 3-satisfiability or, as Karp did, by reduction from the
clique problem
In computer science, the clique problem is the computational problem of finding cliques (subsets of vertices, all adjacent to each other, also called complete subgraphs) in a graph. It has several different formulations depending on which cliq ...
. Vertex cover remains NP-complete even in
cubic graph
In the mathematical field of graph theory, a cubic graph is a graph in which all vertices have degree three. In other words, a cubic graph is a 3-regular graph. Cubic graphs are also called trivalent graphs.
A bicubic graph is a cubic bi ...
s and even in planar graphs of degree at most 3.
For bipartite graphs, the equivalence between vertex cover and maximum matching described by Kőnig's theorem allows the bipartite vertex cover problem to be solved in
polynomial time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
.
For
tree graph
In graph theory, a tree is an undirected graph in which any two Vertex (graph theory), vertices are connected by ''exactly one'' Path (graph theory), path, or equivalently a Connected graph, connected Cycle (graph theory), acyclic undirected grap ...
s, an algorithm finds a minimal vertex cover in polynomial time by finding the first leaf in the tree and adding its parent to the minimal vertex cover, then deleting the leaf and parent and all associated edges and continuing repeatedly until no edges remain in the tree.
Fixed-parameter tractability
An
exhaustive search
In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
algorithm can solve the problem in time 2''k''''n''''O''(1), where ''k'' is the size of the vertex cover. Vertex cover is therefore
fixed-parameter tractable
In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to ''multiple'' parameters of the input or output. T ...
, and if we are only interested in small ''k'', we can solve the problem in
polynomial time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. One algorithmic technique that works here is called ''bounded search tree algorithm'', and its idea is to repeatedly choose some vertex and recursively branch, with two cases at each step: place either the current vertex or all its neighbours into the vertex cover.
The algorithm for solving vertex cover that achieves the best asymptotic dependence on the parameter runs in time . The
klam value In the parameterized complexity of algorithms, the klam value of a parameterized algorithm is a number that bounds the parameter values for which the algorithm might reasonably be expected to be practical.. An algorithm with a higher klam value can ...
of this time bound (an estimate for the largest parameter value that could be solved in a reasonable amount of time) is approximately 190. That is, unless additional algorithmic improvements can be found, this algorithm is suitable only for instances whose vertex cover number is 190 or less. Under reasonable complexity-theoretic assumptions, namely the
exponential time hypothesis
In computational complexity theory, the exponential time hypothesis is an unproven computational hardness assumption that was formulated by . It states that satisfiability of 3-CNF Boolean formulas cannot be solved more quickly than exponential t ...
, the running time cannot be improved to 2''o''(''k''), even when is .
However, for planar graphs, and more generally, for graphs excluding some fixed graph as a minor, a vertex cover of size ''k'' can be found in time '''', i.e., the problem is subexponential
fixed-parameter tractable
In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to ''multiple'' parameters of the input or output. T ...
. This algorithm is again optimal, in the sense that, under the
exponential time hypothesis
In computational complexity theory, the exponential time hypothesis is an unproven computational hardness assumption that was formulated by . It states that satisfiability of 3-CNF Boolean formulas cannot be solved more quickly than exponential t ...
, no algorithm can solve vertex cover on planar graphs in time ''''.
Approximate evaluation
One can find a factor-2
approximation
An approximation is anything that is intentionally similar but not exactly equality (mathematics), equal to something else.
Etymology and usage
The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very ...
by repeatedly taking ''both'' endpoints of an edge into the vertex cover, then removing them from the graph. Put otherwise, we find a
maximal matching
In the mathematical discipline of graph theory, a matching or independent edge set in an undirected graph is a set of edges without common vertices. Finding a matching in a bipartite graph can be treated as a network flow problem.
Definit ...
''M'' with a greedy algorithm and construct a vertex cover ''C'' that consists of all endpoints of the edges in ''M''. In the following figure, a maximal matching ''M'' is marked with red, and the vertex cover ''C'' is marked with blue.
:
The set ''C'' constructed this way is a vertex cover: suppose that an edge ''e'' is not covered by ''C''; then ''M'' ∪ is a matching and ''e'' ∉ ''M'', which is a contradiction with the assumption that ''M'' is maximal. Furthermore, if ''e'' = ∈ ''M'', then any vertex cover – including an optimal vertex cover – must contain ''u'' or ''v'' (or both); otherwise the edge ''e'' is not covered. That is, an optimal cover contains at least ''one'' endpoint of each edge in ''M''; in total, the set ''C'' is at most 2 times as large as the optimal vertex cover.
This simple algorithm was discovered independently by Fanica Gavril and
Mihalis Yannakakis
Mihalis Yannakakis ( el, Μιχάλης Γιαννακάκης; born 13 September 1953 in Athens, Greece)2 - \Theta \left( 1 / \sqrt \right) is known. The problem can be approximated with an approximation factor in - dense graphs.
Inapproximability
No better
constant-factor approximation algorithm
In computational complexity theory, the class APX (an abbreviation of "approximable") is the set of NP optimization problems that allow polynomial-time approximation algorithms with approximation ratio bounded by a constant (or constant-factor ap ...
than the above one is known.
The minimum vertex cover problem is APX-complete, that is, it cannot be approximated arbitrarily well unless P = NP.
Using techniques from the PCP theorem, Dinur and Safra proved in 2005 that minimum vertex cover cannot be approximated within a factor of 1.3606 for any sufficiently large vertex degree unless P = NP.
Later, the factor was improved to for any .
Moreover, if the
unique games conjecture
In computational complexity theory, the unique games conjecture (often referred to as UGC) is a conjecture made by Subhash Khot in 2002.
The conjecture postulates that the problem of determining the approximate ''value'' of a certain type of ga ...
is true then minimum vertex cover cannot be approximated within any constant factor better than 2.
Although finding the minimum-size vertex cover is equivalent to finding the maximum-size independent set, as described above, the two problems are not equivalent in an approximation-preserving way: The Independent Set problem has ''no'' constant-factor approximation unless P = NP.
Pseudocode
APPROXIMATION-VERTEX-COVER(G)
C = ∅
E'= G.E
while E' ≠ ∅:
let (u, v) be an arbitrary edge of E'
C = C ∪
remove from E' every edge incident on either u or v
return C
Applications
Vertex cover optimization serves as a
model
A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure.
Models c ...
for many real-world and
theoretical
A theory is a rational type of abstract thinking about a phenomenon, or the results of such thinking. The process of contemplative and rational thinking is often associated with such processes as observational study or research. Theories may be ...
problems. For example, a commercial establishment interested in installing the fewest possible closed circuit cameras covering all hallways (edges) connecting all rooms (nodes) on a floor might model the objective as a vertex cover minimization problem. The problem has also been used to model the elimination of
repetitive DNA sequences
Repetition may refer to:
*Repetition (rhetorical device), repeating a word within a short space of words
* Repetition (bodybuilding), a single cycle of lifting and lowering a weight in strength training
*Working title for the 1985 slasher film '' ...
for
synthetic biology
Synthetic biology (SynBio) is a multidisciplinary area of research that seeks to create new biological parts, devices, and systems, or to redesign systems that are already found in nature.
It is a branch of science that encompasses a broad ran ...
and
metabolic engineering
Metabolic engineering is the practice of optimizing genetic and regulatory processes within cells to increase the cell's production of a certain substance. These processes are chemical networks that use a series of biochemical reactions and enz ...