In
computer science
Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, a kernelization is a technique for designing efficient
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s that achieve their efficiency by a preprocessing stage in which inputs to the algorithm are replaced by a smaller input, called a "kernel". The result of solving the problem on the kernel should either be the same as on the original input, or it should be easy to transform the output on the kernel to the desired output for the original problem.
Kernelization is often achieved by applying a set of reduction rules that cut away parts of the instance that are easy to handle. In
parameterized complexity theory, it is often possible to prove that a kernel with guaranteed bounds on the size of a kernel (as a function of some parameter associated to the problem) can be found in
polynomial time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. When this is possible, it results in a
fixed-parameter tractable
In computer science, parameterized complexity is a branch of computational complexity theory that focuses on classifying computational problems according to their inherent difficulty with respect to ''multiple'' parameters of the input or output. T ...
algorithm whose running time is the sum of the (polynomial time) kernelization step and the (non-polynomial but bounded by the parameter) time to solve the kernel. Indeed, every problem that can be solved by a fixed-parameter tractable algorithm can be solved by a kernelization algorithm of this type.
Example: vertex cover
A standard example for a kernelization algorithm is the kernelization of the
vertex cover problem
In graph theory, a vertex cover (sometimes node cover) of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.
In computer science, the problem of finding a minimum vertex cover is a classical optimiza ...
by S. Buss.
In this problem, the input is an
undirected graph
In discrete mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to mathematical abstractions called '' v ...
together with a number
. The output is a set of at most
vertices that includes an endpoint of every edge in the graph, if such a set exists, or a failure exception if no such set exists. This problem is
NP-hard
In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
. However, the following reduction rules may be used to kernelize it:
# If
and
is a vertex of degree greater than
, remove
from the graph and decrease
by one. Every vertex cover of size
must contain
since otherwise too many of its neighbors would have to be picked to cover the incident edges. Thus, an optimal vertex cover for the original graph may be formed from a cover of the reduced problem by adding
back to the cover.
# If
is an isolated vertex, remove it. An isolated vertex cannot cover any edges, so in this case
cannot be part of any minimal cover.
# If more than
edges remain in the graph, and neither of the previous two rules can be applied, then the graph cannot contain a vertex cover of size
. For, after eliminating all vertices of degree greater than
, each remaining vertex can only cover at most
edges and a set of
vertices could only cover at most
edges. In this case, the instance may be replaced by an instance with two vertices, one edge, and
, which also has no solution.
An algorithm that applies these rules repeatedly until no more reductions can be made necessarily terminates with a kernel that has at most
edges and (because each edge has at most two endpoints and there are no isolated vertices) at most
vertices. This kernelization may be implemented in
linear time
In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...
. Once the kernel has been constructed, the vertex cover problem may be solved by a
brute force search
In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
algorithm that tests whether each subset of the kernel is a cover of the kernel.
Thus, the vertex cover problem can be solved in time
for a graph with
vertices and
edges, allowing it to be solved efficiently when
is small even if
and
are both large.
Although this bound is fixed-parameter tractable, its dependence on the parameter is higher than might be desired. More complex kernelization procedures can improve this bound, by finding smaller kernels, at the expense of greater running time in the kernelization step. In the vertex cover example, kernelization algorithms are known that produce kernels with at most
vertices.
One algorithm that achieves this improved bound exploits the half-integrality of the
linear program relaxation of vertex cover due to
Nemhauser and Trotter. Another kernelization algorithm achieving that bound is based on what is known as the crown reduction rule and uses
alternating path
Alternating may refer to:
Mathematics
* Alternating algebra, an algebra in which odd-grade elements square to zero
* Alternating form, a function formula in algebra
* Alternating group, the group of even permutations of a finite set
* Alternati ...
arguments.
[ give a kernel based on the crown reduction that has vertices. The vertex bound is a bit more involved and folklore.] The currently best known kernelization algorithm in terms of the number of vertices is due to and achieves
vertices for any fixed constant
.
It is not possible, in this problem, to find a kernel of size
, unless P = NP, for such a kernel would lead to a polynomial-time algorithm for the NP-hard vertex cover problem. However, much stronger bounds on the kernel size can be proven in this case: unless
coNP
In computational complexity theory, co-NP is a complexity class. A decision problem X is a member of co-NP if and only if its complement (complexity), complement is in the complexity class NP (complexity), NP. The class can be defined as follows: ...
NP/poly (believed unlikely by
complexity theorists), for every
it is impossible in polynomial time to find kernels with
edges.
It is unknown for vertex cover whether kernels with
vertices for some
would have any unlikely complexity-theoretic consequences.
Definition
In the literature, there is no clear consensus on how kernelization should be formally defined and there are subtle differences in the uses of that expression.
Downey–Fellows notation
In the notation of , a ''parameterized problem'' is a subset
describing a
decision problem
In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm wheth ...
.
A kernelization for a parameterized problem
is an algorithm that takes an instance
and maps it in time polynomial in
and
to an instance
such that
*
is in
if and only if
is in
,
* the size of
is bounded by a computable function
in
, and
*
is bounded by a function in
.
The output
of kernelization is called a kernel. In this general context, the ''size'' of the string
just refers to its length.
Some authors prefer to use the number of vertices or the number of edges as the size measure in the context of graph problems.
Flum–Grohe notation
In the notation of , a ''parameterized problem'' consists of a decision problem
and a function
, the parameterization. The ''parameter'' of an instance
is the number
.
A kernelization for a parameterized problem
is an algorithm that takes an instance
with parameter
and maps it in polynomial time to an instance
such that
*
is in
if and only if
is in
and
* the size of
is bounded by a computable function
in
.
Note that in this notation, the bound on the size of
implies that the parameter of
is also bounded by a function in
.
The function
is often referred to as the size of
the kernel. If
, it is said that
admits a polynomial kernel. Similarly, for
, the problem admits linear kernel.
Kernelizability and fixed-parameter tractability are equivalent
A problem is fixed-parameter tractable if and only if it is kernelizable and
decidable.
That a kernelizable and decidable problem is fixed-parameter tractable can be seen from the definition above:
First the kernelization algorithm, which runs in time
for some c, is invoked to generate a kernel of size
.
The kernel is then solved by the algorithm that proves that the problem is decidable.
The total running time of this procedure is
, where
is the running time for the algorithm used to solve the kernels.
Since
is computable, e.g. by using the assumption that
is computable and testing all possible inputs of length
, this implies that the problem is fixed-parameter tractable.
The other direction, that a fixed-parameter tractable problem is kernelizable and decidable is a bit more involved. Assume that the question is non-trivial, meaning that there is at least one instance that is in the language, called
, and at least one instance that is not in the language, called
; otherwise, replacing any instance by the empty string is a valid kernelization. Assume also that the problem is fixed-parameter tractable, i.e., it has an algorithm that runs in at most
steps on instances
, for some constant
and some function
. To kernelize an input, run this algorithm on the given input for at most
steps. If it terminates with an answer, use that answer to select either
or
as the kernel. If, instead, it exceeds the
bound on the number of steps without terminating, then return
itself as the kernel. Because
is only returned as a kernel for inputs with
, it follows that the size of the kernel produced in this way is at most
. This size bound is computable, by the assumption from fixed-parameter tractability that
is computable.
More examples
* Vertex cover parametrized by the size of the vertex cover: The
vertex cover
In graph theory, a vertex cover (sometimes node cover) of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.
In computer science, the problem of finding a minimum vertex cover is a classical optimizat ...
problem has kernels with at most
vertices and
edges.
Furthermore, for any
, vertex cover does not have kernels with
edges unless
.
The vertex cover problems in
-uniform hypergraphs has kernels with
edges using the
sunflower lemma
In the mathematical fields of set theory and extremal combinatorics, a sunflower or \Delta-system is a collection of sets whose pairwise intersection is constant. This constant intersection is called the kernel of the sunflower.
The main rese ...
, and it does not have kernels of size
unless
.
* Feedback vertex set parametrized by the size of the feedback vertex set: The
feedback vertex set In the mathematical discipline of graph theory, a feedback vertex set (FVS) of a graph is a set of vertices whose removal leaves a graph without cycles ("removal" means deleting the vertex and all edges adjacent to it). Equivalently, each FVS conta ...
problem has kernels with
vertices and
edges.
Furthermore, it does not have kernels with
edges unless
.
*
-path: The
-path problem is to decide whether a given graph has a
path
A path is a route for physical travel – see Trail.
Path or PATH may also refer to:
Physical paths of different types
* Bicycle path
* Bridle path, used by people on horseback
* Course (navigation), the intended path of a vehicle
* Desire p ...
of length at least
. This problem has kernels of size exponential in
, and it does not have kernels of size polynomial in
unless
.
* Bidimensional problems: Many parameterized versions of
bidimensional problems have linear kernels on planar graphs, and more generally, on graphs excluding some fixed graph as a
minor.
Kernelization for structural parameterizations
While the parameter
in the
examples
Example may refer to:
* '' exempli gratia'' (e.g.), usually read out in English as "for example"
* .example, reserved as a domain name that may not be installed as a top-level domain of the Internet
** example.com, example.net, example.org, e ...
above is chosen as the size of the desired solution, this is not necessary. It is also possible to choose a structural complexity measure of the input as the parameter value, leading to so-called structural parameterizations. This approach is fruitful for instances whose solution size is large, but for which some other complexity measure is bounded. For example, the ''feedback vertex number'' of an undirected graph
is defined as the minimum cardinality of a set of vertices whose removal makes
acyclic. The
vertex cover
In graph theory, a vertex cover (sometimes node cover) of a graph is a set of vertices that includes at least one endpoint of every edge of the graph.
In computer science, the problem of finding a minimum vertex cover is a classical optimizat ...
problem parameterized by the feedback vertex number of the input graph has a polynomial kernelization:
There is a polynomial-time algorithm that, given a graph
whose feedback vertex number is
, outputs a graph
on
vertices such that a minimum vertex cover in
can be transformed into a minimum vertex cover for
in polynomial time. The kernelization algorithm therefore guarantees that instances with a small feedback vertex number
are reduced to small instances.
See also
*
Iterative compression In computer science, iterative compression is an algorithmic technique for the design of fixed-parameter tractable algorithms, in which one element (such as a vertex of a graph) is added to the problem in each step, and a small solution for the pr ...
, a different design technique for fixed-parameter tractable algorithms
Notes
References
* .
* .
* .
* .
* .
*.
* .
* .
* ,
* .
* .
* .
Further reading
*
*
*{{citation
, last1=Cygan, first1=Marek
, last2=Fomin, first2=Fedor V.
, last3=Kowalik, first3=Lukasz
, last4=Lokshtanov, first4=Daniel
, last5=Marx, first5=Daniel
, last6=Pilipczuk, first6=Marcin
, last7=Pilipczuk, first7=Michal
, last8=Saurabh, first8=Saket
, year=2015
, at=Chapters 2 and 9
, title=Parameterized Algorithms
, publisher=Springer
, isbn=978-3-319-21274-6
Parameterized complexity
Analysis of algorithms