computational complexity theory In theoretical computer science and mathematics, computational complexity theory focuses on classifying computational problems according to their resource usage, and relating these classes to each other. A computational problem is a task solved ...

, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in

polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...

) and a

brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...

algorithm can find a solution by trying all possible solutions. # the problem can be used to simulate every other problem for which we can verify quickly that a solution is correct. In this sense, NP-complete problems are the hardest of the problems to which solutions can be verified quickly. If we could find solutions of some NP-complete problem quickly, we could quickly find the solutions of every other problem to which a given solution can be easily verified. The name "NP-complete" is short for "nondeterministic polynomial-time complete". In this name, "nondeterministic" refers to nondeterministic Turing machines, a way of mathematically formalizing the idea of a brute-force search algorithm.

Polynomial time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by ...

refers to an amount of time that is considered "quick" for a

deterministic algorithm In computer science, a deterministic algorithm is an algorithm that, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. Deterministic algorithms are by far ...

to check a single solution, or for a nondeterministic Turing machine to perform the whole search. "

Complete Complete may refer to: Logic * Completeness (logic) * Completeness of a theory, the property of a theory that every formula in the theory's language or its negation is provable Mathematics * The completeness of the real numbers, which implies t ...

" refers to the property of being able to simulate everything in the same

complexity class In computational complexity theory, a complexity class is a set of computational problems of related resource-based complexity. The two most commonly analyzed resources are time and memory. In general, a complexity class is defined in terms o ...

. More precisely, each input to the problem should be associated with a set of solutions of polynomial length, whose validity can be tested quickly (in

), such that the output for any input is "yes" if the solution set is non-empty and "no" if it is empty. The complexity class of problems of this form is called NP, an abbreviation for "nondeterministic polynomial time". A problem is said to be

NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...

if everything in NP can be transformed in polynomial time into it even though it may not be in NP. Conversely, a problem is NP-complete if it is both in NP and NP-hard. The NP-complete problems represent the hardest problems in NP. If some NP-complete problem has a polynomial time algorithm, all problems in NP do. The set of NP-complete problems is often denoted by NP-C or NPC. Although a solution to an NP-complete problem can be ''verified'' "quickly", there is no known way to ''find'' a solution quickly. That is, the time required to solve the problem using any currently known

algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...

increases rapidly as the size of the problem grows. As a consequence, determining whether it is possible to solve these problems quickly, called the P versus NP problem, is one of the fundamental unsolved problems in computer science today. While a method for computing the solutions to NP-complete problems quickly remains undiscovered,

computer scientist A computer scientist is a person who is trained in the academic study of computer science. Computer scientists typically work on the theoretical side of computation, as opposed to the hardware side on which computer engineers mainly focus (a ...

s and

programmer A computer programmer, sometimes referred to as a software developer, a software engineer, a programmer or a coder, is a person who creates computer programs — often for larger computer software. A programmer is someone who writes/creates ...

s still frequently encounter NP-complete problems. NP-complete problems are often addressed by using

heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate ...

methods and

approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned sol ...

Overview

NP-complete problems are in NP, the set of all

decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whe ...

s whose solutions can be verified in polynomial time; ''NP'' may be equivalently defined as the set of decision problems that can be solved in polynomial time on a non-deterministic Turing machine. A problem ''p'' in NP is NP-complete if every other problem in NP can be transformed (or reduced) into ''p'' in polynomial time. It is not known whether every problem in NP can be quickly solved—this is called the P versus NP problem. But if ''any NP-complete problem'' can be solved quickly, then ''every problem in NP'' can, because the definition of an NP-complete problem states that every problem in NP must be quickly reducible to every NP-complete problem (that is, it can be reduced in polynomial time). Because of this, it is often said that NP-complete problems are ''harder'' or ''more difficult'' than NP problems in general.

Formal definition

A decision problem

\scriptstyle C

is NP-complete if: #

\scriptstyle C

is in NP, and # Every problem in NP is reducible to

\scriptstyle C

in polynomial time.

\scriptstyle C

can be shown to be in NP by demonstrating that a candidate solution to

\scriptstyle C

can be verified in polynomial time. Note that a problem satisfying condition 2 is said to be

, whether or not it satisfies condition 1. A consequence of this definition is that if we had a polynomial time algorithm (on a UTM, or any other

Turing-equivalent Turing equivalence may refer to: * As related to Turing completeness In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-compl ...

abstract machine An abstract machine is a computer science theoretical model that allows for a detailed and precise analysis of how a computer system functions. It is analogous to a mathematical function in that it receives inputs and produces outputs based on pr ...

) for

\scriptstyle C

, we could solve all problems in NP in polynomial time.

Background

The concept of NP-completeness was introduced in 1971 (see Cook–Levin theorem), though the term ''NP-complete'' was introduced later. At the 1971 STOC conference, there was a fierce debate between the computer scientists about whether NP-complete problems could be solved in polynomial time on a

deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and cons ...

Turing machine A Turing machine is a mathematical model of computation describing an abstract machine that manipulates symbols on a strip of tape according to a table of rules. Despite the model's simplicity, it is capable of implementing any computer alg ...

John Hopcroft John Edward Hopcroft (born October 7, 1939) is an American theoretical computer scientist. His textbooks on theory of computation (also known as the Cinderella book) and data structures are regarded as standards in their fields. He is the IBM P ...

brought everyone at the conference to a consensus that the question of whether NP-complete problems are solvable in polynomial time should be put off to be solved at some later date, since nobody had any formal proofs for their claims one way or the other. This is known as "the question of whether P=NP". Nobody has yet been able to determine conclusively whether NP-complete problems are in fact solvable in polynomial time, making this one of the great unsolved problems of mathematics. The

Clay Mathematics Institute The Clay Mathematics Institute (CMI) is a private, non-profit foundation dedicated to increasing and disseminating mathematical knowledge. Formerly based in Peterborough, New Hampshire, the corporate address is now in Denver, Colorado. CMI's sc ...

is offering a US$1 million reward to anyone who has a formal proof that P=NP or that P≠NP. The existence of NP-complete problems is not obvious. The Cook–Levin theorem states that the

Boolean satisfiability problem In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisf ...

is NP-complete, thus establishing that such problems do exist. In 1972, Richard Karp proved that several other problems were also NP-complete (see Karp's 21 NP-complete problems); thus, there is a class of NP-complete problems (besides the Boolean satisfiability problem). Since the original results, thousands of other problems have been shown to be NP-complete by reductions from other problems previously shown to be NP-complete; many of these problems are collected in Garey and Johnson's 1979 book '' Computers and Intractability: A Guide to the Theory of NP-Completeness''.

NP-complete problems

The easiest way to prove that some new problem is NP-complete is first to prove that it is in NP, and then to reduce some known NP-complete problem to it. Therefore, it is useful to know a variety of NP-complete problems. The list below contains some well-known problems that are NP-complete when expressed as decision problems. * Boolean satisfiability problem (SAT) *

Knapsack problem The knapsack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit a ...

Hamiltonian path problem In the mathematical field of graph theory the Hamiltonian path problem and the Hamiltonian cycle problem are problems of determining whether a Hamiltonian path (a path in an undirected or directed graph that visits each vertex exactly once) or ...

* Travelling salesman problem (decision version) * Subgraph isomorphism problem * Subset sum problem * Clique problem *

Vertex cover problem In graph theory, a vertex cover (sometimes node cover) of a graph is a set of vertices that includes at least one endpoint of every edge of the graph. In computer science, the problem of finding a minimum vertex cover is a classical optimiza ...

Independent set problem In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set S of vertices such that for every two vertices in S, there is no edge connecting the two ...

* Dominating set problem * Graph coloring problem To the right is a diagram of some of the problems and the

reductions Reductions ( es, reducciones, also called ; , pl. ) were settlements created by Spanish rulers and Roman Catholic missionaries in Spanish America and the Spanish East Indies (the Philippines). In Portuguese-speaking Latin America, such r ...

typically used to prove their NP-completeness. In this diagram, problems are reduced from bottom to top. Note that this diagram is misleading as a description of the mathematical relationship between these problems, as there exists a

polynomial-time reduction In computational complexity theory, a polynomial-time reduction is a method for solving one problem using another. One shows that if a hypothetical subroutine solving the second problem exists, then the first problem can be solved by transforming ...

between any two NP-complete problems; but it indicates where demonstrating this polynomial-time reduction has been easiest. There is often only a small difference between a problem in P and an NP-complete problem. For example, the

3-satisfiability In logic and computer science, the Boolean satisfiability problem (sometimes called propositional satisfiability problem and abbreviated SATISFIABILITY, SAT or B-SAT) is the problem of determining if there exists an interpretation that satisfi ...

problem, a restriction of the Boolean satisfiability problem, remains NP-complete, whereas the slightly more restricted

2-satisfiability In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case ...

problem is in P (specifically, it is NL-complete), but the slightly more general max. 2-sat. problem is again NP-complete. Determining whether a graph can be colored with 2 colors is in P, but with 3 colors is NP-complete, even when restricted to

planar graph In graph theory, a planar graph is a graph that can be embedded in the plane, i.e., it can be drawn on the plane in such a way that its edges intersect only at their endpoints. In other words, it can be drawn in such a way that no edges cro ...

s. Determining if a graph is a cycle or is

bipartite Bipartite may refer to: * 2 (number) * Bipartite (theology), a philosophical term describing the human duality of body and soul * Bipartite graph, in mathematics, a graph in which the vertices are partitioned into two sets and every edge has an en ...

is very easy (in L), but finding a maximum bipartite or a maximum cycle subgraph is NP-complete. A solution of the

knapsack problem The knapsack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit a ...

within any fixed percentage of the optimal solution can be computed in polynomial time, but finding the optimal solution is NP-complete.

Intermediate problems

An interesting example is the graph isomorphism problem, the

graph theory In mathematics, graph theory is the study of '' graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conn ...

problem of determining whether a graph isomorphism exists between two graphs. Two graphs are

isomorphic In mathematics, an isomorphism is a structure-preserving mapping between two structures of the same type that can be reversed by an inverse mapping. Two mathematical structures are isomorphic if an isomorphism exists between them. The word i ...

if one can be transformed into the other simply by renaming vertices. Consider these two problems: * Graph Isomorphism: Is graph G₁ isomorphic to graph G₂? * Subgraph Isomorphism: Is graph G₁ isomorphic to a subgraph of graph G₂? The Subgraph Isomorphism problem is NP-complete. The graph isomorphism problem is suspected to be neither in P nor NP-complete, though it is in NP. This is an example of a problem that is thought to be ''hard'', but is not thought to be NP-complete. This class is called ''NP-Intermediate problems'' and exists if and only if P≠NP.

Solving NP-complete problems

At present, all known algorithms for NP-complete problems require time that is superpolynomial in the input size, in fact for some

k>0

and it is unknown whether there are any faster algorithms. The following techniques can be applied to solve computational problems in general, and they often give rise to substantially faster algorithms: *

Approximation An approximation is anything that is intentionally similar but not exactly equal to something else. Etymology and usage The word ''approximation'' is derived from Latin ''approximatus'', from ''proximus'' meaning ''very near'' and the prefix ' ...

: Instead of searching for an optimal solution, search for a solution that is at most a factor from an optimal one. * Randomization: Use randomness to get a faster average running time, and allow the algorithm to fail with some small probability. Note: The

Monte Carlo method Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deter ...

is not an example of an efficient algorithm in this specific sense, although evolutionary approaches like

Genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to ge ...

s may be. * Restriction: By restricting the structure of the input (e.g., to planar graphs), faster algorithms are usually possible. * Parameterization: Often there are fast algorithms if certain parameters of the input are fixed. *

Heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate ...

: An algorithm that works "reasonably well" in many cases, but for which there is no proof that it is both always fast and always produces a good result.

Metaheuristic In computer science and mathematical optimization, a metaheuristic is a higher-level procedure or heuristic designed to find, generate, or select a heuristic (partial search algorithm) that may provide a sufficiently good solution to an optimizati ...

approaches are often used. One example of a heuristic algorithm is a suboptimal

\scriptstyle O(n\log n)

greedy coloring algorithm used for

graph coloring In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices ...

during the register allocation phase of some compilers, a technique called graph-coloring global register allocation. Each vertex is a variable, edges are drawn between variables which are being used at the same time, and colors indicate the register assigned to each variable. Because most

RISC In computer engineering, a reduced instruction set computer (RISC) is a computer designed to simplify the individual instructions given to the computer to accomplish tasks. Compared to the instructions given to a complex instruction set comp ...

machines have a fairly large number of general-purpose registers, even a heuristic approach is effective for this application.

Completeness under different types of reduction

In the definition of NP-complete given above, the term ''reduction'' was used in the technical meaning of a polynomial-time

many-one reduction In computability theory and computational complexity theory, a many-one reduction (also called mapping reduction) is a reduction which converts instances of one decision problem L_1 into instances of a second decision problem L_2 where the inst ...

. Another type of reduction is polynomial-time Turing reduction. A problem

\scriptstyle X

is polynomial-time Turing-reducible to a problem

\scriptstyle Y

if, given a subroutine that solves

\scriptstyle Y

in polynomial time, one could write a program that calls this subroutine and solves

\scriptstyle X

in polynomial time. This contrasts with many-one reducibility, which has the restriction that the program can only call the subroutine once, and the return value of the subroutine must be the return value of the program. If one defines the analogue to NP-complete with Turing reductions instead of many-one reductions, the resulting set of problems won't be smaller than NP-complete; it is an open question whether it will be any larger. Another type of reduction that is also often used to define NP-completeness is the

logarithmic-space many-one reduction In computational complexity theory, a log-space reduction is a reduction computable by a deterministic Turing machine using logarithmic space. Conceptually, this means it can keep a constant number of pointers into the input, along with a logar ...

which is a many-one reduction that can be computed with only a logarithmic amount of space. Since every computation that can be done in

logarithmic space In computational complexity theory, L (also known as LSPACE or DLOGSPACE) is the complexity class containing decision problems that can be solved by a deterministic Turing machine using a logarithmic amount of writable memory space., Definition ...

can also be done in polynomial time it follows that if there is a logarithmic-space many-one reduction then there is also a polynomial-time many-one reduction. This type of reduction is more refined than the more usual polynomial-time many-one reductions and it allows us to distinguish more classes such as P-complete. Whether under these types of reductions the definition of NP-complete changes is still an open problem. All currently known NP-complete problems are NP-complete under log space reductions. All currently known NP-complete problems remain NP-complete even under much weaker reductions such as

AC_0

reductions and

NC_0

reductions. Some NP-Complete problems such as SAT are known to be complete even under polylogarithmic time projections. It is known, however, that AC⁰ reductions define a strictly smaller class than polynomial-time reductions.

Naming

According to

Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...

, the name "NP-complete" was popularized by Alfred Aho,

and

Jeffrey Ullman Jeffrey David Ullman (born November 22, 1942) is an American computer scientist and the Stanford W. Ascherman Professor of Engineering, Emeritus, at Stanford University. His textbooks on compilers (various editions are popularly known as the d ...

in their celebrated textbook "The Design and Analysis of Computer Algorithms". He reports that they introduced the change in the

galley proofs In printing and publishing, proofs are the preliminary versions of publications meant for review by authors, editors, and proofreaders, often with extra-wide margins. Galley proofs may be uncut and unbound, or in some cases electronically tra ...

for the book (from "polynomially-complete"), in accordance with the results of a poll he had conducted of the

theoretical computer science computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumscribe the ...

community. Other suggestions made in the poll included " Herculean", "formidable", Steiglitz's "hard-boiled" in honor of Cook, and Shen Lin's acronym "PET", which stood for "probably exponential time", but depending on which way the P versus NP problem went, could stand for " exponential time" or "previously exponential time".

Common misconceptions

The following misconceptions are frequent. * ''"NP-complete problems are the most difficult known problems."'' Since NP-complete problems are in NP, their running time is at most exponential. However, some problems have been proven to require more time, for example

Presburger arithmetic Presburger arithmetic is the first-order theory of the natural numbers with addition, named in honor of Mojżesz Presburger, who introduced it in 1929. The signature of Presburger arithmetic contains only the addition operation and equality, omit ...

. Of some problems, it has even been proven that they can never be solved at all, for example the

Halting problem In computability theory, the halting problem is the problem of determining, from a description of an arbitrary computer program and an input, whether the program will finish running, or continue to run forever. Alan Turing proved in 1936 that a ...

. * ''"NP-complete problems are difficult because there are so many different solutions."'' On the one hand, there are many problems that have a solution space just as large, but can be solved in polynomial time (for example

minimum spanning tree A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. ...

). On the other hand, there are NP-problems with at most one solution that are NP-hard under randomized polynomial-time reduction (see

Valiant–Vazirani theorem The Valiant–Vazirani theorem is a theorem in computational complexity theory stating that if there is a polynomial time algorithm for Unambiguous-SAT, then NP = RP. It was proven by Leslie Valiant and Vijay Vazirani in their paper ...

). * ''"Solving NP-complete problems requires exponential time."'' First, this would imply P ≠ NP, which is still an unsolved question. Further, some NP-complete problems actually have algorithms running in superpolynomial, but subexponential time such as O(2''n''). For example, the independent set and dominating set problems for

s are NP-complete, but can be solved in subexponential time using the planar separator theorem. * ''"Each instance of an NP-complete problem is difficult."'' Often some instances, or even most instances, may be easy to solve within polynomial time. However, unless P=NP, any polynomial-time algorithm must asymptotically be wrong on more than polynomially many of the exponentially many inputs of a certain size. * ''"If P=NP, all cryptographic ciphers can be broken."'' A polynomial-time problem can be very difficult to solve in practice if the polynomial's degree or constants are large enough. In addition, information-theoretic security provides cryptographic methods that cannot be broken even with unlimited computing power. * ''"A large-scale quantum computer would be able to efficiently solve NP-complete problems."'' The class of decision problems that can be efficient solved (in principle) by a fault-tolerant quantum computer is known as BQP. However, BQP is not believed to contain all of NP, and if it does not, then it cannot contain any NP-complete problem.

Properties

Viewing a

as a formal language in some fixed encoding, the set NPC of all NP-complete problems is not closed under: * union *

intersection In mathematics, the intersection of two or more objects is another object consisting of everything that is contained in all of the objects simultaneously. For example, in Euclidean geometry, when two lines in a plane are not parallel, thei ...

concatenation In formal language theory and computer programming, string concatenation is the operation of joining character strings end-to-end. For example, the concatenation of "snow" and "ball" is "snowball". In certain formalisations of concatenat ...

* Kleene star It is not known whether NPC is closed under complementation, since NPC= co-NPC if and only if NP= co-NP, and whether NP=co-NP is an open question.

References

Citations

Sources

* This book is a classic, developing the theory, then cataloguing ''many'' NP-Complete problems. * * * * * * * * * *
Computational Complexity of Games and Puzzles

Tetris is Hard, Even to Approximate

* . * . * . * .