HOME

TheInfoList



OR:

The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
problem in
combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combi ...
, important in
theoretical computer science Theoretical computer science (TCS) is a subset of general computer science and mathematics that focuses on mathematical aspects of computer science such as the theory of computation, lambda calculus, and type theory. It is difficult to circumsc ...
and
operations research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve deci ...
. The
travelling purchaser problem The traveling purchaser problem (TPP) is an NP-hard problem studied in theoretical computer science. Given a list of marketplaces, the cost of travelling between different marketplaces, and a list of available goods together with the price of each ...
and the
vehicle routing problem The vehicle routing problem (VRP) is a combinatorial optimization and integer programming problem which asks "What is the optimal set of routes for a fleet of vehicles to traverse in order to deliver to a given set of customers?" It generalises t ...
are both generalizations of TSP. In the theory of computational complexity, the decision version of the TSP (where given a length ''L'', the task is to decide whether the graph has a tour of at most ''L'') belongs to the class of
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
problems. Thus, it is possible that the
worst-case In computer science, best, worst, and average cases of a given algorithm express what the resource usage is ''at least'', ''at most'' and ''on average'', respectively. Usually the resource being considered is running time, i.e. time complexity, b ...
running time In computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by t ...
for any algorithm for the TSP increases superpolynomially (but no more than
exponentially Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above * Exponential decay, decrease at a rate proportional to value *Exp ...
) with the number of cities. The problem was first formulated in 1930 and is one of the most intensively studied problems in optimization. It is used as a
benchmark Benchmark may refer to: Business and economics * Benchmarking, evaluating performance within organizations * Benchmark price * Benchmark (crude oil), oil-specific practices Science and technology * Benchmark (surveying), a point of known elevati ...
for many optimization methods. Even though the problem is computationally difficult, many
heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate, ...
s and
exact algorithm In computer science and operations research, exact algorithms are algorithms that always solve an optimization problem to optimality. Unless P = NP, an exact algorithm for an NP-hard optimization problem cannot run in worst-case polynomial time. ...
s are known, so that some instances with tens of thousands of cities can be solved completely and even problems with millions of cities can be approximated within a small fraction of 1%. The TSP has several applications even in its purest formulation, such as
planning Planning is the process of thinking regarding the activities required to achieve a desired goal. Planning is based on foresight, the fundamental capacity for mental time travel. The evolution of forethought, the capacity to think ahead, is consi ...
,
logistics Logistics is generally the detailed organization and implementation of a complex operation. In a general business sense, logistics manages the flow of goods between the point of origin and the point of consumption to meet the requirements of ...
, and the manufacture of
microchips An integrated circuit or monolithic integrated circuit (also referred to as an IC, a chip, or a microchip) is a set of electronic circuits on one small flat piece (or "chip") of semiconductor material, usually silicon. Large numbers of tiny M ...
. Slightly modified, it appears as a sub-problem in many areas, such as
DNA sequencing DNA sequencing is the process of determining the nucleic acid sequence – the order of nucleotides in DNA. It includes any method or technology that is used to determine the order of the four bases: adenine, guanine, cytosine, and thymine. Th ...
. In these applications, the concept ''city'' represents, for example, customers, soldering points, or DNA fragments, and the concept ''distance'' represents travelling times or cost, or a
similarity measure In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity exists, usually such meas ...
between DNA fragments. The TSP also appears in astronomy, as astronomers observing many sources will want to minimize the time spent moving the telescope between the sources; in such problems, the TSP can be embedded inside an optimal control problem. In many applications, additional constraints such as limited resources or time windows may be imposed.


History

The origins of the travelling salesman problem are unclear. A handbook for travelling salesmen from 1832 mentions the problem and includes example tours through
Germany Germany,, officially the Federal Republic of Germany, is a country in Central Europe. It is the second most populous country in Europe after Russia, and the most populous member state of the European Union. Germany is situated betwe ...
and
Switzerland ). Swiss law does not designate a ''capital'' as such, but the federal parliament and government are installed in Bern, while other federal institutions, such as the federal courts, are in other cities (Bellinzona, Lausanne, Luzern, Neuchâtel ...
, but contains no mathematical treatment. The TSP was mathematically formulated in the 19th century by the Irish mathematician
William Rowan Hamilton Sir William Rowan Hamilton Doctor of Law, LL.D, Doctor of Civil Law, DCL, Royal Irish Academy, MRIA, Royal Astronomical Society#Fellow, FRAS (3/4 August 1805 – 2 September 1865) was an Irish mathematician, astronomer, and physicist. He was the ...
and by the British mathematician
Thomas Kirkman Thomas Penyngton Kirkman FRS (31 March 1806 – 3 February 1895) was a British mathematician and ordained minister of the Church of England. Despite being primarily a churchman, he maintained an active interest in research-level mathematics, a ...
. Hamilton's
icosian game The icosian game is a mathematical game invented in 1857 by William Rowan Hamilton. The game's object is finding a Hamiltonian cycle along the edges of a dodecahedron such that every vertex is visited a single time, and the ending point is the sam ...
was a recreational puzzle based on finding a
Hamiltonian cycle In the mathematical field of graph theory, a Hamiltonian path (or traceable path) is a path in an undirected or directed graph that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a cycle that visits each vertex ...
. The general form of the TSP appears to have been first studied by mathematicians during the 1930s in Vienna and at Harvard, notably by
Karl Menger Karl Menger (January 13, 1902 – October 5, 1985) was an Austrian-American mathematician, the son of the economist Carl Menger. In mathematics, Menger studied the theory of algebras and the dimension theory of low- regularity ("rough") curves a ...
, who defines the problem, considers the obvious brute-force algorithm, and observes the non-optimality of the nearest neighbour heuristic: It was first considered mathematically in the 1930s by
Merrill M. Flood Merrill Meeks Flood (1908 – 1991) was an American mathematician, notable for developing, with Melvin Dresher, the basis of the game theoretical Prisoner's dilemma model of cooperation and conflict while being at RAND in 1950 ( Albert W. Tucker ...
who was looking to solve a school bus routing problem.
Hassler Whitney Hassler Whitney (March 23, 1907 – May 10, 1989) was an American mathematician. He was one of the founders of singularity theory, and did foundational work in manifolds, embeddings, immersions, characteristic classes, and geometric integration t ...
at
Princeton University Princeton University is a private university, private research university in Princeton, New Jersey. Founded in 1746 in Elizabeth, New Jersey, Elizabeth as the College of New Jersey, Princeton is the List of Colonial Colleges, fourth-oldest ins ...
generated interest in the problem, which he called the "48 states problem". The earliest publication using the phrase "travelling salesman problem" was the 1949
RAND Corporation The RAND Corporation (from the phrase "research and development") is an American nonprofit global policy think tank created in 1948 by Douglas Aircraft Company to offer research and analysis to the United States Armed Forces. It is financed ...
report by
Julia Robinson Julia Hall Bowman Robinson (December 8, 1919July 30, 1985) was an American mathematician noted for her contributions to the fields of computability theory and computational complexity theory—most notably in decision problems. Her work on Hilbe ...
, "On the Hamiltonian game (a traveling salesman problem)."A detailed treatment of the connection between Menger and Whitney as well as the growth in the study of TSP can be found in . In the 1950s and 1960s, the problem became increasingly popular in scientific circles in Europe and the United States after the
RAND Corporation The RAND Corporation (from the phrase "research and development") is an American nonprofit global policy think tank created in 1948 by Douglas Aircraft Company to offer research and analysis to the United States Armed Forces. It is financed ...
in
Santa Monica Santa Monica (; Spanish: ''Santa Mónica'') is a city in Los Angeles County, situated along Santa Monica Bay on California's South Coast. Santa Monica's 2020 U.S. Census population was 93,076. Santa Monica is a popular resort town, owing to ...
offered prizes for steps in solving the problem. Notable contributions were made by
George Dantzig George Bernard Dantzig (; November 8, 1914 – May 13, 2005) was an American mathematical scientist who made contributions to industrial engineering, operations research, computer science, economics, and statistics. Dantzig is known for his dev ...
,
Delbert Ray Fulkerson Delbert Ray Fulkerson (; August 14, 1924 – January 10, 1976) was an American mathematician who co-developed the FordFulkerson algorithm, one of the most well-known algorithms to solve the maximum flow problem in networks. Early life and educa ...
and
Selmer M. Johnson Selmer Martin Johnson (21 May 1916 – 26 June 1996) was an American mathematician, a researcher at the RAND Corporation. Biography Johnson was born on May 21, 1916, in Buhl, Minnesota. He earned a B.A. and then an M.A. in mathematics from the U ...
from the RAND Corporation, who expressed the problem as an
integer linear program An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objectiv ...
and developed the cutting plane method for its solution. They wrote what is considered the seminal paper on the subject in which with these new methods they solved an instance with 49 cities to optimality by constructing a tour and proving that no other tour could be shorter. Dantzig, Fulkerson and Johnson, however, speculated that given a near optimal solution we may be able to find optimality or prove optimality by adding a small number of extra inequalities (cuts). They used this idea to solve their initial 49 city problem using a string model. They found they only needed 26 cuts to come to a solution for their 49 city problem. While this paper did not give an algorithmic approach to TSP problems, the ideas that lay within it were indispensable to later creating exact solution methods for the TSP, though it would take 15 years to find an algorithmic approach in creating these cuts. As well as cutting plane methods, Dantzig, Fulkerson and Johnson used
branch and bound Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate soluti ...
algorithms perhaps for the first time. In 1959,
Jillian Beardwood Jillian Beardwood (1934–2019) was a British mathematician known for the Beardwood-Halton-Hammersley Theorem. Published by the  Cambridge Philosophical Society in a 1959 article entitled "The Shortest Path Through Many Points", the theorem ...
, J.H. Halton and
John Hammersley John Michael Hammersley, (21 March 1920 – 2 May 2004) was a British mathematician best known for his foundational work in the theory of self-avoiding walks and percolation theory. Early life and education Hammersley was born in Helensburgh i ...
published an article entitled "The Shortest Path Through Many Points" in the journal of the Cambridge Philosophical Society. The Beardwood–Halton–Hammersley theorem provides a practical solution to the travelling salesman problem.  The authors derived an asymptotic formula to determine the length of the shortest route for a salesman who starts at a home or office and visits a fixed number of locations before returning to the start. In the following decades, the problem was studied by many researchers from
mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...
,
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
,
chemistry Chemistry is the science, scientific study of the properties and behavior of matter. It is a natural science that covers the Chemical element, elements that make up matter to the chemical compound, compounds made of atoms, molecules and ions ...
,
physics Physics is the natural science that studies matter, its fundamental constituents, its motion and behavior through space and time, and the related entities of energy and force. "Physical science is that department of knowledge which r ...
, and other sciences. In the 1960s, however, a new approach was created, that instead of seeking optimal solutions would produce a solution whose length is provably bounded by a multiple of the optimal length, and in doing so would create lower bounds for the problem; these lower bounds would then be used with branch and bound approaches. One method of doing this was to create a
minimum spanning tree A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. T ...
of the graph and then double all its edges, which produces the bound that the length of an optimal tour is at most twice the weight of a minimum spanning tree. In 1976, Christofides and Serdyukov independently of each other made a big advance in this direction: the Christofides-Serdyukov algorithm yields a solution that, in the worst case, is at most 1.5 times longer than the optimal solution. As the algorithm was simple and quick, many hoped it would give way to a near optimal solution method. However, this hope for improvement did not immediately materialize, and Christofides-Serdyukov remained the method with the best worst-case scenario until 2011, when a (very) slightly improved approximation algorithm was developed for the subset of "graphical" TSPs. In 2020 this tiny improvement was extended to the full (metric) TSP.
Richard M. Karp Richard Manning Karp (born January 3, 1935) is an American computer scientist and computational theorist at the University of California, Berkeley. He is most notable for his research in the theory of algorithms, for which he received a Turing ...
showed in 1972 that the
Hamiltonian cycle In the mathematical field of graph theory, a Hamiltonian path (or traceable path) is a path in an undirected or directed graph that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a cycle that visits each vertex ...
problem was
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
, which implies the
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
ness of TSP. This supplied a mathematical explanation for the apparent computational difficulty of finding optimal tours. Great progress was made in the late 1970s and 1980, when Grötschel, Padberg, Rinaldi and others managed to exactly solve instances with up to 2,392 cities, using cutting planes and
branch and bound Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate soluti ...
. In the 1990s, Applegate, Bixby, Chvátal, and
Cook Cook or The Cook may refer to: Food preparation * Cooking, the preparation of food * Cook (domestic worker), a household staff member who prepares food * Cook (professional), an individual who prepares food for consumption in the food industry * ...
developed the program ''Concorde'' that has been used in many recent record solutions. Gerhard Reinelt published the TSPLIB in 1991, a collection of benchmark instances of varying difficulty, which has been used by many research groups for comparing results. In 2006, Cook and others computed an optimal tour through an 85,900-city instance given by a microchip layout problem, currently the largest solved TSPLIB instance. For many other instances with millions of cities, solutions can be found that are guaranteed to be within 2–3% of an optimal tour..


Description


As a graph problem

TSP can be modelled as an undirected weighted graph, such that cities are the graph's vertices, paths are the graph's edges, and a path's distance is the edge's weight. It is a minimization problem starting and finishing at a specified
vertex Vertex, vertices or vertexes may refer to: Science and technology Mathematics and computer science *Vertex (geometry), a point where two or more curves, lines, or edges meet * Vertex (computer graphics), a data structure that describes the positio ...
after having visited each other
vertex Vertex, vertices or vertexes may refer to: Science and technology Mathematics and computer science *Vertex (geometry), a point where two or more curves, lines, or edges meet * Vertex (computer graphics), a data structure that describes the positio ...
exactly once. Often, the model is a
complete graph In the mathematical field of graph theory, a complete graph is a simple undirected graph in which every pair of distinct vertices is connected by a unique edge. A complete digraph is a directed graph in which every pair of distinct vertices is c ...
(i.e., each pair of vertices is connected by an edge). If no path exists between two cities, adding a sufficiently long edge will complete the graph without affecting the optimal tour.


Asymmetric and symmetric

In the ''symmetric TSP'', the distance between two cities is the same in each opposite direction, forming an
undirected graph In discrete mathematics, and more specifically in graph theory, a graph is a structure amounting to a set of objects in which some pairs of the objects are in some sense "related". The objects correspond to mathematical abstractions called '' v ...
. This symmetry halves the number of possible solutions. In the ''asymmetric TSP'', paths may not exist in both directions or the distances might be different, forming a
directed graph In mathematics, and more specifically in graph theory, a directed graph (or digraph) is a graph that is made up of a set of vertices connected by directed edges, often called arcs. Definition In formal terms, a directed graph is an ordered pa ...
. Traffic collisions,
one-way street One-way traffic (or uni-directional traffic) is traffic that moves in a single direction. A one-way street is a street either facilitating only one-way traffic, or designed to direct vehicles to move in one direction. One-way streets typical ...
s, and airfares for cities with different departure and arrival fees are examples of how this symmetry could break down.


Related problems

* An equivalent formulation in terms of
graph theory In mathematics, graph theory is the study of ''graphs'', which are mathematical structures used to model pairwise relations between objects. A graph in this context is made up of '' vertices'' (also called ''nodes'' or ''points'') which are conne ...
is: Given a complete weighted graph (where the vertices would represent the cities, the edges would represent the roads, and the weights would be the cost or distance of that road), find a
Hamiltonian cycle In the mathematical field of graph theory, a Hamiltonian path (or traceable path) is a path in an undirected or directed graph that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a cycle that visits each vertex ...
with the least weight. * The requirement of returning to the starting city does not change the
computational complexity In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations) ...
of the problem, see
Hamiltonian path problem In the mathematical field of graph theory the Hamiltonian path problem and the Hamiltonian cycle problem are problems of determining whether a Hamiltonian path (a path in an undirected or directed graph that visits each vertex exactly once) or a ...
. * Another related problem is the bottleneck travelling salesman problem (bottleneck TSP): Find a Hamiltonian cycle in a
weighted graph This is a glossary of graph theory. Graph theory is the study of graphs, systems of nodes or vertices connected in pairs by lines or edges. Symbols A B ...
with the minimal weight of the weightiest
edge Edge or EDGE may refer to: Technology Computing * Edge computing, a network load-balancing system * Edge device, an entry point to a computer network * Adobe Edge, a graphical development application * Microsoft Edge, a web browser developed by ...
. For example, avoiding narrow streets with big buses. The problem is of considerable practical importance, apart from evident transportation and logistics areas. A classic example is in
printed circuit A printed circuit board (PCB; also printed wiring board or PWB) is a medium used in electrical and electronic engineering to connect electronic components to one another in a controlled manner. It takes the form of a laminated sandwich struc ...
manufacturing: scheduling of a route of the
drill A drill is a tool used for making round holes or driving fasteners. It is fitted with a bit, either a drill or driverchuck. Hand-operated types are dramatically decreasing in popularity and cordless battery-powered ones proliferating due to ...
machine to drill holes in a PCB. In robotic machining or drilling applications, the "cities" are parts to machine or holes (of different sizes) to drill, and the "cost of travel" includes time for retooling the robot (single machine job sequencing problem). * The generalized travelling salesman problem, also known as the "travelling politician problem", deals with "states" that have (one or more) "cities" and the salesman has to visit exactly one "city" from each "state". One application is encountered in ordering a solution to the
cutting stock problem In operations research, the cutting-stock problem is the problem of cutting standard-sized pieces of stock material, such as paper rolls or sheet metal, into pieces of specified sizes while minimizing material wasted. It is an optimization problem ...
in order to minimize knife changes. Another is concerned with drilling in
semiconductor A semiconductor is a material which has an electrical resistivity and conductivity, electrical conductivity value falling between that of a electrical conductor, conductor, such as copper, and an insulator (electricity), insulator, such as glas ...
manufacturing, see e.g., . Noon and Bean demonstrated that the generalized travelling salesman problem can be transformed into a standard TSP with the same number of cities, but a modified
distance matrix In mathematics, computer science and especially graph theory, a distance matrix is a square matrix (two-dimensional array) containing the distances, taken pairwise, between the elements of a set. Depending upon the application involved, the ''dist ...
. * The sequential ordering problem deals with the problem of visiting a set of cities where precedence relations between the cities exist. * A common interview question at Google is how to route data among data processing nodes; routes vary by time to transfer the data, but nodes also differ by their computing power and storage, compounding the problem of where to send data. * The
travelling purchaser problem The traveling purchaser problem (TPP) is an NP-hard problem studied in theoretical computer science. Given a list of marketplaces, the cost of travelling between different marketplaces, and a list of available goods together with the price of each ...
deals with a purchaser who is charged with purchasing a set of products. He can purchase these products in several cities, but at different prices and not all cities offer the same products. The objective is to find a route between a subset of the cities that minimizes total cost (travel cost + purchasing cost) and enables the purchase of all required products.


Integer linear programming formulations

The TSP can be formulated as an
integer linear program An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objectiv ...
. Several formulations are known. Two notable formulations are the Miller–Tucker–Zemlin (MTZ) formulation and the Dantzig–Fulkerson–Johnson (DFJ) formulation. The DFJ formulation is stronger, though the MTZ formulation is still useful in certain settings. Common to both these formulations is that one labels the cities with the numbers 1,\ldots,n and takes c_ > 0 to be the distance from city i to city j. The main variables in the formulations are: : x_ = \begin 1 & \text i \text j \\ 0 & \text \end It is because these are 0/1 variables that the formulations become integer programs; all other constraints are purely linear. In particular, the objective in the program is to : minimize the tour length \sum_^n \sum_^n c_x_ . Without further constraints, the \_ will however effectively range over all subsets of the set of edges, which is very far from the sets of edges in a tour, and allows for a trivial minimum where all x_ = 0 . Therefore, both formulations also have the constraints that there at each vertex is exactly one incoming edge and one outgoing edge, which may be expressed as the 2n linear equations : \sum_^n x_ = 1 for j=1, \ldots, n and \sum_^n x_ = 1 for i=1, \ldots, n . These ensure that the chosen set of edges locally looks like that of a tour, but still allow for solutions violating the global requirement that there is ''one'' tour which visits all vertices, as the edges chosen could make up several tours each visiting only a subset of the vertices; arguably it is this global requirement that makes TSP a hard problem. The MTZ and DFJ formulations differ in how they express this final requirement as linear constraints.


Miller–Tucker–Zemlin formulation

In addition to the x_ variables as above, there is for each i=2,\ldots,n a dummy variable u_i that keeps track of the order in which the cities are visited, counting from city 1; the interpretation is that u_i < u_j implies city i is visited before city j. For a given tour (as encoded into values of the x_ variables), one may find satisfying values for the u_i variables by making u_i equal to the number of edges along that tour, when going from city 1 to city i. Because linear programming favours non-strict inequalities ( \ge ) over strict ( > ), we would like to impose constraints to the effect that : u_j \ge u_i + 1 if x_ = 1 . Merely requiring u_j \geq u_i + x_ would ''not'' achieve that, because this also requires u_j \geq u_i when x_ = 0 , which is not correct. Instead MTZ use the (n-1)(n-2) linear constraints : u_j + (n-2) \ge u_i + (n-1) x_ for all distinct i,j \in \ where the constant term n-2 provides sufficient slack that x_ = 0 does not impose a relation between u_j and u_i . The way that the u_i variables then enforce that a single tour visits all cities is that they increase by (at least) 1 for each step along a tour, with a decrease only allowed where the tour passes through city  1 . That constraint would be violated by every tour which does not pass through city  1 , so the only way to satisfy it is that the tour passing city  1 also passes through all other cities. The MTZ formulation of TSP is thus the following integer linear programming problem: :\begin \min \sum_^n \sum_^nc_x_&\colon && \\ x_ \in& \ && i,j=1, \ldots, n; \\ u_ \in& \mathbf && i=2, \ldots, n; \\ \sum_^n x_ =& 1 && j=1, \ldots, n; \\ \sum_^n x_ =& 1 && i=1, \ldots, n; \\ u_i-u_j +(n-1)x_ \le& n-2 && 2 \le i \ne j \le n; \\ 1 \le u_i \le& n-1 && 2 \le i \le n. \end The first set of equalities requires that each city is arrived at from exactly one other city, and the second set of equalities requires that from each city there is a departure to exactly one other city. The last constraints enforce that there is only a single tour covering all cities, and not two or more disjointed tours that only collectively cover all cities. To prove this, it is shown below (1) that every feasible solution contains only one closed sequence of cities, and (2) that for every single tour covering all cities, there are values for the dummy variables u_i that satisfy the constraints. To prove that every feasible solution contains only one closed sequence of cities, it suffices to show that every subtour in a feasible solution passes through city 1 (noting that the equalities ensure there can only be one such tour). For if we sum all the inequalities corresponding to x_=1 for any subtour of ''k'' steps not passing through city 1, we obtain: :(n-1)k \leq (n-2)k, which is a contradiction. It now must be shown that for every single tour covering all cities, there are values for the dummy variables u_i that satisfy the constraints. Without loss of generality, define the tour as originating (and ending) at city 1. Choose u_=t if city i is visited in step t (i,t=2,3,\ldots,n). Then :u_i-u_j\le n-2, since u_i can be no greater than n and u_j can be no less than 2; hence the constraints are satisfied whenever x_=0. For x_=1, we have: : u_ - u_ + (n-1)x_ = (t) - (t+1) + n-1 = n-2, satisfying the constraint.


Dantzig–Fulkerson–Johnson formulation

Label the cities with the numbers 1, …, ''n'' and define: : x_ = \begin 1 & \text i \text j \\ 0 & \text \end Take c_ > 0 to be the distance from city ''i'' to city ''j''. Then TSP can be written as the following integer linear programming problem: :\begin \min &\sum_^n \sum_^nc_x_\colon && \\ & \sum_^n x_ = 1 && j=1, \ldots, n; \\ & \sum_^n x_ = 1 && i=1, \ldots, n; \\ & \sum_ \leq , Q, -1 && \forall Q \subsetneq \, , Q, \geq 2 \\ \end The last constraint of the DFJ formulation—called a ''subtour elimination'' constraint—ensures no proper subset Q can form a sub-tour, so the solution returned is a single tour and not the union of smaller tours. Because this leads to an exponential number of possible constraints, in practice it is solved with row generation.


Computing a solution

The traditional lines of attack for the NP-hard problems are the following: * Devising
exact algorithm In computer science and operations research, exact algorithms are algorithms that always solve an optimization problem to optimality. Unless P = NP, an exact algorithm for an NP-hard optimization problem cannot run in worst-case polynomial time. ...
s, which work reasonably fast only for small problem sizes. * Devising "suboptimal" or
heuristic algorithm In mathematical optimization and computer science, heuristic (from Greek εὑρίσκω "I find, discover") is a technique designed for solving a problem more quickly when classic methods are too slow for finding an approximate solution, or whe ...
s, i.e., algorithms that deliver approximated solutions in a reasonable time. * Finding special cases for the problem ("subproblems") for which either better or exact heuristics are possible.


Exact algorithms

The most direct solution would be to try all
permutation In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. The word "permutation" also refers to the act or proc ...
s (ordered combinations) and see which one is cheapest (using
brute-force search In computer science, brute-force search or exhaustive search, also known as generate and test, is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the soluti ...
). The running time for this approach lies within a polynomial factor of O(n!), the
factorial In mathematics, the factorial of a non-negative denoted is the product of all positive integers less than or equal The factorial also equals the product of n with the next smaller factorial: \begin n! &= n \times (n-1) \times (n-2) \t ...
of the number of cities, so this solution becomes impractical even for only 20 cities. One of the earliest applications of
dynamic programming Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics. I ...
is the Held–Karp algorithm that solves the problem in time O(n^2 2^n). This bound has also been reached by Exclusion-Inclusion in an attempt preceding the dynamic programming approach. Improving these time bounds seems to be difficult. For example, it has not been determined whether a classical
exact algorithm In computer science and operations research, exact algorithms are algorithms that always solve an optimization problem to optimality. Unless P = NP, an exact algorithm for an NP-hard optimization problem cannot run in worst-case polynomial time. ...
for TSP that runs in time O(1.9999^n) exists. The currently best quantum
exact algorithm In computer science and operations research, exact algorithms are algorithms that always solve an optimization problem to optimality. Unless P = NP, an exact algorithm for an NP-hard optimization problem cannot run in worst-case polynomial time. ...
for TSP due to Ambainis et al. runs in time O(1.728^n). Other approaches include: * Various
branch-and-bound Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate soluti ...
algorithms, which can be used to process TSPs containing 40–60 cities. * Progressive improvement algorithms which use techniques reminiscent of
linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear function#As a polynomial function, li ...
. Works well for up to 200 cities. * Implementations of
branch-and-bound Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate soluti ...
and problem-specific cut generation ( branch-and-cut); this is the method of choice for solving large instances. This approach holds the current record, solving an instance with 85,900 cities, see . An exact solution for 15,112 German towns from TSPLIB was found in 2001 using the
cutting-plane method In mathematical optimization, the cutting-plane method is any of a variety of optimization methods that iteratively refine a feasible set or objective function by means of linear inequalities, termed ''cuts''. Such procedures are commonly used t ...
proposed by
George Dantzig George Bernard Dantzig (; November 8, 1914 – May 13, 2005) was an American mathematical scientist who made contributions to industrial engineering, operations research, computer science, economics, and statistics. Dantzig is known for his dev ...
, Ray Fulkerson, and
Selmer M. Johnson Selmer Martin Johnson (21 May 1916 – 26 June 1996) was an American mathematician, a researcher at the RAND Corporation. Biography Johnson was born on May 21, 1916, in Buhl, Minnesota. He earned a B.A. and then an M.A. in mathematics from the U ...
in 1954, based on
linear programming Linear programming (LP), also called linear optimization, is a method to achieve the best outcome (such as maximum profit or lowest cost) in a mathematical model whose requirements are represented by linear function#As a polynomial function, li ...
. The computations were performed on a network of 110 processors located at
Rice University William Marsh Rice University (Rice University) is a Private university, private research university in Houston, Houston, Texas. It is on a 300-acre campus near the Houston Museum District and adjacent to the Texas Medical Center. Rice is ranke ...
and
Princeton University Princeton University is a private university, private research university in Princeton, New Jersey. Founded in 1746 in Elizabeth, New Jersey, Elizabeth as the College of New Jersey, Princeton is the List of Colonial Colleges, fourth-oldest ins ...
. The total computation time was equivalent to 22.6 years on a single 500 MHz Alpha processor. In May 2004, the travelling salesman problem of visiting all 24,978 towns in Sweden was solved: a tour of length approximately 72,500 kilometres was found and it was proven that no shorter tour exists. In March 2005, the travelling salesman problem of visiting all 33,810 points in a circuit board was solved using ''
Concorde TSP Solver The Concorde TSP Solver is a program for solving the travelling salesman problem. It was written by David Applegate, Robert E. Bixby, Vašek Chvátal, and William J. Cook, in ANSI C, and is freely available for academic use. Concorde has been ap ...
'': a tour of length 66,048,945 units was found and it was proven that no shorter tour exists. The computation took approximately 15.7 CPU-years (Cook et al. 2006). In April 2006 an instance with 85,900 points was solved using ''Concorde TSP Solver'', taking over 136 CPU-years, see .


Heuristic and approximation algorithms

Various
heuristics A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate, ...
and
approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
s, which quickly yield good solutions, have been devised. These include the
Multi-fragment algorithm The multi-fragment (MF) algorithm is a heuristic or approximation algorithm, approximation algorithm for the travelling salesman problem (TSP) (and related problems). This algorithm is also sometimes called the "greedy algorithm" for the TSP. The ...
. Modern methods can find solutions for extremely large problems (millions of cities) within a reasonable time which are with a high probability just 2–3% away from the optimal solution. Several categories of heuristics are recognized.


Constructive heuristics

The nearest neighbour (NN) algorithm (a
greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...
) lets the salesman choose the nearest unvisited city as his next move. This algorithm quickly yields an effectively short route. For N cities randomly distributed on a plane, the algorithm on average yields a path 25% longer than the shortest possible path. However, there exist many specially arranged city distributions which make the NN algorithm give the worst route. This is true for both asymmetric and symmetric TSPs. Rosenkrantz et al. showed that the NN algorithm has the approximation factor \Theta(\log , V, ) for instances satisfying the triangle inequality. A variation of NN algorithm, called nearest fragment (NF) operator, which connects a group (fragment) of nearest unvisited cities, can find shorter routes with successive iterations. The NF operator can also be applied on an initial solution obtained by NN algorithm for further improvement in an elitist model, where only better solutions are accepted. The bitonic tour of a set of points is the minimum-perimeter
monotone polygon In geometry, a polygon ''P'' in the plane is called monotone with respect to a straight line ''L'', if every line orthogonal to ''L'' intersects the boundary of ''P'' at most twice. Similarly, a polygonal chain ''C'' is called monotone with respec ...
that has the points as its vertices; it can be computed efficiently by
dynamic programming Dynamic programming is both a mathematical optimization method and a computer programming method. The method was developed by Richard Bellman in the 1950s and has found applications in numerous fields, from aerospace engineering to economics. I ...
. Another
constructive heuristic A constructive heuristic is a type of heuristic method which starts with an empty solution and repeatedly extends the current solution until a complete solution is obtained. It differs from local search heuristics which start with a complete solutio ...
, Match Twice and Stitch (MTS), performs two sequential matchings, where the second matching is executed after deleting all the edges of the first matching, to yield a set of cycles. The cycles are then stitched to produce the final tour.


The Algorithm of Christofides and Serdyukov

The algorithm of Christofides and Serdyukov follows a similar outline but combines the minimum spanning tree with a solution of another problem, minimum-weight
perfect matching In graph theory, a perfect matching in a graph is a matching that covers every vertex of the graph. More formally, given a graph , a perfect matching in is a subset of edge set , such that every vertex in the vertex set is adjacent to exactly ...
. This gives a TSP tour which is at most 1.5 times the optimal. It was one of the first
approximation algorithm In computer science and operations research, approximation algorithms are efficient algorithms that find approximate solutions to optimization problems (in particular NP-hard problems) with provable guarantees on the distance of the returned solu ...
s, and was in part responsible for drawing attention to approximation algorithms as a practical approach to intractable problems. As a matter of fact, the term "algorithm" was not commonly extended to approximation algorithms until later; the Christofides algorithm was initially referred to as the Christofides heuristic. This algorithm looks at things differently by using a result from graph theory which helps improve on the lower bound of the TSP which originated from doubling the cost of the minimum spanning tree. Given an
Eulerian graph In graph theory, an Eulerian trail (or Eulerian path) is a trail in a finite graph that visits every edge exactly once (allowing for revisiting vertices). Similarly, an Eulerian circuit or Eulerian cycle is an Eulerian trail that starts and ends ...
we can find an Eulerian tour in time. So if we had an Eulerian graph with cities from a TSP as vertices then we can easily see that we could use such a method for finding an Eulerian tour to find a TSP solution. By triangular inequality we know that the TSP tour can be no longer than the Eulerian tour and as such we have a lower bound for the TSP. Such a method is described below. # Find a minimum spanning tree for the problem # Create duplicates for every edge to create an Eulerian graph # Find an Eulerian tour for this graph # Convert to TSP: if a city is visited twice, create a shortcut from the city before this in the tour to the one after this. To improve the lower bound, a better way of creating an Eulerian graph is needed. By triangular inequality, the best Eulerian graph must have the same cost as the best travelling salesman tour, hence finding optimal Eulerian graphs is at least as hard as TSP. One way of doing this is by minimum weight matching using algorithms of O(n^3). Making a graph into an Eulerian graph starts with the minimum spanning tree. Then all the vertices of odd order must be made even. So a matching for the odd degree vertices must be added which increases the order of every odd degree vertex by one. This leaves us with a graph where every vertex is of even order which is thus Eulerian. Adapting the above method gives the algorithm of Christofides and Serdyukov. # Find a minimum spanning tree for the problem # Create a matching for the problem with the set of cities of odd order. # Find an Eulerian tour for this graph # Convert to TSP using shortcuts.


Pairwise exchange

The pairwise exchange or ''
2-opt In optimization, 2-opt is a simple local search algorithm for solving the traveling salesman problem. The 2-opt algorithm was first proposed by Croes in 1958, although the basic move had already been suggested by Flood.M. M. Flood, The traveling-sa ...
'' technique involves iteratively removing two edges and replacing these with two different edges that reconnect the fragments created by edge removal into a new and shorter tour. Similarly, the 3-opt technique removes 3 edges and reconnects them to form a shorter tour. These are special cases of the ''k''-opt method. The label ''Lin–Kernighan'' is an often heard misnomer for 2-opt. Lin–Kernighan is actually the more general k-opt method. For Euclidean instances, 2-opt heuristics give on average solutions that are about 5% better than Christofides' algorithm. If we start with an initial solution made with a
greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...
, the average number of moves greatly decreases again and is . For random starts however, the average number of moves is . However whilst in order this is a small increase in size, the initial number of moves for small problems is 10 times as big for a random start compared to one made from a greedy heuristic. This is because such 2-opt heuristics exploit 'bad' parts of a solution such as crossings. These types of heuristics are often used within
Vehicle routing problem The vehicle routing problem (VRP) is a combinatorial optimization and integer programming problem which asks "What is the optimal set of routes for a fleet of vehicles to traverse in order to deliver to a given set of customers?" It generalises t ...
heuristics to reoptimize route solutions.


''k''-opt heuristic, or Lin–Kernighan heuristics

The
Lin–Kernighan heuristic In combinatorial optimization, Lin–Kernighan is one of the best heuristic algorithm, heuristics for solving the symmetric travelling salesman problem. It belongs to the class of Local search (optimization), local search algorithms, which take a t ...
is a special case of the ''V''-opt or variable-opt technique. It involves the following steps: # Given a tour, delete ''k'' mutually disjoint edges. # Reassemble the remaining fragments into a tour, leaving no disjoint subtours (that is, don't connect a fragment's endpoints together). This in effect simplifies the TSP under consideration into a much simpler problem. # Each fragment endpoint can be connected to other possibilities: of 2''k'' total fragment endpoints available, the two endpoints of the fragment under consideration are disallowed. Such a constrained 2''k''-city TSP can then be solved with brute force methods to find the least-cost recombination of the original fragments. The most popular of the ''k''-opt methods are 3-opt, as introduced by Shen Lin of
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...
in 1965. A special case of 3-opt is where the edges are not disjoint (two of the edges are adjacent to one another). In practice, it is often possible to achieve substantial improvement over 2-opt without the combinatorial cost of the general 3-opt by restricting the 3-changes to this special subset where two of the removed edges are adjacent. This so-called two-and-a-half-opt typically falls roughly midway between 2-opt and 3-opt, both in terms of the quality of tours achieved and the time required to achieve those tours.


''V''-opt heuristic

The variable-opt method is related to, and a generalization of the ''k''-opt method. Whereas the ''k''-opt methods remove a fixed number (''k'') of edges from the original tour, the variable-opt methods do not fix the size of the edge set to remove. Instead, they grow the set as the search process continues. The best-known method in this family is the Lin–Kernighan method (mentioned above as a misnomer for 2-opt).
Shen Lin __NOTOC__ Shen may refer to: * Shen (Chinese religion) (神), a central word in Chinese philosophy, religion, and traditional Chinese medicine; term for god or spirit * Shen (clam-monster) (蜃), a shapeshifting Chinese dragon believed to create mi ...
and
Brian Kernighan Brian Wilson Kernighan (; born 1942) is a Canadian computer scientist. He worked at Bell Labs and contributed to the development of Unix alongside Unix creators Ken Thompson and Dennis Ritchie. Kernighan's name became widely known through co-au ...
first published their method in 1972, and it was the most reliable heuristic for solving travelling salesman problems for nearly two decades. More advanced variable-opt methods were developed at Bell Labs in the late 1980s by David Johnson and his research team. These methods (sometimes called Lin–Kernighan–Johnson) build on the Lin–Kernighan method, adding ideas from
tabu search Tabu search is a metaheuristic search method employing local search methods used for mathematical optimization. It was created by Fred W. Glover in 1986 and formalized in 1989. Local (neighborhood) searches take a potential solution to a prob ...
and
evolutionary computing In computer science, evolutionary computation is a family of algorithms for global optimization inspired by biological evolution, and the subfield of artificial intelligence and soft computing studying these algorithms. In technical terms, the ...
. The basic Lin–Kernighan technique gives results that are guaranteed to be at least 3-opt. The Lin–Kernighan–Johnson methods compute a Lin–Kernighan tour, and then perturb the tour by what has been described as a mutation that removes at least four edges and reconnects the tour in a different way, then ''V''-opting the new tour. The mutation is often enough to move the tour from the
local minimum In mathematical analysis, the maxima and minima (the respective plurals of maximum and minimum) of a function, known collectively as extrema (the plural of extremum), are the largest and smallest value of the function, either within a given ran ...
identified by Lin–Kernighan. ''V''-opt methods are widely considered the most powerful heuristics for the problem, and are able to address special cases, such as the Hamilton Cycle Problem and other non-metric TSPs that other heuristics fail on. For many years Lin–Kernighan–Johnson had identified optimal solutions for all TSPs where an optimal solution was known and had identified the best-known solutions for all other TSPs on which the method had been tried.


Randomized improvement

Optimized
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
algorithms which use local searching heuristic sub-algorithms can find a route extremely close to the optimal route for 700 to 800 cities. TSP is a touchstone for many general heuristics devised for combinatorial optimization such as
genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gene ...
s,
simulated annealing Simulated annealing (SA) is a probabilistic technique for approximating the global optimum of a given function. Specifically, it is a metaheuristic to approximate global optimization in a large search space for an optimization problem. It ...
,
tabu search Tabu search is a metaheuristic search method employing local search methods used for mathematical optimization. It was created by Fred W. Glover in 1986 and formalized in 1989. Local (neighborhood) searches take a potential solution to a prob ...
,
ant colony optimization In computer science and operations research, the ant colony optimization algorithm (ACO) is a probabilistic technique for solving computational problems which can be reduced to finding good paths through graphs. Artificial ants stand for multi ...
, river formation dynamics (see
swarm intelligence Swarm intelligence (SI) is the collective behavior of decentralized, self-organized systems, natural or artificial. The concept is employed in work on artificial intelligence. The expression was introduced by Gerardo Beni and Jing Wang in 1989, in ...
) and the
cross entropy method The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective. The method approximates the optimal importance ...
.


=Ant colony optimization

=
Artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...
researcher
Marco Dorigo Marco Dorigo (born 26 August 1961, in Milan, Italy) is a research director for the Belgian Funds for Scientific Research and a co-director of ''IRIDIA'', the artificial intelligence lab of the Université Libre de Bruxelles. He received a PhD in ...
described in 1993 a method of heuristically generating "good solutions" to the TSP using a simulation of an ant colony called ''ACS'' (''ant colony system''). It models behaviour observed in real ants to find short paths between food sources and their nest, an emergent behaviour resulting from each ant's preference to follow trail pheromones deposited by other ants. ACS sends out a large number of virtual ant agents to explore many possible routes on the map. Each ant probabilistically chooses the next city to visit based on a heuristic combining the distance to the city and the amount of virtual pheromone deposited on the edge to the city. The ants explore, depositing pheromone on each edge that they cross, until they have all completed a tour. At this point the ant which completed the shortest tour deposits virtual pheromone along its complete tour route (''global trail updating''). The amount of pheromone deposited is inversely proportional to the tour length: the shorter the tour, the more it deposits.


Special cases


Metric

In the ''metric TSP'', also known as ''delta-TSP'' or Δ-TSP, the intercity distances satisfy the
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but ...
. A very natural restriction of the TSP is to require that the distances between cities form a
metric Metric or metrical may refer to: * Metric system, an internationally adopted decimal system of measurement * An adjective indicating relation to measurement in general, or a noun describing a specific type of measurement Mathematics In mathem ...
to satisfy the
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but ...
; that is the direct connection from ''A'' to ''B'' is never farther than the route via intermediate ''C'': :d_ \le d_ + d_. The edge spans then build a
metric Metric or metrical may refer to: * Metric system, an internationally adopted decimal system of measurement * An adjective indicating relation to measurement in general, or a noun describing a specific type of measurement Mathematics In mathem ...
on the set of vertices. When the cities are viewed as points in the plane, many natural
distance function In mathematics, a metric space is a set together with a notion of ''distance'' between its elements, usually called points. The distance is measured by a function called a metric or distance function. Metric spaces are the most general setting ...
s are metrics, and so many natural instances of TSP satisfy this constraint. The following are some examples of metric TSPs for various metrics. *In the Euclidean TSP (see below) the distance between two cities is the
Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
between the corresponding points. *In the rectilinear TSP the distance between two cities is the sum of the absolute values of the differences of their ''x''- and ''y''-coordinates. This metric is often called the
Manhattan distance A taxicab geometry or a Manhattan geometry is a geometry whose usual distance function or Metric (mathematics), metric of Euclidean geometry is replaced by a new metric in which the distance between two points is the sum of the absolute differences ...
or city-block metric. *In the maximum metric, the distance between two points is the maximum of the absolute values of differences of their ''x''- and ''y''-coordinates. The last two metrics appear, for example, in routing a machine that drills a given set of holes in a
printed circuit board A printed circuit board (PCB; also printed wiring board or PWB) is a medium used in Electrical engineering, electrical and electronic engineering to connect electronic components to one another in a controlled manner. It takes the form of a L ...
. The Manhattan metric corresponds to a machine that adjusts first one co-ordinate, and then the other, so the time to move to a new point is the sum of both movements. The maximum metric corresponds to a machine that adjusts both co-ordinates simultaneously, so the time to move to a new point is the slower of the two movements. In its definition, the TSP does not allow cities to be visited twice, but many applications do not need this constraint. In such cases, a symmetric, non-metric instance can be reduced to a metric one. This replaces the original graph with a complete graph in which the inter-city distance d_ is replaced by the
shortest path In graph theory, the shortest path problem is the problem of finding a path between two vertices (or nodes) in a graph such that the sum of the weights of its constituent edges is minimized. The problem of finding the shortest path between tw ...
length between ''A'' and ''B'' in the original graph.


Euclidean

For points in the
Euclidean plane In mathematics, the Euclidean plane is a Euclidean space of dimension two. That is, a geometric setting in which two real quantities are required to determine the position of each point ( element of the plane), which includes affine notions of ...
, the optimal solution to the travelling salesman problem forms a
simple polygon In geometry, a simple polygon is a polygon that does not Intersection (Euclidean geometry), intersect itself and has no holes. That is, it is a flat shape consisting of straight, non-intersecting line segments or "sides" that are joined pairwise ...
through all of the points, a
polygonalization In computational geometry, a polygonalization of a finite set of points in the Euclidean plane is a simple polygon with the given points as its vertices. A polygonalization may also be called a polygonization, simple polygonalization, Hamiltoni ...
of the points. Any non-optimal solution with crossings can be made into a shorter solution without crossings by local optimizations. The
Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
obeys the triangle inequality, so the Euclidean TSP forms a special case of metric TSP. However, even when the input points have integer coordinates, their distances generally take the form of
square root In mathematics, a square root of a number is a number such that ; in other words, a number whose ''square'' (the result of multiplying the number by itself, or  ⋅ ) is . For example, 4 and −4 are square roots of 16, because . E ...
s, and the length of a tour is a
sum of radicals In computational complexity theory, there is an open problem of whether some information about a sum of radicals may be computed in polynomial time depending on the input size, i.e., in the number of bits necessary to represent this sum. It is of im ...
, making it difficult to perform the
symbolic computation In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions ...
needed to perform exact comparisons of the lengths of different tours. Like the general TSP, the exact Euclidean TSP is NP-hard, but the issue with sums of radicals is an obstacle to proving that its decision version is in NP, and therefore NP-complete. A discretized version of the problem with distances rounded to integers is NP-complete. With rational coordinates and the actual Euclidean metric, Euclidean TSP is known to be in the Counting Hierarchy, a subclass of PSPACE. With arbitrary real coordinates, Euclidean TSP cannot be in such classes, since there are uncountably many possible inputs. Despite these complications, Euclidean TSP is much easier than the general metric case for approximation. For example, the minimum spanning tree of the graph associated with an instance of the Euclidean TSP is a
Euclidean minimum spanning tree A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments ...
, and so can be computed in expected O (''n'' log ''n'') time for ''n'' points (considerably less than the number of edges). This enables the simple 2-approximation algorithm for TSP with triangle inequality above to operate more quickly. In general, for any ''c'' > 0, where ''d'' is the number of dimensions in the Euclidean space, there is a polynomial-time algorithm that finds a tour of length at most (1 + 1/''c'') times the optimal for geometric instances of TSP in :O\left(n (\log n)^\right), time; this is called a
polynomial-time approximation scheme In computer science (particularly algorithmics), a polynomial-time approximation scheme (PTAS) is a type of approximation algorithm for optimization problems (most often, NP-hard optimization problems). A PTAS is an algorithm which takes an insta ...
(PTAS).
Sanjeev Arora Sanjeev Arora (born January 1968) is an Indian American theoretical computer scientist. Life He was a visiting scholar at the Institute for Advanced Study in 2002–03. In 2008 he was inducted as a Fellow of the Association for Computing Mac ...
and Joseph S. B. Mitchell were awarded the
Gödel Prize The Gödel Prize is an annual prize for outstanding papers in the area of theoretical computer science, given jointly by the European Association for Theoretical Computer Science (EATCS) and the Association for Computing Machinery Special Inter ...
in 2010 for their concurrent discovery of a PTAS for the Euclidean TSP. In practice, simpler heuristics with weaker guarantees continue to be used.


Asymmetric

In most cases, the distance between two nodes in the TSP network is the same in both directions. The case where the distance from ''A'' to ''B'' is not equal to the distance from ''B'' to ''A'' is called asymmetric TSP. A practical application of an asymmetric TSP is route optimization using street-level routing (which is made asymmetric by one-way streets, slip-roads, motorways, etc.).


Conversion to symmetric

Solving an asymmetric TSP graph can be somewhat complex. The following is a 3×3 matrix containing all possible path weights between the nodes ''A'', ''B'' and ''C''. One option is to turn an asymmetric matrix of size ''N'' into a symmetric matrix of size 2''N''. : To double the size, each of the nodes in the graph is duplicated, creating a second ''ghost node'', linked to the original node with a "ghost" edge of very low (possibly negative) weight, here denoted −''w''. (Alternatively, the ghost edges have weight 0, and weight w is added to all other edges.) The original 3×3 matrix shown above is visible in the bottom left and the transpose of the original in the top-right. Both copies of the matrix have had their diagonals replaced by the low-cost hop paths, represented by −''w''. In the new graph, no edge directly links original nodes and no edge directly links ghost nodes. : The weight −''w'' of the "ghost" edges linking the ghost nodes to the corresponding original nodes must be low enough to ensure that all ghost edges must belong to any optimal symmetric TSP solution on the new graph (w=0 is not always low enough). As a consequence, in the optimal symmetric tour, each original node appears next to its ghost node (e.g. a possible path is \mathrm) and by merging the original and ghost nodes again we get an (optimal) solution of the original asymmetric problem (in our example, \mathrm).


Analyst's problem

There is an analogous problem in
geometric measure theory In mathematics, geometric measure theory (GMT) is the study of geometric properties of sets (typically in Euclidean space) through measure theory. It allows mathematicians to extend tools from differential geometry to a much larger class of surfa ...
which asks the following: under what conditions may a subset ''E'' of
Euclidean space Euclidean space is the fundamental space of geometry, intended to represent physical space. Originally, that is, in Euclid's Elements, Euclid's ''Elements'', it was the three-dimensional space of Euclidean geometry, but in modern mathematics ther ...
be contained in a
rectifiable curve Rectification has the following technical meanings: Mathematics * Rectification (geometry), truncating a polytope by marking the midpoints of all its edges, and cutting off its vertices at those points * Rectifiable curve, in mathematics * Rec ...
(that is, when is there a curve with finite length that visits every point in ''E'')? This problem is known as the analyst's travelling salesman problem.


Path length for random sets of points in a square

Suppose X_1,\ldots,X_n are n independent random variables with uniform distribution in the square ,12, and let L^\ast_n be the shortest path length (i.e. TSP solution) for this set of points, according to the usual
Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
. It is known that, almost surely, ::\frac\rightarrow \beta\qquad\textn\to\infty, where \beta is a positive constant that is not known explicitly. Since L^*_n\le2\sqrt n+2 (see below), it follows from bounded convergence theorem that \beta=\lim_ \mathbb E ^*_n\sqrt n, hence lower and upper bounds on \beta follow from bounds on \mathbb E ^*_n/math>. The almost sure limit \frac\rightarrow \beta as n\to\infty may not exist if the independent locations X_1,\ldots,X_n are replaced with observations from a stationary ergodic process with uniform marginals.


Upper bound

*One has L^*\le 2\sqrt+2, and therefore \beta\leq 2, by using a naive path which visits monotonically the points inside each of \sqrt n slices of width 1/\sqrt in the square. *Few proved L^*_n\le\sqrt+1.75, hence \beta\le\sqrt 2, later improved by Karloff (1987): \beta\le0.984\sqrt2. * Fietcher showed an upper bound of \beta\le 0.73\dots.


Lower bound

*By observing that \mathbb E ^*_n/math> is greater than n times the distance between X_0 and the closest point X_i\ne X_0, one gets (after a short computation) ::\mathbb E ^*_nge\tfrac \sqrt. *A better lower bound is obtained by observing that \mathbb E ^*_n/math> is greater than \tfrac12n times the sum of the distances between X_0 and the closest and second closest points X_i,X_j\ne X_0, which gives ::\mathbb E ^*_nge \left( \tfrac + \tfrac \right)\sqrt = \tfrac\sqrt, *The currently best lower bound is ::\mathbb E ^*_nge (\tfrac + \tfrac)\sqrt, *Held and Karp gave a polynomial-time algorithm that provides numerical lower bounds for L^*_n, and thus for \beta(\simeq L^*_n/) which seem to be good up to more or less 1%. In particular, David S. Johnson obtained a lower bound by computer experiment: ::L^*_n\gtrsim 0.7080\sqrt+0.522, where 0.522 comes from the points near square boundary which have fewer neighbours, and Christine L. Valenzuela and
Antonia J. Jones Antonia Jane Jones (1943 – 2010) was a British mathematician and computer scientist. Her research considered number theory and computer science. Early life and education Jones was born in 1943 in Queen Charlotte's and Chelsea Hospital. She w ...
obtained the following other numerical lower bound: ::L^*_n\gtrsim 0.7078\sqrt+0.551.


Computational complexity

The problem has been shown to be
NP-hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
(more precisely, it is complete for the
complexity class In computational complexity theory, a complexity class is a set of computational problems of related resource-based complexity. The two most commonly analyzed resources are time and memory. In general, a complexity class is defined in terms of ...
FPNP; see
function problem In computational complexity theory, a function problem is a computational problem where a single output (of a total function) is expected for every input, but the output is more complex than that of a decision problem. For function problems, the ou ...
), and the
decision problem In computability theory and computational complexity theory, a decision problem is a computational problem that can be posed as a yes–no question of the input values. An example of a decision problem is deciding by means of an algorithm whethe ...
version ("given the costs and a number ''x'', decide whether there is a round-trip route cheaper than ''x''") is
NP-complete In computational complexity theory, a problem is NP-complete when: # it is a problem for which the correctness of each solution can be verified quickly (namely, in polynomial time) and a brute-force search algorithm can find a solution by tryi ...
. The bottleneck travelling salesman problem is also NP-hard. The problem remains NP-hard even for the case when the cities are in the plane with
Euclidean distance In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
s, as well as in a number of other restrictive cases. Removing the condition of visiting each city "only once" does not remove the NP-hardness, since in the planar case there is an optimal tour that visits each city only once (otherwise, by the
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but ...
, a shortcut that skips a repeated visit would not increase the tour length).


Complexity of approximation

In the general case, finding a shortest travelling salesman tour is NPO-complete. If the distance measure is a
metric Metric or metrical may refer to: * Metric system, an internationally adopted decimal system of measurement * An adjective indicating relation to measurement in general, or a noun describing a specific type of measurement Mathematics In mathem ...
(and thus symmetric), the problem becomes
APX In computational complexity theory, the class APX (an abbreviation of "approximable") is the set of NP optimization problems that allow polynomial-time approximation algorithms with approximation ratio bounded by a constant (or constant-factor ap ...
-complete and the algorithm of Christofides and Serdyukov approximates it within 1.5. If the distances are restricted to 1 and 2 (but still are a metric) the approximation ratio becomes 8/7. In the asymmetric case with
triangle inequality In mathematics, the triangle inequality states that for any triangle, the sum of the lengths of any two sides must be greater than or equal to the length of the remaining side. This statement permits the inclusion of degenerate triangles, but ...
, up until recently only logarithmic performance guarantees were known. In 2018, a constant factor approximation was developed by Svensson, Tarnawski and Végh. The best current algorithm, by Traub and Vygen, achieves performance ratio of 22+\varepsilon. The best known inapproximability bound is 75/74. The corresponding maximization problem of finding the ''longest'' travelling salesman tour is approximable within 63/38. If the distance function is symmetric, the longest tour can be approximated within 4/3 by a deterministic algorithm and within \tfrac(33+\varepsilon) by a randomized algorithm.


Human and animal performance

The TSP, in particular the Euclidean variant of the problem, has attracted the attention of researchers in
cognitive psychology Cognitive psychology is the scientific study of mental processes such as attention, language use, memory, perception, problem solving, creativity, and reasoning. Cognitive psychology originated in the 1960s in a break from behaviorism, which ...
. It has been observed that humans are able to produce near-optimal solutions quickly, in a close-to-linear fashion, with performance that ranges from 1% less efficient, for graphs with 10-20 nodes, to 11% less efficient for graphs with 120 nodes. The apparent ease with which humans accurately generate near-optimal solutions to the problem has led researchers to hypothesize that humans use one or more heuristics, with the two most popular theories arguably being the convex-hull hypothesis and the crossing-avoidance heuristic. However, additional evidence suggests that human performance is quite varied, and individual differences as well as graph geometry appear to affect performance in the task. Nevertheless, results suggest that computer performance on the TSP may be improved by understanding and emulating the methods used by humans for these problems, and have also led to new insights into the mechanisms of human thought.. The first issue of the ''Journal of Problem Solving'' was devoted to the topic of human performance on TSP, and a 2011 review listed dozens of papers on the subject. A 2011 study in
animal cognition Animal cognition encompasses the mental capacities of non-human animals including insect cognition. The study of animal conditioning and learning used in this field was developed from comparative psychology. It has also been strongly influenc ...
titled "Let the Pigeon Drive the Bus," named after the children's book '' Don't Let the Pigeon Drive the Bus!'', examined spatial cognition in pigeons by studying their flight patterns between multiple feeders in a laboratory in relation to the travelling salesman problem. In the first experiment, pigeons were placed in the corner of a lab room and allowed to fly to nearby feeders containing peas. The researchers found that pigeons largely used proximity to determine which feeder they would select next. In the second experiment, the feeders were arranged in such a way that flying to the nearest feeder at every opportunity would be largely inefficient if the pigeons needed to visit every feeder. The results of the second experiment indicate that pigeons, while still favoring proximity-based solutions, "can plan several steps ahead along the route when the differences in travel costs between efficient and less efficient routes based on proximity become larger." These results are consistent with other experiments done with non-primates, which have proven that some non-primates were able to plan complex travel routes. This suggests non-primates may possess a relatively sophisticated spatial cognitive ability.


Natural computation

When presented with a spatial configuration of food sources, the
amoeboid An amoeba (; less commonly spelled ameba or amœba; plural ''am(o)ebas'' or ''am(o)ebae'' ), often called an amoeboid, is a type of cell or unicellular organism with the ability to alter its shape, primarily by extending and retracting pseudopo ...
Physarum polycephalum ''Physarum polycephalum'', an acellular slime mold or myxomycete popularly known as "the blob", is a protist with diverse cellular forms and broad geographic distribution. The “acellular” moniker derives from the plasmodial stage of the li ...
adapts its morphology to create an efficient path between the food sources which can also be viewed as an approximate solution to TSP.


Benchmarks

For benchmarking of TSP algorithms, TSPLIB is a library of sample instances of the TSP and related problems is maintained, see the TSPLIB external reference. Many of them are lists of actual cities and layouts of actual printed circuits.


Popular culture

* '' Travelling Salesman'', by director Timothy Lanzone, is the story of four mathematicians hired by the U.S. government to solve the most elusive problem in computer-science history: P vs. NP. Solutions to the problem are used by mathematician
Bob Bosche Robert A. (Bob) Bosch (born August 13, 1963, in Buffalo, New York, Buffalo NY) is an author, recreational mathematician and the James F. Clark Professor of Mathematics at Oberlin College.When the Mona Lisa is NP-Hard
'By Evelyn Lamb, Scientific American, 31 April 2015


See also

* Canadian traveller problem *
Exact algorithm In computer science and operations research, exact algorithms are algorithms that always solve an optimization problem to optimality. Unless P = NP, an exact algorithm for an NP-hard optimization problem cannot run in worst-case polynomial time. ...
*
Route inspection problem In graph theory, a branch of mathematics and computer science, Guan's route problem, the Chinese postman problem, postman tour or route inspection problem is to find a shortest closed path or circuit that visits every edge of an (connected) undire ...
(also known as "Chinese postman problem") *
Set TSP problem In combinatorial optimization, the set TSP, also known as the generalized TSP, group TSP, One-of-a-Set TSP, Multiple Choice TSP or Covering Salesman Problem, is a generalization of the traveling salesman problem (TSP), whereby it is required to find ...
*
Seven Bridges of Königsberg The Seven Bridges of Königsberg is a historically notable problem in mathematics. Its negative resolution by Leonhard Euler in 1736 laid the foundations of graph theory and prefigured the idea of topology. The city of Königsberg in Prussia (n ...
*
Steiner travelling salesman problem The Steiner traveling salesman problem (Steiner TSP, or STSP) is an extension of the traveling salesman problem The travelling salesman problem (also called the travelling salesperson problem or TSP) asks the following question: "Given a list ...
* Subway Challenge *
Tube Challenge The Tube Challenge is the competition for the fastest time to travel to all London Underground stations, tracked as a Guinness World Record since 1960. The goal is to visit all the stations on the system, not necessarily all the lines; partici ...
*
Vehicle routing problem The vehicle routing problem (VRP) is a combinatorial optimization and integer programming problem which asks "What is the optimal set of routes for a fleet of vehicles to traverse in order to deliver to a given set of customers?" It generalises t ...
* Graph exploration *
Mixed Chinese postman problem The mixed Chinese postman problem (MCPP or MCP) is the search for the shortest traversal of a graph with a set of vertices V, a set of undirected edges E with positive rational weights, and a set of directed arcs A with positive rational weights t ...
*
Arc routing Arc routing problems (ARP) are a category of general routing problems (GRP), which also includes node routing problems (NRP). The objective in ARPs and NRPs is to traverse the edges and nodes of a graph, respectively. The objective of arc routing p ...
*
Snow plow routing problem The snow plow routing problem is an application of the structure of Arc Routing Problems (ARPs) and Vehicle Routing Problems (VRPs) to snow removal that considers roads as edges of a graph. The problem is a simple routing problem when the arriv ...


Notes


References

*. *. *. *. *. *. *. *. *. *. *. * *. *. *. *. *. * *. *. *.


Further reading

* * * * * * * * * * * * * * * * * * *


External links

* at
University of Waterloo The University of Waterloo (UWaterloo, UW, or Waterloo) is a public research university with a main campus in Waterloo, Ontario Waterloo is a city in the Canadian province of Ontario. It is one of three cities in the Regional Municipality ...

TSPLIB
at the
University of Heidelberg } Heidelberg University, officially the Ruprecht Karl University of Heidelberg, (german: Ruprecht-Karls-Universität Heidelberg; la, Universitas Ruperto Carola Heidelbergensis) is a public research university in Heidelberg, Baden-Württemberg, ...
*
Traveling Salesman Problem
' by Jon McLoone at the Wolfram Demonstrations Project
TSP visualization tool
{{DEFAULTSORT:Travelling Salesman Problem NP-complete problems NP-hard problems Combinatorial optimization Graph algorithms Computational problems in graph theory Hamiltonian paths and cycles