A greedy algorithm is any

algorithm In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...

that follows the problem-solving

heuristic A heuristic or heuristic technique (''problem solving'', '' mental shortcut'', ''rule of thumb'') is any approach to problem solving that employs a pragmatic method that is not fully optimized, perfected, or rationalized, but is nevertheless ...

of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally optimal solutions that approximate a globally optimal solution in a reasonable amount of time. For example, a greedy strategy for the

travelling salesman problem In the Computational complexity theory, theory of computational complexity, the travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible ...

(which is of high

computational complexity In computer science, the computational complexity or simply complexity of an algorithm is the amount of resources required to run it. Particular focus is given to computation time (generally measured by the number of needed elementary operations ...

) is the following heuristic: "At each step of the journey, visit the nearest unvisited city." This heuristic does not intend to find the best solution, but it terminates in a reasonable number of steps; finding an optimal solution to such a complex problem typically requires unreasonably many steps. In

mathematical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...

, greedy algorithms optimally solve

combinatorial Combinatorics is an area of mathematics primarily concerned with counting, both as a means and as an end to obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many ...

problems having the properties of

matroid In combinatorics, a matroid is a structure that abstracts and generalizes the notion of linear independence in vector spaces. There are many equivalent ways to define a matroid Axiomatic system, axiomatically, the most significant being in terms ...

s and give constant-factor approximations to optimization problems with the submodular structure.

Specifics

Greedy algorithms produce good solutions on some

mathematical problem A mathematical problem is a problem that can be represented, analyzed, and possibly solved, with the methods of mathematics. This can be a real-world problem, such as computing the orbits of the planets in the Solar System, or a problem of a more ...

s, but not on others. Most problems for which they work will have two properties: ; Greedy choice property: Whichever choice seems best at a given moment can be made and then (recursively) solve the remaining sub-problems. The choice made by a greedy algorithm may depend on choices made so far, but not on future choices or all the solutions to the subproblem. It iteratively makes one greedy choice after another, reducing each given problem into a smaller one. In other words, a greedy algorithm never reconsiders its choices. This is the main difference from dynamic programming, which is exhaustive and is guaranteed to find the solution. After every stage, dynamic programming makes decisions based on all the decisions made in the previous stage and may reconsider the previous stage's algorithmic path to the solution. ;Optimal substructure: "A problem exhibits optimal substructure if an optimal solution to the problem contains optimal solutions to the sub-problems."

Correctness Proofs

A common technique for proving the correctness of greedy algorithms uses an inductive exchange argument. The exchange argument demonstrates that any solution different from the greedy solution can be transformed into the greedy solution without degrading its quality. This proof pattern typically follows these steps: This proof pattern typically follows these steps (by contradiction): # Assume there exists an optimal solution different from the greedy solution # Identify the first point where the optimal and greedy solutions differ # Prove that exchanging the optimal choice for the greedy choice at this point cannot worsen the solution # Conclude by induction that there must exist an optimal solution identical to the greedy solution In some cases, an additional step may be needed to prove that no optimal solution can strictly improve upon the greedy solution.

Cases of failure

Greedy algorithms fail to produce the optimal solution for many other problems and may even produce the ''unique worst possible'' solution. One example is the

mentioned above: for each number of cities, there is an assignment of distances between the cities for which the nearest-neighbour heuristic produces the unique worst possible tour. For other possible examples, see horizon effect.

Types

Greedy algorithms can be characterized as being 'short sighted', and also as 'non-recoverable'. They are ideal only for problems that have an 'optimal substructure'. Despite this, for many simple problems, the best-suited algorithms are greedy. It is important, however, to note that the greedy algorithm can be used as a selection algorithm to prioritize options within a search, or branch-and-bound algorithm. There are a few variations to the greedy algorithm: * Pure greedy algorithms * Orthogonal greedy algorithms * Relaxed greedy algorithms

Theory

Greedy algorithms have a long history of study in

combinatorial optimization Combinatorial optimization is a subfield of mathematical optimization that consists of finding an optimal object from a finite set of objects, where the set of feasible solutions is discrete or can be reduced to a discrete set. Typical combina ...

and

theoretical computer science Theoretical computer science is a subfield of computer science and mathematics that focuses on the Abstraction, abstract and mathematical foundations of computation. It is difficult to circumscribe the theoretical areas precisely. The Associati ...

. Greedy heuristics are known to produce suboptimal results on many problems, and so natural questions are: * For which problems do greedy algorithms perform optimally? * For which problems do greedy algorithms guarantee an approximately optimal solution? * For which problems are greedy algorithms guaranteed ''not'' to produce an optimal solution? A large body of literature exists answering these questions for general classes of problems, such as

s, as well as for specific problems, such as

set cover The set cover problem is a classical question in combinatorics, computer science, operations research, and Computational complexity theory, complexity theory. Given a Set (mathematics), set of elements (henceforth referred to as the Universe ( ...

Matroids

is a mathematical structure that generalizes the notion of

linear independence In the theory of vector spaces, a set (mathematics), set of vector (mathematics), vectors is said to be if there exists no nontrivial linear combination of the vectors that equals the zero vector. If such a linear combination exists, then th ...

from

vector spaces In mathematics and physics, a vector space (also called a linear space) is a set whose elements, often called ''vectors'', can be added together and multiplied ("scaled") by numbers called ''scalars''. The operations of vector addition and sc ...

to arbitrary sets. If an optimization problem has the structure of a matroid, then the appropriate greedy algorithm will solve it optimally.

Submodular functions

A function

f

defined on subsets of a set

\Omega

is called

submodular In mathematics, a submodular set function (also known as a submodular function) is a set function that, informally, describes the relationship between a set of inputs and an output, where adding more of one input has a decreasing additional benefi ...

if for every

S, T \subseteq \Omega

we have that

f(S)+f(T)\geq f(S\cup T)+f(S\cap T)

. Suppose one wants to find a set

S

which maximizes

f

. The greedy algorithm, which builds up a set

S

by incrementally adding the element which increases

f

the most at each step, produces as output a set that is at least

(1 - 1/e) \max_ f(X)

. That is, greedy performs within a constant factor of

(1 - 1/e) \approx 0.63

as good as the optimal solution. Similar guarantees are provable when additional constraints, such as cardinality constraints, are imposed on the output, though often slight variations on the greedy algorithm are required. See for an overview.

Applications

Greedy algorithms typically (but not always) fail to find the globally optimal solution because they usually do not operate exhaustively on all the data. They can make commitments to certain choices too early, preventing them from finding the best overall solution later. For example, all known

greedy coloring In the study of graph coloring problems in mathematics and computer science, a greedy coloring or sequential coloring is a coloring of the vertices of a graph formed by a greedy algorithm that considers the vertices of the graph in sequence an ...

algorithms for the

graph coloring problem In graph theory, graph coloring is a methodic assignment of labels traditionally called "colors" to elements of a graph. The assignment is subject to certain constraints, such as that no two adjacent elements have the same color. Graph coloring i ...

and all other

NP-complete In computational complexity theory, NP-complete problems are the hardest of the problems to which ''solutions'' can be verified ''quickly''. Somewhat more precisely, a problem is NP-complete when: # It is a decision problem, meaning that for any ...

problems do not consistently find optimum solutions. Nevertheless, they are useful because they are quick to think up and often give good approximations to the optimum. If a greedy algorithm can be proven to yield the global optimum for a given problem class, it typically becomes the method of choice because it is faster than other optimization methods like dynamic programming. Examples of such greedy algorithms are

Kruskal's algorithm Kruskal's algorithm finds a minimum spanning forest of an undirected edge-weighted graph. If the graph is connected, it finds a minimum spanning tree. It is a greedy algorithm that in each step adds to the forest the lowest-weight edge that ...

and

Prim's algorithm In computer science, Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a Weighted graph, weighted undirected graph. This means it finds a subset of the edge (graph theory), edges that forms a Tree (graph theory), tree ...

for finding

minimum spanning tree A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. ...

s and the algorithm for finding optimum

Huffman tree In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by ...

s. Greedy algorithms appear in network

routing Routing is the process of selecting a path for traffic in a Network theory, network or between or across multiple networks. Broadly, routing is performed in many types of networks, including circuit-switched networks, such as the public switched ...

as well. Using greedy routing, a message is forwarded to the neighbouring node which is "closest" to the destination. The notion of a node's location (and hence "closeness") may be determined by its physical location, as in geographic routing used by ad hoc networks. Location may also be an entirely artificial construct as in small world routing and

distributed hash table A distributed hash table (DHT) is a Distributed computing, distributed system that provides a lookup service similar to a hash table. Key–value pairs are stored in a DHT, and any participating node (networking), node can efficiently retrieve the ...

Examples

* The activity selection problem is characteristic of this class of problems, where the goal is to pick the maximum number of activities that do not clash with each other. * In the Macintosh computer game '' Crystal Quest'' the objective is to collect crystals, in a fashion similar to the

. The game has a demo mode, where the game uses a greedy algorithm to go to every crystal. The

artificial intelligence Artificial intelligence (AI) is the capability of computer, computational systems to perform tasks typically associated with human intelligence, such as learning, reasoning, problem-solving, perception, and decision-making. It is a field of re ...

does not account for obstacles, so the demo mode often ends quickly. * The matching pursuit is an example of a greedy algorithm applied on signal approximation. * A greedy algorithm finds the optimal solution to Malfatti's problem of finding three disjoint circles within a given triangle that maximize the total area of the circles; it is conjectured that the same greedy algorithm is optimal for any number of circles. * A greedy algorithm is used to construct a Huffman tree during

Huffman coding In computer science and information theory, a Huffman code is a particular type of optimal prefix code that is commonly used for lossless data compression. The process of finding or using such a code is Huffman coding, an algorithm developed by ...

where it finds an optimal solution. * In

decision tree learning Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of obser ...

, greedy algorithms are commonly used, however they are not guaranteed to find the optimal solution. **One popular such algorithm is the ID3 algorithm for decision tree construction. *

Dijkstra's algorithm Dijkstra's algorithm ( ) is an algorithm for finding the shortest paths between nodes in a weighted graph, which may represent, for example, a road network. It was conceived by computer scientist Edsger W. Dijkstra in 1956 and published three ...

and the related A* search algorithm are verifiably optimal greedy algorithms for graph search and shortest path finding. **A* search is conditionally optimal, requiring an " admissible heuristic" that will not overestimate path costs. *

and

are greedy algorithms for constructing

s of a given connected graph. They always find an optimal solution, which may not be unique in general. *The Sequitur and Lempel-Ziv-Welch algorithms are greedy algorithms for grammar induction.

References

Sources

* * * * * * * * *

External links

* * {{Authority control Optimization algorithms and methods Combinatorial algorithms Matroid theory Exchange algorithms Greedy algorithms