HOME

TheInfoList



OR:

Alpha–beta pruning is a
search algorithm In computer science, a search algorithm is an algorithm designed to solve a search problem. Search algorithms work to retrieve information stored within particular data structure, or calculated in the search space of a problem domain, with eith ...
that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its
search tree In computer science, a search tree is a tree data structure used for locating specific keys from within a set. In order for a tree to function as a search tree, the key for each node must be greater than any keys in subtrees on the left, and less ...
. It is an adversarial search algorithm used commonly for machine playing of two-player games (
Tic-tac-toe Tic-tac-toe (American English), noughts and crosses (Commonwealth English), or Xs and Os (Canadian or Irish English) is a paper-and-pencil game for two players who take turns marking the spaces in a three-by-three grid with ''X'' or ''O''. ...
,
Chess Chess is a board game for two players, called White and Black, each controlling an army of chess pieces in their color, with the objective to checkmate the opponent's king. It is sometimes called international chess or Western chess to dist ...
, Connect 4, etc.). It stops evaluating a move when at least one possibility has been found that proves the move to be worse than a previously examined move. Such moves need not be evaluated further. When applied to a standard minimax tree, it returns the same move as minimax would, but prunes away branches that cannot possibly influence the final decision.


History

Allen Newell Allen Newell (March 19, 1927 – July 19, 1992) was a researcher in computer science and cognitive psychology at the RAND Corporation and at Carnegie Mellon University’s School of Computer Science, Tepper School of Business, and Depart ...
and Herbert A. Simon who used what John McCarthy calls an "approximation" in 1958 wrote that alpha–beta "appears to have been reinvented a number of times". Arthur Samuel had an early version for a checkers simulation. Richards, Timothy Hart,
Michael Levin Michael Levin (; born 21 May 1943) is an American philosopher and writer. He is professor emeritus of philosophy at City University of New York. He has published on metaphysics, epistemology, race, homosexuality, animal rights, the philosophy ...
and/or Daniel Edwards also invented alpha–beta independently in the
United States The United States of America (U.S.A. or USA), commonly known as the United States (U.S. or US) or America, is a country Continental United States, primarily located in North America. It consists of 50 U.S. state, states, a Washington, D.C., ...
. McCarthy proposed similar ideas during the
Dartmouth workshop The Dartmouth Summer Research Project on Artificial Intelligence was a 1956 summer workshop widely consideredKline, Ronald R., Cybernetics, Automata Studies and the Dartmouth Conference on Artificial Intelligence, IEEE Annals of the History of ...
in 1956 and suggested it to a group of his students including
Alan Kotok Alan Kotok (November 9, 1941 – May 26, 2006) was an American computer scientist known for his work at Digital Equipment Corporation (Digital, or DEC) and at the World Wide Web Consortium (W3C). Steven Levy, in his book '' Hackers: Heroes of t ...
at MIT in 1961.
Alexander Brudno Alexander L'vovich Brudno (russian: Александр Львович Брудно) (10 January 1918 – 1 December 2009)
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
and Ronald W. Moore refined the algorithm in 1975.
Judea Pearl Judea Pearl (born September 4, 1936) is an Israeli-American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks (see the article on belief ...
proved its optimality in terms of the expected running time for trees with randomly assigned leaf values in two papers. The optimality of the randomized version of alpha–beta was shown by Michael Saks and Avi Wigderson in 1986.


Core idea

A game tree can represent many two-player
zero-sum game Zero-sum game is a mathematical representation in game theory and economic theory of a situation which involves two sides, where the result is an advantage for one side and an equivalent loss for the other. In other words, player one's gain is e ...
s, such as chess, checkers, and reversi. Each node in the tree represents a possible situation in the game. Each terminal node (outcome) of a branch is assigned a numeric score that determines the value of the outcome to the player with the next move. The algorithm maintains two values, alpha and beta, which respectively represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of. Initially, alpha is negative infinity and beta is positive infinity, i.e. both players start with their worst possible score. Whenever the maximum score that the minimizing player (i.e. the "beta" player) is assured of becomes less than the minimum score that the maximizing player (i.e., the "alpha" player) is assured of (i.e. beta < alpha), the maximizing player need not consider further descendants of this node, as they will never be reached in the actual play. To illustrate this with a real-life example, suppose somebody is playing chess, and it is their turn. Move "A" will improve the player's position. The player continues to look for moves to make sure a better one hasn't been missed. Move "B" is also a good move, but the player then realizes that it will allow the opponent to force checkmate in two moves. Thus, other outcomes from playing move B no longer need to be considered since the opponent can force a win. The maximum score that the opponent could force after move "B" is negative infinity: a loss for the player. This is less than the minimum position that was previously found; move "A" does not result in a forced loss in two moves.


Improvements over naive minimax

The benefit of alpha–beta pruning lies in the fact that branches of the search tree can be eliminated. This way, the search time can be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Like its predecessor, it belongs to the
branch and bound Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solut ...
class of algorithms. The optimization reduces the effective depth to slightly more than half that of simple minimax if the nodes are evaluated in an optimal or near optimal order (best choice for side on move ordered first at each node). With an (average or constant) branching factor of ''b'', and a search depth of ''d'' plies, the maximum number of leaf node positions evaluated (when the move ordering is pessimal) is ''O''(''b''×''b''×...×''b'') = ''O''(''b''''d'') – the same as a simple minimax search. If the move ordering for the search is optimal (meaning the best moves are always searched first), the number of leaf node positions evaluated is about ''O''(''b''×1×''b''×1×...×''b'') for odd depth and ''O''(''b''×1×''b''×1×...×1) for even depth, or O(b^) = O(\sqrt). In the latter case, where the ply of a search is even, the effective branching factor is reduced to its
square root In mathematics, a square root of a number is a number such that ; in other words, a number whose '' square'' (the result of multiplying the number by itself, or  ⋅ ) is . For example, 4 and −4 are square roots of 16, because . ...
, or, equivalently, the search can go twice as deep with the same amount of computation. The explanation of ''b''×1×''b''×1×... is that all the first player's moves must be studied to find the best one, but for each, only the second player's best move is needed to refute all but the first (and best) first player move—alpha–beta ensures no other second player moves need be considered. When nodes are considered in a random order (i.e., the algorithm randomizes), asymptotically, the expected number of nodes evaluated in uniform trees with binary leaf-values is \Theta( ((b-1+\sqrt)/4 )^d ) . For the same trees, when the values are assigned to the leaf values independently of each other and say zero and one are both equally probable, the expected number of nodes evaluated is \Theta( (b/2)^ ), which is much smaller than the work done by the randomized algorithm, mentioned above, and is again optimal for such random trees. When the leaf values are chosen independently of each other but from the ,1/math> interval uniformly at random, the expected number of nodes evaluated increases to \Theta( b^ ) in the d\to\infty limit, which is again optimal for these kind random trees. Note that the actual work for "small" values of d is better approximated using 0.925 d^. A chess program that searches four plies with an average of 36 branches per node evaluates more than one million terminal nodes. An optimal alpha-beta prune would eliminate all but about 2,000 terminal nodes, a reduction of 99.8%. Normally during alpha–beta, the subtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first move checked by the first player is adequate, but all second player responses are required to try to find a refutation), or vice versa. This advantage can switch sides many times during the search if the move ordering is incorrect, each time leading to inefficiency. As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves. An improved sort at any depth will exponentially reduce the total number of positions searched, but sorting all positions at depths near the root node is relatively cheap as there are so few of them. In practice, the move ordering is often determined by the results of earlier, smaller searches, such as through
iterative deepening In computer science, iterative deepening search or more specifically iterative deepening depth-first search (IDS or IDDFS) is a state space/graph search strategy in which a depth-limited version of depth-first search is run repeatedly with inc ...
. Additionally, this algorithm can be trivially modified to return an entire principal variation in addition to the score. Some more aggressive algorithms such as MTD(f) do not easily permit such a modification.


Pseudocode

The pseudo-code for depth limited minimax with alpha–beta pruning is as follows: Implementations of alpha–beta pruning can often be delineated by whether they are "fail-soft," or "fail-hard". With fail-soft alpha–beta, the alphabeta function may return values (v) that exceed (v < α or v > β) the α and β bounds set by its function call arguments. In comparison, fail-hard alpha–beta limits its function return value into the inclusive range of α and β. The main difference between fail-soft and fail-hard implementations is whether α and β are updated before or after the cutoff check. If they are updated before the check, then they can exceed initial bounds and the algorithm is fail-soft. The following pseudo-code illustrates the fail-hard variation. function alphabeta(node, depth, α, β, maximizingPlayer) is if depth = 0 or node is a terminal node then return the heuristic value of node if maximizingPlayer then value := −∞ for each child of node do value := max(value, alphabeta(child, depth − 1, α, β, FALSE)) if value > β then break ''(* β cutoff *)'' α := max(α, value) return value else value := +∞ for each child of node do value := min(value, alphabeta(child, depth − 1, α, β, TRUE)) if value < α then break ''(* α cutoff *)'' β := min(β, value) return value ''(* Initial call *)'' alphabeta(origin, depth, − , + , TRUE) The following pseudocode illustrates fail-soft alpha-beta. function alphabeta(node, depth, α, β, maximizingPlayer) is if depth = 0 or node is a terminal node then return the heuristic value of node if maximizingPlayer then value := −∞ for each child of node do value := max(value, alphabeta(child, depth − 1, α, β, FALSE)) α := max(α, value) if value ≥ β then break ''(* β cutoff *)'' return value else value := +∞ for each child of node do value := min(value, alphabeta(child, depth − 1, α, β, TRUE)) β := min(β, value) if value ≤ α then break ''(* α cutoff *)'' return value ''(* Initial call *)'' alphabeta(origin, depth, − , + , TRUE)


Heuristic improvements

Further improvement can be achieved without sacrificing accuracy by using ordering
heuristic A heuristic (; ), or heuristic technique, is any approach to problem solving or self-discovery that employs a practical method that is not guaranteed to be optimal, perfect, or rational, but is nevertheless sufficient for reaching an immediate ...
s to search earlier parts of the tree that are likely to force alpha–beta cutoffs. For example, in chess, moves that capture pieces may be examined before moves that do not, and moves that have scored highly in earlier passes through the game-tree analysis may be evaluated before others. Another common, and very cheap, heuristic is the
killer heuristic In competitive two-player games, the killer heuristic is a move-ordering method based on the observation that a strong move or small set of such moves in a particular position may be equally strong in similar positions at the same move (ply) in t ...
, where the last move that caused a beta-cutoff at the same tree level in the tree search is always examined first. This idea can also be generalized into a set of
refutation table {{no footnotes, date=November 2017 A transposition table is a cache of previously seen positions, and associated evaluations, in a game tree generated by a computer game playing program. If a position recurs via a different sequence of moves, the ...
s. Alpha–beta search can be made even faster by considering only a narrow search window (generally determined by guesswork based on experience). This is known as ''aspiration search''. In the extreme case, the search is performed with alpha and beta equal; a technique known as '' zero-window search'', ''null-window search'', or ''scout search''. This is particularly useful for win/loss searches near the end of a game where the extra depth gained from the narrow window and a simple win/loss evaluation function may lead to a conclusive result. If an aspiration search fails, it is straightforward to detect whether it failed ''high'' (high edge of window was too low) or ''low'' (lower edge of window was too high). This gives information about what window values might be useful in a re-search of the position. Over time, other improvements have been suggested, and indeed the Falphabeta (fail-soft alpha–beta) idea of John Fishburn is nearly universal and is already incorporated above in a slightly modified form. Fishburn also suggested a combination of the killer heuristic and zero-window search under the name Lalphabeta ("last move with minimal window alpha–beta search").


Other algorithms

Since the
minimax Minimax (sometimes MinMax, MM or saddle point) is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy for ''mini''mizing the possible loss for a worst case (''max''imum loss) scenario. When ...
algorithm and its variants are inherently depth-first, a strategy such as
iterative deepening In computer science, iterative deepening search or more specifically iterative deepening depth-first search (IDS or IDDFS) is a state space/graph search strategy in which a depth-limited version of depth-first search is run repeatedly with inc ...
is usually used in conjunction with alpha–beta so that a reasonably good move can be returned even if the algorithm is interrupted before it has finished execution. Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints, as well as shallow alpha and beta estimates, that both can help produce cutoffs for higher depth searches much earlier than would otherwise be possible. Algorithms like
SSS* SSS* is a search algorithm, introduced by George Stockman in 1979, that conducts a state space search traversing a game tree in a best-first fashion similar to that of the A* search algorithm. SSS* is based on the notion of solution trees. Info ...
, on the other hand, use the best-first strategy. This can potentially make them more time-efficient, but typically at a heavy cost in space-efficiency.


See also

*
Minimax Minimax (sometimes MinMax, MM or saddle point) is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy for ''mini''mizing the possible loss for a worst case (''max''imum loss) scenario. When ...
* Expectiminimax * Negamax * Pruning (algorithm) *
Branch and bound Branch and bound (BB, B&B, or BnB) is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solut ...
* Combinatorial optimization * Principal variation search * Transposition table


References


Bibliography

* * * {{DEFAULTSORT:Alpha-Beta Pruning Game artificial intelligence Graph algorithms Optimization algorithms and methods Search algorithms Articles with example pseudocode