Maximin (decision theory)
   HOME

TheInfoList



OR:

Minimax (sometimes MinMax, MM or saddle point) is a decision rule used in
artificial intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech r ...
,
decision theory Decision theory (or the theory of choice; not to be confused with choice theory) is a branch of applied probability theory concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical ...
, game theory, statistics, and philosophy for ''mini''mizing the possible
loss Loss may refer to: Arts, entertainment, and media Music * ''Loss'' (Bass Communion album) (2006) * ''Loss'' (Mull Historical Society album) (2001) *"Loss", a song by God Is an Astronaut from their self-titled album (2008) * Losses "(Lil Tjay son ...
for a worst case (''max''imum loss) scenario. When dealing with gains, it is referred to as "maximin" – to maximize the minimum gain. Originally formulated for several-player
zero-sum Zero-sum game is a mathematical representation in game theory and economic theory of a situation which involves two sides, where the result is an advantage for one side and an equivalent loss for the other. In other words, player one's gain is e ...
game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision-making in the presence of uncertainty.


Game theory


In general games

The maximin value is the highest value that the player can be sure to get without knowing the actions of the other players; equivalently, it is the lowest value the other players can force the player to receive when they know the player's action. Its formal definition is: :\underline = \max_ \min_ Where: * is the index of the player of interest. * -i denotes all other players except player . * a_i is the action taken by player . * a_ denotes the actions taken by all other players. * v_i is the value function of player . Calculating the maximin value of a player is done in a worst-case approach: for each possible action of the player, we check all possible actions of the other players and determine the worst possible combination of actions – the one that gives player the smallest value. Then, we determine which action player can take in order to make sure that this smallest value is the highest possible. For example, consider the following game for two players, where the first player ("row player") may choose any of three moves, labelled , , or , and the second player ("column" player) may choose either of two moves, or . The result of the combination of both moves is expressed in a payoff table: :\begin \hline & L & R \\ \hline T & 3,1 & 2,-20 \\ M & 5,0 & -10,1 \\ B & -100,2 & 4,4 \\ \hline \end (where the first number in each of the cell is the pay-out of the row player and the second number is the pay-out of the column player). For the sake of example, we consider only pure strategies. Check each player in turn: * The row player can play , which guarantees them a payoff of at least (playing is risky since it can lead to payoff , and playing can result in a payoff of ). Hence: \underline = 2. * The column player can play and secure a payoff of at least (playing puts them in the risk of getting -20). Hence: \underline = 0. If both players play their respective maximin strategies (T,L), the payoff vector is (3,1). The minimax value of a player is the smallest value that the other players can force the player to receive, without knowing the player's actions; equivalently, it is the largest value the player can be sure to get when they ''know'' the actions of the other players. Its formal definition is: :\overline = \min_ \max_ The definition is very similar to that of the maximin value – only the order of the maximum and minimum operators is inverse. In the above example: * The row player can get a maximum value of (if the other player plays ) or (if the other player plays ), so: \overline = 4\ . * The column player can get a maximum value of (if the other player plays ), (if ) or (if ). Hence: \overline = 1\ . For every player , the maximin is at most the minimax: :\underline \leq \overline Intuitively, in maximin the maximization comes after the minimization, so player tries to maximize their value before knowing what the others will do; in minimax the maximization comes before the minimization, so player is in a much better position – they maximize their value knowing what the others did. Another way to understand the ''notation'' is by reading from right to left: When we write :\overline = \min_ \max_ = \min_ \Big( \max_ \Big) the initial set of outcomes \ v_i(a_i,a_)\ depends on both \ \ and \ \ . We first ''marginalize away'' from v_i(a_i,a_), by maximizing over \ \ (for every possible value of ) to yield a set of marginal outcomes \ v'_i(a_)\,, which depends only on \ \ . We then minimize over \ \ over these outcomes. (Conversely for maximin.) Although it is always the case that \ \underline \leq \overline\ and \ \underline \leq \overline\,, the payoff vector resulting from both players playing their minimax strategies, \ (2,-20)\ in the case of \ (T,R)\ or (-10,1) in the case of \ (M,R)\,, cannot similarly be ranked against the payoff vector \ (3,1)\ resulting from both players playing their maximin strategy.


In zero-sum games

In two-player zero-sum games, the minimax solution is the same as the Nash equilibrium. In the context of zero-sum games, the minimax theorem is equivalent to:
For every two-person,
zero-sum Zero-sum game is a mathematical representation in game theory and economic theory of a situation which involves two sides, where the result is an advantage for one side and an equivalent loss for the other. In other words, player one's gain is e ...
game with finitely many strategies, there exists a value and a mixed strategy for each player, such that :(a) Given Player 2's strategy, the best payoff possible for Player 1 is , and :(b) Given Player 1's strategy, the best payoff possible for Player 2 is −.
Equivalently, Player 1's strategy guarantees them a payoff of regardless of Player 2's strategy, and similarly Player 2 can guarantee themselves a payoff of −. The name ''minimax'' arises because each player minimizes the maximum payoff possible for the other – since the game is zero-sum, they also minimize their own maximum loss (i.e. maximize their minimum payoff). See also example of a game without a value.


Example

The following example of a zero-sum game, where A and B make simultaneous moves, illustrates ''maximin'' solutions. Suppose each player has three choices and consider the
payoff matrix In game theory, normal form is a description of a ''game''. Unlike extensive form, normal-form representations are not graphical ''per se'', but rather represent the game by way of a matrix. While this approach can be of greater use in identifyin ...
for A displayed on the table ("Payoff matrix for player A"). Assume the payoff matrix for B is the same matrix with the signs reversed (i.e. if the choices are A1 and B1 then B pays 3 to A). Then, the maximin choice for A is A2 since the worst possible result is then having to pay 1, while the simple maximin choice for B is B2 since the worst possible result is then no payment. However, this solution is not stable, since if B believes A will choose A2 then B will choose B1 to gain 1; then if A believes B will choose B1 then A will choose A1 to gain 3; and then B will choose B2; and eventually both players will realize the difficulty of making a choice. So a more stable strategy is needed. Some choices are ''dominated'' by others and can be eliminated: A will not choose A3 since either A1 or A2 will produce a better result, no matter what B chooses; B will not choose B3 since some mixtures of B1 and B2 will produce a better result, no matter what A chooses. Player A can avoid having to make an expected payment of more than by choosing A1 with probability and A2 with probability The expected payoff for A would be in case B chose B1 and in case B chose B2. Similarly, B can ensure an expected gain of at least , no matter what A chooses, by using a randomized strategy of choosing B1 with probability and B2 with probability . These mixed minimax strategies cannot be improved and are now stable.


Maximin

Frequently, in game theory, maximin is distinct from minimax. Minimax is used in zero-sum games to denote minimizing the opponent's maximum payoff. In a zero-sum game, this is identical to minimizing one's own maximum loss, and to maximizing one's own minimum gain. "Maximin" is a term commonly used for non-zero-sum games to describe the strategy which maximizes one's own minimum payoff. In non-zero-sum games, this is not generally the same as minimizing the opponent's maximum gain, nor the same as the Nash equilibrium strategy.


In repeated games

The minimax values are very important in the theory of
repeated games In game theory, a repeated game is an extensive form game that consists of a number of repetitions of some base game (called a stage game). The stage game is usually one of the well-studied list of games in game theory, 2-person games. Repeated ga ...
. One of the central theorems in this theory, the folk theorem, relies on the minimax values.


Combinatorial game theory

In
combinatorial game theory Combinatorial game theory is a branch of mathematics and theoretical computer science that typically studies sequential games with perfect information. Study has been largely confined to two-player games that have a ''position'' that the player ...
, there is a minimax algorithm for game solutions. A simple version of the minimax ''algorithm'', stated below, deals with games such as
tic-tac-toe Tic-tac-toe (American English), noughts and crosses (Commonwealth English), or Xs and Os (Canadian or Irish English) is a paper-and-pencil game for two players who take turns marking the spaces in a three-by-three grid with ''X'' or ''O''. ...
, where each player can win, lose, or draw. If player A ''can'' win in one move, their best move is that winning move. If player B knows that one move will lead to the situation where player A ''can'' win in one move, while another move will lead to the situation where player A can, at best, draw, then player B's best move is the one leading to a draw. Late in the game, it's easy to see what the "best" move is. The minimax algorithm helps find the best move, by working backwards from the end of the game. At each step it assumes that player A is trying to maximize the chances of A winning, while on the next turn player B is trying to minimize the chances of A winning (i.e., to maximize B's own chances of winning).


Minimax algorithm with alternate moves

A minimax algorithm is a recursive
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
for choosing the next move in an n-player game, usually a two-player game. A value is associated with each position or state of the game. This value is computed by means of a position evaluation function and it indicates how good it would be for a player to reach that position. The player then makes the move that maximizes the minimum value of the position resulting from the opponent's possible following moves. If it is A's turn to move, A gives a value to each of their legal moves. A possible allocation method consists in assigning a certain win for A as +1 and for B as −1. This leads to
combinatorial game theory Combinatorial game theory is a branch of mathematics and theoretical computer science that typically studies sequential games with perfect information. Study has been largely confined to two-player games that have a ''position'' that the player ...
as developed by
J.H. Conway John Horton Conway (26 December 1937 – 11 April 2020) was an English people, English mathematician active in the theory of finite groups, knot theory, number theory, combinatorial game theory and coding theory. He also made contributions to ...
. An alternative is using a rule that if the result of a move is an immediate win for A it is assigned positive infinity and if it is an immediate win for B, negative infinity. The value to A of any other move is the minimum of the values resulting from each of B's possible replies. For this reason, A is called the ''maximizing player'' and B is called the ''minimizing player'', hence the name ''minimax algorithm''. The above algorithm will assign a value of positive or negative infinity to any position since the value of every position will be the value of some final winning or losing position. Often this is generally only possible at the very end of complicated games such as chess or go, since it is not computationally feasible to look ahead as far as the completion of the game, except towards the end, and instead, positions are given finite values as estimates of the degree of belief that they will lead to a win for one player or another. This can be extended if we can supply a heuristic evaluation function which gives values to non-final game states without considering all possible following complete sequences. We can then limit the minimax algorithm to look only at a certain number of moves ahead. This number is called the "look-ahead", measured in " plies". For example, the chess computer Deep Blue (the first one to beat a reigning world champion,
Garry Kasparov Garry Kimovich Kasparov (born 13 April 1963) is a Russian chess grandmaster, former World Chess Champion, writer, political activist and commentator. His peak rating of 2851, achieved in 1999, was the highest recorded until being surpassed by ...
at that time) looked ahead at least 12 plies, then applied a heuristic evaluation function. The algorithm can be thought of as exploring the nodes of a ''
game tree In the context of Combinatorial game theory, which typically studies sequential games with perfect information, a game tree is a graph representing all possible game states within such a game. Such games include well-known ones such as chess, ch ...
''. The ''effective
branching factor In computing, tree data structures, and game theory, the branching factor is the number of children at each node, the outdegree. If this value is not uniform, an ''average branching factor'' can be calculated. For example, in chess, if a "no ...
'' of the tree is the average number of children of each node (i.e., the average number of legal moves in a position). The number of nodes to be explored usually increases exponentially with the number of plies (it is less than exponential if evaluating forced moves or repeated positions). The number of nodes to be explored for the analysis of a game is therefore approximately the branching factor raised to the power of the number of plies. It is therefore impractical to completely analyze games such as chess using the minimax algorithm. The performance of the naïve minimax algorithm may be improved dramatically, without affecting the result, by the use of
alpha–beta pruning Alpha–beta pruning is a search algorithm that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player games ...
. Other heuristic pruning methods can also be used, but not all of them are guaranteed to give the same result as the unpruned search. A naïve minimax algorithm may be trivially modified to additionally return an entire
Principal Variation A variation can refer to a specific sequence of successive moves in a turn-based game, often used to specify a hypothetical future state of a game that is being played. Although the term is most commonly used in the context of Chess analysis, it has ...
along with a minimax score.


Pseudocode

The pseudocode for the depth-limited minimax algorithm is given below. function minimax( node, depth, maximizingPlayer ) is if depth = 0 or node is a terminal node then return the heuristic value of node if maximizingPlayer then value := −∞ for each child of node do value := max( value, minimax( child, depth − 1, FALSE ) ) return value else ''(* minimizing player *)'' value := +∞ for each child of node do value := min( value, minimax( child, depth − 1, TRUE ) ) return value ''(* Initial call *)'' minimax( origin, depth, TRUE ) The minimax function returns a heuristic value for
leaf nodes In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be con ...
(terminal nodes and nodes at the maximum search depth). Non-leaf nodes inherit their value from a descendant leaf node. The heuristic value is a score measuring the favorability of the node for the maximizing player. Hence nodes resulting in a favorable outcome, such as a win, for the maximizing player have higher scores than nodes more favorable for the minimizing player. The heuristic value for terminal (game ending) leaf nodes are scores corresponding to win, loss, or draw, for the maximizing player. For non terminal leaf nodes at the maximum search depth, an evaluation function estimates a heuristic value for the node. The quality of this estimate and the search depth determine the quality and accuracy of the final minimax result. Minimax treats the two players (the maximizing player and the minimizing player) separately in its code. Based on the observation that \ \max(a,b) = -\min(-a,-b)\ , minimax may often be simplified into the
negamax Negamax search is a variant form of minimax search that relies on the zero-sum property of a two-player game. This algorithm relies on the fact that to simplify the implementation of the minimax algorithm. More precisely, the value of a position ...
algorithm.


Example

Suppose the game being played only has a maximum of two possible moves per player each turn. The algorithm generates the
tree In botany, a tree is a perennial plant with an elongated stem, or trunk, usually supporting branches and leaves. In some usages, the definition of a tree may be narrower, including only woody plants with secondary growth, plants that are ...
on the right, where the circles represent the moves of the player running the algorithm (''maximizing player''), and squares represent the moves of the opponent (''minimizing player''). Because of the limitation of computation resources, as explained above, the tree is limited to a ''look-ahead'' of 4 moves. The algorithm evaluates each ''
leaf node In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be con ...
'' using a heuristic evaluation function, obtaining the values shown. The moves where the ''maximizing player'' wins are assigned with positive infinity, while the moves that lead to a win of the ''minimizing player'' are assigned with negative infinity. At level 3, the algorithm will choose, for each node, the smallest of the ''
child node In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be c ...
'' values, and assign it to that same node (e.g. the node on the left will choose the minimum between "10" and "+∞", therefore assigning the value "10" to itself). The next step, in level 2, consists of choosing for each node the largest of the ''child node'' values. Once again, the values are assigned to each '' parent node''. The algorithm continues evaluating the maximum and minimum values of the child nodes alternately until it reaches the ''
root node In computer science, a tree is a widely used abstract data type that represents a hierarchical tree structure with a set of connected nodes. Each node in the tree can be connected to many children (depending on the type of tree), but must be con ...
'', where it chooses the move with the largest value (represented in the figure with a blue arrow). This is the move that the player should make in order to ''minimize'' the ''maximum'' possible
loss Loss may refer to: Arts, entertainment, and media Music * ''Loss'' (Bass Communion album) (2006) * ''Loss'' (Mull Historical Society album) (2001) *"Loss", a song by God Is an Astronaut from their self-titled album (2008) * Losses "(Lil Tjay son ...
.


Minimax for individual decisions


Minimax in the face of uncertainty

Minimax theory has been extended to decisions where there is no other player, but where the consequences of decisions depend on unknown facts. For example, deciding to prospect for minerals entails a cost, which will be wasted if the minerals are not present, but will bring major rewards if they are. One approach is to treat this as a game against ''nature'' (see move by nature), and using a similar mindset as Murphy's law or resistentialism, take an approach which minimizes the maximum expected loss, using the same techniques as in the two-person zero-sum games. In addition, expectiminimax trees have been developed, for two-player games in which chance (for example, dice) is a factor.


Minimax criterion in statistical decision theory

In classical statistical
decision theory Decision theory (or the theory of choice; not to be confused with choice theory) is a branch of applied probability theory concerned with the theory of making decisions based on assigning probabilities to various factors and assigning numerical ...
, we have an
estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
\ \delta\ that is used to estimate a
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
\ \theta \in \Theta\ . We also assume a risk function \ R(\theta,\delta)\ . usually specified as the integral of a loss function. In this framework, \ \tilde\ is called minimax if it satisfies : \sup_\theta R(\theta,\tilde) = \inf_\delta\ \sup_\theta\ R(\theta,\delta)\ . An alternative criterion in the decision theoretic framework is the Bayes estimator in the presence of a prior distribution \Pi\ . An estimator is Bayes if it minimizes the '' average'' risk : \int_\Theta R(\theta,\delta) \ \operatorname \Pi(\theta)\ .


Non-probabilistic decision theory

A key feature of minimax decision making is being non-probabilistic: in contrast to decisions using expected value or
expected utility The expected utility hypothesis is a popular concept in economics that serves as a reference guide for decisions when the payoff is uncertain. The theory recommends which option rational individuals should choose in a complex situation, based on the ...
, it makes no assumptions about the probabilities of various outcomes, just
scenario analysis Scenario planning, scenario thinking, scenario analysis, scenario prediction and the scenario method all describe a strategic planning method that some organizations use to make flexible long-term plans. It is in large part an adaptation and gener ...
of what the possible outcomes are. It is thus
robust Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...
to changes in the assumptions, in contrast to these other decision techniques. Various extensions of this non-probabilistic approach exist, notably
minimax regret In decision theory, on making choice, decisions under uncertainty—should information about the best course of action arrive ''after'' taking a fixed decision—the human emotional response of regret is often experienced, and can be measured as th ...
and
Info-gap decision theory Info-gap decision theory seeks to optimize robustness to failure under severe uncertainty,Yakov Ben-Haim, ''Information-Gap Theory: Decisions Under Severe Uncertainty,'' Academic Press, London, 2001.Yakov Ben-Haim, ''Info-Gap Theory: Decisions Unde ...
. Further, minimax only requires
ordinal measurement Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scal ...
(that outcomes be compared and ranked), not ''interval'' measurements (that outcomes include "how much better or worse"), and returns ordinal data, using only the modeled outcomes: the conclusion of a minimax analysis is: "this strategy is minimax, as the worst case is (outcome), which is less bad than any other strategy". Compare to expected value analysis, whose conclusion is of the form: "This strategy yields Minimax thus can be used on ordinal data, and can be more transparent.


Maximin in philosophy

In philosophy, the term "maximin" is often used in the context of
John Rawls John Bordley Rawls (; February 21, 1921 – November 24, 2002) was an American moral, legal and political philosopher in the liberal tradition. Rawls received both the Schock Prize for Logic and Philosophy and the National Humanities Medal in ...
's '' A Theory of Justice,'' where he refers to it in the context of The
Difference Principle "Justice as Fairness: Political not Metaphysical" is an essay by John Rawls, published in 1985. In it he describes his conception of justice. It comprises two main principles of liberty and equality; the second is subdivided into Fair Equality of ...
. Rawls defined this principle as the rule which states that social and economic inequalities should be arranged so that "they are to be of the greatest benefit to the least-advantaged members of society".


See also

*
Alpha–beta pruning Alpha–beta pruning is a search algorithm that seeks to decrease the number of nodes that are evaluated by the minimax algorithm in its search tree. It is an adversarial search algorithm used commonly for machine playing of two-player games ...
* Expectiminimax * Computer chess * Horizon effect *
Lesser of two evils principle The lesser of two evils principle, also referred to as the lesser evil principle and lesser-evilism, is the principle that when faced with selecting from two immoral options, the least immoral one should be chosen. The principle is sometimes rec ...
*
Minimax Condorcet In voting systems, the Minimax Condorcet method (often referred to as "the Minimax method") is one of several Condorcet methods used for tabulating votes and determining a winner when using ranked voting in a single-winner election. It is sometim ...
*
Minimax regret In decision theory, on making choice, decisions under uncertainty—should information about the best course of action arrive ''after'' taking a fixed decision—the human emotional response of regret is often experienced, and can be measured as th ...
* Monte Carlo tree search *
Negamax Negamax search is a variant form of minimax search that relies on the zero-sum property of a two-player game. This algorithm relies on the fact that to simplify the implementation of the minimax algorithm. More precisely, the value of a position ...
*
Negascout Principal variation search (sometimes equated with the practically identical NegaScout) is a negamax algorithm that can be faster than alpha-beta pruning. Like alpha-beta pruning, NegaScout is a directional search algorithm for computing the minima ...
* Sion's minimax theorem *
Tit for Tat Tit for tat is an English saying meaning "equivalent retaliation". It developed from "tip for tap", first recorded in 1558. It is also a highly effective strategy in game theory. An intelligent agent, agent using this strategy will first coope ...
*
Transposition table {{no footnotes, date=November 2017 A transposition table is a cache of previously seen positions, and associated evaluations, in a game tree generated by a computer game playing program. If a position recurs via a different sequence of moves, the ...
*
Wald's maximin model In decision theory and game theory, Wald's maximin model is a non-probabilistic decision-making model according to which decisions are ranked on the basis of their worst-case outcomes – the optimal decision is one with the least bad worst outco ...


References


External links

* * — A visualization applet * * — Play a betting-and-bluffing game against a mixed minimax strategy * * — game tree solving (Java Applet), for balance or off-balance trees, with or without alpha-beta pruning) algorithm visualization * — Tutorial with a numerical solution platform * — Java implementation used in a
Checkers Checkers (American English), also known as draughts (; British English), is a group of strategy board games for two players which involve diagonal moves of uniform game pieces and mandatory captures by jumping over opponent pieces. Checkers ...
game * — Strategy Game Programming for board games such as
Checkers Checkers (American English), also known as draughts (; British English), is a group of strategy board games for two players which involve diagonal moves of uniform game pieces and mandatory captures by jumping over opponent pieces. Checkers ...
and Chess {{Decision theory Detection theory Game artificial intelligence Graph algorithms Optimization algorithms and methods Search algorithms Game theory Theorems in discrete mathematics Decision theory Fixed points (mathematics) Articles with example pseudocode