The Prisoner's Dilemma is an example of a game analyzed in

game theory Game theory is the study of mathematical models of strategic interactions among rational agents. Myerson, Roger B. (1991). ''Game Theory: Analysis of Conflict,'' Harvard University Press, p.&nbs1 Chapter-preview links, ppvii–xi It has appli ...

. It is also a thought experiment that challenges two completely

rational Rationality is the quality of being guided by or based on reasons. In this regard, a person acts rationally if they have a good reason for what they do or a belief is rational if it is based on strong evidence. This quality can apply to an abi ...

agents to a dilemma: cooperate with their partner for mutual reward, or betray their partner ("defect") for individual reward. This dilemma was originally framed by

Merrill Flood Merrill Meeks Flood (1908 – 1991) was an American mathematician, notable for developing, with Melvin Dresher, the basis of the game theoretical Prisoner's dilemma model of cooperation and conflict while being at RAND in 1950 ( Albert W. Tucker ...

and

Melvin Dresher Melvin Dresher (born Dreszer; March 13, 1911 – June 4, 1992) was a Polish-born American mathematician, notable for developing, with Merrill Flood, the game theoretical model of cooperation and conflict known as the Prisoner's dilemma while at ...

while working at

RAND The RAND Corporation (from the phrase "research and development") is an American nonprofit global policy think tank created in 1948 by Douglas Aircraft Company to offer research and analysis to the United States Armed Forces. It is finan ...

in 1950. Albert W. Tucker appropriated the game and formalized it by structuring the rewards in terms of prison sentences and named it "prisoner's dilemma".

William Poundstone William Poundstone is an American author, columnist, and skeptic. He has written a number of books including the ''Big Secrets'' series and a biography of Carl Sagan. Early life and education Poundstone attended MIT and studied physics. Personal ...

in his 1993 book ''Prisoner's Dilemma'' writes the following version:

Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of speaking to or exchanging messages with the other. The police admit they don't have enough evidence to convict the pair on the principal charge. They plan to sentence both to two years in prison on a lesser charge. Simultaneously, the police offer each prisoner a
Faustian bargain Faust is the protagonist of a classic German legend based on the historical Johann Georg Faust ( 1480–1540). The erudite Faust is highly successful yet dissatisfied with his life, which leads him to make a pact with the Devil at a crossroad ...
.

The possible outcomes are: * A: If A and B each betray the other, each of them serves 5 years in prison * B: If A betrays B but B remains silent, A will be set free and B will serve 10 years in prison * C: If A remains silent but B betrays A, A will serve 10 years in prison and B will be set free * D: If A and B both remain silent, both of them will serve 2 years in prison (on the lesser charge). As a projection of rational behaviour in terms of loyalty to one's partner in crime, the Prisoner's Dilemma suggests that criminals who are offered a greater reward will betray their partner for the reward. Accepting offers such as B shows that loyalty to one's partner is, in this game, irrational. Rationality in a system that profits from others, governs this behaviour. Alternative ideas governing behaviour have been proposed, see for example,

Elinor Ostrom Elinor Claire "Lin" Ostrom (née Awan; August 7, 1933 – June 12, 2012) was an American political scientist and political economist whose work was associated with New Institutional Economics and the resurgence of political economy. In 2009, ...

a Nobel Laureate in Economics. An assumption of the Prisoner's Dilemma is that all purely rational behaviour is self-interested, and as such people will betray each other for self-interest. This implies the only possible outcome for two purely rational prisoners is for them to betray each other, even though mutual cooperation would yield a greater reward. In this case, "to betray" is the

dominant strategy In game theory, strategic dominance (commonly called simply dominance) occurs when one strategy is better than another strategy for one player, no matter how that player's opponents may play. Many simple games can be solved using dominance. The ...

for both players, meaning it is the player's best response in all circumstances, and it is aligned with the

sure-thing principle In decision theory, the sure-thing principle states that a decision maker who decided they would take a certain action in the case that event ''E'' has occurred, as well as in the case that the negation of ''E'' has occurred, should also take that ...

. The prisoner's dilemma also illustrates that the decisions made under collective rationality may not necessarily be the same as those made under individual rationality, and this conflict can also be witnessed in a situation called the "

Tragedy of the Commons Tragedy (from the grc-gre, τραγῳδία, ''tragōidia'', ''tragōidia'') is a genre of drama based on human suffering and, mainly, the terrible or sorrowful events that befall a main character. Traditionally, the intention of tragedy i ...

". This case indicates the fact that public goods are always prone to over-use. In reality, such

systemic bias Systemic bias, also called institutional bias, and related to structural bias, is the inherent tendency of a process to support particular outcomes. The term generally refers to human systems such as institutions. Institutional bias and structur ...

towards cooperative behavior happens despite what is predicted by simple models of "rational" self-interested action. This bias towards cooperation has been known since the test was first conducted at RAND; the secretaries involved trusted each other and worked together for the best common outcome. The prisoner's dilemma became the focus of extensive experimental research. This experimental research usually takes one of three forms: single play, iterated play and iterated play against a programmed player, each with different purposes. And as a summary of these experiments, their results justify the

categorical imperative The categorical imperative (german: kategorischer Imperativ) is the central philosophical concept in the deontological moral philosophy of Immanuel Kant. Introduced in Kant's 1785 '' Groundwork of the Metaphysic of Morals'', it is a way of eva ...

raised by

Immanuel Kant Immanuel Kant (, , ; 22 April 1724 – 12 February 1804) was a German philosopher and one of the central Enlightenment thinkers. Born in Königsberg, Kant's comprehensive and systematic works in epistemology, metaphysics, ethics, and ...

, which states that a rational agent is expected to "act in the way you wish others to act." This theory is vital for a situation when there are different players each acting for their best interest, and has to take others' actions into consideration to form their own choice. It underlines the interconnectedness of players in such a game, thus stressing the fact that a strategy has to consider others' reactions to be successful, including their responsiveness, their tendency to imitate, etc. An extended "iterated" version of the game also exists. In this version, the classic game is played repeatedly between the same prisoners, who continuously have the opportunity to penalize the other for previous decisions. If the number of times the game will be played is known to the players, then by

backward induction Backward induction is the process of reasoning backwards in time, from the end of a problem or situation, to determine a sequence of optimal actions. It proceeds by examining the last point at which a decision is to be made and then identifying wha ...

two classically rational players will betray each other repeatedly, for the same reasons as the single-shot variant. In an infinite or unknown length game there is no fixed optimum strategy, and prisoner's dilemma tournaments have been held to compete and test algorithms for such cases. The iterated version of the prisoner's dilemma is of particular interest to researchers. Due to its iterative nature, previous researchers observed that the frequency for players to cooperate could change, based on the outcomes of each iteration. Specifically, a player may be less willing to cooperate if their counterpart did not cooperate many times, which renders disappointment. Conversely, as time goes by, cooperation could increase, mainly attributable to the fact that a "tacit agreement" between players has been set up. Another interesting aspect concerning the iterated version of experiment, however, is that this tacit agreement between players has always been established successfully even though the number of iterations is made public to both sides. The prisoner's dilemma game can be used as a model for many real world situations involving cooperative behavior. In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games: for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it difficult or expensive—not necessarily impossible—to coordinate their activities.

Strategy for the prisoner's dilemma

Two prisoners are separated into individual rooms and cannot communicate with each other. The normal game is shown below: It is assumed that both prisoners understand the nature of the game, have no loyalty to each other, and will have no opportunity for retribution or reward outside the game. Regardless of what the other decides, each prisoner gets a higher reward by betraying the other ("defecting"). The reasoning involves analyzing both players'

best response In game theory, the best response is the strategy (or strategies) which produces the most favorable outcome for a player, taking other players' strategies as given (; ). The concept of a best response is central to John Nash's best-known contribu ...

s: B will either cooperate or defect. If B cooperates, A should defect, because going free is better than serving 2 years. If B defects, A should also defect, because serving 5 years is better than serving 10. So either way, A should defect since defecting is A's best response regardless of B's strategy. Parallel reasoning will show that B should defect. Because defection always results in a better payoff than cooperation regardless of the other player's choice, it is a strictly

for both A and B. Mutual defection is the only strong

Nash equilibrium In game theory, the Nash equilibrium, named after the mathematician John Nash, is the most common way to define the solution of a non-cooperative game involving two or more players. In a Nash equilibrium, each player is assumed to know the equili ...

in the game (i.e. the only outcome from which each player could only do worse by unilaterally changing strategy). The dilemma, then, is that mutual cooperation yields a better outcome than mutual defection but is not the rational outcome because the choice to cooperate, from a self-interested perspective, is irrational. Thus, Prisoner's dilemma is a game where the

is not

Pareto efficient Pareto efficiency or Pareto optimality is a situation where no action or allocation is available that makes one individual better off without making another worse off. The concept is named after Vilfredo Pareto (1848–1923), Italian civil engin ...

Generalized form

The structure of the traditional prisoner's dilemma can be generalized from its original prisoner setting. Suppose that the two players are represented by the colors red and blue and that each player chooses to either "cooperate" (stay silent) or "defect" (betray). If both players cooperate, they both receive the reward ''R'' for cooperating. If both players defect, they both receive the punishment payoff ''P''. If Blue defects while Red cooperates, then Blue receives the temptation payoff ''T'', while Red receives the "sucker's" payoff, ''S''. Similarly, if Blue cooperates while Red defects, then Blue receives the sucker's payoff ''S'', while Red receives the temptation payoff ''T''. This can be expressed in normal form: and to be a prisoner's dilemma game in the strong sense, the following condition must hold for the payoffs: : The payoff relationship implies that mutual cooperation is superior to mutual defection, while the payoff relationships and imply that defection is the

for both agents.

Special case: donation game

The "donation game" is a form of prisoner's dilemma in which cooperation corresponds to offering the other player a benefit ''b'' at a personal cost ''c'' with ''b'' > ''c''. Defection means offering nothing. The payoff matrix is thus Note that (i.e. ) which qualifies the donation game to be an iterated game (see next section). The donation game may be applied to markets. Suppose X grows oranges, Y grows apples. The marginal utility of an apple to the orange-grower X is ''b'', which is higher than the marginal utility (''c'') of an orange, since X has a surplus of oranges and no apples. Similarly, for apple-grower Y, the marginal utility of an orange is ''b'' while the marginal utility of an apple is ''c''. If X and Y contract to exchange an apple and an orange, and each fulfills their end of the deal, then each receive a payoff of ''b''-''c''. If one "defects" and does not deliver as promised, the defector will receive a payoff of ''b'', while the cooperator will lose ''c''. If both defect, then neither one gains or loses anything.

The iterated prisoner's dilemma

If two players play prisoner's dilemma more than once in succession and they remember previous actions of their opponent and change their strategy accordingly, the game is called iterated prisoner's dilemma. In addition to the general form above, the iterative version also requires that , to prevent alternating cooperation and defection giving a greater reward than mutual cooperation. The iterated prisoner's dilemma game is fundamental to some theories of human cooperation and trust. On the assumption that the game can model transactions between two people requiring trust, cooperative behavior in populations may be modeled by a multi-player, iterated, version of the game. It has, consequently, fascinated many scholars over the years. In 1975, Grofman and Pool estimated the count of scholarly articles devoted to it at over 2,000. The iterated prisoner's dilemma has also been referred to as the " peace-war game". If the game is played exactly ''N'' times and both players know this, then the dominant strategy is to defect in all rounds. The only possible

is to always defect. The proof is inductive: one might as well defect on the last turn, since the opponent will not have a chance to later retaliate. Therefore, both will defect on the last turn. Thus, the player might as well defect on the second-to-last turn, since the opponent will defect on the last no matter what is done, and so on. The same applies if the game length is unknown but has a known upper limit. Unlike the standard prisoner's dilemma, in the iterated prisoner's dilemma the defection strategy is counter-intuitive and fails badly to predict the behavior of human players. Within standard economic theory, though, this is the only correct answer. The superrational strategy in the iterated prisoner's dilemma with fixed ''N'' is to cooperate against a superrational opponent, and in the limit of large ''N'', experimental results on strategies agree with the superrational version, not the game-theoretic rational one. For

cooperation Cooperation (written as co-operation in British English) is the process of groups of organisms working or acting together for common, mutual, or some underlying benefit, as opposed to working in competition for selfish benefit. Many animal a ...

to emerge between game theoretic rational players, the total number of rounds ''N'' must be unknown to the players. In this case "always defect" may no longer be a strictly dominant strategy, only a Nash equilibrium. Amongst results shown by

Robert Aumann Robert John Aumann (Hebrew name: , Yisrael Aumann; born June 8, 1930) is an Israeli-American mathematician, and a member of the United States National Academy of Sciences. He is a professor at the Center for the Study of Rationality in the Hebrew ...

in a 1959 paper, rational players repeatedly interacting for indefinitely long games can sustain the cooperative outcome. According to a 2019 experimental study in the ''American Economic Review'' which tested what strategies real-life subjects used in iterated prisoners' dilemma situations with perfect monitoring, the majority of chosen strategies were always to defect,

tit-for-tat Tit for tat is an English saying meaning "equivalent retaliation". It developed from "tip for tap", first recorded in 1558. It is also a highly effective strategy in game theory. An agent using this strategy will first cooperate, then subseque ...

, and

grim trigger In game theory, grim trigger (also called the grim strategy or just grim) is a trigger strategy for a repeated game. Initially, a player using grim trigger will cooperate, but as soon as the opponent defects (thus satisfying the trigger condition) ...

. Which strategy the subjects chose depended on the parameters of the game.

Strategy for the iterated prisoner's dilemma

Interest in the iterated prisoner's dilemma (IPD) was kindled by

Robert Axelrod Robert Marshall Axelrod (born May 27, 1943) is an American political scientist. He is Professor of Political Science and Public Policy at the University of Michigan where he has been since 1974. He is best known for his interdisciplinary work o ...

in his book ''

The Evolution of Cooperation ''The Evolution of Cooperation'' is a 1984 book written by political scientist Robert Axelrod that expands upon paper of the same name written by Axelrod and evolutionary biologist W.D. Hamilton. The book details a theory on the emergence of co ...

'' (1984). In it he reports on a tournament he organized of the ''N'' step prisoner's dilemma (with ''N'' fixed) in which participants have to choose their mutual strategy again and again, and have memory of their previous encounters. Axelrod invited academic colleagues all over the world to devise computer strategies to compete in an IPD tournament. The programs that were entered varied widely in algorithmic complexity, initial hostility, capacity for forgiveness, and so forth. Axelrod discovered that when these encounters were repeated over a long period of time with many players, each with different strategies, greedy strategies tended to do very poorly in the long run while more

altruistic Altruism is the principle and moral practice of concern for the welfare and/or happiness of other human beings or animals, resulting in a quality of life both material and spiritual. It is a traditional virtue in many cultures and a core asp ...

strategies did better, as judged purely by self-interest. He used this to show a possible mechanism for the evolution of altruistic behavior from mechanisms that are initially purely selfish, by

natural selection Natural selection is the differential survival and reproduction of individuals due to differences in phenotype. It is a key mechanism of evolution, the change in the heritable traits characteristic of a population over generations. Charle ...

. The winning

deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and consi ...

strategy was

tit for tat Tit for tat is an English saying meaning "equivalent retaliation". It developed from "tip for tap", first recorded in 1558. It is also a highly effective strategy in game theory. An intelligent agent, agent using this strategy will first coope ...

, which

Anatol Rapoport Anatol Rapoport ( uk, Анатолій Борисович Рапопо́рт; russian: Анато́лий Бори́сович Рапопо́рт; May 22, 1911January 20, 2007) was an American mathematical psychologist. He contributed to genera ...

developed and entered into the tournament. It was the simplest of any program entered, containing only four lines of

BASIC BASIC (Beginners' All-purpose Symbolic Instruction Code) is a family of general-purpose, high-level programming languages designed for ease of use. The original version was created by John G. Kemeny and Thomas E. Kurtz at Dartmouth College ...

, and won the contest. The strategy is simply to cooperate on the first iteration of the game; after that, the player does what his or her opponent did on the previous move. Depending on the situation, a slightly better strategy can be "tit for tat with forgiveness". When the opponent defects, on the next move, the player sometimes cooperates anyway, with a small probability (around 1–5%). This allows for occasional recovery from getting trapped in a cycle of defections. The exact probability depends on the line-up of opponents. By analysing the top-scoring strategies, Axelrod stated several conditions necessary for a strategy to be successful. ; Nice: The most important condition is that the strategy must be "nice", that is, it will not defect before its opponent does (this is sometimes referred to as an "optimistic" algorithm). Almost all of the top-scoring strategies were nice. A purely selfish strategy will not "cheat" on its opponent, for purely self-interested reasons first. ; Retaliating: However, Axelrod contended, the successful strategy must not be a blind optimist. It must sometimes retaliate. An example of a non-retaliating strategy is Always Cooperate. This is a very bad choice, as "nasty" strategies will ruthlessly exploit such players. ; Forgiving: Successful strategies must also be forgiving. Though players will retaliate, they will once again fall back to cooperating if the opponent does not continue to defect. This stops long runs of revenge and counter-revenge, maximizing points. ; Non-envious: The last quality is being non-envious, that is not striving to score more than the opponent. The optimal (points-maximizing) strategy for the one-time PD game is simply defection; as explained above, this is true whatever the composition of opponents may be. However, in the iterated-PD game the optimal strategy depends upon the strategies of likely opponents, and how they will react to defections and cooperations. For example, consider a population where everyone defects every time, except for a single individual following the tit for tat strategy. That individual is at a slight disadvantage because of the loss on the first turn. In such a population, the optimal strategy for that individual is to defect every time. In a population with a certain percentage of always-defectors and the rest being tit for tat players, the optimal strategy for an individual depends on the percentage, and on the length of the game. In the strategy called Pavlov, win-stay, lose-switch, faced with a failure to cooperate, the player switches strategy the next turn. In certain circumstances, Pavlov beats all other strategies by giving preferential treatment to co-players using a similar strategy. Deriving the optimal strategy is generally done in two ways: * Bayesian Nash equilibrium: If the statistical distribution of opposing strategies can be determined (e.g. 50% tit for tat, 50% always cooperate) an optimal counter-strategy can be derived analytically. *

Monte Carlo Monte Carlo (; ; french: Monte-Carlo , or colloquially ''Monte-Carl'' ; lij, Munte Carlu ; ) is officially an administrative area of the Principality of Monaco, specifically the ward of Monte Carlo/Spélugues, where the Monte Carlo Casino is ...

simulations of populations have been made, where individuals with low scores die off, and those with high scores reproduce (a

genetic algorithm In computer science and operations research, a genetic algorithm (GA) is a metaheuristic inspired by the process of natural selection that belongs to the larger class of evolutionary algorithms (EA). Genetic algorithms are commonly used to gene ...

for finding an optimal strategy). The mix of algorithms in the final population generally depends on the mix in the initial population. The introduction of mutation (random variation during reproduction) lessens the dependency on the initial population; empirical experiments with such systems tend to produce tit for tat players (see for instance Chess 1988), but no analytic proof exists that this will always occur. Although tit for tat is considered to be the most

robust Robustness is the property of being strong and healthy in constitution. When it is transposed into a system, it refers to the ability of tolerating perturbations that might affect the system’s functional body. In the same line ''robustness'' ca ...

basic strategy, a team from

Southampton University , mottoeng = The Heights Yield to Endeavour , type = Public research university , established = 1862 – Hartley Institution1902 – Hartley University College1913 – Southampton University Coll ...

in England introduced a new strategy at the 20th-anniversary iterated prisoner's dilemma competition, which proved to be more successful than tit for tat. This strategy relied on collusion between programs to achieve the highest number of points for a single program. The university submitted 60 programs to the competition, which were designed to recognize each other through a series of five to ten moves at the start. Once this recognition was made, one program would always cooperate and the other would always defect, assuring the maximum number of points for the defector. If the program realized that it was playing a non-Southampton player, it would continuously defect in an attempt to minimize the score of the competing program. As a result, the 2004 Prisoners' Dilemma Tournament results show

University of Southampton , mottoeng = The Heights Yield to Endeavour , type = Public research university , established = 1862 – Hartley Institution1902 – Hartley University College1913 – Southampton University Coll ...

's strategies in the first three places, despite having fewer wins and many more losses than the GRIM strategy. (In a PD tournament, the aim of the game is not to "win" matches – that can easily be achieved by frequent defection). This strategy ended up taking the top three positions in the competition, as well as a number of positions towards the bottom. The Southampton strategy takes advantage of the fact that multiple entries were allowed in this particular competition and that the performance of a team was measured by that of the highest-scoring player (meaning that the use of self-sacrificing players was a form of

minmaxing This list includes terms used in video games and the video game industry, as well as slang used by players. 0–9 A ...

). In a competition where one has control of only a single player, tit for tat is certainly a better strategy. Because of this new rule, this competition also has little theoretical significance when analyzing single agent strategies as compared to Axelrod's seminal tournament. However, it provided a basis for analysing how to achieve cooperative strategies in multi-agent frameworks, especially in the presence of noise. In fact, long before this new-rules tournament was played, Dawkins, in his book ''

The Selfish Gene ''The Selfish Gene'' is a 1976 book on evolution by the ethologist Richard Dawkins, in which the author builds upon the principal theory of George C. Williams's '' Adaptation and Natural Selection'' (1966). Dawkins uses the term "selfish gen ...

'', pointed out the possibility of such strategies winning if multiple entries were allowed, but he remarked that most probably Axelrod would not have allowed them if they had been submitted. It also relies on circumventing rules about the prisoner's dilemma in that there is no communication allowed between the two players, which the Southampton programs arguably did with their preprogrammed "ten move dance" to recognize one another; this only reinforces just how valuable communication can be in shifting the balance of the game. Even without implicit collusion between software strategies (exploited by the Southampton team) tit for tat is not always the absolute winner of any given tournament; it would be more precise to say that its long run results over a series of tournaments outperform its rivals. (In any one event a given strategy can be slightly better adjusted to the competition than tit for tat, but tit for tat is more robust). The same applies for the tit for tat with forgiveness variant, and other optimal strategies: on any given day they might not "win" against a specific mix of counter-strategies. An alternative way of putting it is using the Darwinian

ESS The suffix ''-ess'' (plural ''-esses'') appended to English words makes a female form of the word. ESS or ess may refer to: Education * Ernestown Secondary School, in Odessa, Ontario * European Standard School, in Dhaka, Bangladesh Governmen ...

simulation. In such a simulation, tit for tat will almost always come to dominate, though nasty strategies will drift in and out of the population because a tit for tat population is penetrable by non-retaliating nice strategies, which in turn are easy prey for the nasty strategies.

Richard Dawkins Richard Dawkins (born 26 March 1941) is a British evolutionary biologist and author. He is an emeritus fellow of New College, Oxford and was Professor for Public Understanding of Science in the University of Oxford from 1995 to 2008. An ath ...

showed that here, no static mix of strategies form a stable equilibrium and the system will always oscillate between bounds.

Stochastic iterated prisoner's dilemma

In a stochastic iterated prisoner's dilemma game, strategies are specified by in terms of "cooperation probabilities". In an encounter between player ''X'' and player ''Y'', ''X''s strategy is specified by a set of probabilities ''P'' of cooperating with ''Y''. ''P'' is a function of the outcomes of their previous encounters or some subset thereof. If ''P'' is a function of only their most recent ''n'' encounters, it is called a "memory-n" strategy. A memory-1 strategy is then specified by four cooperation probabilities:

P=\

, where

P_

is the probability that ''X'' will cooperate in the present encounter given that the previous encounter was characterized by (ab). For example, if the previous encounter was one in which ''X'' cooperated and ''Y'' defected, then

P_

is the probability that ''X'' will cooperate in the present encounter. If each of the probabilities are either 1 or 0, the strategy is called deterministic. An example of a deterministic strategy is the tit-for-tat strategy written as ''P''=, in which ''X'' responds as ''Y'' did in the previous encounter. Another is the

win–stay, lose–switch In psychology, game theory, statistics, and machine learning, win–stay, lose–switch (also win–stay, lose–shift) is a heuristic learning strategy used to model learning in decision situations. It was first invented as an improvement over r ...

strategy written as ''P''=, in which ''X'' responds as in the previous encounter, if it was a "win" (i.e., cc or dc) but changes strategy if it was a loss (i.e., cd or dd). It has been shown that for any memory-n strategy there is a corresponding memory-1 strategy that gives the same statistical results, so that only memory-1 strategies need be considered. If we define ''P'' as the above 4-element strategy vector of ''X'' and

Q=\

as the 4-element strategy vector of ''Y'', a transition matrix ''M'' may be defined for ''X'' whose ''ij'' th entry is the probability that the outcome of a particular encounter between ''X'' and ''Y'' will be ''j'' given that the previous encounter was ''i'', where ''i'' and ''j'' are one of the four outcome indices: ''cc'', ''cd'', ''dc'', or ''dd''. For example, from ''X''s point of view, the probability that the outcome of the present encounter is ''cd'' given that the previous encounter was ''cd'' is equal to

M_=P_(1-Q_)

. (The indices for ''Q'' are from ''Y''s point of view: a ''cd'' outcome for ''X'' is a ''dc'' outcome for ''Y''.) Under these definitions, the iterated prisoner's dilemma qualifies as a

stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that appea ...

and ''M'' is a

stochastic matrix In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, ...

, allowing all of the theory of stochastic processes to be applied. One result of stochastic theory is that there exists a stationary vector ''v'' for the matrix ''M'' such that

v\cdot M=v

. Without loss of generality, it may be specified that ''v'' is normalized so that the sum of its four components is unity. The ''ij'' th entry in

M^n

will give the probability that the outcome of an encounter between ''X'' and ''Y'' will be ''j'' given that the encounter ''n'' steps previous is ''i''. In the limit as ''n'' approaches infinity, ''M'' will converge to a matrix with fixed values, giving the long-term probabilities of an encounter producing ''j'' which will be independent of ''i''. In other words, the rows of

M^\infty

will be identical, giving the long-term equilibrium result probabilities of the iterated prisoners dilemma without the need to explicitly evaluate a large number of interactions. It can be seen that ''v'' is a stationary vector for

M^n

and particularly

M^\infty

, so that each row of

M^\infty

will be equal to ''v''. Thus the stationary vector specifies the equilibrium outcome probabilities for ''X''. Defining

S_x=\

and

S_y=\

as the short-term payoff vectors for the outcomes (From ''X''s point of view), the equilibrium payoffs for ''X'' and ''Y'' can now be specified as

s_x=v\cdot S_x

and

s_y=v\cdot S_y

, allowing the two strategies ''P'' and ''Q'' to be compared for their long term payoffs.

Zero-determinant strategies

In 2012, William H. Press and

Freeman Dyson Freeman John Dyson (15 December 1923 – 28 February 2020) was an English-American theoretical physicist and mathematician known for his works in quantum field theory, astrophysics, random matrices, mathematical formulation of quantum m ...

published a new class of strategies for the stochastic iterated prisoner's dilemma called "zero-determinant" (ZD) strategies. The long term payoffs for encounters between ''X'' and ''Y'' can be expressed as the determinant of a matrix which is a function of the two strategies and the short term payoff vectors:

s_x=D(P,Q,S_x)

and

s_y=D(P,Q,S_y)

, which do not involve the stationary vector ''v''. Since the determinant function

s_y=D(P,Q,f)

is linear in ''f'', it follows that

\alpha s_x+\beta s_y+\gamma=D(P,Q,\alpha S_x+\beta S_y+\gamma U)

(where ''U''=). Any strategies for which

D(P,Q,\alpha S_x+\beta S_y+\gamma U)=0

is by definition a ZD strategy, and the long term payoffs obey the relation

\alpha s_x+\beta s_y+\gamma=0

. Tit-for-tat is a ZD strategy which is "fair" in the sense of not gaining advantage over the other player. However, the ZD space also contains strategies that, in the case of two players, can allow one player to unilaterally set the other player's score or alternatively, force an evolutionary player to achieve a payoff some percentage lower than his own. The extorted player could defect but would thereby hurt himself by getting a lower payoff. Thus, extortion solutions turn the iterated prisoner's dilemma into a sort of

ultimatum game The ultimatum game is a game that has become a popular instrument of economic experiments. An early description is by Nobel laureate John Harsanyi in 1961. One player, the proposer, is endowed with a sum of money. The proposer is tasked with s ...

. Specifically, ''X'' is able to choose a strategy for which

D(P,Q,\beta S_y+\gamma U)=0

, unilaterally setting

s_y

to a specific value within a particular range of values, independent of ''Y''s strategy, offering an opportunity for ''X'' to "extort" player ''Y'' (and vice versa). (It turns out that if ''X'' tries to set

s_x

to a particular value, the range of possibilities is much smaller, only consisting of complete cooperation or complete defection.) An extension of the IPD is an evolutionary stochastic IPD, in which the relative abundance of particular strategies is allowed to change, with more successful strategies relatively increasing. This process may be accomplished by having less successful players imitate the more successful strategies, or by eliminating less successful players from the game, while multiplying the more successful ones. It has been shown that unfair ZD strategies are not evolutionarily stable. The key intuition is that an evolutionarily stable strategy must not only be able to invade another population (which extortionary ZD strategies can do) but must also perform well against other players of the same type (which extortionary ZD players do poorly because they reduce each other's surplus). Theory and simulations confirm that beyond a critical population size, ZD extortion loses out in evolutionary competition against more cooperative strategies, and as a result, the average payoff in the population increases when the population is larger. In addition, there are some cases in which extortioners may even catalyze cooperation by helping to break out of a face-off between uniform defectors and

agents. While extortionary ZD strategies are not stable in large populations, another ZD class called "generous" strategies ''is'' both stable and robust. In fact, when the population is not too small, these strategies can supplant any other ZD strategy and even perform well against a broad array of generic strategies for iterated prisoner's dilemma, including win–stay, lose–switch. This was proven specifically for the donation game by Alexander Stewart and Joshua Plotkin in 2013. Generous strategies will cooperate with other cooperative players, and in the face of defection, the generous player loses more utility than its rival. Generous strategies are the intersection of ZD strategies and so-called "good" strategies, which were defined by Akin (2013) to be those for which the player responds to past mutual cooperation with future cooperation and splits expected payoffs equally if he receives at least the cooperative expected payoff. Among good strategies, the generous (ZD) subset performs well when the population is not too small. If the population is very small, defection strategies tend to dominate.

Continuous iterated prisoner's dilemma

Most work on the iterated prisoner's dilemma has focused on the discrete case, in which players either cooperate or defect, because this model is relatively simple to analyze. However, some researchers have looked at models of the continuous iterated prisoner's dilemma, in which players are able to make a variable contribution to the other player. Le and Boyd found that in such situations, cooperation is much harder to evolve than in the discrete iterated prisoner's dilemma. The basic intuition for this result is straightforward: in a continuous prisoner's dilemma, if a population starts off in a non-cooperative equilibrium, players who are only marginally more cooperative than non-cooperators get little benefit from assorting with one another. By contrast, in a discrete prisoner's dilemma, tit-for-tat cooperators get a big payoff boost from assorting with one another in a non-cooperative equilibrium, relative to non-cooperators. Since nature arguably offers more opportunities for variable cooperation rather than a strict dichotomy of cooperation or defection, the continuous prisoner's dilemma may help explain why real-life examples of tit-for-tat-like cooperation are extremely rare in nature (ex. Hammerstein) even though tit for tat seems robust in theoretical models.

Emergence of stable strategies

Players cannot seem to coordinate mutual cooperation, thus often get locked into the inferior yet stable strategy of defection. In this way, iterated rounds facilitate the evolution of stable strategies. Iterated rounds often produce novel strategies, which have implications to complex social interaction. One such strategy is win-stay lose-shift. This strategy outperforms a simple Tit-For-Tat strategy – that is, if you can get away with cheating, repeat that behavior. However if you get caught, switch. The only problem of this tit-for-tat strategy is that they are vulnerable to signal error. The problem arises when one individual cheats in retaliation but the other interprets it as cheating. As a result of this, the second individual now cheats, and then it starts a see-saw pattern of cheating in a chain reaction. Even without repeated games, strong

enlightened self-interest Enlightened self-interest is a philosophy in ethics which states that persons who act to further the interests of others (or the interests of the group or groups to which they belong), ultimately serve their own self-interest. It has often been ...

can result in a stable and efficient outcome.

Real-life examples

The prisoner setting may seem contrived, but there are in fact many examples in human interaction as well as interactions in nature that have the same payoff matrix. The prisoner's dilemma is therefore of interest to the

social science Social science is one of the branches of science, devoted to the study of societies and the relationships among individuals within those societies. The term was formerly used to refer to the field of sociology, the original "science of soc ...

s such as

economics Economics () is the social science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and intera ...

politics Politics (from , ) is the set of activities that are associated with making decisions in groups, or other forms of power relations among individuals, such as the distribution of resources or status. The branch of social science that studies ...

, and

sociology Sociology is a social science that focuses on society, human social behavior, patterns of Interpersonal ties, social relationships, social interaction, and aspects of culture associated with everyday life. It uses various methods of Empirical ...

, as well as to the biological sciences such as

ethology Ethology is the scientific study of animal behaviour, usually with a focus on behaviour under natural conditions, and viewing behaviour as an evolutionarily adaptive trait. Behaviourism as a term also describes the scientific and objectiv ...

and

evolutionary biology Evolutionary biology is the subfield of biology that studies the evolutionary processes (natural selection, common descent, speciation) that produced the diversity of life on Earth. It is also defined as the study of the history of life fo ...

. Many natural processes have been abstracted into models in which living beings are engaged in endless games of prisoner's dilemma. This wide applicability of the PD gives the game its substantial importance.

Environmental studies

environmental studies Environmental studies is a multidisciplinary academic field which systematically studies human interaction with the environment. Environmental studies connects principles from the physical sciences, commerce/economics, the humanities, and social ...

, the PD is evident in crises such as global

climate-change In common usage, climate change describes global warming—the ongoing increase in global average temperature—and its effects on Earth's climate system. Climate variability and change, Climate change in a broader sense also includes ...

. It is argued all countries will benefit from a stable climate, but any single country is often hesitant to curb emissions. The immediate benefit to any one country from maintaining current behavior is perceived to be greater than the purported eventual benefit to that country if all countries' behavior was changed, therefore explaining the impasse concerning climate-change in 2007. An important difference between climate-change politics and the prisoner's dilemma is uncertainty; the extent and pace at which pollution can change climate is not known. The dilemma faced by governments is therefore different from the prisoner's dilemma in that the payoffs of cooperation are unknown. This difference suggests that states will cooperate much less than in a real iterated prisoner's dilemma, so that the probability of avoiding a possible climate catastrophe is much smaller than that suggested by a game-theoretical analysis of the situation using a real iterated prisoner's dilemma. Osang and Nandy (2003) provide a theoretical explanation with proofs for a regulation-driven win-win situation along the lines of

Michael Porter Michael Eugene Porter (born May 23, 1947) is an American academic known for his theories on economics, business strategy, and social causes. He is the Bishop William Lawrence University Professor at Harvard Business School, and he was one of t ...

's hypothesis, in which government regulation of competing firms is substantial.

Animals

Cooperative behavior of many animals can be understood as an example of the prisoner's dilemma. Often animals engage in long-term partnerships, which can be more specifically modeled as iterated prisoner's dilemma. For example,

guppies The guppy (), also known as millionfish and rainbow fish, is one of the world's most widely distributed tropical fish and one of the most popular List of freshwater aquarium fish species, freshwater aquarium fish species. It is a member of the ...

inspect predators cooperatively in groups, and they are thought to punish non-cooperative inspectors.

Vampire bats Vampire bats, species of the subfamily Desmodontinae, are leaf-nosed bats found in Central and South America. Their food source is blood of other animals, a dietary trait called hematophagy. Three extant bat species feed solely on blood: the com ...

are social animals that engage in reciprocal food exchange. Applying the payoffs from the prisoner's dilemma can help explain this behavior: * Cooperate/Cooperate: "Reward: I get blood on my unlucky nights, which saves me from starving. I have to give blood on my lucky nights, which doesn't cost me too much." * Defect/Cooperate: "Temptation: You save my life on my poor night. But then I get the added benefit of not having to pay the slight cost of feeding you on my good night." * Cooperate/Defect: "Sucker's Payoff: I pay the cost of saving your life on my good night. But on my bad night you don't feed me and I run a real risk of starving to death." * Defect/Defect: "Punishment: I don't have to pay the slight costs of feeding you on my good nights. But I run a real risk of starving on my poor nights."

Psychology

addiction Addiction is a neuropsychological disorder characterized by a persistent and intense urge to engage in certain behaviors, one of which is the usage of a drug, despite substantial harm and other negative consequences. Repetitive drug use o ...

research / behavioral economics, George Ainslie points out that addiction can be cast as an intertemporal PD problem between the present and future selves of the addict. In this case, ''defecting'' means ''relapsing'', and it is easy to see that not defecting both today and in the future is by far the best outcome. The case where one abstains today but relapses in the future is the worst outcome – in some sense the discipline and self-sacrifice involved in abstaining today have been "wasted" because the future relapse means that the addict is right back where they started and will have to start over (which is quite demoralizing, and makes starting over more difficult). Relapsing today and tomorrow is a slightly "better" outcome, because while the addict is still addicted, they haven't put the effort in to trying to stop. The final case, where one engages in the addictive behavior today while abstaining "tomorrow" will be familiar to anyone who has struggled with an addiction. The problem here is that (as in other PDs) there is an obvious benefit to defecting "today", but tomorrow one will face the same PD, and the same obvious benefit will be present then, ultimately leading to an endless string of defections.

John Gottman John Mordechai Gottman (born April 26, 1942) is an American psychologist, professor emeritus of psychology at the University of Washington. His work focuses on divorce prediction and marital stability through relationship analyses. The lessons d ...

in his research described in "The Science of Trust" defines good relationships as those where partners know not to enter the (D,D) cell or at least not to get dynamically stuck there in a loop. In

cognitive neuroscience Cognitive neuroscience is the scientific field that is concerned with the study of the biological processes and aspects that underlie cognition, with a specific focus on the neural connections in the brain which are involved in mental proces ...

, fast brain signaling associated with processing different rounds may indicate choices at the next round. Mutual cooperation outcomes entail brain activity changes predictive of how quickly a person will cooperate in kind at the next opportunity; this activity may be linked to basic homeostatic and motivational processes, possibly increasing the likelihood to short-cut into the (C,C) cell of the game.

Economics

The prisoner's dilemma has been called the '' E. coli'' of social psychology, and it has been used widely to research various topics such as

oligopolistic An oligopoly (from Greek ὀλίγος, ''oligos'' "few" and πωλεῖν, ''polein'' "to sell") is a market structure in which a market or industry is dominated by a small number of large sellers or producers. Oligopolies often result fro ...

competition and collective action to produce a collective good. Advertising is sometimes cited as a real-example of the prisoner's dilemma. When

cigarette advertising Nicotine marketing is the marketing of nicotine-containing products or use. Traditionally, the tobacco industry markets cigarette smoking, but it is increasingly marketing other products, such as electronic cigarettes and heated tobacco produ ...

was legal in the United States, competing cigarette manufacturers had to decide how much money to spend on advertising. The effectiveness of Firm A's advertising was partially determined by the advertising conducted by Firm B. Likewise, the profit derived from advertising for Firm B is affected by the advertising conducted by Firm A. If both Firm A and Firm B chose to advertise during a given period, then the advertisement from each firm negates the other's, receipts remain constant, and expenses increase due to the cost of advertising. Both firms would benefit from a reduction in advertising. However, should Firm B choose not to advertise, Firm A could benefit greatly by advertising. Nevertheless, the optimal amount of advertising by one firm depends on how much advertising the other undertakes. As the best strategy is dependent on what the other firm chooses there is no dominant strategy, which makes it slightly different from a prisoner's dilemma. The outcome is similar, though, in that both firms would be better off were they to advertise less than in the equilibrium. Sometimes cooperative behaviors do emerge in business situations. For instance, cigarette manufacturers endorsed the making of laws banning cigarette advertising, understanding that this would reduce costs and increase profits across the industry. This analysis is likely to be pertinent in many other business situations involving advertising. Without enforceable agreements, members of a

cartel A cartel is a group of independent market participants who collude with each other in order to improve their profits and dominate the market. Cartels are usually associations in the same sphere of business, and thus an alliance of rivals. Mos ...

are also involved in a (multi-player) prisoner's dilemma. 'Cooperating' typically means keeping prices at a pre-agreed minimum level. 'Defecting' means selling under this minimum level, instantly taking business (and profits) from other cartel members. Anti-trust authorities want potential cartel members to mutually defect, ensuring the lowest possible prices for

consumer A consumer is a person or a group who intends to order, or uses purchased goods, products, or services primarily for personal, social, family, household and similar needs, who is not directly related to entrepreneurial or business activities. T ...

Sport

Doping in sport In competitive sports, doping is the use of banned athletic performance-enhancing drugs by athletic competitors as a way of cheating in sports. The term ''doping'' is widely used by organizations that regulate sporting competitions. The use of ...

has been cited as an example of a prisoner's dilemma. Two competing athletes have the option to use an illegal and/or dangerous drug to boost their performance. If neither athlete takes the drug, then neither gains an advantage. If only one does, then that athlete gains a significant advantage over their competitor, reduced by the legal and/or medical dangers of having taken the drug. If both athletes take the drug, however, the benefits cancel out and only the dangers remain, putting them both in a worse position than if neither had used doping. In a conversation with Ken Griffey Jr. after the 1998

MLB Major League Baseball (MLB) is a professional baseball organization and the oldest major professional sports league in the world. MLB is composed of 30 total teams, divided equally between the National League (NL) and the American League (AL), ...

season,

Barry Bonds Barry Lamar Bonds (born July 24, 1964) is an American former professional baseball left fielder who played 22 seasons in Major League Baseball (MLB). Bonds was a member of the Pittsburgh Pirates from 1986 to 1992 and the San Francisco Giants f ...

expressed his frustration with other players' use of steroids. Bonds stated "I had a helluva season last year, and nobody gave a crap. Nobody. As much as I've complained about McGwire and Canseco and all of the bull with steroids, I'm tired of fighting it. I turn 35 this year. I've got three or four good seasons left, and I wanna get paid. I'm just gonna start using some hard-core stuff, and hopefully it won't hurt my body. Then I'll get out of the game and be done with it." Bonds found himself in the prisoner's dilemma that is doping in baseball, the feeling that he has to use steroids so that his competitors don't have such a significant advantage over him, putting him on an even playing field, though everyone is worse off than if no one had used steroids at all.

International politics

In international political theory, the Prisoner's Dilemma is often used to demonstrate the coherence of strategic realism, which holds that in international relations, all states (regardless of their internal policies or professed ideology), will act in their rational self-interest given international anarchy. A classic example is an arms race like the

Cold War The Cold War is a term commonly used to refer to a period of geopolitical tension between the United States and the Soviet Union and their respective allies, the Western Bloc and the Eastern Bloc. The term '' cold war'' is used because the ...

and similar conflicts. During the Cold War the opposing alliances of

NATO The North Atlantic Treaty Organization (NATO, ; french: Organisation du traité de l'Atlantique nord, ), also called the North Atlantic Alliance, is an intergovernmental military alliance between 30 member states – 28 European and two No ...

and the

Warsaw Pact The Warsaw Pact (WP) or Treaty of Warsaw, formally the Treaty of Friendship, Cooperation and Mutual Assistance, was a collective defense treaty signed in Warsaw, Poland, between the Soviet Union and seven other Eastern Bloc socialist republic ...

both had the choice to arm or disarm. From each side's point of view, disarming whilst their opponent continued to arm would have led to military inferiority and possible annihilation. Conversely, arming whilst their opponent disarmed would have led to superiority. If both sides chose to arm, neither could afford to attack the other, but both incurred the high cost of developing and maintaining a nuclear arsenal. If both sides chose to disarm, war would be avoided and there would be no costs. Although the 'best' overall outcome is for both sides to disarm, the rational course for both sides is to arm, and this is indeed what happened. Both sides poured enormous resources into military research and armament in a

war of attrition The War of Attrition ( ar, حرب الاستنزاف, Ḥarb al-Istinzāf; he, מלחמת ההתשה, Milhemet haHatashah) involved fighting between Israel and Egypt, Jordan, the Palestine Liberation Organisation (PLO) and their allies from ...

for the next thirty years until the Soviet Union could not withstand the economic cost. The same logic could be applied in any similar scenario, be it economic or technological competition between sovereign states.

Multiplayer dilemmas

Many real-life dilemmas involve multiple players. Although metaphorical, Hardin's

tragedy of the commons Tragedy (from the grc-gre, τραγῳδία, ''tragōidia'', ''tragōidia'') is a genre of drama based on human suffering and, mainly, the terrible or sorrowful events that befall a main character. Traditionally, the intention of tragedy i ...

may be viewed as an example of a multi-player generalization of the PD: Each villager makes a choice for personal gain or restraint. The collective reward for unanimous (or even frequent) defection is very low payoffs (representing the destruction of the "commons"). A commons dilemma most people can relate to is washing the dishes in a shared house. By not washing dishes an individual can gain by saving his time, but if that behavior is adopted by every resident, the collective cost is no clean plates for anyone. The commons are not always exploited:

, in a book about the prisoner's dilemma, describes a situation in New Zealand where newspaper boxes are left unlocked. It is possible for people to take a paper without paying (''defecting''), but very few do, feeling that if they do not pay then neither will others, destroying the system. Subsequent research by

, winner of the 2009

Nobel Memorial Prize in Economic Sciences The Nobel Memorial Prize in Economic Sciences, officially the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel ( sv, Sveriges riksbanks pris i ekonomisk vetenskap till Alfred Nobels minne), is an economics award administered ...

, hypothesized that the tragedy of the commons is oversimplified, with the negative outcome influenced by outside influences. Without complicating pressures, groups communicate and manage the commons among themselves for their mutual benefit, enforcing social norms to preserve the resource and achieve the maximum good for the group, an example of effecting the best case outcome for PD.

Related games

Closed-bag exchange

Prisoner's Dilemma briefcase exchange (colorized)

Douglas Hofstadter once suggested that people often find problems such as the PD problem easier to understand when it is illustrated in the form of a simple game, or trade-off. One of several examples he used was "closed bag exchange":

''Friend or Foe?''

'' Friend or Foe?'' is a game show that aired from 2002 to 2003 on the

Game Show Network Game Show Network (GSN) is an American basic cable channel owned by Sony Pictures Television. The channel's programming is primarily dedicated to game shows, including reruns of acquired game shows, along with new, first-run original and revive ...

in the US. It is an example of the prisoner's dilemma game tested on real people, but in an artificial setting. On the game show, three pairs of people compete. When a pair is eliminated, they play a game similar to the prisoner's dilemma to determine how the winnings are split. If they both cooperate (Friend), they share the winnings and they both get 50% of the winnings. If one cooperates and the other defects (Foe), the defector gets 80% of the winnings and the cooperator gets 20% of the winnings. If both defect, both leave with nothing. Notice that the reward matrix is slightly different from the standard one given above, as the rewards for the "both defect" and the "cooperate while the opponent defects" cases are identical. This makes the "both defect" case a weak equilibrium, compared with being a strict equilibrium in the standard prisoner's dilemma. If a contestant knows that their opponent is going to vote "Foe", then their own choice does not affect their own winnings. In a specific sense, ''Friend or Foe'' has a rewards model the same as the

game of Chicken The game of chicken, also known as the hawk–dove game or snowdrift game, is a model of conflict for two players in game theory. The principle of the game is that while the ideal outcome is for one player to yield (to avoid the worst outcome if ...

. The rewards matrix is This payoff matrix has also been used on the

British British may refer to: Peoples, culture, and language * British people, nationals or natives of the United Kingdom, British Overseas Territories, and Crown Dependencies. ** Britishness, the British identity and common culture * British English, ...

television Television, sometimes shortened to TV, is a telecommunication medium for transmitting moving images and sound. The term can refer to a television set, or the medium of television transmission. Television is a mass medium for advertisin ...

programmes ''Trust Me'', ''

Shafted ''Shafted'' was a British game show that aired on ITV from 5 to 26 November 2001 and was hosted by Robert Kilroy-Silk. Format The game begins with six players and is played in five rounds. In the first round, each player must secretly declare ...

'', ''

The Bank Job ''The'' () is a grammatical article in English, denoting persons or things already mentioned, under discussion, implied or otherwise presumed familiar to listeners, readers, or speakers. It is the definite article in English. ''The'' is the ...

'' and ''

Golden Balls ''Golden Balls'' was a British daytime game show that was presented by Jasper Carrott. It was broadcast on the ITV (TV network), ITV network from 18 June 2007 to 18 December 2009. Gameplay Round 1 At the back of the studio is the "Golden Bank, ...

'', and on the

American American(s) may refer to: * American, something of, from, or related to the United States of America, commonly known as the "United States" or "America" ** Americans, citizens and nationals of the United States of America ** American ancestry, pe ...

game shows '' Take It All'', as well as for the winning couple on the Reality Show shows '' Bachelor Pad'' and '' Love Island''. Game data from the ''

'' series has been analyzed by a team of economists, who found that cooperation was "surprisingly high" for amounts of money that would seem consequential in the real world but were comparatively low in the context of the game.

Iterated snowdrift

Researchers from the

University of Lausanne The University of Lausanne (UNIL; french: links=no, Université de Lausanne) in Lausanne, Switzerland was founded in 1537 as a school of Protestant theology, before being made a university in 1890. The university is the second oldest in Switzer ...

and the

University of Edinburgh The University of Edinburgh ( sco, University o Edinburgh, gd, Oilthigh Dhùn Èideann; abbreviated as ''Edin.'' in post-nominals) is a public research university based in Edinburgh, Scotland. Granted a royal charter by King James VI in 15 ...

have suggested that the "Iterated Snowdrift Game" may more closely reflect real-world social situations. Although this model is actually a chicken game, it will be described here. In this model, the risk of being exploited through defection is lower, and individuals always gain from taking the cooperative choice. The snowdrift game imagines two drivers who are stuck on opposite sides of a

snowdrift A snowdrift is a deposit of snow sculpted by wind into a mound during a snowstorm. Snowdrifts resemble sand dunes and are formed in a similar manner, namely, by wind moving light snow and depositing it when the wind has virtually stopped, u ...

, each of whom is given the option of shoveling snow to clear a path, or remaining in their car. A player's highest payoff comes from leaving the opponent to clear all the snow by themselves, but the opponent is still nominally rewarded for their work. This may better reflect real-world scenarios, the researchers giving the example of two scientists collaborating on a report, both of whom would benefit if the other worked harder. "But when your collaborator doesn't do any work, it's probably better for you to do all the work yourself. You'll still end up with a completed project."

Coordination games

In coordination games, players must coordinate their strategies for a good outcome. An example is two cars that abruptly meet in a blizzard; each must choose whether to swerve left or right. If both swerve left, or both right, the cars do not collide. The local left- and right-hand traffic convention helps to co-ordinate their actions. Symmetrical co-ordination games include

Stag hunt In game theory, the stag hunt, sometimes referred to as the assurance game, trust dilemma or common interest game, describes a conflict between safety and social cooperation. The stag hunt problem originated with philosopher Jean-Jacques Roussea ...

and Bach or Stravinsky.

Asymmetric prisoner's dilemmas

A more general set of games are asymmetric. As in the prisoner's dilemma, the best outcome is cooperation, and there are motives for defection. Unlike the symmetric prisoner's dilemma, though, one player has more to lose and/or more to gain than the other. Some such games have been described as a prisoner's dilemma in which one prisoner has an

alibi An alibi (from the Latin, '' alibī'', meaning "somewhere else") is a statement by a person, who is a possible perpetrator of a crime, of where they were at the time a particular offence was committed, which is somewhere other than where the crim ...

, whence the term "alibi game". In experiments, players getting unequal payoffs in repeated games may seek to maximize profits, but only under the condition that both players receive equal payoffs; this may lead to a stable equilibrium strategy in which the disadvantaged player defects every X games, while the other always co-operates. Such behaviour may depend on the experiment's social norms around fairness.

Guardian's Dilemma

It is not only prisoners who face dilemmas. Guardians also confront situations in which there are only unattractive choices from which to choose. Examples can easily be found in cases where one agent must smooth tensions between its own partners: one can think of two colleagues jockeying for career advancement and the troubles this causes their company's

managing director A chief executive officer (CEO), also known as a central executive officer (CEO), chief administrator officer (CAO) or just chief executive (CE), is one of a number of corporate executives charged with the management of an organization especially ...

; two officials competing for promotion and the tension this causes for the head of their

bureau Bureau ( ) may refer to: Agencies and organizations * Government agency *Public administration * News bureau, an office for gathering or distributing news, generally for a given geographical location * Bureau (European Parliament), the administra ...

; or in

parenting Parenting or child rearing promotes and supports the physical, emotional, social, spiritual and intellectual development of a child from infancy to adulthood. Parenting refers to the intricacies of raising a child and not exclusively for a ...

when two siblings vie for attention and the anxiety this causes their parents. If the behaviour of the guardian satisfies one side, the other side feels exposed and alienated. From an

international relations International relations (IR), sometimes referred to as international studies and international affairs, is the scientific study of interactions between sovereign states. In a broader sense, it concerns all activities between states—such as ...

perspective, Dr

Spyros Katsoulas Spiro(s) may refer to: * Spiro, Oklahoma, a town in the U.S. ** Spiro Mounds, an archaeological site * Spiro (band), a British music group * Spiro (name), including a list of people with the name * Špiro, South Slavic masculine given name * ARA ...

introduces the concept of the guardian's dilemma. The guardian's dilemma is defined as the condition in which two states maintain their enmity towards one another despite sharing a stronger common ally. By default, a dilemma is a situation with unsatisfactory choices. The guardian's dilemma lies in the fact that the stronger state can neither stay out of a crisis between its allies nor get actively involved without affecting the fragile equilibrium. If the guardian abstains, the situation may spin out of control; if the guardian gets involved, any tilt against one side may be seen as a victory or a window of opportunity for the other. Expanding on

Glenn Snyder Glenn Herald Snyder (October 8, 1924 February 14, 2013) was professor emeritus of political science at the University of North Carolina at Chapel Hill. His expertise was in the fields of international relations theory and security studies. He made ...

's concept of the alliance security dilemma,Glenn H. Snyder, "The Security Dilemma in Alliance Politics" World Politics, Volume 36, Issue 4, July 1984, pp. 461 - 495 https://doi.org/10.2307/2010183 the outcomes of the interaction between the guardian and the two smaller partners are described as abandonment, entrapment, and emboldening.

Software

Several software packages have been created to run prisoner's dilemma simulations and tournaments, some of which have available source code. * The source code for the second tournament run by Robert Axelrod (written by Axelrod and many contributors in Fortran) is availabl
online

a library written in

Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...

, last updated in 1998
Axelrod-Python
written in

Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...

Evoplex
a fast agent-based modeling program released in 2018 by Marcos Cardinot

In fiction

Hannu Rajaniemi set the opening scene of his '' The Quantum Thief'' trilogy in a "dilemma prison". The main theme of the series has been described as the "inadequacy of a binary universe" and the ultimate antagonist is a character called the All-Defector. Rajaniemi is particularly interesting as an artist treating this subject in that he is a Cambridge-trained mathematician and holds a Ph.D. in

mathematical physics Mathematical physics refers to the development of mathematics, mathematical methods for application to problems in physics. The ''Journal of Mathematical Physics'' defines the field as "the application of mathematics to problems in physics and t ...

– the interchangeability of matter and information is a major feature of the books, which take place in a "post-singularity" future. The first book in the series was published in 2010, with the two sequels, '' The Fractal Prince'' and '' The Causal Angel'', published in 2012 and 2014, respectively. A game modeled after the (iterated) prisoner's dilemma is a central focus of the 2012 video game '' Zero Escape: Virtue's Last Reward'' and a minor part in its 2016 sequel '' Zero Escape: Zero Time Dilemma''. In ''

The Mysterious Benedict Society and the Prisoner's Dilemma ''The Mysterious Benedict Society and the Prisoner's Dilemma'' is a 2009 children's novel written by Trenton Lee Stewart and illustrated by Diana Sudyka. For a decade it remained the third and final book in the The Mysterious Benedict Society (ser ...

'' by

Trenton Lee Stewart Trenton Lee Stewart (born May 27, 1970) is an American author best known for the The Mysterious Benedict Society (series), Mysterious Benedict Society series. Stewart is a graduate of Hendrix College and the Iowa Writers' Workshop. He lives in L ...

, the main characters start by playing a version of the game and escaping from the "prison" altogether. Later they become actual prisoners and escape once again. In ''

The Adventure Zone ''The Adventure Zone'' is a weekly comedy and adventure actual play podcast based loosely upon the ''Dungeons & Dragons'' game series, along with other role-playing games. The show is distributed by the Maximum Fun network and hosted by brother ...

: Balance'' during ''The Suffering Game'' subarc, the player characters are twice presented with the prisoner's dilemma during their time in two liches' domain, once cooperating and once defecting. In the 8th novel from the author James S. A. Corey '' Tiamat's Wrath,'' Winston Duarte explains the prisoners dilemma to his 14-year-old daughter, Teresa, to train her in strategic thinking. An extreme version of the prisoner's dilemma is featured in the 2008 film ''

The Dark Knight ''The Dark Knight'' is a 2008 superhero film directed by Christopher Nolan from a screenplay he co-wrote with his brother Jonathan Nolan, Jonathan. Based on the DC Comics superhero, Batman, it is the sequel to ''Batman Begins'' (2005) and t ...

'' in which the Joker rigs two ferries, one containing prisoners and the other containing civilians, arming both groups with the means to detonate the bomb on each other's ferries. Ultimately, the two sides decide not to act.

References

External links

*
Prisoner's Dilemma (''Stanford Encyclopedia of Philosophy'')

The Prisoner's Dilemma in ornithology – mathematical cartoon by Larry Gonick.
The Prisoner's Dilemma
The Prisoner's Dilemma with Lego minifigures. *

Dawkins: Nice Guys Finish First

Axelrod
Iterated Prisoner's Dilemma

library
Play Prisoner's Dilemma on ''oTree''
(N/A 11-5-17) * Nicky Case'
Evolution of Trust
an example of the donation game
Iterated Prisoner's Dilemma online game
by Wayne Davis {{Authority control Non-cooperative games Thought experiments Dilemmas Environmental studies Social psychology Social science experiments Moral psychology