HOME

TheInfoList



OR:

In
game theory Game theory is the study of mathematical models of strategic interactions. It has applications in many fields of social science, and is used extensively in economics, logic, systems science and computer science. Initially, game theory addressed ...
, a stochastic game (or Markov game) is a repeated game with probabilistic transitions played by one or more players. The game is played in a sequence of stages. At the beginning of each stage the game is in some state. The players select actions and each player receives a payoff that depends on the current state and the chosen actions. The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players. The procedure is repeated at the new state and play continues for a finite or infinite number of stages. The total payoff to a player is often taken to be the discounted sum of the stage payoffs or the
limit inferior In mathematics, the limit inferior and limit superior of a sequence can be thought of as limiting (that is, eventual and extreme) bounds on the sequence. They can be thought of in a similar fashion for a function (see limit of a function). For ...
of the averages of the stage payoffs. Stochastic games were introduced by
Lloyd Shapley Lloyd Stowell Shapley (; June 2, 1923 – March 12, 2016) was an American mathematician and Nobel Memorial Prize-winning economist. He contributed to the fields of mathematical economics and especially game theory. Shapley is generally conside ...
in the early 1950s. They generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic situations in which the environment changes in response to the players' choices.


Two-player games

Stochastic two-player games on directed graphs are widely used for modeling and analysis of discrete systems operating in an unknown (adversarial) environment. Possible configurations of a system and its environment are represented as vertices, and the transitions correspond to actions of the system, its environment, or "nature". A run of the system then corresponds to an infinite path in the graph. Thus, a system and its environment can be seen as two players with antagonistic objectives, where one player (the system) aims at maximizing the probability of "good" runs, while the other player (the environment) aims at the opposite. In many cases, there exists an equilibrium value of this probability, but optimal strategies for both players may not exist. We introduce basic concepts and algorithmic questions studied in this area, and we mention some long-standing open problems. Then, we mention selected recent results.


Theory

The ingredients of a stochastic game are: a finite set of players I; a state space S (either a finite set or a
measurable space In mathematics, a measurable space or Borel space is a basic object in measure theory. It consists of a set and a σ-algebra, which defines the subsets that will be measured. It captures and generalises intuitive notions such as length, area, an ...
(S,)); for each player i\in I, an action set A^i (either a finite set or a measurable space (A^i,^i)); a transition probability P from S\times A, where A=\times_A^i is the action profiles, to S, where P(S \mid s, a) is the probability that the next state is in S given the current state s and the current action profile a; and a payoff function g from S\times A to R^I, where the i-th coordinate of g, g^i, is the payoff to player i as a function of the state s and the action profile a. The game starts at some initial state s_1. At stage t, players first observe s_t, then simultaneously choose actions a^i_t\in A^i, then observe the action profile a_t=(a^i_t)_i, and then nature selects s_ according to the probability P(\cdot\mid s_t,a_t). A play of the stochastic game, s_1,a_1,\ldots,s_t,a_t,\ldots, defines a stream of payoffs g_1,g_2,\ldots, where g_t=g(s_t,a_t). The discounted game \Gamma_\lambda with discount factor \lambda (0<\lambda \leq 1) is the game where the payoff to player i is \lambda \sum_^(1-\lambda)^g^i_t. The n-stage game is the game where the payoff to player i is \bar^i_n:=\frac1n\sum_^ng^i_t. The value v_n(s_1), respectively v_(s_1), of a two-person zero-sum stochastic game \Gamma_n, respectively \Gamma_, with finitely many states and actions exists, and
Truman Bewley Truman Fassett Bewley (born July 19, 1941) is an American economist. He is the Alfred Cowles Professor of Economics at Yale University. Originally specializing in mathematical economics and general equilibrium theory, since the late 1990s Bewley ...
and Elon Kohlberg (1976) proved that v_n(s_1) converges to a limit as n goes to infinity and that v_(s_1) converges to the same limit as \lambda goes to 0. The "undiscounted" game \Gamma_\infty is the game where the payoff to player i is the "limit" of the averages of the stage payoffs. Some precautions are needed in defining the value of a two-person zero-sum \Gamma_ and in defining equilibrium payoffs of a non-zero-sum \Gamma_. The uniform value v_ of a two-person zero-sum stochastic game \Gamma_\infty exists if for every \varepsilon>0 there is a positive integer N and a strategy pair \sigma_ of player 1 and \tau_ of player 2 such that for every \sigma and \tau and every n\geq N the expectation of \bar^i_n with respect to the probability on plays defined by \sigma_ and \tau is at least v_ -\varepsilon , and the expectation of \bar^i_n with respect to the probability on plays defined by \sigma and \tau_ is at most v_ +\varepsilon . Jean-François Mertens and Abraham Neyman (1981) proved that every two-person zero-sum stochastic game with finitely many states and actions has a uniform value. If there is a finite number of players and the action sets and the set of states are finite, then a stochastic game with a finite number of stages always has a
Nash equilibrium In game theory, the Nash equilibrium is the most commonly used solution concept for non-cooperative games. A Nash equilibrium is a situation where no player could gain by changing their own strategy (holding all other players' strategies fixed) ...
. The same is true for a game with infinitely many stages if the total payoff is the discounted sum. The non-zero-sum stochastic game \Gamma_\infty has a uniform equilibrium payoff v_ if for every \varepsilon>0 there is a positive integer N and a strategy profile \sigma such that for every unilateral deviation by a player i, i.e., a strategy profile \tau with \sigma^j=\tau^j for all j\neq i, and every n\geq N the expectation of \bar^i_n with respect to the probability on plays defined by \sigma is at least v^i_ -\varepsilon , and the expectation of \bar^i_n with respect to the probability on plays defined by \tau is at most v^i_ +\varepsilon . Nicolas Vieille has shown that all two-person stochastic games with finite state and action spaces have a uniform equilibrium payoff. The non-zero-sum stochastic game \Gamma_\infty has a limiting-average equilibrium payoff v_ if for every \varepsilon>0 there is a strategy profile \sigma such that for every unilateral deviation by a player i, the expectation of the limit inferior of the averages of the stage payoffs with respect to the probability on plays defined by \sigma is at least v^i_ -\varepsilon , and the expectation of the limit superior of the averages of the stage payoffs with respect to the probability on plays defined by \tau is at most v^i_ +\varepsilon . Jean-François Mertens and Abraham Neyman (1981) proves that every two-person zero-sum stochastic game with finitely many states and actions has a limiting-average value, and Nicolas Vieille has shown that all two-person stochastic games with finite state and action spaces have a limiting-average equilibrium payoff. In particular, these results imply that these games have a value and an approximate equilibrium payoff, called the liminf-average (respectively, the limsup-average) equilibrium payoff, when the total payoff is the limit inferior (or the limit superior) of the averages of the stage payoffs. Whether every stochastic game with finitely many players, states, and actions, has a uniform equilibrium payoff, or a limiting-average equilibrium payoff, or even a liminf-average equilibrium payoff, is a challenging open question. A Markov perfect equilibrium is a refinement of the concept of sub-game perfect Nash equilibrium to stochastic games. Stochastic games have been combined with
Bayesian game In game theory, a Bayesian game is a strategic decision-making model which assumes players have incomplete information. Players may hold private information relevant to the game, meaning that the payoffs are not common knowledge. Bayesian games mo ...
s to model uncertainty over player strategies. The resulting stochastic Bayesian game model is solved via a recursive combination of the Bayesian Nash equilibrium equation and the Bellman optimality equation.


Stopping games

E. B. Dynkin presented the following problem in game theory related to stopping of stochastic processes. Suppose _n, n=0,1,\ldots, be an increasing sequence of σ-algebras in some probability space (\Omega,, ) , where is containing all the _n. Two players observe stochastic sequences \_^\infty, \_^\infty, i.e. functions measurable with respect to \_^\infty. A game can be stopped at time n by the first player if \varPhi_n\geq 0 and by the second player if \varPhi_n< 0. If the game is stopped at time n, then the first player receives from the second player x_n. Player 1 seeks to maximize the expected payoff, and player 2 seeks to minimize it. Let \tau be Markov time with respect to the filtration \_^\inftyand \chi_A be the characteristic function of the event A. Denote \mathcal T the set of all stopping times with respect to the filtration \_^\infty, \Lambda=\, and \Mu=\. Let \lambda be the stopping time selected by the first (\mu resp. by the second) player. The payoff, payment of the second player to the first, is defined then R(\lambda,\mu)= X_. Under the condition that (\sup_n, X_n, )<\infty, Dynkin proved that the value of the game v=\sup_\inf_ R(\lambda, \mu ) exists. He constructed ε-optimal strategies, and introduced several conditions for the existence of optimal strategies (for extension see Neveu and Yasuda ).


Applications

Stochastic games have applications in
economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services. Economics focuses on the behaviour and interac ...
,
evolutionary biology Evolutionary biology is the subfield of biology that studies the evolutionary processes such as natural selection, common descent, and speciation that produced the diversity of life on Earth. In the 1930s, the discipline of evolutionary biolo ...
and computer networks. They are generalizations of repeated games which correspond to the special case where there is only one state.


See also

*
Stochastic process In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...


Notes


Further reading

* * * (suitable for undergraduates; main results, no proofs)


External links


Lecture on Stochastic Two-Player Games by Antonin Kucera
{{DEFAULTSORT:Stochastic Game Game theory game classes Mathematical and quantitative methods (economics)