Stochastic control or stochastic optimal control is a sub field of

control theory Control theory is a field of mathematics that deals with the control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the application of system inputs to drive the system to a ...

that deals with the existence of uncertainty either in observations or in the noise that drives the evolution of the system. The system designer assumes, in a

Bayesian probability Bayesian probability is an interpretation of the concept of probability, in which, instead of frequency or propensity of some phenomenon, probability is interpreted as reasonable expectation representing a state of knowledge or as quantification ...

-driven fashion, that random noise with known probability distribution affects the evolution and observation of the state variables. Stochastic control aims to design the time path of the controlled variables that performs the desired control task with minimum cost, somehow defined, despite the presence of this noise. The context may be either discrete time or

continuous time In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "po ...

Certainty equivalence

An extremely well-studied formulation in stochastic control is that of linear quadratic Gaussian control. Here the model is linear, the objective function is the expected value of a quadratic form, and the disturbances are purely additive. A basic result for discrete-time centralized systems with only additive uncertainty is the certainty equivalence property: that the optimal control solution in this case is the same as would be obtained in the absence of the additive disturbances. This property is applicable to all centralized systems with linear equations of evolution, quadratic cost function, and noise entering the model only additively; the quadratic assumption allows for the optimal control laws, which follow the certainty-equivalence property, to be linear functions of the observations of the controllers. Any deviation from the above assumptions—a nonlinear state equation, a non-quadratic objective function, noise in the multiplicative parameters of the model, or decentralization of control—causes the certainty equivalence property not to hold. For example, its failure to hold for decentralized control was demonstrated in

Witsenhausen's counterexample Witsenhausen's counterexample, shown in the figure below, is a deceptively simple toy problem in decentralized stochastic control. It was formulated by Hans Witsenhausen in 1968. It is a counterexample to a natural conjecture that one can genera ...

Discrete time

In a discrete-time context, the decision-maker observes the state variable, possibly with observational noise, in each time period. The objective may be to optimize the sum of expected values of a nonlinear (possibly quadratic) objective function over all the time periods from the present to the final period of concern, or to optimize the value of the objective function as of the final period only. At each time period new observations are made, and the control variables are to be adjusted optimally. Finding the optimal solution for the present time may involve iterating a matrix Riccati equation backwards in time from the last period to the present period. In the discrete-time case with uncertainty about the parameter values in the transition matrix (giving the effect of current values of the state variables on their own evolution) and/or the control response matrix of the state equation, but still with a linear state equation and quadratic objective function, a Riccati equation can still be obtained for iterating backward to each period's solution even though certainty equivalence does not apply.^ch.13 The discrete-time case of a non-quadratic loss function but only additive disturbances can also be handled, albeit with more complications.

Example

A typical specification of the discrete-time stochastic linear quadratic control problem is to minimize :

\mathrm_1\sum_^S \left_t^\mathsf Qy_t + u_t^\mathsf Ru_t\right /math>

where E

₁ is the expected value operator conditional on ''y''₀, superscript T indicates a

matrix transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...

, and ''S'' is the time horizon, subject to the state equation :

y_t = A_ty_ + B_tu_t,

where ''y'' is an ''n'' × 1 vector of observable state variables, ''u'' is a ''k'' × 1 vector of control variables, ''A''_''t'' is the time ''t'' realization of the stochastic ''n'' × ''n'' state transition matrix, ''B''_''t'' is the time ''t'' realization of the stochastic ''n'' × ''k'' matrix of control multipliers, and ''Q'' (''n'' × ''n'') and ''R'' (''k'' × ''k'') are known symmetric positive definite cost matrices. We assume that each element of ''A'' and ''B'' is jointly independently and identically distributed through time, so the expected value operations need not be time-conditional. Induction backwards in time can be used to obtain the optimal control solution at each time, :

\mathrm\left(B^\mathsfX_tA\right)y_,

with the symmetric positive definite cost-to-go matrix ''X'' evolving backwards in time from

X_S = Q

according to :

\mathrm\left(B^\mathsfX_tA\right),

which is known as the discrete-time dynamic Riccati equation of this problem. The only information needed regarding the unknown parameters in the ''A'' and ''B'' matrices is the expected value and variance of each element of each matrix and the covariances among elements of the same matrix and among elements across matrices. The optimal control solution is unaffected if zero-mean, i.i.d. additive shocks also appear in the state equation, so long as they are uncorrelated with the parameters in the ''A'' and ''B'' matrices. But if they are so correlated, then the optimal control solution for each period contains an additional additive constant vector. If an additive constant vector appears in the state equation, then again the optimal control solution for each period contains an additional additive constant vector. The steady-state characterization of ''X'' (if it exists), relevant for the infinite-horizon problem in which ''S'' goes to infinity, can be found by iterating the dynamic equation for ''X'' repeatedly until it converges; then ''X'' is characterized by removing the time subscripts from its dynamic equation.

Continuous time

If the model is in continuous time, the controller knows the state of the system at each instant of time. The objective is to maximize either an integral of, for example, a concave function of a state variable over a horizon from time zero (the present) to a terminal time ''T'', or a concave function of a state variable at some future date ''T''. As time evolves, new observations are continuously made and the control variables are continuously adjusted in optimal fashion.

Stochastic model predictive control

In the literature, there are two types of MPCs for stochastic systems; Robust model predictive control and Stochastic Model Predictive Control (SMPC). Robust model predictive control is a more conservative method which considers the worst scenario in the optimization procedure. However, this method, similar to other robust controls, deteriorates the overall controller's performance and also is applicable only for systems with bounded uncertainties. The alternative method, SMPC, considers soft constraints which limit the risk of violation by a probabilistic inequality.

In finance

In a continuous time approach in a finance context, the state variable in the stochastic differential equation is usually wealth or net worth, and the controls are the shares placed at each time in the various assets. Given the

asset allocation Asset allocation is the implementation of an investment strategy that attempts to balance risk versus reward by adjusting the percentage of each asset in an investment portfolio according to the investor's risk tolerance, goals and investment t ...

chosen at any time, the determinants of the change in wealth are usually the stochastic returns to assets and the interest rate on the risk-free asset. The field of stochastic control has developed greatly since the 1970s, particularly in its applications to finance. Robert Merton used stochastic control to study optimal portfolios of safe and risky assets.

His work His or HIS may refer to: Computing * Hightech Information System, a Hong Kong graphics card company * Honeywell Information Systems * Hybrid intelligent system * Microsoft Host Integration Server Education * Hangzhou International School, in ...

and that of Black–Scholes changed the nature of the finance literature. Influential mathematical textbook treatments were by Fleming and Rishel, and by Fleming and Soner. These techniques were applied by

Stein Stein is a German, Yiddish and Norwegian word meaning "stone" and "pip" or "kernel". It stems from the same Germanic root as the English word stone. It may refer to: Places In Austria * Stein, a neighbourhood of Krems an der Donau, Lower Aust ...

to the

financial crisis of 2007–08 Finance is the study and discipline of money, currency and capital assets. It is related to, but not synonymous with economics, the study of production, distribution, and consumption of money, assets, goods and services (the discipline of f ...

. The maximization, say of the expected logarithm of net worth at a terminal date ''T'', is subject to stochastic processes on the components of wealth. In this case, in continuous time Itô's equation is the main tool of analysis. In the case where the maximization is an integral of a concave function of utility over an horizon (0,''T''), dynamic programming is used. There is no certainty equivalence as in the older literature, because the coefficients of the control variables—that is, the returns received by the chosen shares of assets—are stochastic.

Certainty equivalence

Discrete time

Example

Continuous time

Stochastic model predictive control

In finance

See also

References

Further reading