Dynamic programming is both a
mathematical optimization
Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criterion, from some set of available alternatives. It is generally divided into two subfi ...
method and a computer programming method. The method was developed by
Richard Bellman
Richard Ernest Bellman (August 26, 1920 – March 19, 1984) was an American applied mathematician, who introduced dynamic programming in 1953, and made important contributions in other fields of mathematics, such as biomathematics. He founde ...
in the 1950s and has found applications in numerous fields, from
aerospace engineering
Aerospace engineering is the primary field of engineering concerned with the development of aircraft and spacecraft. It has two major and overlapping branches: aeronautical engineering and astronautical engineering. Avionics engineering is si ...
to
economics
Economics () is the social science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services.
Economics focuses on the behaviour and intera ...
.
In both contexts it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a
recursive
Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
manner. While some decision problems cannot be taken apart this way, decisions that span several points in time do often break apart recursively. Likewise, in computer science, if a problem can be solved optimally by breaking it into sub-problems and then recursively finding the optimal solutions to the sub-problems, then it is said to have ''
optimal substructure
In computer science, a problem is said to have optimal substructure if an optimal solution can be constructed from optimal solutions of its subproblems. This property is used to determine the usefulness of greedy algorithms for a problem.{{cite boo ...
''.
If sub-problems can be nested recursively inside larger problems, so that dynamic programming methods are applicable, then there is a relation between the value of the larger problem and the values of the sub-problems.
[Cormen, T. H.; Leiserson, C. E.; Rivest, R. L.; Stein, C. (2001), Introduction to Algorithms (2nd ed.), MIT Press & McGraw–Hill, . pp. 344.] In the optimization literature this relationship is called the
Bellman equation
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time ...
.
Overview
Mathematical optimization
In terms of mathematical optimization, dynamic programming usually refers to simplifying a decision by breaking it down into a sequence of decision steps over time. This is done by defining a sequence of value functions ''V''
1, ''V''
2, ..., ''V''
''n'' taking ''y'' as an argument representing the
state
State may refer to:
Arts, entertainment, and media Literature
* ''State Magazine'', a monthly magazine published by the U.S. Department of State
* ''The State'' (newspaper), a daily newspaper in Columbia, South Carolina, United States
* ''Our S ...
of the system at times ''i'' from 1 to ''n''. The definition of ''V''
''n''(''y'') is the value obtained in state ''y'' at the last time ''n''. The values ''V''
''i'' at earlier times ''i'' = ''n'' −1, ''n'' − 2, ..., 2, 1 can be found by working backwards, using a
recursive
Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
relationship called the
Bellman equation
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time ...
. For ''i'' = 2, ..., ''n'', ''V''
''i''−1 at any state ''y'' is calculated from ''V''
''i'' by maximizing a simple function (usually the sum) of the gain from a decision at time ''i'' − 1 and the function ''V''
''i'' at the new state of the system if this decision is made. Since ''V''
''i'' has already been calculated for the needed states, the above operation yields ''V''
''i''−1 for those states. Finally, ''V''
1 at the initial state of the system is the value of the optimal solution. The optimal values of the decision variables can be recovered, one by one, by tracking back the calculations already performed.
Control theory
In
control theory
Control theory is a field of mathematics that deals with the control of dynamical systems in engineered processes and machines. The objective is to develop a model or algorithm governing the application of system inputs to drive the system to a ...
, a typical problem is to find an admissible control
which causes the system
to follow an admissible trajectory
on a continuous time interval
that minimizes a
cost function
:
The solution to this problem is an optimal control law or policy
, which produces an optimal trajectory
and a
cost-to-go function . The latter obeys the fundamental equation of dynamic programming:
:
a
partial differential equation
In mathematics, a partial differential equation (PDE) is an equation which imposes relations between the various partial derivatives of a Multivariable calculus, multivariable function.
The function is often thought of as an "unknown" to be sol ...
known as the
Hamilton–Jacobi–Bellman equation In optimal control theory, the Hamilton-Jacobi-Bellman (HJB) equation gives a necessary and sufficient condition for optimality of a control with respect to a loss function. It is, in general, a nonlinear partial differential equation in the value ...
, in which
and
. One finds that minimizing
in terms of
,
, and the unknown function
and then substitutes the result into the Hamilton–Jacobi–Bellman equation to get the partial differential equation to be solved with boundary condition
. In practice, this generally requires
numerical techniques for some discrete approximation to the exact optimization relationship.
Alternatively, the continuous process can be approximated by a discrete system, which leads to a following recurrence relation analog to the Hamilton–Jacobi–Bellman equation:
:
at the
-th stage of
equally spaced discrete time intervals, and where
and
denote discrete approximations to
and
. This functional equation is known as the
Bellman equation
A Bellman equation, named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the "value" of a decision problem at a certain point in time ...
, which can be solved for an exact solution of the discrete approximation of the optimization equation.
Example from economics: Ramsey's problem of optimal saving
In economics, the objective is generally to maximize (rather than minimize) some dynamic
social welfare function
In welfare economics, a social welfare function is a function that ranks social states (alternative complete descriptions of the society) as less desirable, more desirable, or indifferent for every possible pair of social states. Inputs of the f ...
. In Ramsey's problem, this function relates amounts of consumption to levels of
utility
As a topic of economics, utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness as part of the theory of utilitarianism by moral philosopher ...
. Loosely speaking, the planner faces the trade-off between contemporaneous consumption and future consumption (via investment in
capital stock
A corporation's share capital, commonly referred to as capital stock in the United States, is the portion of a corporation's equity that has been derived by the issue of shares in the corporation to a shareholder, usually for cash. "Share capi ...
that is used in production), known as
intertemporal choice Intertemporal choice is the process by which people make decisions about what and how much to do at various points in time, when choices at one time influence the possibilities available at other points in time. These choices are influenced by the r ...
. Future consumption is discounted at a constant rate
. A discrete approximation to the transition equation of capital is given by
:
where
is consumption,
is capital, and
is a
production function
In economics, a production function gives the technological relation between quantities of physical inputs and quantities of output of goods. The production function is one of the key concepts of mainstream neoclassical theories, used to define ...
satisfying the
Inada conditions
In macroeconomics, the Inada conditions, named after Japanese economist Ken-Ichi Inada, are assumptions about the shape of a function, usually applied to a production function or a utility function. When the production function of a neoclassic ...
. An initial capital stock
is assumed.
Let
be consumption in period , and assume consumption yields
utility
As a topic of economics, utility is used to model worth or value. Its usage has evolved significantly over time. The term was introduced initially as a measure of pleasure or happiness as part of the theory of utilitarianism by moral philosopher ...
as long as the consumer lives. Assume the consumer is impatient, so that he
discounts
Discounting is a financial mechanism in which a debtor obtains the right to delay payments to a creditor, for a defined period of time, in exchange for a charge or fee.See "Time Value", "Discount", "Discount Yield", "Compound Interest", "Efficie ...
future utility by a factor each period, where