HOME

TheInfoList



OR:

The value function of an
optimization problem In mathematics, engineering, computer science and economics Economics () is a behavioral science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goo ...
gives the value attained by the
objective function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...
at a solution, while only depending on the
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s of the problem. In a controlled
dynamical system In mathematics, a dynamical system is a system in which a Function (mathematics), function describes the time dependence of a Point (geometry), point in an ambient space, such as in a parametric curve. Examples include the mathematical models ...
, the value function represents the optimal payoff of the system over the interval , t1/var> when started at the time-t state variable x(t)=x. If the objective function represents some cost that is to be minimized, the value function can be interpreted as the cost to finish the optimal program, and is thus referred to as "cost-to-go function." In an economic context, where the objective function usually represents
utility In economics, utility is a measure of a certain person's satisfaction from a certain state of the world. Over time, the term has been used with at least two meanings. * In a normative context, utility refers to a goal or objective that we wish ...
, the value function is conceptually equivalent to the indirect utility function. In a problem of optimal control, the value function is defined as the
supremum In mathematics, the infimum (abbreviated inf; : infima) of a subset S of a partially ordered set P is the greatest element in P that is less than or equal to each element of S, if such an element exists. If the infimum of S exists, it is unique, ...
of the objective function taken over the set of admissible controls. Given (t_, x_) \in , t_\times \mathbb^, a typical optimal control problem is to : \text \quad J(t_, x_; u) = \int_^ I(t,x(t), u(t)) \, \mathrmt + \phi(x(t_)) subject to :\frac = f(t, x(t), u(t)) with initial state variable x(t_)=x_. The objective function J(t_, x_; u) is to be maximized over all admissible controls u \in U _,t_/math>, where u is a Lebesgue measurable function from _, t_/math> to some prescribed arbitrary set in \mathbb^. The value function is then defined as with V(t_, x(t_)) = \phi(x(t_)), where \phi(x(t_)) is the "scrap value". If the optimal pair of control and state trajectories is (x^\ast, u^\ast), then V(t_, x_) = J(t_, x_; u^\ast). The function h that gives the optimal control u^\ast based on the current state x is called a feedback control policy, or simply a policy function. Bellman's principle of optimality roughly states that any optimal policy at time t, t_ \leq t \leq t_ taking the current state x(t) as "new" initial condition must be optimal for the remaining problem. If the value function happens to be continuously differentiable, this gives rise to an important
partial differential equation In mathematics, a partial differential equation (PDE) is an equation which involves a multivariable function and one or more of its partial derivatives. The function is often thought of as an "unknown" that solves the equation, similar to ho ...
known as Hamilton–Jacobi–Bellman equation, :-\frac = \max_u \left\ where the maximand on the right-hand side can also be re-written as the
Hamiltonian Hamiltonian may refer to: * Hamiltonian mechanics, a function that represents the total energy of a system * Hamiltonian (quantum mechanics), an operator corresponding to the total energy of that system ** Dyall Hamiltonian, a modified Hamiltonian ...
, H \left(t, x, u, \lambda \right) = I(t,x,u) + \lambda(t) f(t, x, u), as :-\frac = \max_u H(t,x,u,\lambda) with \partial V(t,x)/\partial x = \lambda(t) playing the role of the costate variables. Given this definition, we further have \mathrm \lambda(t) / \mathrmt = \partial^ V(t,x) / \partial x \partial t + \partial^ V(t,x) / \partial x^ \cdot f(x), and after differentiating both sides of the HJB equation with respect to x, :- \frac = \frac + \frac f(x) + \frac \frac which after replacing the appropriate terms recovers the costate equation :- \dot(t) = \underbrace_ where \dot(t) is Newton notation for the derivative with respect to time. The value function is the unique viscosity solution to the Hamilton–Jacobi–Bellman equation. In an
online In computer technology and telecommunications, online indicates a state of connectivity, and offline indicates a disconnected state. In modern terminology, this usually refers to an Internet connection, but (especially when expressed as "on lin ...
closed-loop approximate optimal control, the value function is also a Lyapunov function that establishes global asymptotic stability of the closed-loop system.


References


Further reading

* * * *{{cite book , first=Robert F. , last=Stengel , chapter=Conditions for Optimality , title=Optimal Control and Estimation , location=New York , publisher=Dover , year=1994 , isbn=0-486-68200-5 , pages=201–222 , chapter-url=https://books.google.com/books?id=jDjPxqm7Lw0C&pg=PA201 Dynamic programming Optimal control