In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, Itô's lemma or Itô's formula (also called the Itô–Döblin formula) is an
identity used in
Itô calculus to find the
differential of a time-dependent function of a
stochastic process
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Sto ...
. It serves as the
stochastic calculus
Stochastic calculus is a branch of mathematics that operates on stochastic processes. It allows a consistent theory of integration to be defined for integrals of stochastic processes with respect to stochastic processes. This field was created an ...
counterpart of the
chain rule
In calculus, the chain rule is a formula that expresses the derivative of the Function composition, composition of two differentiable functions and in terms of the derivatives of and . More precisely, if h=f\circ g is the function such that h ...
. It can be heuristically derived by forming the
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the
Wiener process
In mathematics, the Wiener process (or Brownian motion, due to its historical connection with Brownian motion, the physical process of the same name) is a real-valued continuous-time stochastic process discovered by Norbert Wiener. It is one o ...
increment. The
lemma is widely employed in
mathematical finance
Mathematical finance, also known as quantitative finance and financial mathematics, is a field of applied mathematics, concerned with mathematical modeling in the financial field.
In general, there exist two separate branches of finance that req ...
, and its best known application is in the derivation of the
Black–Scholes equation
In mathematical finance, the Black–Scholes equation, also called the Black–Scholes–Merton equation, is a partial differential equation (PDE) governing the price evolution of derivatives under the Black–Scholes model. Broadly speaking, the ...
for option values.
This result was discovered by Japanese mathematician
Kiyoshi Itô
Kiyoshi, (きよし or キヨシ), is a Japanese given name, also spelled Kyoshi.
Possible meanings
*'' Kyōshi'', a form of Japanese poetry
*Kyōshi, a Japanese honorific
Written forms
*清, "cleanse"
*淳, "pure"
*潔, "undefiled"
*清志, ...
in 1951.
Motivation
Suppose we are given the stochastic differential equation
where is a
Wiener process
In mathematics, the Wiener process (or Brownian motion, due to its historical connection with Brownian motion, the physical process of the same name) is a real-valued continuous-time stochastic process discovered by Norbert Wiener. It is one o ...
and the functions
are deterministic (not stochastic) functions of time. In general, it's not possible to write a solution
directly in terms of
However, we can formally write an integral solution
This expression lets us easily read off the mean and variance of
(which has no higher moments). First, notice that every
individually has mean 0, so the expected value of
is simply the integral of the drift function:
Similarly, because the
terms have variance 1 and no correlation with one another, the variance of
is simply the integral of the variance of each infinitesimal step in the random walk:
However, sometimes we are faced with a stochastic differential equation for a more complex process
in which the process appears on both sides of the differential equation. That is, say
for some functions
and
In this case, we cannot immediately write a formal solution as we did for the simpler case above. Instead, we hope to write the process
as a function of a simpler process
taking the form above. That is, we want to identify three functions
and
such that
and
In practice, Ito's lemma is used in order to find this transformation. Finally, once we have transformed the problem into the simpler type of problem, we can determine the mean and higher moments of the process.
Derivation
We derive Itô's lemma by expanding a Taylor series and applying the rules of stochastic calculus.
Suppose
is an
Itô drift-diffusion process that satisfies the
stochastic differential equation
A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic process, resulting in a solution which is also a stochastic process. SDEs have many applications throughout pure mathematics an ...
where is a
Wiener process
In mathematics, the Wiener process (or Brownian motion, due to its historical connection with Brownian motion, the physical process of the same name) is a real-valued continuous-time stochastic process discovered by Norbert Wiener. It is one o ...
.
If is a
twice-differentiable scalar function, its expansion in a
Taylor series
In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor ser ...
is
Then use the
total derivative
In mathematics, the total derivative of a function at a point is the best linear approximation near this point of the function with respect to its arguments. Unlike partial derivatives, the total derivative approximates the function with res ...
and the definition of the partial derivative
:
Substituting
and therefore
, we get
In the limit
, the terms
and
tend to zero faster than
.
is
(due to the
quadratic variation
In mathematics, quadratic variation is used in the analysis of stochastic processes such as Brownian motion and other martingales. Quadratic variation is just one kind of variation of a process.
Definition
Suppose that X_t is a real-valued st ...
of a
Wiener process
In mathematics, the Wiener process (or Brownian motion, due to its historical connection with Brownian motion, the physical process of the same name) is a real-valued continuous-time stochastic process discovered by Norbert Wiener. It is one o ...
which says
), so setting
and
terms to zero and substituting
for
, and then collecting the
terms, we obtain
as required.
Alternatively,
Geometric intuition

Suppose we know that
are two jointly-Gaussian distributed random variables, and
is nonlinear but has continuous second derivative, then in general, neither of
is Gaussian, and their joint distribution is also not Gaussian. However, since
is Gaussian, we might still find
is Gaussian. This is not true when
is finite, but when
becomes infinitesimal, this becomes true.
The key idea is that
has a deterministic part and a noisy part. When
is nonlinear, the noisy part has a deterministic contribution. If
is convex, then the deterministic contribution is positive (by
Jensen's inequality).
To find out how large the contribution is, we write
, where
is a standard Gaussian, then perform Taylor expansion.
We have split it into two parts, a deterministic part, and a random part with mean zero. The random part is non-Gaussian, but the non-Gaussian parts decay faster than the Gaussian part, and at the
limit, only the Gaussian part remains. The deterministic part has the expected
, but also a part contributed by the convexity:
.
To understand why there should be a contribution due to convexity, consider the simplest case of geometric Brownian walk (of the stock market):
. In other words,
. Let
, then
, and
is a Brownian walk. However, although the expectation of
remains constant, the expectation of
grows. Intuitively it is because the downside is limited at zero, but the upside is unlimited. That is, while
is normally distributed,
is
log-normally distributed.
Mathematical formulation of Itô's lemma
In the following subsections we discuss versions of Itô's lemma for different types of stochastic processes.
Itô drift-diffusion processes (due to: Kunita–Watanabe)
In its simplest form, Itô's lemma states the following: for an
Itô drift-diffusion process
and any twice
differentiable
In mathematics, a differentiable function of one real variable is a function whose derivative exists at each point in its domain. In other words, the graph of a differentiable function has a non- vertical tangent line at each interior point in ...
scalar function of two real variables and , one has
This immediately implies that is itself an Itô drift-diffusion process.
In higher dimensions, if
is a vector of Itô processes such that
for a vector
and matrix
, Itô's lemma then states that
where
is the
gradient
In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field (or vector-valued function) \nabla f whose value at a point p gives the direction and the rate of fastest increase. The g ...
of w.r.t. , is the
Hessian matrix
In mathematics, the Hessian matrix, Hessian or (less commonly) Hesse matrix is a square matrix of second-order partial derivatives of a scalar-valued Function (mathematics), function, or scalar field. It describes the local curvature of a functio ...
of w.r.t. , and is the
trace operator
In mathematical analysis, the trace operator extends the notion of the restriction of a function to the boundary of its domain to "generalized" functions in a Sobolev space. This is particularly important for the study of partial differential equ ...
.
Poisson jump processes
We may also define functions on discontinuous stochastic processes.
Let be the jump intensity. The
Poisson process
In probability theory, statistics and related fields, a Poisson point process (also known as: Poisson random measure, Poisson random point field and Poisson point field) is a type of mathematical object that consists of Point (geometry), points ...
model for jumps is that the probability of one jump in the interval is plus higher order terms. could be a constant, a deterministic function of time, or a stochastic process. The survival probability is the probability that no jump has occurred in the interval . The change in the survival probability is
So
Let be a discontinuous stochastic process. Write
for the value of ''S'' as we approach ''t'' from the left. Write
for the non-infinitesimal change in as a result of a jump. Then