In mathematics, Kingman's subadditive ergodic theorem is one of several ergodic theorems. It can be seen as a generalization of

Birkhoff's ergodic theorem Ergodic theory is a branch of mathematics that studies statistical properties of deterministic dynamical systems; it is the study of ergodicity. In this context, "statistical properties" refers to properties which are expressed through the behavi ...

.S. Lalley, Kingman's subadditive ergodic theorem lecture notes, http://galton.uchicago.edu/~lalley/Courses/Graz/Kingman.pdf Intuitively, the subadditive ergodic theorem is a kind of random variable version of

Fekete's lemma In mathematics, subadditivity is a property of a function that states, roughly, that evaluating the function for the sum of two elements of the domain always returns something less than or equal to the sum of the function's values at each element ...

(hence the name ergodic). As a result, it can be rephrased in the language of probability, e.g. using a sequence of random variables and

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

s. The theorem is named after

John Kingman Sir John Frank Charles Kingman (born 28 August 1939) is a British mathematician. He served as N. M. Rothschild and Sons Professor of Mathematical Sciences and Director of the Isaac Newton Institute at the University of Cambridge from 2001 unt ...

Statement of theorem

Let

T

be a

measure-preserving transformation In mathematics, a measure-preserving dynamical system is an object of study in the abstract formulation of dynamical systems, and ergodic theory in particular. Measure-preserving systems obey the Poincaré recurrence theorem, and are a special cas ...

on the

probability space In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models ...

(\Omega,\Sigma,\mu)

, and let

\_

be a sequence of

L^1

functions such that

g_(x)\le g_n(x)+g_m(T^nx)

(subadditivity relation). Then :

\lim_\frac=:g(x)\ge-\infty

for

\mu

-a.e. ''x'', where ''g''(''x'') is ''T''-invariant. In particular, if ''T'' is

ergodic In mathematics, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space that the system moves in, in a uniform and random sense. This implies th ...

, then ''g''(''x'') is a constant.

Equivalent statement

Given a family of real random variables

X(m, n)

, with

0 \leq m < n \in \N

, such that they are subadditive in the sense that

\begin
      & X(m+1, n+1)=X(m, n) \circ T \\
      & X(0, n) \leq X(0, m)+X(m, n)
      \end

Then there exists a random variable

Y

such that

Y \in [-\infty, +\infty)

Y

is invariant with respect to

T

, and

\lim_n \frac 1n X(0, n) = Y

a.s.. They are equivalent by setting *

g_n = X(0, n)

with

n \geq 1

; *

X(m, m+n) = g_n \circ T^m

with

m \geq 0

Proof

Proof due to (J. Michael Steele, 1989).

Subadditivity by partition

Fix some

n\geq 1

. By subadditivity, for any

l\in 1:n-1

g_n \leq g_ + g_l \circ T^

We can picture this as starting with the set

0:n-1

, and then removing its length

l

tail. Repeating this construction until the set

0:n-1

is all gone, we have a one-to-one correspondence between upper bounds of

g_n

and partitions of

1:n-1

. Specifically, let

\_i

be a partition of

0:n-1

, then we have

g_n \leq \sum_i g_\circ T^

Constructing ''g''

Let

g := \liminf g_n/n

, then it is

T

-invariant. By subadditivity,

\frac \leq\frac

Taking the

n\to \infty

limit, we have

g \leq g\circ T

We can visualize

T

as hill-climbing on the graph of

g

. If

T

actually causes a nontrivial amount of hill-climbing, then we would get a spatial contraction, and so

T

does not preserve measure. Therefore

g = g\circ T

a.e. Let

c\in \R

, then

\ \subset \ = T^(\)

and since both sides have the same measure, by squeezing, they are equal a.e.. That is,

g(x) \geq c \iff g(Tx) \geq c

, a.e.. Now apply this for all rational

c

Reducing to the case of ''g_n ≤ 0''

By subadditivity, using the partition of

0:n-1

into singletons.

\begin
          g_1 &\leq g_1 \\
          g_2 &\leq g_1 + g_1 \circ T \\
          g_3 &\leq g_1 + g_1 \circ T + g_1 \circ T^2 \\
          & \cdots  
          \end

Now, construct the sequence

\begin
          f_1 &= g_1 - g_1 \\
          f_2 &= g_2 - (g_1 + g_1 \circ T) \\
          f_3 &= g_3 - (g_1 + g_1 \circ T + g_1 \circ T^2) \\
          & \cdots  
          \end

which satisfies

f_n \leq 0

for all

n

. By the special case,

f_n/n

converges a.e. to a

T

-invariant function. By Birkhoff's pointwise ergodic theorem, the running average

\frac 1n (g_1 + g_1 \circ T + g_1 \circ T^2 + \cdots )

converges a.e. to a

T

-invariant function. Therefore, their sum does as well.

Bounding the truncation

Fix arbitrary

\epsilon, M > 0

, and construct the truncated function, still

T

-invariant:

g' := \max(g, -M)

With these, it suffices to prove an a.e. upper bound

\limsup g_n/n \leq g' + \epsilon

since it would allow us to take the limit

\epsilon = 1/1, 1/2, 1/3, \dots

, then the limit

M = 1, 2, 3, \dots

, giving us a.e.

\limsup g_n/n \leq \liminf g_n/n =: g

And by squeezing, we have

g_n/n

converging a.e. to

g

. Define two families of sets, one shrinking to the empty set, and one growing to the full set. For each "length"

L = 1, 2, 3, \dots

, define

B_L := \

A_L := B_L^c = \

Since

g' \geq \liminf g_n/n

, the

B

family shrinks to the empty set. Fix

x \in X

. Fix

L \in \N

. Fix

n > L

. The ordering of these qualifiers is vitally important, because we will be removing the qualifiers one by one in the reverse order. To prove the a.e. upper bound, we must use the subadditivity, which means we must construct a partition of the set

0:n-1

. We do this inductively:

Take the smallest $k$ not already in a partition.

If $T^k x \in A_N$ , then $g_l(T^k x)/l \leq g'(x) + \epsilon$ for some $l\in 1, 2, \dots L$ . Take one such $l$ – the choice does not matter.

If $k+l-1 \leq n-1$ , then we cut out $\$ . Call these partitions "type 1". Else, we cut out $\$ . Call these partitions "type 2".

Else, we cut out $\$ . Call these partitions "type 3".

Now convert this partition into an inequality:

g_n(x) \leq \sum_i g_(T^x)

where

k_i

are the heads of the partitions, and

l_i

are the lengths. Since all

g_n \leq 0

, we can remove the other kinds of partitions:

g_n(x) \leq \sum_ g_(T^x)

By construction, each

g_(T^x) \leq l_i(g'(x) + \epsilon)

, thus

\frac 1n g_n(x) \leq g'(x) \frac 1n \sum_ l_i + \epsilon

Now it would be tempting to continue with

g'(x) \frac 1n \sum_ l_i \leq g'(x)

, but unfortunately

g' \leq 0

, so the direction is the exact opposite. We must ''lower'' bound the sum

\sum_ l_i

. The number of type 3 elements is equal to

\sum_ 1_(T^k x)

If a number

k

is of type 2, then it must be inside the last

L-1

elements of

0:n-1

. Thus the number of type 2 elements is at most

L-1

. Together, we have the ''lower'' bound:

\frac 1n  \sum_ l_i \geq 1 - \frac - \frac 1n \sum_ 1_(T^k x)

Peeling off the first qualifier

Remove the

n>L

qualifier by taking the

n\to \infty

limit. By Birkhoff's pointwise ergodic theorem, there exists an a.e. pointwise limit

\lim_n \frac 1n \sum_ 1_(T^k x) \to \bar 1_(x)

satisfying

\int \bar 1_  = \mu(B_L); \quad \bar 1_(x) \in

, 1 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...

At the limit, we find that for a.e.

x\in X, L \in \N

\limsup_n \frac \leq g'(x) (1- \bar 1_(x) )  + \epsilon

Peeling off the second qualifier

Remove the

L \in \N

qualifier by taking the

L\to \infty

limit. Since we have

\int \bar 1_  = \mu(B_L) \to 0

and

\bar 1_ \geq \bar 1_ \geq \cdots

1_ \geq 1_ \geq \cdots

, we can apply the same argument used for proving

Markov's inequality In probability theory, Markov's inequality gives an upper bound on the probability that a non-negative random variable is greater than or equal to some positive Constant (mathematics), constant. Markov's inequality is tight in the sense that for e ...

, to obtain

\limsup_n \frac \leq g'(x)  + \epsilon

for a.e.

x\in X

. In detail, the argument is as follows: since

\bar 1_ \geq \bar 1_ \geq \cdots \geq 0

, and

\int \bar 1_ \to 0

, we know that for any small

\delta, \delta' > 0

, all large enough

L

satisfies

\bar 1_(x) < \delta

everywhere except on a set of size

\geq \delta'

. Thus,

\limsup_n \frac \leq g'(x)(1-\delta)  + \epsilon

with probability

\geq 1-\delta'

. Now take both

\delta, \delta' \to 0

Applications

Taking

g_n(x):=\sum_^f(T^jx)

recovers Birkhoff's pointwise ergodic theorem. Taking all

g_n

constant functions, we recover the Fekete's subadditive lemma. Kingman's subadditive ergodic theorem can be used to prove statements about

Lyapunov exponent In mathematics, the Lyapunov exponent or Lyapunov characteristic exponent of a dynamical system is a quantity that characterizes the rate of separation of infinitesimally close trajectory, trajectories. Quantitatively, two trajectories in phase sp ...

s. It also has applications to percolations and

longest increasing subsequence In computer science, the longest increasing subsequence problem aims to find a subsequence of a given sequence in which the subsequence's elements are sorted in an ascending order and in which the subsequence is as long as possible. This subsequenc ...

.Pitman, Lecture 12: Subadditive ergodic theory, http://www.stat.berkeley.edu/~pitman/s205s03/lecture12.pdf

Longest increasing subsequence

To study the longest increasing subsequence of a random permutation

\pi

, we generate it in an equivalent way. A random permutation on

1:n

is equivalently generated by uniformly sampling

n

points in a square, then find the longest increasing subsequence of that. Now, define the

Poisson point process In probability theory, statistics and related fields, a Poisson point process (also known as: Poisson random measure, Poisson random point field and Poisson point field) is a type of mathematical object that consists of points randomly located ...

with density 1 on

[0, \infty)^2

, and define the random variables

M^*_k

to be the length of the longest increasing subsequence in the square

[0, k)^2

. Define the measure-preserving transform

T

by shifting the plane by

(-1, -1)

, then chopping off the parts that have fallen out of

[0, \infty)^2

. The process is subadditive, that is,

M_^* \geq M_^* + M_^* \circ T^k

. To see this, notice that the right side constructs an increasing subsequence first in the square

[0, k)^2

, then in the square

[k, k+m)^2

, and finally concatenate them together. This produces an increasing subsequence in

[0, k+m)^2

, but not necessarily the longest one. Also,

T

is ergodic, so by Kingman's theorem,

M_k^* /k

converges to a constant almost surely. Since at the limit, there are

n = k^2

points in the square, we have

L_n^* / \sqrt n

converging to a constant almost surely.

References

{{Reflist Ergodic theory Theorems in probability theory