The system size expansion, also known as van Kampen's expansion or the Ω-expansion, is a technique pioneered by

Nico van Kampen Nicolaas 'Nico' Godfried van Kampen (June 22, 1921 – October 6, 2013) was a Dutch theoretical physicist, who worked mainly on statistical mechanics and non-equilibrium thermodynamics. Van Kampen was born in Leiden, and was a nephew of Frits Ze ...

van Kampen, N. G. (2007) "Stochastic Processes in Physics and Chemistry", North-Holland Personal Library used in the analysis of

stochastic processes In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables. Stochastic processes are widely used as mathematical models of systems and phenomena that a ...

. Specifically, it allows one to find an approximation to the solution of a

master equation In physics, chemistry and related fields, master equations are used to describe the time evolution of a system that can be modelled as being in a probabilistic combination of states at any given time and the switching between states is determine ...

with nonlinear transition rates. The leading order term of the expansion is given by the linear noise approximation, in which the master equation is approximated by a

Fokker–Planck equation In statistical mechanics, the Fokker–Planck equation is a partial differential equation that describes the time evolution of the probability density function of the velocity of a particle under the influence of drag forces and random forces, a ...

with linear coefficients determined by the transition rates and

stoichiometry Stoichiometry refers to the relationship between the quantities of reactants and products before, during, and following chemical reactions. Stoichiometry is founded on the law of conservation of mass where the total mass of the reactants equ ...

of the system. Less formally, it is normally straightforward to write down a mathematical description of a system where processes happen randomly (for example, radioactive atoms randomly

decay Decay may refer to: Science and technology * Bit decay, in computing * Software decay, in computing * Distance decay, in geography * Decay time (fall time), in electronics Biology * Decomposition of organic matter * Tooth decay (dental caries), ...

in a physical system, or genes that are expressed stochastically in a cell). However, these mathematical descriptions are often too difficult to solve for the study of the systems statistics (for example, the

mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value ( magnitude and sign) of a given data set. For a data set, the '' ari ...

and

variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...

of the number of atoms or proteins as a function of time). The system size expansion allows one to obtain an approximate statistical description that can be solved much more easily than the master equation.

Preliminaries

Systems that admit a treatment with the system size expansion may be described by a

probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...

P(X, t)

, giving the probability of observing the system in state

X

at time

t

X

may be, for example, a

vector Vector most often refers to: *Euclidean vector, a quantity with a magnitude and a direction *Vector (epidemiology), an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematic ...

with elements corresponding to the number of molecules of different chemical species in a system. In a system of size

\Omega

(intuitively interpreted as the volume), we will adopt the following nomenclature:

\mathbf

is a vector of macroscopic copy numbers,

\mathbf = \mathbf/\Omega

is a vector of concentrations, and

\mathbf

is a vector of deterministic concentrations, as they would appear according to the rate equation in an infinite system.

\mathbf

and

\mathbf

are thus quantities subject to stochastic effects. A

describes the time evolution of this probability. Henceforth, a system of chemical reactionsElf, J. and Ehrenberg, M. (2003) "Fast Evaluation of Fluctuations in Biochemical Networks With the Linear Noise Approximation", ''Genome Research'', 13:2475–2484. will be discussed to provide a concrete example, although the nomenclature of "species" and "reactions" is generalisable. A system involving

N

species and

R

reactions can be described with the master equation: :

\frac = \Omega \sum_^R \left( \prod_^ \mathbb^ - 1 \right) f_j (\mathbf, \Omega) P (\mathbf, t).

Here,

\Omega

is the system size,

\mathbb

is an

operator Operator may refer to: Mathematics * A symbol indicating a mathematical operation * Logical operator or logical connective in mathematical logic * Operator (mathematics), mapping that acts on elements of a space to produce elements of another ...

which will be addressed later,

S_

is the stoichiometric matrix for the system (in which element

S_

gives the

stoichiometric coefficient A chemical equation is the symbolic representation of a chemical reaction in the form of symbols and chemical formulas. The reactant entities are given on the left-hand side and the product entities on the right-hand side with a plus sign between t ...

for species

i

in reaction

j

), and

f_j

is the rate of reaction

j

given a state

\mathbf

and system size

\Omega

\mathbb^

is a step operator, removing

S_

from the

i

th element of its argument. For example,

\mathbb^ f(x_1, x_2, x_3) = f(x_1, x_2 - S_, x_3)

. This formalism will be useful later. The above equation can be interpreted as follows. The initial sum on the RHS is over all reactions. For each reaction

j

, the brackets immediately following the sum give two terms. The term with the simple coefficient −1 gives the probability flux away from a given state

\mathbf

due to reaction

j

changing the state. The term preceded by the product of step operators gives the probability flux due to reaction

j

changing a different state

\mathbf

into state

\mathbf

. The product of step operators constructs this state

\mathbf

Example

For example, consider the (linear) chemical system involving two chemical species

X_1

and

X_2

and the reaction

X_1 \rightarrow X_2

. In this system,

N = 2

(species),

R = 1

(reactions). A state of the system is a vector

\mathbf = \

, where

n_1, n_2

are the number of molecules of

X_1

and

X_2

respectively. Let

f_1(\mathbf, \Omega) = \frac = x_1

, so that the rate of reaction 1 (the only reaction) depends on the concentration of

X_1

. The stoichiometry matrix is

(-1, 1)^T

. Then the master equation reads: :

\begin \frac & = \Omega \left( \mathbb^ \mathbb^ - 1 \right) f_1 \left( \frac \right) P(\mathbf, t) \\
& = \Omega \left( f_1 \left( \frac \right) P \left( \mathbf + \mathbf, t \right)  - f_1 \left( \frac \right) P \left( \mathbf, t \right) \right),\end

where

\mathbf = \

is the shift caused by the action of the product of step operators, required to change state

\mathbf

to a precursor state

\mathbf'

Linear noise approximation

If the master equation possesses

nonlinear In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other ...

transition rates, it may be impossible to solve it analytically. The system size expansion utilises the

ansatz In physics and mathematics, an ansatz (; , meaning: "initial placement of a tool at a work piece", plural Ansätze ; ) is an educated guess or an additional assumption made to help solve a problem, and which may later be verified to be part of th ...

that the

of the steady-state probability distribution of constituent numbers in a population scales like the system size. This ansatz is used to expand the master equation in terms of a small parameter given by the inverse system size. Specifically, let us write the

X_i

, the copy number of component

i

, as a sum of its "deterministic" value (a scaled-up concentration) and a

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...

\xi

, scaled by

\Omega^

: :

X_i = \Omega \phi_i + \Omega^ \xi_i.

The probability distribution of

\mathbf

can then be rewritten in the vector of random variables

\xi

: :

P(\mathbf, t) = P(\Omega \mathbf + \Omega^ \mathbf) = \Pi (\mathbf, t).

Consider how to write reaction rates

f

and the step operator

\mathbb

in terms of this new random variable.

Taylor expansion In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor se ...

of the transition rates gives: :

f_j (\mathbf) = f_j (\mathbf + \Omega^ \mathbf) = f_j( \mathbf ) + \Omega^ \sum_^N \frac \xi_i + O(\Omega^).

The step operator has the effect

\mathbb f(n) \rightarrow f(n+1)

and hence

\mathbb f(\xi) \rightarrow f(\xi + \Omega^)

: :

\prod_^\mathbb^ \simeq 1 - \Omega^ \sum_i S_ \frac + \frac \sum_i \sum_k S_ S_ \frac + O(\Omega^).

We are now in a position to recast the master equation. :

\begin &  \quad \frac - \Omega^ \sum_^N \frac \frac \\
& = \Omega \sum_^R \left( -\Omega^ \sum_i S_ \frac + \frac \sum_i \sum_k S_ S_ \frac + O(\Omega^) \right) \\
&  \qquad \times \left( f_j(\mathbf) + \Omega^ \sum_i \frac \xi_i + O(\Omega^) \right) \Pi(\mathbf, t). \end

This rather frightening expression makes a bit more sense when we gather terms in different powers of

\Omega

. First, terms of order

\Omega^

give :

\sum_^N \frac \frac = \sum_^N \sum_^R S_ f_j (\mathbf) \frac.

These terms cancel, due to the macroscopic reaction equation :

\frac = \sum_^R S_ f_j (\mathbf).

The terms of order

\Omega^0

are more interesting: :

\frac = \sum_j \left( \sum_ -S_ \frac \frac + \frac f_j \sum_ S_ S_ \frac \right),

which can be written as :

\frac,

where :

A_ = \sum_^R S_ \frac = \frac,

and :

\mathbf^T = \sum_^R S_S_ f_j (\mathbf) = \mathbf \, \mbox(f(\mathbf)) \, \mathbf^T .

The time evolution of

\Pi

is then governed by the linear

with coefficient matrices

\mathbf

and

\mathbf^T

(in the large-

\Omega

limit, terms of

O(\Omega^)

may be neglected, termed the linear noise approximation). With knowledge of the reaction rates

\mathbf

and stoichiometry

S

, the moments of

\Pi

can then be calculated. The approximation implies that fluctuations around the mean are

Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...

distributed. Non-Gaussian features of the distributions can be computed by taking into account higher order terms in the expansion.

Software

The linear noise approximation has become a popular technique for estimating the size of intrinsic noise in terms of coefficients of variation and

Fano factor In statistics, the Fano factor, like the coefficient of variation, is a measure of the dispersion of a probability distribution of a Fano noise. It is named after Ugo Fano, an Italian American physicist. The Fano factor is defined as :F=\frac, w ...

s for molecular species in intracellular pathways. The second moment obtained from the linear noise approximation (on which the noise measures are based) are exact only if the pathway is composed of first-order reactions. However bimolecular reactions such as enzyme-substrate, protein-protein and protein-DNA interactions are ubiquitous elements of all known pathways; for such cases, the linear noise approximation can give estimates which are accurate in the limit of large reaction volumes. Since this limit is taken at constant concentrations, it follows that the linear noise approximation gives accurate results in the limit of large molecule numbers and becomes less reliable for pathways characterized by many species with low copy numbers of molecules. A number of studies have elucidated cases of the insufficiency of the linear noise approximation in biological contexts by comparison of its predictions with those of stochastic simulations.Hayot, F. and Jayaprakash, C. (2004), "The linear noise approximation for molecular fluctuations within cells", ''Physical Biology'', 1:205Ferm, L. Lötstedt, P. and Hellander, A. (2008), "A Hierarchy of Approximations of the Master Equation Scaled by a Size Parameter", ''Journal of Scientific Computing'', 34:127 This has led to the investigation of higher order terms of the system size expansion that go beyond the linear approximation. These terms have been used to obtain more accurate moment estimates for the

concentrations and for the

s of the concentration fluctuations in intracellular pathways. In particular, the leading order corrections to the linear noise approximation yield corrections of the conventional

rate equation In chemistry, the rate law or rate equation for a reaction is an equation that links the initial or forward reaction rate with the concentrations or pressures of the reactants and constant parameters (normally rate coefficients and partial rea ...

s.Grima, R. (2010) "An effective rate equation approach to reaction kinetics in small volumes: Theory and application to biochemical reactions in nonequilibrium steady-state conditions", ''The Journal of Chemical Physics'', 132:035101 Terms of higher order have also been used to obtain corrections to the

s and

covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...

s estimates of the linear noise approximation.Grima, R. and Thomas, P. and Straube, A.V. (2011), "How accurate are the nonlinear chemical Fokker-Planck and chemical Langevin equations?", ''The Journal of Chemical Physics'', 135:084103Grima, R. (2012), "A study of the accuracy of moment-closure approximations for stochastic chemical kinetics", ''The Journal of Chemical Physics'', 136: 154105 The linear noise approximation and corrections to it can be computed using the open source software intrinsic Noise Analyzer. The corrections have been shown to be particularly considerable for allosteric and non-allosteric enzyme-mediated reactions in intracellular compartments.

References

{{DEFAULTSORT:System Size Expansion Stochastic processes Applied mathematics Chemical kinetics Stoichiometry Equations of physics