A stochastic simulation is a

simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of Conceptual model, models; the model represents the key characteristics or behaviors of the selected system or proc ...

of a

system A system is a group of interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment, is described by its boundaries, structure and purpose and express ...

that has variables that can change stochastically (randomly) with individual probabilities.DLOUHÝ, M.; FÁBRY, J.; KUNCOVÁ, M.. Simulace pro ekonomy. Praha : VŠE, 2005. Realizations of these

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

s are generated and inserted into a model of the system. Outputs of the model are recorded, and then the process is repeated with a new set of random values. These steps are repeated until a sufficient amount of data is gathered. In the end, the

distribution Distribution may refer to: Mathematics * Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a vari ...

of the outputs shows the most probable estimates as well as a frame of expectations regarding what ranges of values the variables are more or less likely to fall in. Often random variables inserted into the model are created on a computer with a

random number generator Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outc ...

(RNG). The U(0,1)

uniform distribution Uniform distribution may refer to: * Continuous uniform distribution * Discrete uniform distribution * Uniform distribution (ecology) * Equidistributed sequence See also * * Homogeneous distribution In mathematics, a homogeneous distribution ...

outputs of the random number generator are then transformed into random variables with probability distributions that are used in the system model.

Etymology

Stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...

'' originally meant "pertaining to conjecture"; from Greek stokhastikos "able to guess, conjecturing": from stokhazesthai "guess"; from stokhos "a guess, aim, target, mark". The sense of "randomly determined" was first recorded in 1934, from German Stochastik.

Discrete-event simulation

In order to determine the next event in a stochastic simulation, the rates of all possible changes to the state of the model are computed, and then ordered in an array. Next, the cumulative sum of the array is taken, and the final cell contains the number R, where R is the total event rate. This cumulative array is now a discrete cumulative distribution, and can be used to choose the next event by picking a random number z~U(0,R) and choosing the first event, such that z is less than the rate associated with that event.

Probability distributions

A probability distribution is used to describe the potential outcome of a random variable. Limits the outcomes where the variable can only take on discrete values.Rachev, Svetlozar T. Stoyanov, Stoyan V. Fabozzi, Frank J., "Chapter 1 Concepts of Probability" in Advanced Stochastic Models, Risk Assessment, and Portfolio Optimization : The Ideal Risk, Uncertainty, and Performance Measures, Hoboken, NJ, USA: Wiley, 2008

Bernoulli distribution

A random variable X is Bernoulli-distributed with parameter p if it has two possible outcomes usually encoded 1 (success or default) or 0 (failure or survival) where the probabilities of success and failure are

P(X = 1) = p

and

P(X = 0) = 1 - p

where

0 \leq p \leq 1

. To produce a random variable X with a Bernoulli distribution from a U(0,1) uniform distribution made by a random number generator, we define

X = \begin 1, & \text 0 \leq U < p \\ 0, & \text 1 \geq U \geq p \end

such that the probability for

P(X = 1) = P(0 \leq U < p) = p

and

P(X = 0) = P(1 \geq U \geq p) = 1 - p

= Example: Toss of coin

= Define

X = \begin
1 & \text \\
0 & \text
\end

For a fair coin, both realizations are equally likely. We can generate realizations of this random variable X from a

U(1,0)

uniform distribution provided by a random number generator (RNG) by having

X = 1

if the RNG outputs a value between 0 and 0.5 and

X = 0

if the RNG outputs a value between 0.5 and 1.

\begin
 P (X = 1) &= P(0 \leq U  <   1/2) = 1/2 \\
 P (X = 0) &= P(1 \geq U \geq 1/2) = 1/2
\end

Of course, the two outcomes may not be equally likely (e.g. success of medical treatment).Bernoulli Distribution, The University of Chicago - Department of Statistics, nlineavailable at http://galton.uchicago.edu/~eichler/stat22000/Handouts/l12.pdf

Binomial distribution

A binomial distributed random variable Y with parameters ''n'' and ''p'' is obtained as the sum of ''n'' independent and identically Bernoulli-distributed

random In common usage, randomness is the apparent or actual lack of pattern or predictability in events. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual ran ...

variables ''X''₁, ''X''₂, ..., ''X''_''n'' Example: A coin is tossed three times. Find the probability of getting exactly two heads. This problem can be solved by looking at the sample space. There are three ways to get two heads. The answer is 3/8 (= 0.375).

Poisson distribution

A poisson process is a process where events occur randomly in an interval of time or space. The probability distribution for Poisson processes with constant rate ''λ'' per time interval is given by the following equation.

P(k \text) = \frac

Defining

N(t)

as the number of events that occur in the time interval

t

P(N(t) = k) = \frace^

It can be shown that inter-arrival times for events is exponentially distributed with a

cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Eve ...

(CDF) of

F(t) = 1 - e^

. The inverse of the exponential CDF is given by

t = -\frac\ln(u)

where

u

is an

U(0,1)

uniformly distributed random variable. Simulating a Poisson process with a constant rate

\lambda

for the number of events

N

that occur in interval

_\text,t_\text /math> can be carried out with the following algorithm.

# Begin with N = 0 and t = t_\text # Generate random variable u from U(0,1) uniform distribution
# Update the time with t = t - \ln(u) / \lambda # If t > t_\text, then stop. Else continue to step 5.
# N = N + 1 # Continue to step 2

Methods

Direct and first reaction methods

Published by Dan Gillespie in 1977, and is a linear search on the cumulative array. See Gillespie algorithm. Gillespie’s Stochastic Simulation Algorithm (SSA) is essentially an exact procedure for numerically simulating the time evolution of a well-stirred chemically reacting system by taking proper account of the randomness inherent in such a system.Stephen Gilmore, An Introduction to Stochastic Simulation - Stochastic Simulation Algorithms, University of Edinburgh, nlineavailable at http://www.doc.ic.ac.uk/~jb/conferences/pasta2006/slides/stochastic-simulation-introduction.pdf It is rigorously based on the same microphysical premise that underlies the chemical master equation and gives a more realistic representation of a system’s evolution than the deterministic reaction rate equation (RRE) represented mathematically by ODEs. As with the chemical master equation, the SSA converges, in the limit of large numbers of reactants, to the same solution as the law of mass action.

Next reaction method

Published 2000 by Gibson and Bruck. This is an improvement over the first reaction method where the unused reaction times are reused. To make the sampling of reactions more efficient, an indexed priority queue is used to store the reaction times. On the other hand, to make the recomputation of propensities more efficient, a dependency graph is used. This dependency graph tells which reaction propensities to update after a particular reaction has fired.

Optimised and sorting direct methods

Published 2004 and 2005. These methods sort the cumulative array to reduce the average search depth of the algorithm. The former runs a presimulation to estimate the firing frequency of reactions, whereas the latter sorts the cumulative array on-the-fly.

Logarithmic direct method

Published in 2006. This is a binary search on the cumulative array, thus reducing the worst-case time complexity of reaction sampling to O (log M).

Partial-propensity methods

Published in 2009, 2010, and 2011 (Ramaswamy 2009, 2010, 2011). Use factored-out, partial reaction propensities to reduce the computational cost to scale with the number of species in the network, rather than the (larger) number of reactions. Four variants exist: * PDM, the partial-propensity direct method. Has a computational cost that scales linearly with the number of different species in the reaction network, independent of the coupling class of the network (Ramaswamy 2009). * SPDM, the sorting partial-propensity direct method. Uses dynamic bubble sort to reduce the pre-factor of the computational cost in multi-scale reaction networks where the reaction rates span several orders of magnitude (Ramaswamy 2009). * PSSA-CR, the partial-propensity SSA with composition-rejection sampling. Reduces the computational cost to constant time (i.e., independent of network size) for weakly coupled networks (Ramaswamy 2010) using composition-rejection sampling (Slepoy 2008). * dPDM, the delay partial-propensity direct method. Extends PDM to reaction networks that incur time delays (Ramaswamy 2011) by providing a partial-propensity variant of the delay-SSA method (Bratsun 2005, Cai 2007). The use of partial-propensity methods is limited to elementary chemical reactions, i.e., reactions with at most two different reactants. Every non-elementary chemical reaction can be equivalently decomposed into a set of elementary ones, at the expense of a linear (in the order of the reaction) increase in network size.

Approximate Methods

A general drawback of stochastic simulations is that for big systems, too many events happen which cannot all be taken into account in a simulation. The following methods can dramatically improve simulation speed by some approximations.

τ leaping method

Since the SSA method keeps track of each transition, it would be impractical to implement for certain applications due to high time complexity. Gillespie proposed an approximation procedure, the tau-leaping method which decreases computational time with minimal loss of accuracy. Instead of taking incremental steps in time, keeping track of ''X''(''t'') at each time step as in the SSA method, the tau-leaping method leaps from one subinterval to the next, approximating how many transitions take place during a given subinterval. It is assumed that the value of the leap, τ, is small enough that there is no significant change in the value of the transition rates along the subinterval 't'', ''t'' + ''τ'' This condition is known as the leap condition. The tau-leaping method thus has the advantage of simulating many transitions in one leap while not losing significant accuracy, resulting in a speed up in computational time.

Conditional Difference Method

This method approximates reversible processes (which includes random walk/diffusion processes) by taking only net rates of the opposing events of a reversible process into account. The main advantage of this method is that it can be implemented with a simple if-statement replacing the previous transition rates of the model with new, effective rates. The model with the replaced transition rates can thus be solved, for instance, with the conventional SSA.

Continuous simulation

While in discrete

state space A state space is the set of all possible configurations of a system. It is a useful abstraction for reasoning about the behavior of a given system and is widely used in the fields of artificial intelligence and game theory. For instance, the to ...

it is clearly distinguished between particular states (values) in continuous space it is not possible due to certain continuity. The system usually change over time, variables of the model, then change continuously as well. Continuous simulation thereby simulates the system over time, given

differential equation In mathematics, a differential equation is an equation that relates one or more unknown functions and their derivatives. In applications, the functions generally represent physical quantities, the derivatives represent their rates of change, ...

s determining the rates of change of state variables. Example of continuous system is ''the predator/prey model'' or cart-pole balancing

Probability distributions

Normal distribution

The

variable is said to be normally distributed with parameters and , abbreviated by , if the density of the

variable is given by the formula

f_X(x) = \frac e^ , \quad x \in \Reals.

Many things actually are normally distributed, or very close to it. For example, height and intelligence are approximately normally distributed; measurement errors also often have a

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

.University of Notre Dame, Normal Distribution, nlineavailable at http://www3.nd.edu/~rwilliam/stats1/x21.pdf

Exponential distribution

Exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...

describes the time between events in a

Poisson process In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space with the essential feature that the points occur independently of one ...

, i.e. a process in which events occur continuously and independently at a constant average rate. The

exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...

is popular, for example, in

queuing theory Queueing theory is the mathematical study of waiting lines, or queues. A queueing model is constructed so that queue lengths and waiting time can be predicted. Queueing theory is generally considered a branch of operations research because the ...

when we want to model the time we have to wait until a certain event takes place. Examples include the time until the next client enters the store, the time until a certain company defaults or the time until some machine has a defect.

Student's t-distribution

Student's t-distribution In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situ ...

are used in finance as probabilistic models of assets returns. The

density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...

of the t-distribution is given by the following equation:

f(t) = \frac  \left(1+\frac \right)^,

where

\nu

is the number of ''

degrees of freedom Degrees of freedom (often abbreviated df or DOF) refers to the number of independent variables or parameters of a thermodynamic system. In various scientific fields, the word "freedom" is used to describe the limits to which physical movement or ...

'' and

\Gamma

is the

gamma function In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers excep ...

. For large values of ''n'', the t-distribution doesn't significantly differ from a standard

. Usually, for values ''n'' > 30, the t-distribution is considered as equal to the standard

Other distributions

Generalized extreme value distribution In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known a ...

Combined simulation

It is often possible to model one and the same system by use of completely different world views.

Discrete event simulation A discrete-event simulation (DES) models the operation of a system as a ( discrete) sequence of events in time. Each event occurs at a particular instant in time and marks a change of state in the system. Between consecutive events, no change in t ...

of a problem as well as continuous event simulation of it (continuous simulation with the discrete events that disrupt the continuous flow) may lead eventually to the same answers. Sometimes however, the techniques can answer different questions about a system. If we necessarily need to answer all the questions, or if we don't know what purposes is the model going to be used for, it is convenient to apply combined continuous/discrete

methodology In its most common sense, methodology is the study of research methods. However, the term can also refer to the methods themselves or to the philosophical discussion of associated background assumptions. A method is a structured procedure for br ...

.Francois E. Cellier, Combined Continuous/Discrete Simulation Applications, Techniques, and Tools Similar techniques can change from a discrete, stochastic description to a deterministic, continuum description in a time-and space dependent manner. The use of this technique enables the capturing of noise due to small copy numbers, while being much faster to simulate than the conventional Gillespie algorithm. Furthermore, the use of the deterministic continuum description enables the simulations of arbitrarily large systems.

Monte Carlo simulation

Monte Carlo Monte Carlo (; ; french: Monte-Carlo , or colloquially ''Monte-Carl'' ; lij, Munte Carlu ; ) is officially an administrative area of the Principality of Monaco, specifically the ward of Monte Carlo/Spélugues, where the Monte Carlo Casino is ...

is an estimation procedure. The main idea is that if it is necessary to know the average value of some

variable and its distribution cannot be stated, and if it is possible to take samples from the distribution, we can estimate it by taking the samples, independently, and averaging them. If there are sufficient samples, then the law of large numbers says the average must be close to the true value. The central limit theorem says that the average has a

Gaussian distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

around the true value.Cosma Rohilla Shalizi, Monte Carlo, and Other Kinds of Stochastic Simulation, nlineavailable at http://bactra.org/notebooks/monte-carlo.html As a simple example, suppose we need to measure area of a shape with a complicated, irregular outline. The Monte Carlo approach is to draw a square around the shape and measure the square. Then we throw darts into the square, as uniformly as possible. The fraction of darts falling on the shape gives the ratio of the area of the shape to the area of the square. In fact, it is possible to cast almost any integral problem, or any averaging problem, into this form. It is necessary to have a good way to tell if you're inside the outline, and a good way to figure out how many darts to throw. Last but not least, we need to throw the darts uniformly, i.e., using a good

Application

There are wide possibilities for use of Monte Carlo Method: * Statistic experiment using generation of random variables (e.g. dice) * sampling method *

Mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in modern mathematics ...

(e.g. numerical integration, multiple integrals) *

Reliability Engineering Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specifie ...

Project Management Project management is the process of leading the work of a team to achieve all project goals within the given constraints. This information is usually described in project documentation, created at the beginning of the development process. T ...

(SixSigma) *

Experimental particle physics Particle physics or high energy physics is the study of fundamental particles and forces that constitute matter and radiation. The fundamental particles in the universe are classified in the Standard Model as fermions (matter particles) and b ...

Simulation A simulation is the imitation of the operation of a real-world process or system over time. Simulations require the use of Conceptual model, models; the model represents the key characteristics or behaviors of the selected system or proc ...

s * Risk Measurement/ Risk Management (e.g. Portfolio value estimation) *

Economics Economics () is the social science that studies the production, distribution, and consumption of goods and services. Economics focuses on the behaviour and interactions of economic agents and how economies work. Microeconomics anal ...

(e.g. finding the best fitting demand curve) *

Process Simulation Process simulation is used for the design, development, analysis, and optimization of technical processes such as: chemical plants, chemical processes, environmental systems, power stations, complex manufacturing operations, biological processes, ...

Operations Research Operations research ( en-GB, operational research) (U.S. Air Force Specialty Code: Operations Analysis), often shortened to the initialism OR, is a discipline that deals with the development and application of analytical methods to improve decis ...

Random number generators

For

experiments (including Monte Carlo) it is necessary to generate

numbers (as values of variables). The problem is that the computer is highly

deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and cons ...

machine—basically, behind each process there is always an algorithm, a

computation changing inputs to outputs; therefore it is not easy to generate uniformly spread

numbers over a defined interval or set. A

is a device capable of producing a sequence of numbers which cannot be "easily" identified with

properties. This sequence is then called a ''sequence of stochastic numbers''.Donald E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms - chapitre 3 : Random Numbers (Addison-Wesley, Boston, 1998). The algorithms typically rely on

pseudorandom number A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Background The generation of random numbers has many uses, such as for random ...

s, computer generated numbers mimicking true random numbers, to generate a realization, one possible outcome of a process.Andreas hellander, Stochastic Simulation and Monte Carlo Methods, nlineavailable at http://www.it.uu.se/edu/course/homepage/bervet2/MCkompendium/mc.pdf Methods for obtaining

numbers have existed for a long time and are used in many different fields (such as

gaming Gaming may refer to: Games and sports The act of playing games, as in: * Legalized gambling, playing games of chance for money, often referred to in law as "gaming" * Playing a role-playing game, in which players assume fictional roles * Playing ...

). However, these numbers suffer from a certain bias. Currently the best methods expected to produce truly random sequences are natural methods that take advantage of the random nature of quantum phenomena.

References

* (Slepoy 2008): * (Bratsun 2005): * (Cai 2007): * * * (Ramaswamy 2009): * (Ramaswamy 2010): * (Ramaswamy 2011): {{cite journal, author1=R. Ramaswamy , author2=I. F. Sbalzarini , title=A partial-propensity formulation of the stochastic simulation algorithm for chemical reaction networks with delays , journal= J. Chem. Phys., volume= 134, pages= 014106 , year=2011, doi = 10.1063/1.3521496, pmid=21218996, issue=1 , bibcode=2011JChPh.134a4106R, s2cid=4949530 , url=http://www.zora.uzh.ch/id/eprint/79206/8/1.3521496.pdf

External links

;Software
cayenne
- Fast, easy to use Python package for stochastic simulations. Implementations of direct, tau-leaping, and tau-adaptive algorithms.
StochSS
- StochSS: Stochastic Simulation Service - A Cloud Computing Framework for Modeling and Simulation of Stochastic Biochemical Systems.
ResAssure
- Stochastic reservoir simulation software - solves fully implicit, dynamic three-phase fluid flow equations for every geological realisation.
Cain
- Stochastic simulation of chemical kinetics. Direct, next reaction, tau-leaping, hybrid, etc.
pSSAlib
- C++ implementations of all partial-propensity methods.
StochPy
- Stochastic modelling in Python

- STochastic Engine for Pathway Simulation using swig to create Python interface to C/C++ code Stochastic processes