
In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, more specifically in the theory of
Monte Carlo method
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be ...
s, variance reduction is a procedure used to increase the
precision of the
estimates obtained for a given simulation or computational effort.
Every output random variable from the simulation is associated with a
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
which limits the precision of the simulation results. In order to make a simulation statistically efficient, i.e., to obtain a greater precision and smaller
confidence intervals for the output random variable of interest, variance reduction techniques can be used.
The main variance reduction methods are
* common random numbers
*
antithetic variates
*
control variates
*
importance sampling
*
stratified sampling
*
moment matching
*
conditional Monte Carlo
* and
quasi random variables (in
Quasi-Monte Carlo method
In numerical analysis, the quasi-Monte Carlo method is a method for numerical integration and solving some other problems using low-discrepancy sequences (also called quasi-random sequences or sub-random sequences) to achieve variance reduction. ...
)
For simulation with
black-box models
subset simulation and
line sampling can also be used. Under these headings are a variety of specialized techniques; for example, particle transport simulations make extensive use of "weight windows" and "splitting/Russian roulette" techniques, which are a form of importance sampling.
Crude Monte Carlo simulation
Suppose one wants to compute
with the random variable
defined on the
probability space
In probability theory, a probability space or a probability triple (\Omega, \mathcal, P) is a mathematical construct that provides a formal model of a random process or "experiment". For example, one can define a probability space which models ...
. Monte Carlo does this by sampling
i.i.d. copies
of
and then to estimate
via the sample-mean estimator
:
Under further mild conditions such as
, a
central limit theorem
In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the Probability distribution, distribution of a normalized version of the sample mean converges to a Normal distribution#Standard normal distributi ...
will apply such that for large
, the distribution of
converges to a normal distribution with mean
and standard error
. Because the standard deviation only converges towards
at the rate
, implying one needs to increase the number of simulations (
) by a factor of
to halve the standard deviation of
, variance reduction methods are often useful for obtaining more precise estimates for
without needing very large numbers of simulations.
Common Random Numbers (CRN)
The common random numbers variance reduction technique is a popular and useful variance reduction technique which applies when we are comparing two or more alternative configurations (of a system) instead of investigating a single configuration. CRN has also been called ''correlated sampling'', ''matched streams'' or ''matched pairs''.
CRN requires synchronization of the random number streams, which ensures that in addition to using the same random numbers to simulate all configurations, a specific random number used for a specific purpose in one configuration is used for exactly the same purpose in all other configurations. For example, in queueing theory, if we are comparing two different configurations of tellers in a bank, we would want the (random) time of arrival of the ''N-''th customer to be generated using the same draw from a random number stream for both configurations.
Underlying principle of the CRN technique
Suppose
and
are the observations from the first and second configurations on the ''j-''th independent replication.
We want to estimate
:
If we perform ''n'' replications of each configuration and let
:
then
and
is an unbiased estimator of
.
And since the
's are independent identically distributed random variables,
:
In case of independent sampling, i.e., no common random numbers used then Cov(''X''
1''j'', ''X''
2''j'') = 0. But if we succeed to induce an element of positive correlation between ''X''
1 and ''X''
2 such that Cov(''X''
1''j'', ''X''
2''j'') > 0, it can be seen from the equation above that the variance is reduced.
It can also be observed that if the CRN induces a negative correlation, i.e., Cov(''X''
1''j'', ''X''
2''j'') < 0, this technique can actually backfire, where the variance is increased and not decreased (as intended).
See also
*
Explained variance
*
Regularization (mathematics)
In mathematics, statistics, Mathematical finance, finance, and computer science, particularly in machine learning and inverse problems, regularization is a process that converts the Problem solving, answer to a problem to a simpler one. It is ofte ...
References
*
*{{cite journal , last=Kahn , first=H. , last2=Marshall , first2=A. W. , year=1953 , title=Methods of Reducing Sample Size in Monte Carlo Computations , journal=
Journal of the Operations Research Society of America, volume=1 , issue=5 , pages=263–271 , doi=10.1287/opre.1.5.263
*MCNP — A General Monte Carlo N-Particle Transport Code, Version 5 Los Alamos Report LA-UR-03-1987
Monte Carlo methods