In
mathematics
Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...
, a low-discrepancy sequence is a
sequence
In mathematics, a sequence is an enumerated collection of objects in which repetitions are allowed and order matters. Like a set, it contains members (also called ''elements'', or ''terms''). The number of elements (possibly infinite) is cal ...
with the property that for all values of
, its subsequence
has a low
discrepancy.
Roughly speaking, the discrepancy of a sequence is low if the proportion of points in the sequence falling into an arbitrary set ''B'' is close to proportional to the
measure of ''B'', as would happen on average (but not for particular samples) in the case of an
equidistributed sequence In mathematics, a sequence (''s''1, ''s''2, ''s''3, ...) of real numbers is said to be equidistributed, or uniformly distributed, if the proportion of terms falling in a subinterval is proportional to the length of that subinterval. Such sequences ...
. Specific definitions of discrepancy differ regarding the choice of ''B'' (
hyperspheres,
hypercubes
In geometry, a hypercube is an N-dimensional space, ''n''-dimensional analogue of a Square (geometry), square (two-dimensional, ) and a cube (Three-dimensional, ); the special case for Four-dimensional space, is known as a ''tesseract''. It is ...
, etc.) and how the discrepancy for every B is computed (usually normalized) and combined (usually by taking the worst value).
Low-discrepancy sequences are also called quasirandom sequences, due to their common use as a replacement of uniformly distributed
random numbers.
The "quasi" modifier is used to denote more clearly that the values of a low-discrepancy sequence are neither random nor
pseudorandom
A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Pseudorandom number generators are often used in computer programming, as tradi ...
, but such sequences share some properties of random variables and in certain applications such as the
quasi-Monte Carlo method
In numerical analysis, the quasi-Monte Carlo method is a method for numerical integration and solving some other problems using low-discrepancy sequences (also called quasi-random sequences or sub-random sequences) to achieve variance reduction. ...
their lower discrepancy is an important advantage.
Applications
Quasirandom numbers have an advantage over pure random numbers in that they cover the domain of interest quickly and evenly.
Two useful applications are in finding the
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts:
* The indicator function of a subset, that is the function
\mathbf_A\colon X \to \,
which for a given subset ''A'' of ''X'', has value 1 at points ...
of a
probability density function
In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
, and in finding the
derivative
In mathematics, the derivative is a fundamental tool that quantifies the sensitivity to change of a function's output with respect to its input. The derivative of a function of a single variable at a chosen input value, when it exists, is t ...
function of a deterministic function with a small amount of noise. Quasirandom numbers allow higher-order
moments to be calculated to high accuracy very quickly.
Applications that don't involve sorting would be in finding the
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
,
standard deviation
In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
,
skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
For a unimodal ...
and
kurtosis
In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtos ...
of a statistical distribution, and in finding the
integral
In mathematics, an integral is the continuous analog of a Summation, sum, which is used to calculate area, areas, volume, volumes, and their generalizations. Integration, the process of computing an integral, is one of the two fundamental oper ...
and global
maxima and minima
In mathematical analysis, the maximum and minimum of a function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given range (the ''local'' or ''relative ...
of difficult deterministic functions. Quasirandom numbers can also be used for providing starting points for deterministic algorithms that only work locally, such as
Newton–Raphson iteration.
Quasirandom numbers can also be combined with search algorithms. With a search algorithm, quasirandom numbers can be used to find the
mode,
median
The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...
,
confidence intervals and
cumulative distribution of a statistical distribution, and all
local minima and all solutions of deterministic functions.
Low-discrepancy sequences in numerical integration
Various methods of
numerical integration
In analysis, numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral.
The term numerical quadrature (often abbreviated to quadrature) is more or less a synonym for "numerical integr ...
can be phrased as approximating the integral of a function
in some interval, e.g.
,1/nowiki>, as the average of the function evaluated at a set in that interval:
If the points are chosen as , this is the ''rectangle rule''.
If the points are chosen to be randomly (or pseudorandom
A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Pseudorandom number generators are often used in computer programming, as tradi ...
ly) distributed, this is the ''Monte Carlo method
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be ...
''.
If the points are chosen as elements of a low-discrepancy sequence, this is the ''quasi-Monte Carlo method''.
A remarkable result, the Koksma–Hlawka inequality (stated below), shows that the error of such a method can be bounded by the product of two terms, one of which depends only on , and the other one is the discrepancy of the set .
It is convenient to construct the set in such a way that if a set with elements is constructed, the previous elements need not be recomputed.
The rectangle rule uses points set which have low discrepancy, but in general the elements must be recomputed if is increased.
Elements need not be recomputed in the random Monte Carlo method if is increased,
but the point sets do not have minimal discrepancy.
By using low-discrepancy sequences we aim for low discrepancy and no need for recomputations, but actually low-discrepancy sequences can only be incrementally good on discrepancy if we allow no recomputation.
Definition of discrepancy
The ''discrepancy'' of a set is defined, using Niederreiter's notation, as
where is the -dimensional Lebesgue measure
In measure theory, a branch of mathematics, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of higher dimensional Euclidean '-spaces. For lower dimensions or , it c ...
,
is the number of points in that fall into ,
and is the set of -dimensional intervals or boxes of the form
where .
The ''star-discrepancy'' is defined similarly, except that the supremum is taken over the set of rectangular boxes of the form
where is in the half-open interval [0, 1).
The two are related by
''Note'': With these definitions, discrepancy represents the worst-case or maximum point density deviation of a uniform set. However, also other error measures are meaningful, leading to other definitions and variation measures. For instance, -discrepancy or modified centered -discrepancy are also used intensively to compare the quality of uniform point sets. Both are much easier to calculate for large and .
The Koksma–Hlawka inequality
Let be the -dimensional unit cube, .
Let have bounded variation on in the sense of G. H. Hardy, Hardy and Krause.
Then for any in