In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, heavy-tailed distributions are
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s whose tails are not exponentially bounded:
that is, they have heavier tails than the
exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.
There are three important subclasses of heavy-tailed distributions: the
fat-tailed distribution
A fat-tailed distribution is a probability distribution that exhibits a large skewness or kurtosis, relative to that of either a normal distribution or an exponential distribution. In common usage, the terms fat-tailed and Heavy-tailed distributi ...
s, the
long-tailed distribution
In statistics and business, a long tail of some distributions of numbers is the portion of the distribution having many occurrences far from the "head" or central part of the distribution. The distribution could involve popularities, random nu ...
s and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.
There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power
moments finite; and some others to those distributions that do not have a finite
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as
log-normal
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
that possess all their power moments, yet which are generally considered to be heavy-tailed. (Occasionally, heavy-tailed is used for any distribution that has heavier tails than the normal distribution.)
Definitions
Definition of heavy-tailed distribution
The distribution of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
''X'' with
distribution function ''F'' is said to have a heavy (right) tail if the
moment generating function
In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
of ''X'', ''M
X''(''t''), is infinite for all ''t'' > 0.
[Rolski, Schmidli, Scmidt, Teugels, ''Stochastic Processes for Insurance and Finance'', 1999]
That means
:
This is also written in terms of the tail distribution function
:
as
:
Definition of long-tailed distribution
The distribution of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
''X'' with
distribution function ''F'' is said to have a long right tail
if for all ''t'' > 0,
:
or equivalently
:
This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level.
All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.
Subexponential distributions
Subexponentiality is defined in terms of
convolutions of probability distributions. For two independent, identically distributed
random variables
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
with a common distribution function
, the convolution of
with itself, written
and called the convolution square, is defined using
Lebesgue–Stieltjes integration
In measure-theoretic analysis and related branches of mathematics, Lebesgue–Stieltjes integration generalizes both Riemann–Stieltjes and Lebesgue integration, preserving the many advantages of the former in a more general measure-theoretic f ...
by:
:
and the ''n''-fold convolution
is defined inductively by the rule:
:
The tail distribution function
is defined as
.
A distribution
on the positive half-line is subexponential
if
:
This implies
that, for any
,
:
The probabilistic interpretation
of this is that, for a sum of
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independ ...
random variables
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
with common distribution
,
:
This is often known as the principle of the single big jump or catastrophe principle.
A distribution
on the whole real line is subexponential if the distribution
is. Here
I([0,\infty)) is the indicator function of the positive half-line. Alternatively, a random variable
X supported on the real line is subexponential if and only if
X^+ = \max(0,X) is subexponential.
All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.
Common heavy-tailed distributions
All commonly used heavy-tailed distributions are subexponential.
Those that are one-tailed include:
*the
Pareto distribution
The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actua ...
;
*the Log-normal distribution;
*the
Lévy distribution
In probability theory and statistics, the Lévy distribution, named after Paul Lévy, is a continuous probability distribution for a non-negative random variable. In spectroscopy, this distribution, with frequency as the dependent variable, is kn ...
;
*the Weibull distribution with shape parameter greater than 0 but less than 1;
*the Burr distribution;
*the log-logistic distribution;
*the
log-gamma distribution;
*the
Fréchet distribution
The Fréchet distribution, also known as inverse Weibull distribution, is a special case of the generalized extreme value distribution. It has the cumulative distribution function
:\Pr(X \le x)=e^ \text x>0.
where ''α'' > 0 is a ...
;
*the q-Gaussian distribution
*the
log-Cauchy distribution
In probability theory, a log-Cauchy distribution is a probability distribution of a random variable whose logarithm is distributed in accordance with a Cauchy distribution. If ''X'' is a random variable with a Cauchy distribution, then ''Y'' = exp ...
, sometimes described as having a "super-heavy tail" because it exhibits
logarithmic decay producing a heavier tail than the Pareto distribution.
Those that are two-tailed include:
*The
Cauchy distribution
The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) fun ...
, itself a special case of both the stable distribution and the t-distribution;
*The family of
stable distributions
In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stab ...
, excepting the special case of the normal distribution within that family. Some stable distributions are one-sided (or supported by a half-line), see e.g.
Lévy distribution
In probability theory and statistics, the Lévy distribution, named after Paul Lévy, is a continuous probability distribution for a non-negative random variable. In spectroscopy, this distribution, with frequency as the dependent variable, is kn ...
. See also ''
''.
*The
t-distribution.
*The skew lognormal cascade distribution.
Relationship to fat-tailed distributions
A
fat-tailed distribution
A fat-tailed distribution is a probability distribution that exhibits a large skewness or kurtosis, relative to that of either a normal distribution or an exponential distribution. In common usage, the terms fat-tailed and Heavy-tailed distributi ...
is a distribution for which the probability density function, for large x, goes to zero as a power
x^. Since such a power is always bounded below by the probability density function of an exponential distribution, fat-tailed distributions are always heavy-tailed. Some distributions, however, have a tail which goes to zero slower than an exponential function (meaning they are heavy-tailed), but faster than a power (meaning they are not fat-tailed). An example is the
log-normal distribution
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
. Many other heavy-tailed distributions such as the
log-logistic and
Pareto distribution are, however, also fat-tailed.
Estimating the tail-index
There are parametric
and non-parametric
approaches to the problem of the tail-index estimation.
To estimate the tail-index using the parametric approach, some authors employ
GEV distribution or
Pareto distribution
The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actua ...
; they may apply the maximum-likelihood estimator (MLE).
Pickand's tail-index estimator
With
(X_n , n \geq 1) a random sequence of independent and same density function
F \in D(H(\xi)), the Maximum Attraction Domain
of the generalized extreme value density
H , where
\xi \in \mathbb. If
\lim_ k(n) = \infty and
\lim_ \frac= 0, then the ''Pickands'' tail-index estimation is
:
\xi^\text_ =\frac \ln \left( \frac\right),
where
X_=\max \left(X_,\ldots ,X_\right). This estimator converges in probability to
\xi.
Hill's tail-index estimator
Let
(X_t , t \geq 1) be a sequence of independent and identically distributed random variables with distribution function
F \in D(H(\xi)), the maximum domain of attraction of the
generalized extreme value distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
H , where
\xi \in \mathbb. The sample path is
where
n is the sample size. If
\ is an intermediate order sequence, i.e.
k(n) \in \, ,
k(n) \to \infty and
k(n)/n \to 0, then the Hill tail-index estimator is
:
\xi^\text_ = \left(\frac 1 \sum_^n \ln(X_) - \ln (X_)\right)^,
where
X_ is the
i-th
order statistic
In statistics, the ''k''th order statistic of a statistical sample is equal to its ''k''th-smallest value. Together with rank statistics, order statistics are among the most fundamental tools in non-parametric statistics and inference.
Import ...
of
X_1, \dots, X_n.
This estimator converges in probability to
\xi, and is asymptotically normal provided
k(n) \to \infty is restricted based on a higher order regular variation property
. Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences, irrespective of whether
X_t is observed, or a computed residual or filtered data from a large class of models and estimators, including mis-specified models and models with errors that are dependent. Note that both Pickand's and Hill's tail-index estimators commonly make use of logarithm of the order statistics.
Ratio estimator of the tail-index
The ratio estimator (RE-estimator) of the tail-index was introduced by Goldie
and Smith.
It is constructed similarly to Hill's estimator but uses a non-random "tuning parameter".
A comparison of Hill-type and RE-type estimators can be found in Novak.
Software
aest C tool for estimating the heavy-tail index.
Estimation of heavy-tailed density
Nonparametric approaches to estimate heavy- and superheavy-tailed probability density functions were given in
Markovich.
These are approaches based on variable bandwidth and long-tailed kernel estimators; on the preliminary data transform to a new random variable at finite or infinite intervals, which is more convenient for the estimation and then inverse transform of the obtained density estimate; and "piecing-together approach" which provides a certain parametric model for the tail of the density and a non-parametric model to approximate the mode of the density. Nonparametric estimators require an appropriate selection of tuning (smoothing) parameters like a bandwidth of kernel estimators and the bin width of the histogram. The well known data-driven methods of such selection are a cross-validation and its modifications, methods based on the minimization of the mean squared error (MSE) and its asymptotic and their upper bounds.
A discrepancy method which uses well-known nonparametric statistics like Kolmogorov-Smirnov's, von Mises and Anderson-Darling's ones as a metric in the space of distribution functions (dfs) and quantiles of the later statistics as a known uncertainty or a discrepancy value can be found in.
Bootstrap is another tool to find smoothing parameters using approximations of unknown MSE by different schemes of re-samples selection, see e.g.
[{{cite book
, author=Hall P.
, title=The Bootstrap and Edgeworth Expansion
, year=1992
, series=Springer
, isbn=9780387945088
]
See also
*
Leptokurtic distribution
*
Generalized extreme value distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
*
Generalized Pareto distribution
In statistics, the generalized Pareto distribution (GPD) is a family of continuous probability distributions. It is often used to model the tails of another distribution. It is specified by three parameters: location \mu, scale \sigma, and shap ...
*
Outlier
In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are ...
*
Long tail
In statistics and business, a long tail of some probability distribution, distributions of numbers is the portion of the distribution having many occurrences far from the "head" or central part of the distribution. The distribution could involv ...
*
Power law
In statistics, a power law is a Function (mathematics), functional relationship between two quantities, where a Relative change and difference, relative change in one quantity results in a proportional relative change in the other quantity, inde ...
*
Seven states of randomness
The seven states of randomness in probability theory, fractals and risk analysis are extensions of the concept of randomness as modeled by the normal distribution. These seven states were first introduced by Benoît Mandelbrot in his 1997 boo ...
*
Fat-tailed distribution
A fat-tailed distribution is a probability distribution that exhibits a large skewness or kurtosis, relative to that of either a normal distribution or an exponential distribution. In common usage, the terms fat-tailed and Heavy-tailed distributi ...
**
Taleb distribution
In economics and finance, a Taleb distribution is the statistical profile of an investment which normally provides a payoff of small positive returns, while carrying a small but significant risk of catastrophic losses. The term was coined by jo ...
and
Holy grail distribution
In economics and finance, a holy grail distribution is a probability distribution with positive mean and right fat tail — a returns profile of a hypothetical investment vehicle that produces small returns centered on zero and occasionally exhi ...
References
Tails of probability distributions
Types of probability distributions
Actuarial science
Risk