In
probability theory
Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
, heavy-tailed distributions are
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomeno ...
s whose tails are not exponentially bounded:
that is, they have heavier tails than the
exponential distribution
In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant averag ...
. In many applications it is the right tail of the distribution that is of interest, but a distribution may have a heavy left tail, or both tails may be heavy.
There are three important subclasses of heavy-tailed distributions: the
fat-tailed distributions, the
long-tailed distributions and the subexponential distributions. In practice, all commonly used heavy-tailed distributions belong to the subexponential class.
There is still some discrepancy over the use of the term heavy-tailed. There are two other definitions in use. Some authors use the term to refer to those distributions which do not have all their power
moments finite; and some others to those distributions that do not have a finite
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of number ...
. The definition given in this article is the most general in use, and includes all distributions encompassed by the alternative definitions, as well as those distributions such as
log-normal that possess all their power moments, yet which are generally considered to be heavy-tailed. (Occasionally, heavy-tailed is used for any distribution that has heavier tails than the normal distribution.)
Definitions
Definition of heavy-tailed distribution
The distribution of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
''X'' with
distribution function ''F'' is said to have a heavy (right) tail if the
moment generating function of ''X'', ''M
X''(''t''), is infinite for all ''t'' > 0.
[Rolski, Schmidli, Scmidt, Teugels, ''Stochastic Processes for Insurance and Finance'', 1999]
That means
:
This is also written in terms of the tail distribution function
:
as
:
Definition of long-tailed distribution
The distribution of a
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the p ...
''X'' with
distribution function ''F'' is said to have a long right tail
if for all ''t'' > 0,
:
or equivalently
:
This has the intuitive interpretation for a right-tailed long-tailed distributed quantity that if the long-tailed quantity exceeds some high level, the probability approaches 1 that it will exceed any other higher level.
All long-tailed distributions are heavy-tailed, but the converse is false, and it is possible to construct heavy-tailed distributions that are not long-tailed.
Subexponential distributions
Subexponentiality is defined in terms of
convolutions of probability distributions. For two independent, identically distributed
random variables with a common distribution function
, the convolution of
with itself, written
and called the convolution square, is defined using
Lebesgue–Stieltjes integration by:
:
and the ''n''-fold convolution
is defined inductively by the rule:
:
The tail distribution function
is defined as
.
A distribution
on the positive half-line is subexponential
if
:
This implies
that, for any
,
:
The probabilistic interpretation
of this is that, for a sum of
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independe ...
random variables with common distribution
,
:
This is often known as the principle of the single big jump or catastrophe principle.
A distribution
on the whole real line is subexponential if the distribution
is. Here
I([0,\infty)) is the indicator function of the positive half-line. Alternatively, a random variable
X supported on the real line is subexponential if and only if
X^+ = \max(0,X) is subexponential.
All subexponential distributions are long-tailed, but examples can be constructed of long-tailed distributions that are not subexponential.
Common heavy-tailed distributions
All commonly used heavy-tailed distributions are subexponential.
Those that are one-tailed include:
*the
Pareto distribution;
*the Log-normal distribution;
*the
Lévy distribution;
*the Weibull distribution with shape parameter greater than 0 but less than 1;
*the Burr distribution;
*the log-logistic distribution;
*the
log-gamma distribution;
*the
Fréchet distribution;
*the q-Gaussian distribution
*the
log-Cauchy distribution, sometimes described as having a "super-heavy tail" because it exhibits
logarithmic decay producing a heavier tail than the Pareto distribution.
Those that are two-tailed include:
*The
Cauchy distribution
The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) fu ...
, itself a special case of both the stable distribution and the t-distribution;
*The family of
stable distributions, excepting the special case of the normal distribution within that family. Some stable distributions are one-sided (or supported by a half-line), see e.g.
Lévy distribution. See also ''
financial models with long-tailed distributions and volatility clustering''.
*The
t-distribution.
*The skew lognormal cascade distribution.
Relationship to fat-tailed distributions
A
fat-tailed distribution is a distribution for which the probability density function, for large x, goes to zero as a power
x^. Since such a power is always bounded below by the probability density function of an exponential distribution, fat-tailed distributions are always heavy-tailed. Some distributions, however, have a tail which goes to zero slower than an exponential function (meaning they are heavy-tailed), but faster than a power (meaning they are not fat-tailed). An example is the
log-normal distribution
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
. Many other heavy-tailed distributions such as the
log-logistic and
Pareto distribution are, however, also fat-tailed.
Estimating the tail-index
There are parametric
and non-parametric
approaches to the problem of the tail-index estimation.
To estimate the tail-index using the parametric approach, some authors employ
GEV distribution
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...
or
Pareto distribution; they may apply the maximum-likelihood estimator (MLE).
Pickand's tail-index estimator
With
(X_n , n \geq 1) a random sequence of independent and same density function
F \in D(H(\xi)), the Maximum Attraction Domain
of the generalized extreme value density
H , where
\xi \in \mathbb. If
\lim_ k(n) = \infty and
\lim_ \frac= 0, then the ''Pickands'' tail-index estimation is
:
\xi^\text_ =\frac \ln \left( \frac\right),
where
X_=\max \left(X_,\ldots ,X_\right). This estimator converges in probability to
\xi.
Hill's tail-index estimator
Let
(X_t , t \geq 1) be a sequence of independent and identically distributed random variables with distribution function
F \in D(H(\xi)), the maximum domain of attraction of the
generalized extreme value distribution H , where
\xi \in \mathbb. The sample path is
where
n is the sample size. If
\ is an intermediate order sequence, i.e.
k(n) \in \, ,
k(n) \to \infty and
k(n)/n \to 0, then the Hill tail-index estimator is
:
\xi^\text_ = \left(\frac 1 \sum_^n \ln(X_) - \ln (X_)\right)^,
where
X_ is the
i-th
order statistic of
X_1, \dots, X_n.
This estimator converges in probability to
\xi, and is asymptotically normal provided
k(n) \to \infty is restricted based on a higher order regular variation property
. Consistency and asymptotic normality extend to a large class of dependent and heterogeneous sequences, irrespective of whether
X_t is observed, or a computed residual or filtered data from a large class of models and estimators, including mis-specified models and models with errors that are dependent. Note that both Pickand's and Hill's tail-index estimators commonly make use of logarithm of the order statistics.
Ratio estimator of the tail-index
The ratio estimator (RE-estimator) of the tail-index was introduced by Goldie
and Smith.
It is constructed similarly to Hill's estimator but uses a non-random "tuning parameter".
A comparison of Hill-type and RE-type estimators can be found in Novak.
Software
aest C tool for estimating the heavy-tail index.
Estimation of heavy-tailed density
Nonparametric approaches to estimate heavy- and superheavy-tailed probability density functions were given in
Markovich.
These are approaches based on variable bandwidth and long-tailed kernel estimators; on the preliminary data transform to a new random variable at finite or infinite intervals, which is more convenient for the estimation and then inverse transform of the obtained density estimate; and "piecing-together approach" which provides a certain parametric model for the tail of the density and a non-parametric model to approximate the mode of the density. Nonparametric estimators require an appropriate selection of tuning (smoothing) parameters like a bandwidth of kernel estimators and the bin width of the histogram. The well known data-driven methods of such selection are a cross-validation and its modifications, methods based on the minimization of the mean squared error (MSE) and its asymptotic and their upper bounds.
A discrepancy method which uses well-known nonparametric statistics like Kolmogorov-Smirnov's, von Mises and Anderson-Darling's ones as a metric in the space of distribution functions (dfs) and quantiles of the later statistics as a known uncertainty or a discrepancy value can be found in.
Bootstrap is another tool to find smoothing parameters using approximations of unknown MSE by different schemes of re-samples selection, see e.g.
[{{cite book
, author=Hall P.
, title=The Bootstrap and Edgeworth Expansion
, year=1992
, series=Springer
, isbn=9780387945088
]
See also
*
Leptokurtic distribution
In probability theory and statistics, kurtosis (from el, κυρτός, ''kyrtos'' or ''kurtos'', meaning "curved, arching") is a measure of the "tailedness" of the probability distribution of a real-valued random variable. Like skewness, kurtosi ...
*
Generalized extreme value distribution
*
Generalized Pareto distribution
*
Outlier
In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are ...
*
Long tail
*
Power law
*
Seven states of randomness
*
Fat-tailed distribution
**
Taleb distribution
In economics and finance, a Taleb distribution is the statistical profile of an investment which normally provides a payoff of small positive returns, while carrying a small but significant risk of catastrophic losses. The term was coined by jo ...
and
Holy grail distribution
References
Tails of probability distributions
Types of probability distributions
Actuarial science
Risk