Generalized Pareto Distribution
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the generalized Pareto distribution (GPD) is a family of continuous
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s. It is often used to model the tails of another distribution. It is specified by three parameters: location \mu, scale \sigma, and shape \xi. Sometimes it is specified by only scale and shape and sometimes only by its shape parameter. Some references give the shape parameter as \kappa = - \xi \,.


Definition

The standard cumulative distribution function (cdf) of the GPD is defined by : F_(z) = \begin 1 - \left(1 + \xi z\right)^ & \text\xi \neq 0, \\ 1 - e^ & \text\xi = 0. \end where the support is z \geq 0 for \xi \geq 0 and 0 \leq z \leq - 1 /\xi for \xi < 0. The corresponding probability density function (pdf) is : f_(z) = \begin (1 + \xi z)^ & \text\xi \neq 0, \\ e^ & \text\xi = 0. \end


Characterization

The related location-scale family of distributions is obtained by replacing the argument ''z'' by \frac and adjusting the support accordingly. The
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
of X \sim GPD(\mu, \sigma, \xi) (\mu\in\mathbb R, \sigma>0, and \xi\in\mathbb R) is : F_(x) = \begin 1 - \left(1+ \frac\right)^ & \text\xi \neq 0, \\ 1 - \exp \left(-\frac\right) & \text\xi = 0, \end where the support of X is x \geqslant \mu when \xi \geqslant 0 \,, and \mu \leqslant x \leqslant \mu - \sigma /\xi when \xi < 0. The
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
(pdf) of X \sim GPD(\mu, \sigma, \xi) is : f_(x) = \frac\left(1 + \frac\right)^, again, for x \geqslant \mu when \xi \geqslant 0, and \mu \leqslant x \leqslant \mu - \sigma /\xi when \xi < 0. The pdf is a solution of the following
differential equation In mathematics, a differential equation is an equation that relates one or more unknown functions and their derivatives. In applications, the functions generally represent physical quantities, the derivatives represent their rates of change, an ...
: :\left\


Special cases

*If the shape \xi and location \mu are both zero, the GPD is equivalent to the
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
. *With shape \xi = -1, the GPD is equivalent to the
continuous uniform distribution In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of symmetric probability distributions. The distribution describes an experiment where there is an arbitrary outcome that lies betw ...
U(0, \sigma). Castillo, Enrique, and Ali S. Hadi. "Fitting the generalized Pareto distribution to data." Journal of the American Statistical Association 92.440 (1997): 1609-1620. *With shape \xi > 0 and location \mu = \sigma/\xi, the GPD is equivalent to the
Pareto distribution The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actua ...
with scale x_m=\sigma/\xi and shape \alpha=1/\xi. *If X \sim GPD (\mu = 0, \sigma, \xi ), then Y = \log (X) \sim exGPD(\sigma, \xi)
(exGPD stands for the Generalized Pareto distribution#Exponentiated generalized Pareto distribution, exponentiated generalized Pareto distribution.) *GPD is similar to the
Burr distribution In probability theory, statistics and econometrics, the Burr Type XII distribution or simply the Burr distribution is a continuous probability distribution for a non-negative random variable. It is also known as the Singh–Maddala distribution a ...
.


Generating generalized Pareto random variables


Generating GPD random variables

If ''U'' is uniformly distributed on (0, 1], then : X = \mu + \frac \sim GPD(\mu, \sigma, \xi \neq 0) and : X = \mu - \sigma \ln(U) \sim GPD(\mu,\sigma,\xi =0). Both formulas are obtained by inversion of the cdf. In Matlab Statistics Toolbox, you can easily use "gprnd" command to generate generalized Pareto random numbers.


GPD as an Exponential-Gamma Mixture

A GPD random variable can also be expressed as an exponential random variable, with a Gamma distributed rate parameter. :X, \Lambda \sim \operatorname(\Lambda) and :\Lambda \sim \operatorname(\alpha, \beta) then :X \sim \operatorname(\xi = 1/\alpha, \ \sigma = \beta/\alpha) Notice however, that since the parameters for the Gamma distribution must be greater than zero, we obtain the additional restrictions that:\xi must be positive.


Exponentiated generalized Pareto distribution


The exponentiated generalized Pareto distribution (exGPD)

If X \sim GPD (\mu = 0, \sigma, \xi ), then Y = \log (X) is distributed according to th
exponentiated generalized Pareto distribution
denoted by Y \sim exGPD (\sigma, \xi ). The
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
(pdf) of Y \sim exGPD (\sigma, \xi )\,\, (\sigma >0) is : g_(y) = \begin \frac\bigg( 1 + \frac \bigg)^\,\,\,\, \text \xi \neq 0, \\ \frace^ \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\, \text \xi = 0 ,\end where the support is -\infty < y < \infty for \xi \geq 0 , and -\infty < y \leq \log(-\sigma/\xi) for \xi < 0 . For all \xi, the \log \sigma becomes the location parameter. See the right panel for the pdf when the shape \xi is positive. The exGPD has finite moments of all orders for all \sigma>0 and -\infty< \xi < \infty . The
moment-generating function In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
of Y \sim exGPD(\sigma,\xi) is : M_Y(s) = E ^= \begin -\frac\bigg(-\frac\bigg)^ B(s+1, -1/\xi) \,\,\,\,\,\,\,\,\,\,\,\, \text s \in (-1, \infty), \xi < 0 , \\ \frac\bigg(\frac\bigg)^ B(s+1, 1/\xi - s) \,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\, \text s \in (-1, 1/\xi), \xi > 0 , \\ \sigma^ \Gamma(1+s) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \text s \in (-1, \infty), \xi = 0, \end where B(a,b) and \Gamma (a) denote the
beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^(1 ...
and
gamma function In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except ...
, respectively. The
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
of Y \sim exGPD (\sigma, \xi ) depends on the scale \sigma and shape \xi parameters, while the \xi participates through the
digamma function In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function: :\psi(x)=\frac\ln\big(\Gamma(x)\big)=\frac\sim\ln-\frac. It is the first of the polygamma functions. It is strictly increasing and strictly ...
: : E = \begin \log\ \bigg(-\frac \bigg)+ \psi(1) - \psi(-1/\xi+1) \,\,\,\,\,\,\,\,\,\,\,\, \,\, \text\xi < 0 , \\ \log\ \bigg(\frac \bigg)+ \psi(1) - \psi(1/\xi) \,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\, \,\,\, \,\,\, \,\,\, \,\,\,\,\,\, \,\,\, \text\xi > 0 , \\ \log \sigma + \psi(1) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\, \,\,\, \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\, \,\,\,\,\,\,\, \text\xi = 0. \end Note that for a fixed value for the \xi \in (-\infty,\infty) , the \log\ \sigma plays as the location parameter under the exponentiated generalized Pareto distribution. The
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
of Y \sim exGPD (\sigma, \xi ) depends on the shape parameter \xi only through the
polygamma function In mathematics, the polygamma function of order is a meromorphic function on the complex numbers \mathbb defined as the th derivative of the logarithm of the gamma function: :\psi^(z) := \frac \psi(z) = \frac \ln\Gamma(z). Thus :\psi^(z) = ...
of order 1 (also called the
trigamma function In mathematics, the trigamma function, denoted or , is the second of the polygamma functions, and is defined by : \psi_1(z) = \frac \ln\Gamma(z). It follows from this definition that : \psi_1(z) = \frac \psi(z) where is the digamma functio ...
): : Var = \begin \psi'(1) - \psi'(-1/\xi +1) \,\,\,\,\,\,\,\,\,\,\,\, \, \text\xi < 0 , \\ \psi'(1) + \psi'(1/\xi) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \text\xi > 0 , \\ \psi'(1) \,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\,\, \,\,\,\,\,\text\xi = 0. \end See the right panel for the variance as a function of \xi. Note that \psi'(1) = \pi^2/6 \approx 1.644934 . Note that the roles of the scale parameter \sigma and the shape parameter \xi under Y \sim exGPD(\sigma, \xi) are separably interpretable, which may lead to a robust efficient estimation for the \xi than using the X \sim GPD(\sigma, \xi)
The roles of the two parameters are associated each other under X \sim GPD(\mu=0,\sigma, \xi) (at least up to the second central moment); see the formula of variance Var(X) wherein both parameters are participated.


The Hill's estimator

Assume that X_ = (X_1, \cdots, X_n) are n observations (not need to be i.i.d.) from an unknown
heavy-tailed distribution In probability theory, heavy-tailed distributions are probability distributions whose tails are not exponentially bounded: that is, they have heavier tails than the exponential distribution. In many applications it is the right tail of the distrib ...
F such that its tail distribution is regularly varying with the tail-index 1/\xi (hence, the corresponding shape parameter is \xi ). To be specific, the tail distribution is described as : \bar(x) = 1 - F(x) = L(x) \cdot x^, \,\,\,\,\,\text\xi>0,\,\,\text L \text It is of a particular interest in the
extreme value theory Extreme value theory or extreme value analysis (EVA) is a branch of statistics dealing with the extreme deviations from the median of probability distributions. It seeks to assess, from a given ordered sample of a given random variable, the pr ...
to estimate the shape parameter \xi, especially when \xi is positive (so called the heavy-tailed distribution). Let F_u be their conditional excess distribution function. Pickands–Balkema–de Haan theorem (Pickands, 1975; Balkema and de Haan, 1974) states that for a large class of underlying distribution functions F, and large u, F_u is well approximated by the generalized Pareto distribution (GPD), which motivated Peak Over Threshold (POT) methods to estimate \xi: ''the GPD plays the key role in POT approach.'' A renowned estimator using the POT methodology is the Hill's estimator. Technical formulation of the Hill's estimator is as follows. For 1\leq i \leq n , write X_ for the i-th largest value of X_1, \cdots, X_n . Then, with this notation, the Hill's estimator (see page 190 of Reference 5 by Embrechts et a

based on the k upper order statistics is defined as : \widehat_^ = \widehat_^(X_) = \frac \sum_^ \log \bigg(\frac \bigg), \,\,\,\,\,\,\,\, \text 2 \leq k \leq n. In practice, the Hill estimator is used as follows. First, calculate the estimator \widehat_^ at each integer k \in \, and then plot the ordered pairs \_^. Then, select from the set of Hill estimators \_^ which are roughly constant with respect to k: these stable values are regarded as reasonable estimates for the shape parameter \xi. If X_1, \cdots, X_n are i.i.d., then the Hill's estimator is a consistent estimator for the shape parameter \xi
Note that the Hill estimator \widehat_^ makes a use of the log-transformation for the observations X_ = (X_1, \cdots, X_n) . (The Pickand's estimator \widehat_^ also employed the log-transformation, but in a slightly different wa

)


See also

*
Burr distribution In probability theory, statistics and econometrics, the Burr Type XII distribution or simply the Burr distribution is a continuous probability distribution for a non-negative random variable. It is also known as the Singh–Maddala distribution a ...
*
Pareto distribution The Pareto distribution, named after the Italian civil engineer, economist, and sociologist Vilfredo Pareto ( ), is a power-law probability distribution that is used in description of social, quality control, scientific, geophysical, actua ...
*
Generalized extreme value distribution In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known ...

Exponentiated generalized Pareto distribution
* Pickands–Balkema–de Haan theorem


References


Further reading

* * * * Chapter 20, Section 12: Generalized Pareto Distributions. * *


External links


Mathworks: Generalized Pareto distribution
{{ProbDistributions, continuous-variable Continuous distributions Power laws Probability distributions with non-finite variance