HOME

TheInfoList



OR:

In
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
and
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the generalized beta distributionMcDonald, James B. & Xu, Yexiao J. (1995) "A generalization of the beta distribution with applications," ''Journal of Econometrics'', 66(1–2), 133–152 is a
continuous probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
with four
shape parameter In probability theory and statistics, a shape parameter (also known as form parameter) is a kind of numerical parameter of a parametric family of probability distributionsEveritt B.S. (2002) Cambridge Dictionary of Statistics. 2nd Edition. CUP. ...
s (however it's customary to make explicit the
scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family o ...
as a fifth parameter, while the
location parameter In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
is usually left implicit), including more than thirty named distributions as
limiting In electronics, a limiter is a circuit that allows signals below a specified input power or level to pass unaffected while attenuating (lowering) the peaks of stronger signals that exceed this threshold. Limiting is a type of dynamic range compr ...
or
special case In logic, especially as applied in mathematics, concept is a special case or specialization of concept precisely if every instance of is also an instance of but not vice versa, or equivalently, if is a generalization of . A limiting case is ...
s. It has been used in the modeling of
income distribution In economics, income distribution covers how a country's total GDP is distributed amongst its population. Economic theory and economic policy have long seen income and its distribution as a central concern. Unequal distribution of income causes eco ...
, stock returns, as well as in
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
. The exponential generalized beta (EGB) distribution follows directly from the GB and generalizes other common distributions.


Definition

A generalized beta random variable, ''Y'', is defined by the following probability density function: : GB(y;a,b,c,p,q) = \frac \quad \quad \text 0 and zero otherwise. Here the parameters satisfy a \ne 0, 0 \le c \le 1 and b, p, and q positive. The function ''B''(''p,q'') is the
beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^(1 ...
. The parameter b is the
scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family o ...
and can thus be set to 1
without loss of generality ''Without loss of generality'' (often abbreviated to WOLOG, WLOG or w.l.o.g.; less commonly stated as ''without any loss of generality'' or ''with no loss of generality'') is a frequently used expression in mathematics. The term is used to indicate ...
, but it is usually made explicit as in the function above (while the
location parameter In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
is usually left implicit and set to 0 as in the function above).


Properties


Moments

It can be shown that the ''h''th moment can be expressed as follows: : \operatorname_(Y^)=\frac_F_ \begin p + h/a,h/a;c \\ p + q +h/a; \end, where _F_ denotes the hypergeometric series (which converges for all ''h'' if ''c''<1, or for all ''h''/''a''<''q'' if ''c''=1 ).


Related distributions

The generalized beta encompasses many distributions as limiting or special cases. These are depicted in the GB distribution tree shown above. Listed below are its three direct descendants, or sub-families.


Generalized beta of first kind (GB1)

The generalized beta of the first kind is defined by the following pdf: : GB1(y;a,b,p,q) = \frac for 0< y^ where b , p , and q are positive. It is easily verified that : GB1(y;a,b,p,q) = GB(y;a,b,c=0,p,q). The moments of the GB1 are given by : \operatorname_(Y^) = \frac. The GB1 includes the beta of the first kind (B1), generalized gamma(GG), and Pareto as special cases: : B1(y;b,p,q) = GB1(y;a=1,b,p,q) , : GG(y;a,\beta,p) = \lim_ GB1(y;a,b=q^\beta,p,q) , : PARETO(y;b,p) = GB1(y;a=-1,b,p,q=1) .


Generalized beta of the second kind (GB2)

The GB2 is defined by the following pdf: : GB2(y;a,b,p,q) = \frac for 0< y < \infty and zero otherwise. One can verify that : GB2(y;a,b,p,q) = GB(y;a,b,c=1,p,q). The moments of the GB2 are given by : \operatorname_(Y^h) = \frac. The GB2 is also known as the Generalized Beta Prime (Patil, Boswell, Ratnaparkhi (1984)),Patil, G.P., Boswell, M.T., and Ratnaparkhi, M.V., Dictionary and Classified Bibliography of Statistical Distributions in Scientific Work Series, editor G.P. Patil, Internal Co-operative Publishing House, Burtonsville, Maryland, 1984. the transformed beta (Venter, 1983),Venter, G., Transformed beta and gamma distributions and aggregate losses, Proceedings of the Casualty Actuarial Society, 1983. the generalized F (Kalfleisch and Prentice, 1980),Kalbfleisch, J.D. and R.L. Prentice, The Statistical Analysis of Failure Time Data, New York: J. Wiley, 1980 and is a special case (μ≡0) of the Feller-Pareto (Arnold, 1983)Arnold, B.C., Pareto Distributions, Volume 5 in Statistical Distributions in Scientific Work Series, International Co-operative Publishing House, Burtonsville, Md. 1983. distribution. The GB2 nests common distributions such as the generalized gamma (GG), Burr type 3, Burr type 12, Dagum,
lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...
,
Weibull Weibull is a Swedish locational surname. The Weibull family share the same roots as the Danish / Norwegian noble family of Falsenbr>They originated from and were named after the village of Weiböl in Widstedts parish, Jutland, but settled in Skà ...
,
gamma Gamma (uppercase , lowercase ; ''gámma'') is the third letter of the Greek alphabet. In the system of Greek numerals it has a value of 3. In Ancient Greek, the letter gamma represented a voiced velar stop . In Modern Greek, this letter re ...
, Lomax,
F statistic An ''F''-test is any statistical test in which the test statistic has an ''F''-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model th ...
, Fisk or Rayleigh, chi-square, half-normal, half-Student's t,
exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above * Exponential decay, decrease at a rate proportional to value *Exp ...
, asymmetric log-Laplace, log-Laplace, power function, and the log-logistic.McDonald, J.B. (1984) "Some generalized functions for the size distributions of income", ''Econometrica'' 52, 647–663.


Beta

The beta family of distributions (B) is defined by: : B(y;b,c,p,q) = \frac for 0 and zero otherwise. Its relation to the GB is seen below: : B(y;b,c,p,q) = GB(y;a=1,b,c,p,q). The beta family includes the beta of the first and second kind (B1 and B2, where the B2 is also referred to as the
Beta prime Beta (, ; uppercase , lowercase , or cursive ; grc, βῆτα, bē̂ta or ell, βήτα, víta) is the second letter of the Greek alphabet. In the system of Greek numerals, it has a value of 2. In Modern Greek, it represents the voiced labiod ...
), which correspond to ''c'' = 0 and ''c'' = 1, respectively. Setting c = 0, b = 1 yields the standard two-parameter
beta distribution In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1in terms of two positive parameters, denoted by ''alpha'' (''α'') and ''beta'' (''β''), that appear as ...
.


Generalized Gamma

The
generalized gamma distribution The generalized gamma distribution is a continuous probability distribution with two shape parameters (and a scale parameter). It is a generalization of the gamma distribution which has one shape parameter (and a scale parameter). Since many distr ...
(GG) is a limiting case of the GB2. Its PDF is defined by: : GG(y;a,\beta,p) = \lim_ GB2(y,a,b=q^ \beta,p,q) = \frac with the hth moments given by : \operatorname(Y_^h) = \frac. As noted earlier, the GB distribution family tree visually depicts the special and limiting cases (see McDonald and Xu (1995) ).


Pareto

The Pareto (PA) distribution is the following limiting case of the generalized gamma: : PA(y;\beta,\theta) = \lim_ GG(y;a,\beta,p=-\theta /a) = \lim_\left(\frac\right) = : \lim_\left(\frac \right) = \frac for \beta < y and 0 otherwise.


Power

The power (P) distribution is the following limiting case of the generalized gamma: : P(y;\beta,\theta) = \lim_GG(y;a=\theta /p, \beta, p) = \lim_\frac = \lim_\frace^ = : \lim_\frace^ = \lim_\frace^ = \frac, which is equivalent to the power function distribution for 0\leq y\leq\beta and \theta > 0.


Asymmetric Log-Laplace

The asymmetric log-Laplace distribution (also referred to as the double Pareto distribution ) is defined by: : ALL(y;b,\lambda_1,\lambda_2) = \lim_ GB2(y;a,b,p = \lambda_1/a,q = \lambda_2/a) = \frac\begin (\frac)^ & \mbox 0 < y < b \\ (\frac)^ & \mbox y \ge b \end where the hth moments are given by : \operatorname(Y_^h) = \frac. When \lambda_1 = \lambda_2, this is equivalent to the
log-Laplace distribution In probability theory and statistics, the log-Laplace distribution is the probability distribution of a random variable whose logarithm has a Laplace distribution. If ''X'' has a Laplace distribution with parameters ''μ'' and ''b'', then ''Y ...
.


Exponential generalized beta distribution

Letting Y \sim GB(y;a,b,c,p,q) (without location parameter), the random variable Z = \ln(Y), with re-parametrization \delta = \ln(b) and \sigma = 1/a, is distributed as an exponential generalized beta (EGB), with the following pdf: : EGB(z;\delta,\sigma,c,p,q) = \frac for -\infty < \frac<\ln(\frac) , and zero otherwise. The EGB includes generalizations of the Gompertz, Gumbel, extreme value type I, logistic, Burr-2,
exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above * Exponential decay, decrease at a rate proportional to value *Exp ...
, and
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
distributions. The parameter \delta = \ln(b) is the
location parameter In geography, location or place are used to denote a region (point, line, or area) on Earth's surface or elsewhere. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ...
of the EGB (while b is the
scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family o ...
of the GB), and \sigma = 1/a is the
scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family o ...
of the EGB (while a is a
shape parameter In probability theory and statistics, a shape parameter (also known as form parameter) is a kind of numerical parameter of a parametric family of probability distributionsEveritt B.S. (2002) Cambridge Dictionary of Statistics. 2nd Edition. CUP. ...
of the GB); The EGB has thus three
shape parameter In probability theory and statistics, a shape parameter (also known as form parameter) is a kind of numerical parameter of a parametric family of probability distributionsEveritt B.S. (2002) Cambridge Dictionary of Statistics. 2nd Edition. CUP. ...
s. Included is a figure showing the relationship between the EGB and its special and limiting cases.


Moment generating function

Using similar notation as above, the
moment-generating function In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
of the EGB can be expressed as follows: : M_(Z)=\frac_F_ \begin p + t\sigma,t\sigma;c \\ p + q +t\sigma; \end.


Multivariate generalized beta distribution

A multivariate generalized beta pdf extends the univariate distributions listed above. For n variables y = (y_1 , ... , y_n), define 1xn parameter vectors by a = (a_1 , ... , a_n), b = (b_1 , ... , b_n), c = (c_1 , ... , c_n), and p = (p_1 , ... , p_n) where each b_i and p_i is positive, and 0 \le c_i \le 1. The parameter q is assumed to be positive, and define the function B(p_1, ... , p_n, q) = \frac for \bar = \sum_^ p_i. The pdf of the multivariate generalized beta (MGB) may be written as follows: :MGB(y; a, b, p, q, c) = \frac where 0 < \sum_^ (1-c_i)(\frac)^ < 1 for 0 \le c_i < 1 and 0 < y_i when c_i = 1. Like the univariate generalized beta distribution, the multivariate generalized beta includes several distributions in its family as special cases. By imposing certain constraints on the parameter vectors, the following distributions can be easily derived.William M. Cockriel & James B. McDonald (2017): Two multivariate generalized beta families, Communications in Statistics - Theory and Methods,


Multivariate generalized beta of the first kind (MGB1)

When each c_i is equal to 0, the MGB function simplifies to the multivariate generalized beta of the first kind (MGB1), which is defined by: :MGB1(y; a, b, p, q) = \frac where 0 < \sum_^ (\frac)^ < 1.


Multivariate generalized beta of the second kind (MGB2)

In the case where each c_i is equal to 1, the MGB simplifies to the multivariate generalized beta of the second kind (MGB2), with the pdf defined below: :MGB2(y; a, b, p, q) = \frac when 0 < y_i for all y_i.


Multivariate generalized gamma

The multivariate generalized gamma (MGG) pdf can be derived from the MGB pdf by substituting b_i = \beta_i q ^ and taking the limit as q \to \infty, with Stirling's approximation for the gamma function, yielding the following function: :MGG(y; a, \beta, p) = (\frac)e^ = \prod_^ GG(y_i; a_i, \beta_i, p_i) which is the product of independently but not necessarily identically distributed generalized gamma random variables.


Other multivariate distributions

Similar pdfs can be constructed for other variables in the family tree shown above, simply by placing an M in front of each pdf name and finding the appropriate limiting and special cases of the MGB as indicated by the constraints and limits of the univariate distribution. Additional multivariate pdfs in the literature include the
Dirichlet distribution In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted \operatorname(\boldsymbol\alpha), is a family of continuous multivariate probability distributions parameterized by a vector \boldsymb ...
(standard form) given by MGB1(y; a = 1, b = 1, p, q), the multivariate inverted beta and inverted Dirichlet (Dirichlet type 2) distribution given by MGB2(y; a = 1, b = 1, p, q), and the multivariate Burr distribution given by MGB2(y; a, b, p, q = 1).


Marginal density functions

The marginal density functions of the MGB1 and MGB2, respectively, are the generalized beta distributions of the first and second kind, and are given as follows: :GB1(y_i; a_i, b_i, p_i, \bar - p_i + q) = \frac :GB2(y_i; a_i, b_i, p_i, q) = \frac


Applications

The flexibility provided by the GB family is used in modeling the distribution of: * distribution of income * hazard functions * stock returns * insurance losses Applications involving members of the EGB family include: * partially adaptive estimation of regression models * time series models * (G)ARCH models


Distribution of Income

The GB2 and several of its special and limiting cases have been widely used as models for the distribution of income. For some early examples see Thurow (1970),Thurow, L.C. (1970) "Analyzing the American Income Distribution," ''Papers and Proceedings, American Economics Association'', 60, 261-269 Dagum (1977),Dagum, C. (1977) "A New Model for Personal Income Distribution: Specification and Estimation," ''Economie Applique'e'', 30, 413-437 Singh and Maddala (1976), and McDonald (1984). Maximum likelihood estimations using individual, grouped, or top-coded data are easily performed with these distributions. Measures of inequality, such as the Gini index (G), Pietra index (P), and
Theil index The Theil index is a statistic primarily used to measure economic inequality and other economic phenomena, though it has also been used to measure racial segregation. The Theil index ''T''T is the same as redundancy in information theory which is ...
(T) can be expressed in terms of the distributional parameters, as given by McDonald and Ransom (2008):McDonald, J.B. and Ransom, M. (2008) "The Generalized Beta Distribution as a Model for the Distribution of Income: Estimation of Related Measures of Inequality", ''Modeling the Distributions and Lorenz Curves'', "Economic Studies in Inequality: Social Exclusion and Well-Being", Springer: New York editor Jacques Silber, 5, 147-166 :\begin G=\left(\right) \operatorname(, Y-X, ) = \left(P\right) \int_^\int_^ , x-y, f(x)f(y)\,dx dy \\ = 1 - \frac \\ P = \left( \frac\right) \operatorname (, Y-\mu, ) = \left(\frac\right)\int_0^ , y-\mu, f(y)\, dy \\ T = \operatorname (\ln (Y/\mu)^) = \int_0^ \infty (y/\mu) \ln (y/\mu) f(y)\, dy \end


Hazard Functions

The
hazard function Failure rate is the frequency with which an engineered system or component fails, expressed in failures per unit of time. It is usually denoted by the Greek letter λ (lambda) and is often used in reliability engineering. The failure rate of a ...
, h(s), where f(s) is a pdf and F(s) the corresponding cdf, is defined by : h(s) = \frac Hazard functions are useful in many applications, such as modeling unemployment duration, the failure time of products or life expectancy. Taking a specific example, if s denotes the length of life, then h(s) is the rate of death at age s, given that an individual has lived up to age s. The shape of the hazard function for human mortality data might appear as follows: decreasing mortality in the first few months of life, then a period of relatively constant mortality and finally an increasing probability of death at older ages. Special cases of the generalized beta distribution offer more flexibility in modeling the shape of the hazard function, which can call for "∪" or "∩" shapes or strictly increasing (denoted by I}) or decreasing (denoted by D) lines. The generalized gamma is "∪"-shaped for a>1 and p<1/a, "∩"-shaped for a<1 and p>1/a, I-shaped for a>1 and p>1/a and D-shaped for a<1 and p>1/a. This is summarized in the figure below.McDonald, James B. (1987) "A general methodology for determining distributional forms with applications in reliability," ''Journal of Statistical Planning and Inference'', 16, 365-376 McDonald, J.B. and Richards, D.O. (1987) "Hazard Functions and Generalized Beta Distributions", ''IEEE Transactions on Reliability'', 36, 463-466


References


Bibliography

* C. Kleiber and S. Kotz (2003) ''Statistical Size Distributions in Economics and Actuarial Sciences''. New York: Wiley * Johnson, N. L., S. Kotz, and N. Balakrishnan (1994) ''Continuous Univariate Distributions''. Vol. 2, Hoboken, NJ: Wiley-Interscience. {{ProbDistributions, continuous-bounded Continuous distributions