Logistic-beta Distribution
   HOME

TheInfoList



OR:

The term generalized logistic distribution is used as the name for several different families of
probability distributions In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spac ...
. For example, Johnson et al.Johnson, N.L., Kotz, S., Balakrishnan, N. (1995) ''Continuous Univariate Distributions, Volume 2'', Wiley. (pages 140–142) list four forms, which are listed below. Type I has also been called the skew-logistic distribution.
Type IV Type may refer to: Science and technology Computing * Typing, producing text via a keyboard, typewriter, etc. * Data type, collection of values used for computations. * File type * TYPE (DOS command), a command to display contents of a file. * Ty ...
subsumes the other types and is obtained when applying the
logit In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in Data transformation (statistics), data transformations. Ma ...
transform to
beta Beta (, ; uppercase , lowercase , or cursive ; or ) is the second letter of the Greek alphabet. In the system of Greek numerals, it has a value of 2. In Ancient Greek, beta represented the voiced bilabial plosive . In Modern Greek, it represe ...
random variates. Following the same convention as for the
log-normal distribution In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normal distribution, normally distributed. Thus, if the random variable is log-normally distributed ...
, type IV may be referred to as the logistic-beta distribution, with reference to the standard
logistic function A logistic function or logistic curve is a common S-shaped curve ( sigmoid curve) with the equation f(x) = \frac where The logistic function has domain the real numbers, the limit as x \to -\infty is 0, and the limit as x \to +\infty is L. ...
, which is the inverse of the logit transform. For other families of distributions that have also been called generalized logistic distributions, see the shifted log-logistic distribution, which is a generalization of the
log-logistic distribution In probability and statistics, the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for event ...
; and the metalog ("meta-logistic") distribution, which is highly shape-and-bounds flexible and can be fit to data with linear least squares.


Definitions

The following definitions are for standardized versions of the families, which can be expanded to the full form as a location-scale family. Each is defined using either the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ever ...
(''F'') or the
probability density function In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a Function (mathematics), function whose value at any given sample (or point) in the sample space (the s ...
(''ƒ''), and is defined on (-∞,∞).


Type I

:F(x;\alpha)=\frac \equiv (1+e^)^, \quad \alpha > 0 . The corresponding probability density function is: :f(x;\alpha)=\frac, \quad \alpha > 0 . This type has also been called the "skew-logistic" distribution.


Type II

:F(x;\alpha)=1-\frac, \quad \alpha > 0 . The corresponding probability density function is: :f(x;\alpha)=\frac, \quad \alpha > 0 .


Type III

:f(x;\alpha)=\frac\frac, \quad \alpha > 0 . Here ''B'' is the
beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^ ...
. The
moment generating function In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
for this type is :M(t)=\frac, \quad -\alpha The corresponding cumulative distribution function is: :F(x;\alpha)= \frac, \quad \alpha > 0 .


Type IV

: \begin f(x;\alpha,\beta)&=\frac\frac, \quad \alpha,\beta > 0 \\ pt&= \frac . \end Where, ''B'' is the
beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^ ...
and \sigma(x)=1/(1+e^) is the standard
logistic function A logistic function or logistic curve is a common S-shaped curve ( sigmoid curve) with the equation f(x) = \frac where The logistic function has domain the real numbers, the limit as x \to -\infty is 0, and the limit as x \to +\infty is L. ...
. The
moment generating function In probability theory and statistics, the moment-generating function of a real-valued random variable is an alternative specification of its probability distribution. Thus, it provides the basis of an alternative route to analytical results compare ...
for this type is :M(t)=\frac, \quad -\alpha This type is also called the "exponential generalized beta of the second type". The corresponding cumulative distribution function is: :F(x;\alpha,\beta)= \frac , \quad \alpha,\beta > 0 .


Relationship between types

Type IV is the most general form of the distribution. The Type III distribution can be obtained from Type IV by fixing \beta = \alpha. The Type II distribution can be obtained from Type IV by fixing \alpha = 1 (and renaming \beta to \alpha). The Type I distribution can be obtained from Type IV by fixing \beta = 1. Fixing \alpha=\beta=1 gives the standard
logistic distribution In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It rese ...
.


Type IV (logistic-beta) properties

The Type IV generalized logistic, or logistic-beta distribution, with support x\in\mathbb and shape parameters \alpha,\beta>0, has (as shown
above Above may refer to: *Above (artist) Tavar Zawacki (b. 1981, California) is a Polish, Portuguese - American abstract artist and internationally recognized visual artist based in Berlin, Germany. From 1996 to 2016, he created work under the ...
) the probability density function (pdf): : f(x;\alpha,\beta)= \frac\frac = \frac, where \sigma(x)=1/(1+e^) is the standard
logistic function A logistic function or logistic curve is a common S-shaped curve ( sigmoid curve) with the equation f(x) = \frac where The logistic function has domain the real numbers, the limit as x \to -\infty is 0, and the limit as x \to +\infty is L. ...
. The probability density functions for three different sets of shape parameters are shown in the plot, where the distributions have been scaled and shifted to give zero means and unity variances, in order to facilitate comparison of the shapes. In what follows, the notation B_\sigma(\alpha,\beta) is used to denote the Type IV distribution.


Relationship with Gamma Distribution

This distribution can be obtained in terms of the
gamma distribution In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the g ...
as follows. Let y\sim\text(\alpha,\gamma) and independently, z\sim\text(\beta,\gamma) and let x=\ln y - \ln z. Then x\sim B_\sigma(\alpha,\beta).


Symmetry

If x\sim B_\sigma(\alpha,\beta), then -x\sim B_\sigma(\beta,\alpha).


Mean and variance

By using the logarithmic expectations of the gamma distribution, the mean and variance can be derived as: : \begin \text &= \psi(\alpha) - \psi(\beta) \\ \text &= \psi'(\alpha) + \psi'(\beta) \\ \end where \psi is the
digamma function In mathematics, the digamma function is defined as the logarithmic derivative of the gamma function: :\psi(z) = \frac\ln\Gamma(z) = \frac. It is the first of the polygamma functions. This function is Monotonic function, strictly increasing a ...
, while \psi'=\psi^ is its first derivative, also known as the
trigamma function In mathematics, the trigamma function, denoted or , is the second of the polygamma functions, and is defined by : \psi_1(z) = \frac \ln\Gamma(z). It follows from this definition that : \psi_1(z) = \frac \psi(z) where is the digamma functi ...
, or the first
polygamma function In mathematics, the polygamma function of order is a meromorphic function on the complex numbers \mathbb defined as the th derivative of the logarithm of the gamma function: :\psi^(z) := \frac \psi(z) = \frac \ln\Gamma(z). Thus :\psi^(z) ...
. Since \psi is
strictly increasing In mathematical writing, the term strict refers to the property of excluding equality and equivalence and often occurs in the context of inequality and monotonic functions. It is often attached to a technical term to indicate that the exclusiv ...
, the sign of the mean is the same as the sign of \alpha-\beta. Since \psi' is strictly decreasing, the shape parameters can also be interpreted as concentration parameters. Indeed, as shown below, the left and right tails respectively become thinner as \alpha or \beta are increased. The two terms of the variance represent the contributions to the variance of the left and right parts of the distribution.


Cumulants and skewness

The
cumulant generating function In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
is K(t)=\ln M(t), where the moment generating function M(t) is given
above Above may refer to: *Above (artist) Tavar Zawacki (b. 1981, California) is a Polish, Portuguese - American abstract artist and internationally recognized visual artist based in Berlin, Germany. From 1996 to 2016, he created work under the ...
. The ''cumulants'', \kappa_n, are the n-th derivatives of K(t), evaluated at t=0: : \kappa_n = K^(0) = \psi^(\alpha) + (-1)^ \psi^(\beta) where \psi^=\psi and \psi^ are the digamma and polygamma functions. In agreement with the derivation above, the first cumulant, \kappa_1, is the mean and the second, \kappa_2, is the variance. The third cumulant, \kappa_3, is the third central moment E x-E[x^3">.html" ;"title="x-E[x">x-E[x^3/math>, which when scaled by the third power of the standard deviation gives the skewness: : \text = \frac The sign (and therefore the chirality (mathematics), handedness) of the skewness is the same as the sign of \alpha-\beta.


Mode

The mode (pdf maximum) can be derived by finding x where the log pdf derivative is zero: : \frac\ln f(x;\alpha,\beta) = \alpha\sigma(-x) -\beta\sigma(x) = 0 This simplifies to \alpha/\beta=e^x, so that: : \text = \ln\frac


Tail behaviour

In each of the left and right tails, one of the sigmoids in the pdf saturates to one, so that the tail is formed by the other sigmoid. For large negative x, the left tail of the pdf is proportional to \sigma(x)^\alpha\approx e^, while the right tail (large positive x) is proportional to \sigma(-x)^\beta\approx e^. This means the tails are independently controlled by \alpha and \beta. Although type IV tails are heavier than those of the
normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...
(e^, for variance v), the type IV means and variances remain finite for all \alpha,\beta>0. This is in contrast with the
Cauchy distribution The Cauchy distribution, named after Augustin-Louis Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution (after Hendrik Lorentz), Cauchy–Lorentz distribution, Lorentz(ian) ...
for which the mean and variance do not exist. In the log pdf plots shown here, the type IV tails are linear, the normal distribution tails are quadratic and the Cauchy tails are logarithmic.


Exponential family properties

B_\sigma(\alpha,\beta) forms an
exponential family In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
with natural parameters \alpha and \beta and
sufficient statistics In logic and mathematics, necessity and sufficiency are terms used to describe a conditional or implicational relationship between two statements. For example, in the conditional statement: "If then ", is necessary for , because the truth of ...
\log\sigma(x) and \log\sigma(-x). The expected values of the sufficient statistics can be found by differentiation of the log-normalizer:C.M.Bishop, ''Pattern Recognition and Machine Learning'', Springer 2006. : \begin E log\sigma(x)&= \frac = \psi(\alpha) - \psi(\alpha+\beta) \\ E log\sigma(-x)&= \frac = \psi(\beta) - \psi(\alpha+\beta) \\ \end Given a data set x_1,\ldots,x_n assumed to have been generated IID from B_\sigma(\alpha,\beta), the
maximum-likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
parameter estimate is: : \begin \hat\alpha,\hat\beta = \arg\max_ &\;\frac1n\sum_^n \log f(x_i;\alpha,\beta) \\ =\arg\max_ &\;\alpha\Bigl(\frac1n\sum_i\log\sigma(x_i)\Bigr) + \beta\Bigl(\frac1n\sum_i\log\sigma(-x_i)\Bigr) -\log B(\alpha,\beta)\\ =\arg\max_&\;\alpha\,\overline + \beta\,\overline -\log B(\alpha,\beta) \end where the overlines denote the averages of the sufficient statistics. The maximum-likelihood estimate depends on the data only via these average statistics. Indeed, at the maximum-likelihood estimate the expected values and averages agree: : \begin \psi(\hat\alpha) - \psi(\hat\alpha+\hat\beta) &= \overline \\ \psi(\hat\beta) - \psi(\hat\alpha+\hat\beta) &= \overline \\ \end which is also where the partial derivatives of the above maximand vanish.


Relationships with other distributions

Relationships with other distributions include: * The log-ratio of gamma variates is of type IV as detailed
above Above may refer to: *Above (artist) Tavar Zawacki (b. 1981, California) is a Polish, Portuguese - American abstract artist and internationally recognized visual artist based in Berlin, Germany. From 1996 to 2016, he created work under the ...
. * If y\sim\text(\alpha,\beta), then x=\ln y has a type IV distribution, with parameters \alpha and \beta. See
beta prime distribution In probability theory and statistics, the beta prime distribution (also known as inverted beta distribution or beta distribution of the second kindJohnson et al (1995), p 248) is an absolutely continuous probability distribution. If p\in ,1/math ...
. * If z\sim\text(\beta,1) and y\mid z\sim\text(\alpha,z), where z is used as the rate parameter of the second gamma distribution, then y has a compound gamma distribution, which is the same as \text(\alpha,\beta), so that x=\ln y has a type IV distribution. * If p\sim\text(\alpha,\beta), then x=\text\, p has a type IV distribution, with parameters \alpha and \beta. See
beta distribution In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval
, 1 The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
or (0, 1) in terms of two positive Statistical parameter, parameters, denoted by ''alpha'' (''α'') an ...
. The
logit function In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in Data transformation (statistics), data transformations. Ma ...
, \mathrm(p) = \log\frac is the inverse of the
logistic function A logistic function or logistic curve is a common S-shaped curve ( sigmoid curve) with the equation f(x) = \frac where The logistic function has domain the real numbers, the limit as x \to -\infty is 0, and the limit as x \to +\infty is L. ...
. This relationship explains the name logistic-beta for this distribution: if the logistic function is applied to logistic-beta variates, the transformed distribution is beta.


Large shape parameters

For large values of the shape parameters, \alpha,\beta\gg1, the distribution becomes more
Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
, with: : \begin E \approx\ln\frac \\ \text &\approx\frac \end This is demonstrated in the pdf and log pdf plots here.


Random variate generation

Since random sampling from the gamma and
beta Beta (, ; uppercase , lowercase , or cursive ; or ) is the second letter of the Greek alphabet. In the system of Greek numerals, it has a value of 2. In Ancient Greek, beta represented the voiced bilabial plosive . In Modern Greek, it represe ...
distributions are readily available on many software platforms, the above relationships with those distributions can be used to generate variates from the type IV distribution.


Generalization with location and scale parameters

A flexible, four-parameter family can be obtained by adding
location In geography, location or place is used to denote a region (point, line, or area) on Earth's surface. The term ''location'' generally implies a higher degree of certainty than ''place'', the latter often indicating an entity with an ambiguous bou ...
and
scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family ...
s. One way to do this is if x\sim B_\sigma(\alpha,\beta), then let y=kx+\delta, where k>0 is the scale parameter and \delta\in\mathbb is the location parameter. The four-parameter family obtained thus has the desired additional flexibility, but the new parameters may be hard to interpret because \delta\ne E /math> and k^2\ne \text /math>. Moreover
maximum-likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
estimation with this parametrization is hard. These problems can be addressed as follows. Recall that the mean and variance of x are: : \begin \tilde\mu&=\psi(\alpha)-\psi(\beta), &\tilde s^2&=\psi'(\alpha)+\psi'(\beta) \end Now expand the family with location parameter \mu\in\mathbb and scale parameter s>0, via the transformation: : \begin y&=\mu + \frac(x-\tilde\mu) \iff x=\tilde\mu + \frac(y-\mu) \end so that \mu=E /math> and s^2=\text /math> are now interpretable. It may be noted that allowing s to be either positive or negative does not generalize this family, because of the above-noted
symmetry Symmetry () in everyday life refers to a sense of harmonious and beautiful proportion and balance. In mathematics, the term has a more precise definition and is usually used to refer to an object that is Invariant (mathematics), invariant und ...
property. We adopt the notation y\sim\bar B_\sigma(\alpha,\beta,\mu,s^2) for this family. If the pdf for x\sim B_\sigma(\alpha,\beta) is f(x;\alpha,\beta), then the pdf for y\sim \bar B_\sigma(\alpha,\beta,\mu,s^2) is: : \bar f(y;\alpha,\beta,\mu,s^2) = \frac\, f(x;\alpha,\beta) where it is understood that x is computed as detailed above, as a function of y,\alpha,\beta,\mu,s. The pdf and log-pdf plots above, where the captions contain ''(means=0, variances=1),'' are for \bar B_\sigma(\alpha,\beta,0,1).


Maximum likelihood parameter estimation

In this section,
maximum-likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
estimation of the distribution parameters, given a dataset x_1,\ldots,x_n is discussed in turn for the families B_\sigma(\alpha,\beta) and \bar B_\sigma(\alpha,\beta,\mu,s^2).


Maximum likelihood for standard Type IV

As noted
above Above may refer to: *Above (artist) Tavar Zawacki (b. 1981, California) is a Polish, Portuguese - American abstract artist and internationally recognized visual artist based in Berlin, Germany. From 1996 to 2016, he created work under the ...
, B_\sigma(\alpha,\beta) is an exponential family with natural parameters \alpha,\beta, the maximum-likelihood estimates of which depend only on averaged sufficient statistics: : \begin \overline&=\frac1n\sum_i\log\sigma(x_i) &&\text & \overline&=\frac1n\sum_i\log\sigma(-x_i) \end Once these statistics have been accumulated, the maximum-likelihood estimate is given by: : \begin \hat\alpha,\hat\beta =\arg\max_&\;\alpha\,\overline + \beta\,\overline -\log B(\alpha,\beta) \end By using the parametrization \theta_1=\log\alpha and \theta_2=\log\beta an unconstrained
numerical optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...
algorithm like BFGS can be used. Optimization iterations are fast, because they are independent of the size of the data-set. An alternative is to use an EM-algorithm based on the composition: x-\log(\gamma\delta)\sim B_\sigma(\alpha,\beta) if z\sim\text(\beta,\gamma) and e^x\mid z\sim\text(\alpha,z/\delta). Because of the self-conjugacy of the
gamma distribution In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the g ...
, the posterior expectations, \left\langle z\right\rangle_ and \left\langle\log z\right\rangle_ that are required for the E-step can be computed in closed form. The M-step parameter update can be solved analogously to maximum-likelihood for the gamma distribution.


Maximum likelihood for the four-parameter family

The maximum-likelihood problem for \bar B_\sigma(\alpha,\beta,\mu,s^2), having pdf \bar f is: : \hat\alpha,\hat\beta,\hat\mu,\hat s = \arg\max_ \log\frac1n\sum_i \bar f(x_i;\alpha,\beta,\mu,s^2) This is no longer an exponential family, so that each optimization iteration has to traverse the whole data-set. Moreover the computation of the
partial derivatives In mathematics, a partial derivative of a function of several variables is its derivative with respect to one of those variables, with the others held constant (as opposed to the total derivative, in which all variables are allowed to vary). Par ...
(as required for example by BFGS) is considerably more complex than for the above two-parameter case. However, all the component functions are readily available in software packages with
automatic differentiation In mathematics and computer algebra, automatic differentiation (auto-differentiation, autodiff, or AD), also called algorithmic differentiation, computational differentiation, and differentiation arithmetic Hend Dawood and Nefertiti Megahed (2023) ...
. Again, the positive parameters can be parametrized in terms of their logarithms to obtain an unconstrained numerical optimization problem. For this problem, numerical optimization may fail unless the initial location and scale parameters are chosen appropriately. However the above-mentioned interpretability of these parameters in the parametrization of \bar B_\sigma can be used to do this. Specifically, the initial values for \mu and s^2 can be set to the empirical mean and variance of the data.


See also

*
Champernowne distribution In statistics, the Champernowne distribution is a symmetric, continuous probability distribution, describing random variables that take both positive and negative values. It is a generalization of the logistic distribution that was introduced by D. ...
, another generalization of the logistic distribution.


References

{{ProbDistributions, continuous-infinite Continuous distributions