Yule–Simon Distribution
   HOME

TheInfoList



OR:

In
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
and
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the Yule–Simon distribution is a
discrete probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
named after
Udny Yule George Udny Yule FRS (18 February 1871 – 26 June 1951), usually known as Udny Yule, was a British statistician, particularly known for the Yule distribution. Personal life Yule was born at Beech Hill, a house in Morham near Haddington, ...
and
Herbert A. Simon Herbert Alexander Simon (June 15, 1916 – February 9, 2001) was an American political scientist, with a Ph.D. in political science, whose work also influenced the fields of computer science, economics, and cognitive psychology. His primary ...
. Simon originally called it the ''Yule distribution''. The
probability mass function In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
(pmf) of the Yule–Simon (''ρ'') distribution is :f(k;\rho) = \rho\operatorname(k, \rho+1), for
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
k \geq 1 and
real Real may refer to: Currencies * Brazilian real (R$) * Central American Republic real * Mexican real * Portuguese real * Spanish real * Spanish colonial real Music Albums * ''Real'' (L'Arc-en-Ciel album) (2000) * ''Real'' (Bright album) (2010) ...
\rho > 0, where \operatorname is the
beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^(1 ...
. Equivalently the pmf can be written in terms of the
rising factorial In mathematics, the falling factorial (sometimes called the descending factorial, falling sequential product, or lower factorial) is defined as the polynomial :\begin (x)_n = x^\underline &= \overbrace^ \\ &= \prod_^n(x-k+1) = \prod_^(x-k) \,. \e ...
as : f(k;\rho) = \frac, where \Gamma is the
gamma function In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except ...
. Thus, if \rho is an integer, : f(k;\rho) = \frac. The parameter \rho can be estimated using a fixed point algorithm. The probability mass function ''f'' has the property that for sufficiently large ''k'' we have : f(k;\rho) \approx \frac \propto \frac 1 . This means that the tail of the Yule–Simon distribution is a realization of Zipf's law: f(k;\rho) can be used to model, for example, the relative frequency of the kth most frequent word in a large collection of text, which according to Zipf's law is
inversely proportional In mathematics, two sequences of numbers, often experimental data, are proportional or directly proportional if their corresponding elements have a constant ratio, which is called the coefficient of proportionality or proportionality constan ...
to a (typically small) power of k.


Occurrence

The Yule–Simon distribution arose originally as the limiting distribution of a particular model studied by Udny Yule in 1925 to analyze the growth in the number of species per genus in some higher taxon of biotic organisms. The Yule model makes use of two related Yule processes, where a Yule process is defined as a continuous time
birth process In probability theory, a birth process or a pure birth process is a special case of a continuous-time Markov process and a generalisation of a Poisson process. It defines a continuous process which takes values in the natural numbers and can only ...
which starts with one or more individuals. Yule proved that when time goes to infinity, the limit distribution of the number of species in a genus selected uniformly at random has a specific form and exhibits a power-law behavior in its tail. Thirty years later, the Nobel laureate Herbert A. Simon proposed a time-discrete preferential attachment model to describe the appearance of new words in a large piece of a text. Interestingly enough, the limit distribution of the number of occurrences of each word, when the number of words diverges, coincides with that of the number of species belonging to the randomly chosen genus in the Yule model, for a specific choice of the parameters. This fact explains the designation Yule–Simon distribution that is commonly assigned to that limit distribution. In the context of random graphs, the
Barabási–Albert model The Barabási–Albert (BA) model is an algorithm for generating random scale-free network, scale-free complex network, networks using a preferential attachment mechanism. Several natural and human-made systems, including the Internet, the World ...
also exhibits an asymptotic degree distribution that equals the Yule–Simon distribution in correspondence of a specific choice of the parameters and still presents power-law characteristics for more general choices of the parameters. The same happens also for other
preferential attachment A preferential attachment process is any of a class of processes in which some quantity, typically some form of wealth or credit, is distributed among a number of individuals or objects according to how much they already have, so that those who ...
random graph models. The preferential attachment process can also be studied as an urn process in which balls are added to a growing number of urns, each ball being allocated to an urn with probability linear in the number (of balls) the urn already contains. The distribution also arises as a
compound distribution In probability and statistics, a compound probability distribution (also known as a mixture distribution or contagious distribution) is the probability distribution that results from assuming that a random variable is distributed according to som ...
, in which the parameter of a
geometric distribution In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions: * The probability distribution of the number ''X'' of Bernoulli trials needed to get one success, supported on the set \; * ...
is treated as a function of random variable having an
exponential distribution In probability theory and statistics, the exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average ...
. Specifically, assume that W follows an exponential distribution with scale 1/\rho or rate \rho: :W \sim \operatorname(\rho), with density :h(w;\rho) = \rho \exp(-\rho w). Then a Yule–Simon distributed variable ''K'' has the following geometric distribution conditional on ''W'': : K \sim \operatorname(\exp(-W)). The pmf of a geometric distribution is :g(k; p) = p (1-p)^ for k\in\. The Yule–Simon pmf is then the following exponential-geometric compound distribution: :f(k;\rho) = \int_0^\infty g(k;\exp(-w)) h(w;\rho)\,dw. The
maximum likelihood estimator In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statist ...
for the parameter \rho given the observations k_1,k_2,k_3,\dots,k_N is the solution to the fixed point equation : \rho^ = \frac, where b=0, a=1 are the rate and shape parameters of the
gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...
prior on \rho . This algorithm is derived by Garcia by directly optimizing the likelihood. Roberts and Roberts generalize the algorithm to
Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
settings with the compound geometric formulation described above. Additionally, Roberts and Roberts are able to use the
Expectation Maximisation Expectation or Expectations may refer to: Science * Expectation (epistemic) * Expected value, in mathematical probability theory * Expectation value (quantum mechanics) * Expectation–maximization algorithm, in statistics Music * ''Expectation ...
(EM) framework to show convergence of the fixed point algorithm. Moreover, Roberts and Roberts derive the sub-linearity of the convergence rate for the fixed point algorithm. Additionally, they use the EM formulation to give 2 alternate derivations of the standard error of the estimator from the fixed point equation. The variance of the \lambda estimator is : \operatorname(\hat) = \frac, the
standard error The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error ...
is the square root of the quantity of this estimate divided by N.


Generalizations

The two-parameter generalization of the original Yule distribution replaces the beta function with an
incomplete beta function In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral : \Beta(z_1,z_2) = \int_0^1 t^ ...
. The probability mass function of the generalized Yule–Simon(''ρ'', ''α'') distribution is defined as : f(k;\rho,\alpha) = \frac \rho \; \mathrm_(k, \rho+1), \, with 0 \leq \alpha < 1. For \alpha = 0 the ordinary Yule–Simon(''ρ'') distribution is obtained as a special case. The use of the incomplete beta function has the effect of introducing an exponential cutoff in the upper tail.


See also

*
Zeta distribution In probability theory and statistics, the zeta distribution is a discrete probability distribution. If ''X'' is a zeta-distributed random variable with parameter ''s'', then the probability that ''X'' takes the integer value ''k'' is given by ...
*
Scale-free network A scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction ''P''(''k'') of nodes in the network having ''k'' connections to other nodes goes for large values of ''k'' as : P(k) ...
*
Beta negative binomial distribution In probability theory, a beta negative binomial distribution is the probability distribution of a discrete random variable X equal to the number of failures needed to get r successes in a sequence of independent Bernoulli trials. The probabi ...


Bibliography

* Colin Rose and Murray D. Smith, ''Mathematical Statistics with Mathematica''. New York: Springer, 2002, . (''See page 107, where it is called the "Yule distribution".'')


References

{{DEFAULTSORT:Yule-Simon Distribution Discrete distributions Compound probability distributions