In
probability theory and
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the beta-binomial distribution is a family of discrete
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
s on a finite
support
Support may refer to:
Arts, entertainment, and media
* Supporting character
Business and finance
* Support (technical analysis)
* Child support
* Customer support
* Income Support
Construction
* Support (structure), or lateral support, a ...
of non-negative integers arising when the probability of success in each of a fixed or known number of
Bernoulli trials is either unknown or random. The beta-binomial distribution is the
binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
in which the probability of success at each of ''n'' trials is not fixed but randomly drawn from a
beta distribution
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1in terms of two positive parameters, denoted by ''alpha'' (''α'') and ''beta'' (''β''), that appear as ...
. It is frequently used in
Bayesian statistics,
empirical Bayes methods and
classical statistics
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pr ...
to capture
overdispersion in binomial type distributed data.
The beta-binomial is a one-dimensional version of the
Dirichlet-multinomial distribution as the binomial and beta distributions are univariate versions of the
multinomial and
Dirichlet distribution
In probability and statistics, the Dirichlet distribution (after Peter Gustav Lejeune Dirichlet), often denoted \operatorname(\boldsymbol\alpha), is a family of continuous multivariate probability distributions parameterized by a vector \boldsymb ...
s respectively. The special case where ''α'' and ''β'' are integers is also known as the
negative hypergeometric distribution.
Motivation and derivation
As a compound distribution
The
Beta distribution
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1in terms of two positive parameters, denoted by ''alpha'' (''α'') and ''beta'' (''β''), that appear as ...
is a
conjugate distribution of the
binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
. This fact leads to an analytically tractable
compound distribution where one can think of the
parameter in the binomial distribution as being randomly drawn from a beta distribution.
Suppose we were interested in predicting the number of heads,
in
future trials. This is given by
:
Using the properties of the
beta function, this can alternatively be written
:
Beta-binomial as an urn model
The beta-binomial distribution can also be motivated via an
urn model for positive
integer values of ''α'' and ''β'', known as the
Pólya urn model. Specifically, imagine an urn containing ''α'' red balls and ''β'' black balls, where random draws are made. If a red ball is observed, then two red balls are returned to the urn. Likewise, if a black ball is drawn, then two black balls are returned to the urn. If this is repeated ''n'' times, then the probability of observing ''x'' red balls follows a beta-binomial distribution with parameters ''n'', ''α'' and ''β''.
If the random draws are with simple replacement (no balls over and above the observed ball are added to the urn), then the distribution follows a binomial distribution and if the random draws are made without replacement, the distribution follows a
hypergeometric distribution.
Moments and properties
The first three raw
moments are
::
and the
kurtosis is
::
Letting
we note, suggestively, that the mean can be written as
::
and the variance as
::
where
. The parameter
is known as the "intra class" or "intra cluster" correlation. It is this positive correlation which gives rise to overdispersion. Note that when
, no information is available to distinguish between the beta and binomial variation, and the two models have equal variances.
Factorial Moments
The -th
factorial moment of a Beta-binomial random variable is
:
.
Point estimates
Method of moments
The
method of moments estimates can be gained by noting the first and second moments of the beta-binomial and setting those equal to the sample moments
and
. We find
::
These estimates can be non-sensically negative which is evidence that the data is either undispersed or underdispersed relative to the binomial distribution. In this case, the binomial distribution and the
hypergeometric distribution are alternative candidates respectively.
Maximum likelihood estimation
While closed-form
maximum likelihood estimates are impractical, given that the pdf consists of common functions (gamma function and/or Beta functions), they can be easily found via direct numerical optimization. Maximum likelihood estimates from empirical data can be computed using general methods for fitting multinomial Pólya distributions, methods for which are described in
(Minka 2003).
The
R package VGAM through the function vglm, via maximum likelihood, facilitates the fitting of
glm type models with responses distributed according to the beta-binomial distribution. There is no requirement that n is fixed throughout the observations.
Example
The following data gives the number of male children among the first 12 children of family size 13 in 6115 families taken from hospital records in 19th century
Saxony (Sokal and Rohlf, p. 59 from Lindsey). The 13th child is ignored to blunt the effect of families non-randomly stopping when a desired gender is reached.
The first two sample moments are
::
and therefore the method of moments estimates are
::
The
maximum likelihood estimates can be found numerically
::
and the maximized log-likelihood is
::
from which we find the
AIC AIC may refer to:
Arts and entertainment
* Alice in Chains, American rock band
* Alice in Chains: AIC 23, a 2013 mockumentary
* Anime International Company, a Japanese animation studio
* Art Institute of Chicago, an art museum in Chicago
Busin ...
::
The AIC for the competing binomial model is AIC = 25070.34 and thus we see that the beta-binomial model provides a superior fit to the data i.e. there is evidence for overdispersion.
Trivers and Willard postulate a theoretical justification for heterogeneity in gender-proneness among
mammalian offspring.
The superior fit is evident especially among the tails
Beta-binomial in Bayesian statistics
The beta-binomial distribution plays a prominent role in the Bayesian estimation of a Bernoulli success probability
. Let
be a
sample
Sample or samples may refer to:
Base meaning
* Sample (statistics), a subset of a population – complete data set
* Sample (signal), a digital discrete sample of a continuous analog signal
* Sample (material), a specimen or small quantity of s ...
of
independent and identically distributed Bernoulli random variables
. Suppose, our knowledge of
, - in Bayesian fashion - is uncertain and is modeled by the
prior distribution . If
then through
compounding
In the field of pharmacy, compounding (performed in compounding pharmacies) is preparation of a custom formulation of a medication to fit a unique need of a patient that cannot be met with commercially available products. This may be done for me ...
, the
prior predictive distribution of
:
.
After observing
we note that the
posterior distribution for
:
where
is a normalizing constant. We recognize the posterior distribution as a
.
Thus, again through compounding, we find that the
posterior predictive distribution
Posterior may refer to:
* Posterior (anatomy), the end of an organism opposite to its head
** Buttocks, as a euphemism
* Posterior horn (disambiguation)
* Posterior probability
The posterior probability is a type of conditional probability that r ...
of a sum of a future sample of size
of
random variables is
:
.
Generating beta binomial-distributed random variables
To draw a beta-binomial random variate
simply draw a
and then draw
.
Related distributions
*
where
.
*
where
is the
discrete uniform distribution
In probability theory and statistics, the discrete uniform distribution is a symmetric probability distribution wherein a finite number of values are equally likely to be observed; every one of ''n'' values has equal probability 1/''n''. Anothe ...
.
*
where
and
and
is the
binomial distribution
In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
.
*
where
is the
negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
.
See also
*
Dirichlet-multinomial distribution
References
* Minka, Thomas P. (2003)
Estimating a Dirichlet distribution Microsoft Technical Report.
External links
*
ttp://research.microsoft.com/~minka/software/fastfit/ Fastfitcontains Matlab code for fitting Beta-Binomial distributions (in the form of two-dimensional Pólya distributions) to data.
* Interactive graphic
Univariate Distribution Relationships
{{DEFAULTSORT:Beta-Binomial Distribution
Discrete distributions
Compound probability distributions
Conjugate prior distributions