In
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the geometric distribution is either one of two
discrete probability distribution
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spa ...
s:
* The probability distribution of the number
of
Bernoulli trial
In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is ...
s needed to get one success, supported on
;
* The probability distribution of the number
of failures before the first success, supported on
.
These two different geometric distributions should not be confused with each other. Often, the name ''shifted'' geometric distribution is adopted for the former one (distribution of
); however, to avoid ambiguity, it is considered wise to indicate which is intended, by mentioning the support explicitly.
The geometric distribution gives the probability that the first occurrence of success requires
independent trials, each with success probability
. If the probability of success on each trial is
, then the probability that the
-th trial is the first success is
:
for
The above form of the geometric distribution is used for modeling the number of trials up to and including the first success. By contrast, the following form of the geometric distribution is used for modeling the number of failures until the first success:
:
for
The geometric distribution gets its name because its probabilities follow a
geometric sequence. It is sometimes called the Furry distribution after
Wendell H. Furry.
Definition
The geometric distribution is the
discrete probability distribution
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spa ...
that describes when the first success in an infinite sequence of
independent and identically distributed
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
Bernoulli trials occurs. Its
probability mass function
In probability and statistics, a probability mass function (sometimes called ''probability function'' or ''frequency function'') is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes i ...
depends on its parameterization and
support. When supported on
, the probability mass function is
where
is the number of trials and
is the probability of success in each trial.
The support may also be
, defining
. This alters the probability mass function into
where
is the number of failures before the first success.
An alternative parameterization of the distribution gives the probability mass function
where
and
.
An example of a geometric distribution arises from rolling a six-sided
die until a "1" appears. Each roll is
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
with a
chance of success. The number of rolls needed follows a geometric distribution with
.
Properties
Memorylessness
The geometric distribution is the only memoryless discrete probability distribution. It is the discrete version of the same property found in the
exponential distribution
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuousl ...
.
The property asserts that the number of previously failed trials does not affect the number of future trials needed for a success.
Because there are two definitions of the geometric distribution, there are also two definitions of memorylessness for discrete random variables. Expressed in terms of
conditional probability
In probability theory, conditional probability is a measure of the probability of an Event (probability theory), event occurring, given that another event (by assumption, presumption, assertion or evidence) is already known to have occurred. This ...
, the two definitions are
and
where
and
are
natural numbers
In mathematics, the natural numbers are the numbers 0, 1, 2, 3, and so on, possibly excluding 0. Some start counting with 0, defining the natural numbers as the non-negative integers , while others start with 1, defining them as the positiv ...
,
is a geometrically distributed random variable defined over
, and
is a geometrically distributed random variable defined over
. Note that these definitions are not equivalent for discrete random variables;
does not satisfy the first equation and
does not satisfy the second.
Moments and cumulants
The
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
and
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
of a geometrically distributed
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
defined over
is
With a geometrically distributed random variable
defined over
, the expected value changes into
while the variance stays the same.
For example, when rolling a six-sided die until landing on a "1", the average number of rolls needed is
and the average number of failures is
.
The
moment generating function of the geometric distribution when defined over
and
respectively is
The moments for the number of failures before the first success are given by
:
where
is the
polylogarithm function.
The
cumulant generating function
In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
of the geometric distribution defined over
is
The
cumulant
In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will have ...
s
satisfy the recursion
where
, when defined over
.
Proof of expected value
Consider the expected value
of ''X'' as above, i.e. the average number of trials until a success.
The first trial either succeeds with probability
, or fails with probability
.
If it fails, the remaining mean number of trials until a success is identical to the original mean -
this follows from the fact that all trials are independent.
From this we get the formula:
:
which, when solved for
, gives:
:
The expected number of failures
can be found from the
linearity of expectation
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first moment) is a generalization of the weighted average. Informally, the expected va ...
,
. It can also be shown in the following way:
:
The interchange of summation and differentiation is justified by the fact that convergent
power series
In mathematics, a power series (in one variable) is an infinite series of the form
\sum_^\infty a_n \left(x - c\right)^n = a_0 + a_1 (x - c) + a_2 (x - c)^2 + \dots
where ''a_n'' represents the coefficient of the ''n''th term and ''c'' is a co ...
converge uniformly on
compact
Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to:
* Interstate compact, a type of agreement used by U.S. states
* Blood compact, an ancient ritual of the Philippines
* Compact government, a t ...
subsets of the set of points where they converge.
Summary statistics
The
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
of the geometric distribution is its expected value which is, as previously discussed in
§ Moments and cumulants,
or
when defined over
or
respectively.
The
median
The median of a set of numbers is the value separating the higher half from the lower half of a Sample (statistics), data sample, a statistical population, population, or a probability distribution. For a data set, it may be thought of as the “ ...
of the geometric distribution is
when defined over
and
when defined over
.
The
mode of the geometric distribution is the first value in the support set. This is 1 when defined over
and 0 when defined over
.
The
skewness
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
For a unimodal ...
of the geometric distribution is
.
The
kurtosis
In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtos ...
of the geometric distribution is
.
The
excess kurtosis
In probability theory and statistics, kurtosis (from , ''kyrtos'' or ''kurtos'', meaning "curved, arching") refers to the degree of “tailedness” in the probability distribution of a real-valued random variable. Similar to skewness, kurtosi ...
of a distribution is the difference between its kurtosis and the kurtosis of a
normal distribution
In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
f(x) = \frac ...
,
.
Therefore, the excess kurtosis of the geometric distribution is
. Since
, the excess kurtosis is always positive so the distribution is
leptokurtic.
In other words, the tail of a geometric distribution decays faster than a Gaussian.
Entropy and Fisher's Information
Entropy (Geometric Distribution, Failures Before Success)
Entropy is a measure of uncertainty in a probability distribution. For the geometric distribution that models the number of failures before the first success, the probability mass function is:
:
The entropy
for this distribution is defined as:
:
The entropy increases as the probability
decreases, reflecting greater uncertainty as success becomes rarer.
Fisher's Information (Geometric Distribution, Failures Before Success)
Fisher information measures the amount of information that an observable random variable
carries about an unknown parameter
. For the geometric distribution (failures before the first success), the Fisher information with respect to
is given by:
:
Proof:
*The Likelihood Function for a geometric random variable
is:
:
*The Log-Likelihood Function is:
:
*The Score Function (first derivative of the log-likelihood w.r.t.
) is:
:
*The second derivative of the log-likelihood function is:
:
*Fisher Information is calculated as the negative expected value of the second derivative:
:
Fisher information increases as
decreases, indicating that rarer successes provide more information about the parameter
.
Entropy (Geometric Distribution, Trials Until Success)
For the geometric distribution modeling the number of trials until the first success, the probability mass function is:
:
The entropy
for this distribution is given by:
:
Entropy increases as
decreases, reflecting greater uncertainty as the probability of success in each trial becomes smaller.
Fisher's Information (Geometric Distribution, Trials Until Success)
Fisher information for the geometric distribution modeling the number of trials until the first success is given by:
:
Proof:
*The Likelihood Function for a geometric random variable
is:
:
*The Log-Likelihood Function is:
:
*The Score Function (first derivative of the log-likelihood w.r.t.
) is:
:
*The second derivative of the log-likelihood function is:
:
*Fisher Information is calculated as the negative expected value of the second derivative:
:
General properties
* The
probability generating function
In probability theory, the probability generating function of a discrete random variable is a power series representation (the generating function) of the probability mass function of the random variable. Probability generating functions are of ...
s of geometric random variables
and
defined over
and
are, respectively,
::
* The
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts:
* The indicator function of a subset, that is the function
\mathbf_A\colon X \to \,
which for a given subset ''A'' of ''X'', has value 1 at points ...
is equal to
so the geometric distribution's characteristic function, when defined over
and
respectively, is
* The
entropy
Entropy is a scientific concept, most commonly associated with states of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynamics, where it was first recognized, to the micros ...
of a geometric distribution with parameter
is
* Given a
mean
A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...
, the geometric distribution is the
maximum entropy probability distribution
In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, ...
of all discrete probability distributions. The corresponding continuous distribution is the
exponential distribution
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuousl ...
.
* The geometric distribution defined on
is
infinitely divisible
Infinite divisibility arises in different ways in philosophy, physics, economics, order theory (a branch of mathematics), and probability theory (also a branch of mathematics). One may speak of infinite divisibility, or the lack thereof, of matter ...
, that is, for any positive integer
, there exist
independent identically distributed random variables whose sum is also geometrically distributed. This is because the negative binomial distribution can be derived from a Poisson-stopped sum of
logarithmic random variables.
* The decimal digits of the geometrically distributed random variable ''Y'' are a sequence of
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
(and ''not'' identically distributed) random variables. For example, the hundreds digit ''D'' has this probability distribution:
::
:where ''q'' = 1 − ''p'', and similarly for the other digits, and, more generally, similarly for
numeral system
A numeral system is a writing system for expressing numbers; that is, a mathematical notation for representing numbers of a given set, using digits or other symbols in a consistent manner.
The same sequence of symbols may represent differe ...
s with other bases than 10. When the base is 2, this shows that a geometrically distributed random variable can be written as a sum of independent random variables whose probability distributions are
indecomposable.
*
Golomb coding
Golomb coding is a lossless data compression method using a family of data compression codes invented by Solomon W. Golomb in the 1960s. Alphabets following a geometric distribution will have a Golomb code as an optimal prefix code, making ...
is the optimal
prefix code
A prefix code is a type of code system distinguished by its possession of the prefix property, which requires that there is no whole Code word (communication), code word in the system that is a prefix (computer science), prefix (initial segment) of ...
for the geometric discrete distribution.
Related distributions
* The sum of
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
geometric random variables with parameter
is a
negative binomial
In probability theory and statistics, the negative binomial distribution, also called a Pascal distribution, is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Berno ...
random variable with parameters
and
. The geometric distribution is a special case of the negative binomial distribution, with
.
*The geometric distribution is a special case of discrete
compound Poisson distribution.
* The minimum of
geometric random variables with parameters
is also geometrically distributed with parameter
.
* Suppose 0 < ''r'' < 1, and for ''k'' = 1, 2, 3, ... the random variable ''X''
''k'' has a
Poisson distribution
In probability theory and statistics, the Poisson distribution () is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known const ...
with expected value ''r''
''k''/''k''. Then
::
:has a geometric distribution taking values in
, with expected value ''r''/(1 − ''r'').
* The
exponential distribution
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuousl ...
is the continuous analogue of the geometric distribution. Applying the
floor
A floor is the bottom surface of a room or vehicle. Floors vary from wikt:hovel, simple dirt in a cave to many layered surfaces made with modern technology. Floors may be stone, wood, bamboo, metal or any other material that can support the ex ...
function to the exponential distribution with parameter
creates a geometric distribution with parameter
defined over
.
This can be used to generate geometrically distributed random numbers as detailed in
§ Random variate generation.
* If ''p'' = 1/''n'' and ''X'' is geometrically distributed with parameter ''p'', then the distribution of ''X''/''n'' approaches an
exponential distribution
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuousl ...
with expected value 1 as ''n'' → ∞, since
More generally, if ''p'' = ''λ''/''n'', where ''λ'' is a parameter, then as ''n''→ ∞ the distribution of ''X''/''n'' approaches an exponential distribution with rate ''λ'':
therefore the distribution function of ''X''/''n'' converges to
, which is that of an exponential random variable.
* The
index of dispersion
In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a pro ...
of the geometric distribution is
and its
coefficient of variation
In probability theory and statistics, the coefficient of variation (CV), also known as normalized root-mean-square deviation (NRMSD), percent RMS, and relative standard deviation (RSD), is a standardized measure of dispersion of a probability ...
is
. The distribution is
overdispersed.
Statistical inference
The true parameter
of an unknown geometric distribution can be inferred through estimators and conjugate distributions.
Method of moments
Provided they exist, the first
moments of a probability distribution can be estimated from a sample
using the formula
where
is the
th sample moment and
.
Estimating
with
gives the
sample mean
The sample mean (sample average) or empirical mean (empirical average), and the sample covariance or empirical covariance are statistics computed from a sample of data on one or more random variables.
The sample mean is the average value (or me ...
, denoted
. Substituting this estimate in the formula for the expected value of a geometric distribution and solving for
gives the estimators
and
when supported on
and
respectively. These estimators are
biased since
as a result of
Jensen's inequality
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier p ...
.
Maximum likelihood estimation
The
maximum likelihood estimator
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
of
is the value that maximizes the
likelihood function
A likelihood function (often simply called the likelihood) measures how well a statistical model explains observed data by calculating the probability of seeing that data under different parameter values of the model. It is constructed from the ...
given a sample.
By finding the
zero
0 (zero) is a number representing an empty quantity. Adding (or subtracting) 0 to any number leaves that number unchanged; in mathematical terminology, 0 is the additive identity of the integers, rational numbers, real numbers, and compl ...
of the
derivative
In mathematics, the derivative is a fundamental tool that quantifies the sensitivity to change of a function's output with respect to its input. The derivative of a function of a single variable at a chosen input value, when it exists, is t ...
of the
log-likelihood function when the distribution is defined over
, the maximum likelihood estimator can be found to be
, where
is the sample mean. If the domain is
, then the estimator shifts to
. As previously discussed in
§ Method of moments, these estimators are biased.
Regardless of the domain, the bias is equal to
:
which yields the
bias-corrected maximum likelihood estimator,
:
Bayesian inference
In
Bayesian inference
Bayesian inference ( or ) is a method of statistical inference in which Bayes' theorem is used to calculate a probability of a hypothesis, given prior evidence, and update it as more information becomes available. Fundamentally, Bayesian infer ...
, the parameter
is a random variable from a
prior distribution
A prior probability distribution of an uncertain quantity, simply called the prior, is its assumed probability distribution before some evidence is taken into account. For example, the prior could be the probability distribution representing the ...
with a
posterior distribution
The posterior probability is a type of conditional probability that results from updating the prior probability with information summarized by the likelihood via an application of Bayes' rule. From an epistemological perspective, the posterior ...
calculated using
Bayes' theorem
Bayes' theorem (alternatively Bayes' law or Bayes' rule, after Thomas Bayes) gives a mathematical rule for inverting Conditional probability, conditional probabilities, allowing one to find the probability of a cause given its effect. For exampl ...
after observing samples.
If a
beta distribution
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1
The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
or (0, 1) in terms of two positive Statistical parameter, parameters, denoted by ''alpha'' (''α'') an ...
is chosen as the prior distribution, then the posterior will also be a beta distribution and it is called the
conjugate distribution. In particular, if a
prior is selected, then the posterior, after observing samples
, is
Alternatively, if the samples are in
, the posterior distribution is
Since the expected value of a
distribution is
,
as
and
approach zero, the posterior mean approaches its maximum likelihood estimate.
Random variate generation
The geometric distribution can be generated experimentally from
i.i.d. standard uniform random variables by finding the first such random variable to be less than or equal to
. However, the number of random variables needed is also geometrically distributed and the algorithm slows as
decreases.
Random generation can be done in
constant time
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations p ...
by truncating
exponential random numbers. An exponential random variable
can become geometrically distributed with parameter
through
. In turn,
can be generated from a standard uniform random variable
altering the formula into
.
Applications
The geometric distribution is used in many disciplines. In
queueing theory
Queueing theory is the mathematical study of waiting lines, or queues. A queueing model is constructed so that queue lengths and waiting time can be predicted. Queueing theory is generally considered a branch of operations research because th ...
, the
M/M/1 queue
In queueing theory, a discipline within the mathematical probability theory, theory of probability, an M/M/1 queue represents the queue length in a system having a single server, where arrivals are determined by a Poisson process and job service ...
has a steady state following a geometric distribution. In
stochastic processes
In probability theory and related fields, a stochastic () or random process is a mathematical object usually defined as a family of random variables in a probability space, where the index of the family often has the interpretation of time. Stoc ...
, the Yule Furry process is geometrically distributed. The distribution also arises when modeling the lifetime of a device in discrete contexts. It has also been used to fit data including modeling patients spreading
COVID-19
Coronavirus disease 2019 (COVID-19) is a contagious disease caused by the coronavirus SARS-CoV-2. In January 2020, the disease spread worldwide, resulting in the COVID-19 pandemic.
The symptoms of COVID‑19 can vary but often include fever ...
.
See also
*
Hypergeometric distribution
In probability theory and statistics, the hypergeometric distribution is a Probability distribution#Discrete probability distribution, discrete probability distribution that describes the probability of k successes (random draws for which the ...
*
Coupon collector's problem
*
Compound Poisson distribution
*
Negative binomial distribution
In probability theory and statistics, the negative binomial distribution, also called a Pascal distribution, is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Berno ...
References
{{ProbDistributions, discrete-infinite
Discrete distributions
Exponential family distributions
Infinitely divisible probability distributions
Articles with example R code