Poisson Distribution
   HOME

TheInfoList



OR:

In
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
and
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the Poisson distribution is a
discrete probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. It is named after French mathematician
Siméon Denis Poisson Baron Siméon Denis Poisson FRS FRSE (; 21 June 1781 – 25 April 1840) was a French mathematician and physicist who worked on statistics, complex analysis, partial differential equations, the calculus of variations, analytical mechanics, electri ...
(; ). The Poisson distribution can also be used for the number of events in other specified interval types such as distance, area, or volume. For instance, a call center receives an average of 180 calls per hour, 24 hours a day. The calls are independent; receiving one does not change the probability of when the next one will arrive. The number of calls received during any minute has a Poisson probability distribution with mean 3: the most likely numbers are 2 and 3 but 1 and 4 are also likely and there is a small probability of it being as low as zero and a very small probability it could be 10. Another example is the number of decay events that occur from a radioactive source during a defined observation period.


History

The distribution was first introduced by
Siméon Denis Poisson Baron Siméon Denis Poisson FRS FRSE (; 21 June 1781 – 25 April 1840) was a French mathematician and physicist who worked on statistics, complex analysis, partial differential equations, the calculus of variations, analytical mechanics, electri ...
(1781–1840) and published together with his probability theory in his work ''Recherches sur la probabilité des jugements en matière criminelle et en matière civile'' (1837). The work theorized about the number of wrongful convictions in a given country by focusing on certain
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s that count, among other things, the number of discrete occurrences (sometimes called "events" or "arrivals") that take place during a
time Time is the continued sequence of existence and events that occurs in an apparently irreversible succession from the past, through the present, into the future. It is a component quantity of various measurements used to sequence events, to ...
-interval of given length. The result had already been given in 1711 by
Abraham de Moivre Abraham de Moivre FRS (; 26 May 166727 November 1754) was a French mathematician known for de Moivre's formula, a formula that links complex numbers and trigonometry, and for his work on the normal distribution and probability theory. He moved ...
in ''De Mensura Sortis seu; de Probabilitate Eventuum in Ludis a Casu Fortuito Pendentibus'' . This makes it an example of
Stigler's law Stigler's law of eponymy, proposed by University of Chicago statistics professor Stephen Stigler in his 1980 publication ''Stigler’s law of eponymy'', states that no scientific discovery is named after its original discoverer. Examples include ...
and it has prompted some authors to argue that the Poisson distribution should bear the name of de Moivre. In 1860,
Simon Newcomb Simon Newcomb (March 12, 1835 – July 11, 1909) was a Canadian–American astronomer, applied mathematician, and autodidactic polymath. He served as Professor of Mathematics in the United States Navy and at Johns Hopkins University. Born in Nov ...
fitted the Poisson distribution to the number of stars found in a unit of space. A further practical application of this distribution was made by
Ladislaus Bortkiewicz Ladislaus Josephovich Bortkiewicz (Russian Владислав Иосифович Борткевич, German ''Ladislaus von Bortkiewicz'' or ''Ladislaus von Bortkewitsch'') (7 August 1868 – 15 July 1931) was a Russian economist and statisti ...
in 1898 when he was given the task of investigating the number of soldiers in the Prussian army killed accidentally by horse kicks; this experiment introduced the Poisson distribution to the field of
reliability engineering Reliability engineering is a sub-discipline of systems engineering that emphasizes the ability of equipment to function without failure. Reliability describes the ability of a system or component to function under stated conditions for a specifie ...
.


Definitions


Probability mass function

A discrete
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
is said to have a Poisson distribution, with parameter \lambda>0, if it has a
probability mass function In probability and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete density function. The probability mass ...
given by: :f(k; \lambda) = \Pr(Xk)= \frac, where * is the number of occurrences (k = 0, 1, 2, \ldots) * is
Euler's number The number , also known as Euler's number, is a mathematical constant approximately equal to 2.71828 that can be characterized in many ways. It is the base of a logarithm, base of the natural logarithms. It is the Limit of a sequence, limit ...
(e = 2.71828\ldots) * is the
factorial In mathematics, the factorial of a non-negative denoted is the product of all positive integers less than or equal The factorial also equals the product of n with the next smaller factorial: \begin n! &= n \times (n-1) \times (n-2) \t ...
function. The positive
real number In mathematics, a real number is a number that can be used to measure a ''continuous'' one-dimensional quantity such as a distance, duration or temperature. Here, ''continuous'' means that values can have arbitrarily small variations. Every real ...
is equal to the
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
of and also to its
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
. :\lambda = \operatorname(X) = \operatorname(X). The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. The number of such events that occur during a fixed time interval is, under the right circumstances, a random number with a Poisson distribution. The equation can be adapted if, instead of the average number of events \lambda, we are given the average rate r at which events occur. Then \lambda = r t, and: : P(k \text t) = \frac.


Example

The Poisson distribution may be useful to model events such as: * the number of meteorites greater than 1 meter diameter that strike Earth in a year; * the number of laser photons hitting a detector in a particular time interval; and * the number of students achieving a low and high mark in an exam.


Assumptions and validity

The Poisson distribution is an appropriate model if the following assumptions are true: * is the number of times an event occurs in an interval and can take values 0, 1, 2, ... . * The occurrence of one event does not affect the probability that a second event will occur. That is, events occur independently. * The average rate at which events occur is independent of any occurrences. For simplicity, this is usually assumed to be constant, but may in practice vary with time. * Two events cannot occur at exactly the same instant; instead, at each very small sub-interval, either exactly one event occurs, or no event occurs. If these conditions are true, then is a Poisson random variable, and the distribution of is a Poisson distribution. The Poisson distribution is also the
limit Limit or Limits may refer to: Arts and media * ''Limit'' (manga), a manga by Keiko Suenobu * ''Limit'' (film), a South Korean film * Limit (music), a way to characterize harmony * "Limit" (song), a 2016 single by Luna Sea * "Limits", a 2019 ...
of a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
, for which the probability of success for each trial equals divided by the number of trials, as the number of trials approaches infinity (see Related distributions).


Examples of probability for Poisson distributions

On a particular river, overflow floods occur once every 100 years on average. Calculate the probability of = 0, 1, 2, 3, 4, 5, or 6 overflow floods in a 100 year interval, assuming the Poisson model is appropriate. Because the average event rate is one overflow flood per 100 years, = 1 : P(k \text) = \frac = \frac : P(k = 0 \text) = \frac = \frac \approx 0.368 : P(k = 1 \text) = \frac = \frac \approx 0.368 : P(k = 2 \text) = \frac = \frac \approx 0.184 : The probability for 0 to 6 overflow floods in a 100 year period.
María Dolores Ugarte ‪María Dolores (Lola) Ugarte Martínez is a Spanish statistician specializing in spatial analysis, spatio-temporal analysis, epidemiology, and small area estimation. She is a professor in the Statististics, Computer Science, and Mathematics D ...
and colleagues report that the average number of goals in a World Cup soccer match is approximately 2.5 and the Poisson model is appropriate. Because the average event rate is 2.5 goals per match, = 2.5 . : P(k \text) = \frac : P(k = 0 \text) = \frac = \frac \approx 0.082 : P(k = 1 \text) = \frac = \frac \approx 0.205 : P(k = 2 \text) = \frac = \frac \approx 0.257 : The probability for 0 to 7 goals in a match.


Once in an interval events: The special case of = 1 and = 0

Suppose that astronomers estimate that large meteorites (above a certain size) hit the earth on average once every 100 years ( event per 100 years), and that the number of meteorite hits follows a Poisson distribution. What is the probability of meteorite hits in the next 100 years? : P(k = \text) = \frac = \frac \approx 0.37. Under these assumptions, the probability that no large meteorites hit the earth in the next 100 years is roughly 0.37. The remaining is the probability of 1, 2, 3, or more large meteorite hits in the next 100 years. In an example above, an overflow flood occurred once every 100 years The probability of no overflow floods in 100 years was roughly 0.37, by the same calculation. In general, if an event occurs on average once per interval ( = 1), and the events follow a Poisson distribution, then In addition, as shown in the table for overflow floods.


Examples that violate the Poisson assumptions

The number of students who arrive at the
student union A students' union, also known by many other names, is a student organization present in many colleges, universities, and high schools. In higher education, the students' union is often accorded its own building on the campus, dedicated to social, ...
per minute will likely not follow a Poisson distribution, because the rate is not constant (low rate during class time, high rate between class times) and the arrivals of individual students are not independent (students tend to come in groups). The non-constant arrival rate may be modeled as a
mixed Poisson distribution A mixed Poisson distribution is a Univariate distribution, univariate discrete probability distribution in stochastics. It results from assuming that the conditional distribution of a random variable, given the value of the rate parameter, is a P ...
, and the arrival of groups rather than individual students as a
compound Poisson process A compound Poisson process is a continuous-time (random) stochastic process with jumps. The jumps arrive randomly according to a Poisson process and the size of the jumps is also random, with a specified probability distribution. A compound Poisso ...
. The number of magnitude 5 earthquakes per year in a country may not follow a Poisson distribution, if one large earthquake increases the probability of aftershocks of similar magnitude. Examples in which at least one event is guaranteed are not Poisson distributed; but may be modeled using a
zero-truncated Poisson distribution In probability theory, the zero-truncated Poisson (ZTP) distribution is a certain discrete probability distribution whose support is the set of positive integers. This distribution is also known as the conditional Poisson distribution or the pos ...
. Count distributions in which the number of intervals with zero events is higher than predicted by a Poisson model may be modeled using a
zero-inflated model In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations. Zero-inflated Poisson One well-known zero-inflated model is D ...
.


Properties


Descriptive statistics

* The
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
and
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
of a Poisson-distributed random variable are both equal to . * The
coefficient of variation In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as ...
is \lambda^, while the
index of dispersion In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a p ...
is 1. * The
mean absolute deviation The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point. It is a summary statistic of statistical dispersion or variability. In the general form, the central point can be a mean, median, m ...
about the mean is \operatorname X-\lambda, \ \frac. * The
mode Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to: Arts and entertainment * '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine * ''Mode'' magazine, a fictional fashion magazine which is ...
of a Poisson-distributed random variable with non-integer is equal to \lfloor \lambda \rfloor, which is the largest integer less than or equal to . This is also written as
floor A floor is the bottom surface of a room or vehicle. Floors vary from simple dirt in a cave to many layered surfaces made with modern technology. Floors may be stone, wood, bamboo, metal or any other material that can support the expected load ...
(). When is a positive integer, the modes are and  − 1. * All of the
cumulant In probability theory and statistics, the cumulants of a probability distribution are a set of quantities that provide an alternative to the '' moments'' of the distribution. Any two probability distributions whose moments are identical will ha ...
s of the Poisson distribution are equal to the expected value . The  th
factorial moment In probability theory, the factorial moment is a mathematical quantity defined as the expectation or average of the falling factorial of a random variable. Factorial moments are useful for studying non-negative integer-valued random variables,D. J ...
of the Poisson distribution is  . * The
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
of a
Poisson process In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space with the essential feature that the points occur independently of one ...
is sometimes decomposed into the product of ''intensity'' and ''exposure'' (or more generally expressed as the integral of an "intensity function" over time or space, sometimes described as "exposure").


Median

Bounds for the median (\nu) of the distribution are known and are
sharp Sharp or SHARP may refer to: Acronyms * SHARP (helmet ratings) (Safety Helmet Assessment and Rating Programme), a British motorcycle helmet safety rating scheme * Self Help Addiction Recovery Program, a charitable organisation founded in 19 ...
: \lambda - \ln 2 \le \nu < \lambda + \frac.


Higher moments

The higher non-centered moments, of the Poisson distribution, are
Touchard polynomials The Touchard polynomials, studied by , also called the exponential polynomials or Bell polynomials, comprise a polynomial sequence of binomial type defined by :T_n(x)=\sum_^n S(n,k)x^k=\sum_^n \left\x^k, where S(n,k)=\left\is a Stirling numb ...
in : m_k = \sum_^k \lambda^i \begin k \\ i \end, where the denote
Stirling numbers of the second kind In mathematics, particularly in combinatorics, a Stirling number of the second kind (or Stirling partition number) is the number of ways to partition a set of ''n'' objects into ''k'' non-empty subsets and is denoted by S(n,k) or \textstyle \lef ...
. The coefficients of the polynomials have a
combinatorial Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many ap ...
meaning. In fact, when the expected value of the Poisson distribution is 1, then Dobinski's formula says that the ‑th moment equals the number of
partitions of a set Partition may refer to: Computing Hardware * Disk partitioning, the division of a hard disk drive * Memory partition, a subdivision of a computer's memory, usually for use by a single job Software * Partition (database), the division of a ...
of size . A simple bound is m_k = E ^k\le \left(\frac\right)^k \le \lambda^k \exp\left(\frac\right).


Sums of Poisson-distributed random variables

If X_i \sim \operatorname(\lambda_i) for i=1,\dotsc,n are
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
, then \sum_^n X_i \sim \operatorname\left(\sum_^n \lambda_i\right). A converse is
Raikov's theorem Raikov’s theorem, named for Russian mathematician Dmitrii Abramovich Raikov, is a result in probability theory. It is well known that if each of two independence (probability theory), independent random variables ξ1 and ξ2 has a Poisson distribu ...
, which says that if the sum of two independent random variables is Poisson-distributed, then so are each of those two independent random variables.


Other properties

* The Poisson distributions are
infinitely divisible Infinite divisibility arises in different ways in philosophy, physics, economics, order theory (a branch of mathematics), and probability theory (also a branch of mathematics). One may speak of infinite divisibility, or the lack thereof, of matter, ...
probability distributions. * The directed
Kullback–Leibler divergence In mathematical statistics, the Kullback–Leibler divergence (also called relative entropy and I-divergence), denoted D_\text(P \parallel Q), is a type of statistical distance: a measure of how one probability distribution ''P'' is different fro ...
of \operatorname(\lambda_0) from \operatorname(\lambda) is given by \operatorname_(\lambda\mid\lambda_0) = \lambda_0 - \lambda + \lambda \log \frac. * If \lambda \geq 1 is an integer, then Y\sim \operatorname(\lambda) satisfies \Pr(Y \geq E \geq \frac and \Pr(Y \leq E \geq \frac. * Bounds for the tail probabilities of a Poisson random variable X \sim \operatorname(\lambda) can be derived using a
Chernoff bound In probability theory, the Chernoff bound gives exponentially decreasing bounds on tail distributions of sums of independent random variables. Despite being named after Herman Chernoff, the author of the paper it first appeared in, the result is d ...
argument. P(X \geq x) \leq \frac, \text x > \lambda, P(X \leq x) \leq \frac, \text x < \lambda. * The upper tail probability can be tightened (by a factor of at least two) as follows: P(X \geq x) \leq \frac, \text x > \lambda, where \operatorname_(x\mid\lambda) is the directed Kullback–Leibler divergence, as described above. * Inequalities that relate the distribution function of a Poisson random variable X \sim \operatorname(\lambda) to the Standard normal distribution function \Phi(x) are as follows: \Phi\left(\operatorname(k-\lambda)\sqrt\right) < P(X \leq k) < \Phi\left(\operatorname(k-\lambda+1)\sqrt\right), \text k > 0, where \operatorname_(k\mid\lambda) is again the directed Kullback–Leibler divergence.


Poisson races

Let X \sim \operatorname(\lambda) and Y \sim \operatorname(\mu) be independent random variables, with \lambda < \mu, then we have that \frac - \frac - \frac \leq P(X - Y \geq 0) \leq e^ The upper bound is proved using a standard Chernoff bound. The lower bound can be proved by noting that P(X-Y\geq0\mid X+Y=i) is the probability that Z \geq \frac, where Z \sim \operatorname\left(i, \frac\right), which is bounded below by \frac e^, where D is
relative entropy Relative may refer to: General use *Kinship and family, the principle binding the most basic social units society. If two people are connected by circumstances of birth, they are said to be ''relatives'' Philosophy *Relativism, the concept that ...
(See the entry on bounds on tails of binomial distributions for details). Further noting that X+Y \sim \operatorname(\lambda+\mu), and computing a lower bound on the unconditional probability gives the result. More details can be found in the appendix of Kamath ''et al.''.


Related distributions


As a Binomial distribution with infinitesimal time-steps

The Poisson distribution can be derived as a limiting case to the
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
as the number of trials goes to infinity and the expected number of successes remains fixed — see law of rare events below. Therefore, it can be used as an approximation of the binomial distribution if is sufficiently large and ''p'' is sufficiently small. The Poisson distribution is a good approximation of the binomial distribution if is at least 20 and ''p'' is smaller than or equal to 0.05, and an excellent approximation if  ≥ 100 and  ≤ 10. F_\mathrm(k;n, p) \approx F_\mathrm(k;\lambda=np)


General

* If X_1 \sim \mathrm(\lambda_1)\, and X_2 \sim \mathrm(\lambda_2)\, are independent, then the difference Y = X_1 - X_2 follows a
Skellam distribution The Skellam distribution is the discrete probability distribution of the difference N_1-N_2 of two statistically independent random variables N_1 and N_2, each Poisson distribution, Poisson-distributed with respective expected values \mu_1 and \mu ...
. * If X_1 \sim \mathrm(\lambda_1)\, and X_2 \sim \mathrm(\lambda_2)\, are independent, then the distribution of X_1 conditional on X_1+X_2 is a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
. Specifically, if X_1+X_2=k, then X_1, X_1+X_2=k\sim \mathrm(k, \lambda_1/(\lambda_1+\lambda_2)). More generally, if ''X''1, ''X''2, ..., ''X'' are independent Poisson random variables with parameters 1, 2, ..., then *: given \sum_^n X_j=k, it follows that X_i\Big, \sum_^n X_j=k \sim \mathrm\left(k, \frac\right). In fact, \ \sim \mathrm\left(k, \left\\right). * If X \sim \mathrm(\lambda)\, and the distribution of Y conditional on ''X'' =  is a
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
, Y \mid (X = k) \sim \mathrm(k, p), then the distribution of Y follows a Poisson distribution Y \sim \mathrm(\lambda \cdot p). In fact, if, conditional on \, \ follows a
multinomial distribution In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a ''k''-sided dice rolled ''n'' times. For ''n'' independent trials each of w ...
, \ \mid (X = k) \sim \mathrm\left(k, p_i\right), then each Y_i follows an independent Poisson distribution Y_i \sim \mathrm(\lambda \cdot p_i), \rho(Y_i, Y_j) = 0. * The Poisson distribution is a
special case In logic, especially as applied in mathematics, concept is a special case or specialization of concept precisely if every instance of is also an instance of but not vice versa, or equivalently, if is a generalization of . A limiting case is ...
of the discrete compound Poisson distribution (or stuttering Poisson distribution) with only a parameter. The discrete compound Poisson distribution can be deduced from the limiting distribution of univariate multinomial distribution. It is also a
special case In logic, especially as applied in mathematics, concept is a special case or specialization of concept precisely if every instance of is also an instance of but not vice versa, or equivalently, if is a generalization of . A limiting case is ...
of a
compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
. * For sufficiently large values of , (say >1000), the
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
with mean and variance (standard deviation \sqrt) is an excellent approximation to the Poisson distribution. If is greater than about 10, then the normal distribution is a good approximation if an appropriate
continuity correction In probability theory, a continuity correction is an adjustment that is made when a discrete distribution is approximated by a continuous distribution. Examples Binomial If a random variable ''X'' has a binomial distribution with parameters ' ...
is performed, i.e., if , where ''x'' is a non-negative integer, is replaced by . F_\mathrm(x;\lambda) \approx F_\mathrm(x;\mu=\lambda,\sigma^2=\lambda) *
Variance-stabilizing transformation In applied statistics, a variance-stabilizing transformation is a data transformation that is specifically chosen either to simplify considerations in graphical exploratory data analysis or to allow the application of simple regression-based or anal ...
: If X \sim \mathrm(\lambda), then Y = 2 \sqrt \approx \mathcal(2\sqrt;1), and Y = \sqrt \approx \mathcal(\sqrt;1/4). Under this transformation, the convergence to normality (as \lambda increases) is far faster than the untransformed variable. Other, slightly more complicated, variance stabilizing transformations are available, one of which is
Anscombe transform In statistics, the Anscombe transform, named after Francis Anscombe, is a variance-stabilizing transformation that transforms a random variable with a Poisson distribution into one with an approximately standard Gaussian distribution. The Ansc ...
. See
Data transformation (statistics) In statistics, data transformation is the application of a deterministic mathematical function to each point in a data set—that is, each data point ''zi'' is replaced with the transformed value ''yi'' = ''f''(''zi''), where ''f'' is a functio ...
for more general uses of transformations. * If for every ''t'' > 0 the number of arrivals in the time interval follows the Poisson distribution with mean ''λt'', then the sequence of inter-arrival times are independent and identically distributed
exponential Exponential may refer to any of several mathematical topics related to exponentiation, including: *Exponential function, also: **Matrix exponential, the matrix analogue to the above * Exponential decay, decrease at a rate proportional to value *Exp ...
random variables having mean 1/. * The
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
s of the Poisson and
chi-squared distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
s are related in the following ways: F_\text(k;\lambda) = 1-F_(2\lambda;2(k+1)) \quad\quad \text k, and P(X=k)=F_(2\lambda;2(k+1)) -F_(2\lambda;2k).


Poisson approximation

Assume X_1\sim\operatorname(\lambda_1), X_2\sim\operatorname(\lambda_2), \dots, X_n\sim\operatorname(\lambda_n) where \lambda_1 + \lambda_2 + \dots + \lambda_n=1, then (X_1, X_2, \dots, X_n) is multinomially distributed (X_1, X_2, \dots, X_n) \sim \operatorname(N, \lambda_1, \lambda_2, \dots, \lambda_n) conditioned on N = X_1 + X_2 + \dots X_n. This means, among other things, that for any nonnegative function f(x_1, x_2, \dots, x_n), if (Y_1, Y_2, \dots, Y_n)\sim\operatorname(m, \mathbf) is multinomially distributed, then \operatorname (Y_1, Y_2, \dots, Y_n)\le e\sqrt\operatorname (X_1, X_2, \dots, X_n) where (X_1, X_2, \dots, X_n)\sim\operatorname(\mathbf). The factor of e\sqrt can be replaced by 2 if f is further assumed to be monotonically increasing or decreasing.


Bivariate Poisson distribution

This distribution has been extended to the bivariate case. The
generating function In mathematics, a generating function is a way of encoding an infinite sequence of numbers () by treating them as the coefficients of a formal power series. This series is called the generating function of the sequence. Unlike an ordinary seri ...
for this distribution is g( u, v ) = \exp ( \theta_1 - \theta_ )( u - 1 ) + ( \theta_2 - \theta_ )(v - 1) + \theta_ ( uv - 1 ) with \theta_1, \theta_2 > \theta_ > 0 The marginal distributions are Poisson(''θ''1) and Poisson(''θ''2) and the correlation coefficient is limited to the range 0 \le \rho \le \min\left\ A simple way to generate a bivariate Poisson distribution X_1,X_2 is to take three independent Poisson distributions Y_1,Y_2,Y_3 with means \lambda_1,\lambda_2,\lambda_3 and then set X_1 = Y_1 + Y_3, X_2 = Y_2 + Y_3. The probability function of the bivariate Poisson distribution is \Pr(X_1=k_1,X_2=k_2) = \exp\left(-\lambda_1-\lambda_2-\lambda_3\right) \frac \frac \sum_^ \binom \binom k! \left( \frac\right)^k


Free Poisson distribution

The free Poisson distribution with jump size \alpha and rate \lambda arises in
free probability Free probability is a mathematical theory that studies non-commutative random variables. The "freeness" or free independence property is the analogue of the classical notion of independence, and it is connected with free products. This theory was in ...
theory as the limit of repeated
free convolution Free convolution is the free probability analog of the classical notion of convolution of probability measures. Due to the non-commutative nature of free probability theory, one has to talk separately about additive and multiplicative free convoluti ...
\left( \left(1-\frac\right)\delta_0 + \frac\delta_\alpha\right)^ as . In other words, let X_N be random variables so that X_N has value \alpha with probability \frac and value 0 with the remaining probability. Assume also that the family X_1, X_2, \ldots are freely independent. Then the limit as N \to \infty of the law of X_1 + \cdots +X_N is given by the Free Poisson law with parameters \lambda,\alpha. This definition is analogous to one of the ways in which the classical Poisson distribution is obtained from a (classical) Poisson process. The measure associated to the free Poisson law is given by \mu=\begin (1-\lambda) \delta_0 + \lambda \nu,& \text 0\leq \lambda \leq 1 \\ \nu, & \text\lambda >1, \end where \nu = \frac\sqrt \, dt and has support alpha (1-\sqrt)^2,\alpha (1+\sqrt)^2 This law also arises in
random matrix In probability theory and mathematical physics, a random matrix is a matrix-valued random variable—that is, a matrix in which some or all elements are random variables. Many important properties of physical systems can be represented mathemat ...
theory as the Marchenko–Pastur law. Its free cumulants are equal to \kappa_n=\lambda\alpha^n.


Some transforms of this law

We give values of some important transforms of the free Poisson law; the computation can be found in e.g. in the book ''Lectures on the Combinatorics of Free Probability'' by A. Nica and R. Speicher The R-transform of the free Poisson law is given by R(z)=\frac. The
Cauchy transform Baron Augustin-Louis Cauchy (, ; ; 21 August 178923 May 1857) was a French mathematician, engineer, and physicist who made pioneering contributions to several branches of mathematics, including mathematical analysis and continuum mechanics. He w ...
(which is the negative of the Stieltjes transformation) is given by G(z) = \frac The S-transform is given by S(z) = \frac in the case that \alpha = 1.


Weibull and Stable count

Poisson's probability mass function f(k; \lambda) can be expressed in a form similar to the product distribution of a
Weibull distribution In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice Ren ...
and a variant form of the
stable count distribution In probability theory, the stable count distribution is the conjugate prior of a one-sided stable distribution. This distribution was discovered by Stephen Lihn (Chinese: 藺鴻圖) in his 2017 study of daily distributions of the S&P 500 and the ...
. The variable (k+1) can be regarded as inverse of Lévy's stability parameter in the stable count distribution: f(k; \lambda) = \displaystyle\int_0^\infty \frac \, W_(\frac) \left \left(k+1\right) u^k \, \mathfrak_\left(u^\right) \right\, du , where \mathfrak_(\nu) is a standard stable count distribution of shape \alpha = 1/\left(k+1\right), and W_(x) is a standard Weibull distribution of shape k+1.


Statistical inference


Parameter estimation

Given a sample of measured values k_i \in \, for we wish to estimate the value of the parameter of the Poisson population from which the sample was drawn. The
maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...
estimate is :\widehat_\mathrm=\frac\sum_^n k_i\ . Since each observation has expectation so does the sample mean. Therefore, the maximum likelihood estimate is an
unbiased estimator In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In stat ...
of . It is also an efficient estimator since its variance achieves the Cramér–Rao lower bound (CRLB). Hence it is minimum-variance unbiased. Also it can be proven that the sum (and hence the sample mean as it is a one-to-one function of the sum) is a complete and sufficient statistic for . To prove sufficiency we may use the factorization theorem. Consider partitioning the probability mass function of the joint Poisson distribution for the sample into two parts: one that depends solely on the sample \mathbf (called h(\mathbf)) and one that depends on the parameter \lambda and the sample \mathbf only through the function T(\mathbf). Then T(\mathbf) is a sufficient statistic for \lambda. : P(\mathbf)=\prod_^n\frac=\frac \times \lambda^e^ The first term, h(\mathbf, depends only on \mathbf. The second term, g(T(\mathbf), \lambda), depends on the sample only through T(\mathbf)=\sum_^n x_i. Thus, T(\mathbf) is sufficient. To find the parameter that maximizes the probability function for the Poisson population, we can use the logarithm of the likelihood function: : \begin \ell(\lambda) & = \ln \prod_^n f(k_i \mid \lambda) \\ & = \sum_^n \ln\!\left(\frac\right) \\ & = -n\lambda + \left(\sum_^n k_i\right) \ln(\lambda) - \sum_^n \ln(k_i!). \end We take the derivative of \ell with respect to and compare it to zero: : \frac \ell(\lambda) = 0 \iff -n + \left(\sum_^n k_i\right) \frac = 0. \! Solving for gives a stationary point. : \lambda = \frac So is the average of the ''i'' values. Obtaining the sign of the second derivative of ''L'' at the stationary point will determine what kind of extreme value is. : \frac = -\lambda^\sum_^n k_i Evaluating the second derivative ''at the stationary point'' gives: : \frac = - \frac which is the negative of times the reciprocal of the average of the ki. This expression is negative when the average is positive. If this is satisfied, then the stationary point maximizes the probability function. For completeness, a family of distributions is said to be complete if and only if E(g(T)) = 0 implies that P_\lambda(g(T) = 0) = 1 for all \lambda. If the individual X_i are iid \mathrm(\lambda), then T(\mathbf)=\sum_^n X_i\sim \mathrm(n\lambda). Knowing the distribution we want to investigate, it is easy to see that the statistic is complete. :E(g(T))=\sum_^\infty g(t)\frac = 0 For this equality to hold, g(t) must be 0. This follows from the fact that none of the other terms will be 0 for all t in the sum and for all possible values of \lambda. Hence, E(g(T)) = 0 for all \lambda implies that P_\lambda(g(T) = 0) = 1, and the statistic has been shown to be complete.


Confidence interval

The
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the mean of a Poisson distribution can be expressed using the relationship between the cumulative distribution functions of the Poisson and
chi-squared distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
s. The chi-squared distribution is itself closely related to the
gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...
, and this leads to an alternative expression. Given an observation from a Poisson distribution with mean ''μ'', a confidence interval for ''μ'' with confidence level is :\tfrac \chi^(\alpha/2; 2k) \le \mu \le \tfrac \chi^(1-\alpha/2; 2k+2), or equivalently, :F^(\alpha/2; k,1) \le \mu \le F^(1-\alpha/2; k+1,1), where \chi^(p;n) is the
quantile function In probability and statistics, the quantile function, associated with a probability distribution of a random variable, specifies the value of the random variable such that the probability of the variable being less than or equal to that value equ ...
(corresponding to a lower tail area ''p'') of the chi-squared distribution with degrees of freedom and F^(p;n,1) is the quantile function of a
gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...
with shape parameter n and scale parameter 1. This interval is '
exact Exact may refer to: * Exaction, a concept in real property law * ''Ex'Act'', 2016 studio album by Exo * Schooner Exact, the ship which carried the founders of Seattle Companies * Exact (company), a Dutch software company * Exact Change, an Ameri ...
' in the sense that its
coverage probability In statistics, the coverage probability is a technique for calculating a confidence interval which is the proportion of the time that the interval contains the true value of interest. For example, suppose our interest is in the mean number of mon ...
is never less than the nominal . When quantiles of the gamma distribution are not available, an accurate approximation to this exact interval has been proposed (based on the
Wilson–Hilferty transformation In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-square ...
): :k \left( 1 - \frac - \frac\right)^3 \le \mu \le (k+1) \left( 1 - \frac + \frac\right)^3, where z_ denotes the
standard normal deviate A standard normal deviate is a normally distributed deviate. It is a realization of a standard normal random variable, defined as a random variable with expected value 0 and variance 1.Dodge, Y. (2003) The Oxford Dictionary of Statis ...
with upper tail area . For application of these formulae in the same context as above (given a sample of measured values ''i'' each drawn from a Poisson distribution with mean ), one would set :k=\sum_^n k_i , calculate an interval for and then derive the interval for .


Bayesian inference

In
Bayesian inference Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
, the
conjugate prior In Bayesian probability theory, if the posterior distribution p(\theta \mid x) is in the same probability distribution family as the prior probability distribution p(\theta), the prior and posterior are then called conjugate distributions, and th ...
for the rate parameter of the Poisson distribution is the
gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...
. Let :\lambda \sim \mathrm(\alpha, \beta) denote that is distributed according to the gamma
density Density (volumetric mass density or specific mass) is the substance's mass per unit of volume. The symbol most often used for density is ''ρ'' (the lower case Greek letter rho), although the Latin letter ''D'' can also be used. Mathematical ...
''g'' parameterized in terms of a
shape parameter In probability theory and statistics, a shape parameter (also known as form parameter) is a kind of numerical parameter of a parametric family of probability distributionsEveritt B.S. (2002) Cambridge Dictionary of Statistics. 2nd Edition. CUP. ...
''α'' and an inverse
scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family o ...
''β'': : g(\lambda \mid \alpha,\beta) = \frac \; \lambda^ \; e^ \qquad \text \lambda>0 \,\!. Then, given the same sample of measured values ''i'' as before, and a prior of Gamma(''α'', ''β''), the posterior distribution is :\lambda \sim \mathrm\left(\alpha + \sum_^n k_i, \beta + n\right). Note that the posterior mean is linear and is given by : E k_1, \ldots, k_n = \frac. It can be shown that gamma distribution is the only prior that induces linearity of the conditional mean. Moreover, a converse result exists which states that if the conditional mean is close to a linear function in the L_2 distance than the prior distribution of must be close to gamma distribution in Levy distance. The posterior mean E[] approaches the maximum likelihood estimate \widehat_\mathrm in the limit as \alpha\to 0, \beta \to 0, which follows immediately from the general expression of the mean of the
gamma distribution In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distri ...
. The
posterior predictive distribution Posterior may refer to: * Posterior (anatomy), the end of an organism opposite to its head ** Buttocks, as a euphemism * Posterior horn (disambiguation) * Posterior probability The posterior probability is a type of conditional probability that r ...
for a single additional observation is a
negative binomial distribution In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
, sometimes called a gamma–Poisson distribution.


Simultaneous estimation of multiple Poisson means

Suppose X_1, X_2, \dots, X_p is a set of independent random variables from a set of p Poisson distributions, each with a parameter \lambda_i, i=1,\dots, p, and we would like to estimate these parameters. Then, Clevenson and Zidek show that under the normalized squared error loss L(\lambda,)=\sum_^p \lambda_i^ (_i-\lambda_i)^2, when p>1, then, similar as in
Stein's example In decision theory and estimation theory, Stein's example (also known as Stein's phenomenon or Stein's paradox) is the observation that when three or more parameters are estimated simultaneously, there exist combined estimators more accurate on ave ...
for the Normal means, the MLE estimator _i = X_i is inadmissible. In this case, a family of
minimax estimator In statistical decision theory, where we are faced with the problem of estimating a deterministic parameter (vector) \theta \in \Theta from observations x \in \mathcal, an estimator (estimation rule) \delta^M \,\! is called minimax if its maximal ...
s is given for any 0 < c \leq 2(p-1) and b \geq (p-2+p^) as :_i = \left(1 - \frac\right) X_i, \qquad i=1,\dots,p.


Occurrence and applications

Applications of the Poisson distribution can be found in many fields including: *
Count data Count (feminine: countess) is a historical title of nobility in certain European countries, varying in relative status, generally of middling rank in the hierarchy of nobility. Pine, L. G. ''Titles: How the King Became His Majesty''. New York: ...
in general *
Telecommunication Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than that fe ...
example: telephone calls arriving in a system. *
Astronomy Astronomy () is a natural science that studies astronomical object, celestial objects and phenomena. It uses mathematics, physics, and chemistry in order to explain their origin and chronology of the Universe, evolution. Objects of interest ...
example: photons arriving at a telescope. *
Chemistry Chemistry is the science, scientific study of the properties and behavior of matter. It is a natural science that covers the Chemical element, elements that make up matter to the chemical compound, compounds made of atoms, molecules and ions ...
example: the
molar mass distribution The molar mass distribution (or molecular weight distribution) describes the relationship between the number of moles of each polymer species (Ni) and the molar mass (Mi) of that species. In linear polymers, the individual polymer chains rarely have ...
of a
living polymerization In polymer chemistry, living polymerization is a form of chain growth polymerization where the ability of a growing polymer chain to terminate has been removed. This can be accomplished in a variety of ways. Chain termination and chain transfer r ...
. *
Biology Biology is the scientific study of life. It is a natural science with a broad scope but has several unifying themes that tie it together as a single, coherent field. For instance, all organisms are made up of cells that process hereditary i ...
example: the number of mutations on a strand of DNA per unit length. *
Management Management (or managing) is the administration of an organization, whether it is a business, a nonprofit organization, or a government body. It is the art and science of managing resources of the business. Management includes the activities o ...
example: customers arriving at a counter or call centre. *
Finance and insurance Financial services are the economic services provided by the finance industry, which encompasses a broad range of businesses that manage money, including credit unions, banks, credit-card companies, insurance companies, accountancy companies, ...
example: number of losses or claims occurring in a given period of time. *
Earthquake seismology Seismology (; from Ancient Greek σεισμός (''seismós'') meaning " earthquake" and -λογία (''-logía'') meaning "study of") is the scientific study of earthquakes and the propagation of elastic waves through the Earth or through o ...
example: an asymptotic Poisson model of seismic risk for large earthquakes. * Radioactivity example: number of decays in a given time interval in a radioactive sample. *
Optics Optics is the branch of physics that studies the behaviour and properties of light, including its interactions with matter and the construction of instruments that use or detect it. Optics usually describes the behaviour of visible, ultraviole ...
example: the number of photons emitted in a single laser pulse. This is a major vulnerability to most
Quantum key distribution Quantum key distribution (QKD) is a secure communication method which implements a cryptographic protocol involving components of quantum mechanics. It enables two parties to produce a shared random secret key known only to them, which can then be ...
protocols known as Photon Number Splitting (PNS). The Poisson distribution arises in connection with Poisson processes. It applies to various phenomena of discrete properties (that is, those that may happen 0, 1, 2, 3, … times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or
space Space is the boundless three-dimensional extent in which objects and events have relative position and direction. In classical physics, physical space is often conceived in three linear dimensions, although modern physicists usually consider ...
. Examples of events that may be modelled as a Poisson distribution include: * The number of soldiers killed by horse-kicks each year in each corps in the
Prussia Prussia, , Old Prussian: ''Prūsa'' or ''Prūsija'' was a German state on the southeast coast of the Baltic Sea. It formed the German Empire under Prussian rule when it united the German states in 1871. It was ''de facto'' dissolved by an em ...
n cavalry. This example was used in a book by
Ladislaus Bortkiewicz Ladislaus Josephovich Bortkiewicz (Russian Владислав Иосифович Борткевич, German ''Ladislaus von Bortkiewicz'' or ''Ladislaus von Bortkewitsch'') (7 August 1868 – 15 July 1931) was a Russian economist and statisti ...
(1868–1931). * The number of yeast cells used when brewing
Guinness Guinness () is an Irish dry stout that originated in the brewery of Arthur Guinness at St. James's Gate, Dublin, Ireland, in 1759. It is one of the most successful alcohol brands worldwide, brewed in almost 50 countries, and available in ove ...
beer. This example was used by
William Sealy Gosset William Sealy Gosset (13 June 1876 – 16 October 1937) was an English statistician, chemist and brewer who served as Head Brewer of Guinness and Head Experimental Brewer of Guinness and was a pioneer of modern statistics. He pioneered small sa ...
(1876–1937). * The number of phone calls arriving at a
call centre A call centre ( Commonwealth spelling) or call center (American spelling; see spelling differences) is a managed capability that can be centralised or remote that is used for receiving or transmitting a large volume of enquiries by telephone. ...
within a minute. This example was described by A.K. Erlang (1878–1929). * Internet traffic. * The number of goals in sports involving two competing teams. * The number of deaths per year in a given age group. * The number of jumps in a stock price in a given time interval. * Under an assumption of
homogeneity Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the uniformity of a substance or organism. A material or image that is homogeneous is uniform in composition or character (i.e. color, shape, siz ...
, the number of times a
web server A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiate ...
is accessed per minute. * The number of
mutation In biology, a mutation is an alteration in the nucleic acid sequence of the genome of an organism, virus, or extrachromosomal DNA. Viral genomes contain either DNA or RNA. Mutations result from errors during DNA or viral replication, mi ...
s in a given stretch of DNA after a certain amount of radiation. * The proportion of cells that will be infected at a given
multiplicity of infection In microbiology, the multiplicity of infection or MOI is the ratio of agents (e.g. phage or more generally virus, bacteria) to infection targets (e.g. cell). For example, when referring to a group of cells inoculated with virus particles, the MOI i ...
. * The number of bacteria in a certain amount of liquid. * The arrival of
photons A photon () is an elementary particle that is a quantum of the electromagnetic field, including electromagnetic radiation such as light and radio waves, and the force carrier for the electromagnetic force. Photons are massless, so they alway ...
on a pixel circuit at a given illumination and over a given time period. * The targeting of
V-1 flying bomb The V-1 flying bomb (german: Vergeltungswaffe 1 "Vengeance Weapon 1") was an early cruise missile. Its official Ministry of Aviation (Nazi Germany), Reich Aviation Ministry () designation was Fi 103. It was also known to the Allies as the buz ...
s on London during World War II investigated by R. D. Clarke in 1946.
Gallagher Gallagher may refer to: Places United States * Gallagher Township, Pennsylvania * Gallagher, West Virginia, an unincorporated place People * Gallagher (comedian) (1946–2022), American stand-up comedian * Gallagher (surname) Fictional characte ...
showed in 1976 that the counts of
prime number A prime number (or a prime) is a natural number greater than 1 that is not a product of two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. For example, 5 is prime because the only ways ...
s in short intervals obey a Poisson distribution provided a certain version of the unproved prime r-tuple conjecture of Hardy-Littlewood is true.


Law of rare events

The rate of an event is related to the probability of an event occurring in some small subinterval (of time, space or otherwise). In the case of the Poisson distribution, one assumes that there exists a small enough subinterval for which the probability of an event occurring twice is "negligible". With this assumption one can derive the Poisson distribution from the Binomial one, given only the information of expected number of total events in the whole interval. Let the total number of events in the whole interval be denoted by \lambda. Divide the whole interval into n subintervals I_1,\dots,I_n of equal size, such that n > \lambda (since we are interested in only very small portions of the interval this assumption is meaningful). This means that the expected number of events in each of the subintervals is equal to \lambda/n. Now we assume that the occurrence of an event in the whole interval can be seen as a sequence of
Bernoulli trial In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is c ...
s, where the i-th
Bernoulli trial In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is c ...
corresponds to looking whether an event happens at the subinterval I_i with probability \lambda/n. The expected number of total events in n such trials would be \lambda, the expected number of total events in the whole interval. Hence for each subdivision of the interval we have approximated the occurrence of the event as a Bernoulli process of the form \textrm(n,\lambda/n). As we have noted before we want to consider only very small subintervals. Therefore, we take the limit as n goes to infinity. In this case the
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
converges to what is known as the Poisson distribution by the
Poisson limit theorem In probability theory, the law of rare events or Poisson limit theorem states that the Poisson distribution may be used as an approximation to the binomial distribution, under certain conditions. The theorem was named after Siméon Denis Poisson ...
. In several of the above examples — such as, the number of mutations in a given sequence of DNA—the events being counted are actually the outcomes of discrete trials, and would more precisely be modelled using the
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
, that is X \sim \textrm(n,p). In such cases is very large and is very small (and so the expectation is of intermediate magnitude). Then the distribution may be approximated by the less cumbersome Poisson distribution X \sim \textrm(np). This approximation is sometimes known as the ''law of rare events'', since each of the individual Bernoulli events rarely occurs. The name "law of rare events" may be misleading because the total count of success events in a Poisson process need not be rare if the parameter is not small. For example, the number of telephone calls to a busy switchboard in one hour follows a Poisson distribution with the events appearing frequent to the operator, but they are rare from the point of view of the average member of the population who is very unlikely to make a call to that switchboard in that hour. The variance of the binomial distribution is 1 − ''p'' times that of the Poisson distribution, so almost equal when ''p'' is very small. The word ''law'' is sometimes used as a synonym of
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
, and ''convergence in law'' means ''convergence in distribution''. Accordingly, the Poisson distribution is sometimes called the "law of small numbers" because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen. ''The Law of Small Numbers'' is a book by Ladislaus Bortkiewicz about the Poisson distribution, published in 1898.


Poisson point process

The Poisson distribution arises as the number of points of a
Poisson point process In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space with the essential feature that the points occur independently of one ...
located in some finite region. More specifically, if ''D'' is some region space, for example Euclidean space R''d'', for which , ''D'', , the area, volume or, more generally, the Lebesgue measure of the region is finite, and if denotes the number of points in ''D'', then : P(N(D)=k)=\frac .


Poisson regression and negative binomial regression

Poisson regression In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable ''Y'' has a Poisson distribution, and assumes the logari ...
and
negative binomial In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
regression are useful for analyses where the dependent (response) variable is the count of the number of events or occurrences in an interval.


Other applications in science

In a Poisson process, the number of observed occurrences fluctuates about its mean with a
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
\sigma_k =\sqrt. These fluctuations are denoted as ''Poisson noise'' or (particularly in electronics) as ''
shot noise Shot noise or Poisson noise is a type of noise which can be modeled by a Poisson process. In electronics shot noise originates from the discrete nature of electric charge. Shot noise also occurs in photon counting in optical devices, where shot ...
''. The correlation of the mean and standard deviation in counting independent discrete occurrences is useful scientifically. By monitoring how the fluctuations vary with the mean signal, one can estimate the contribution of a single occurrence, ''even if that contribution is too small to be detected directly''. For example, the charge ''e'' on an electron can be estimated by correlating the magnitude of an
electric current An electric current is a stream of charged particles, such as electrons or ions, moving through an electrical conductor or space. It is measured as the net rate of flow of electric charge through a surface or into a control volume. The moving pa ...
with its
shot noise Shot noise or Poisson noise is a type of noise which can be modeled by a Poisson process. In electronics shot noise originates from the discrete nature of electric charge. Shot noise also occurs in photon counting in optical devices, where shot ...
. If ''N'' electrons pass a point in a given time ''t'' on the average, the
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
current Currents, Current or The Current may refer to: Science and technology * Current (fluid), the flow of a liquid or a gas ** Air current, a flow of air ** Ocean current, a current in the ocean *** Rip current, a kind of water current ** Current (stre ...
is I=eN/t; since the current fluctuations should be of the order \sigma_I = e\sqrt/t (i.e., the standard deviation of the
Poisson process In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space with the essential feature that the points occur independently of one ...
), the charge e can be estimated from the ratio t\sigma_I^2/I. An everyday example is the graininess that appears as photographs are enlarged; the graininess is due to Poisson fluctuations in the number of reduced
silver Silver is a chemical element with the Symbol (chemistry), symbol Ag (from the Latin ', derived from the Proto-Indo-European wikt:Reconstruction:Proto-Indo-European/h₂erǵ-, ''h₂erǵ'': "shiny" or "white") and atomic number 47. A soft, whi ...
grains, not to the individual grains themselves. By correlating the graininess with the degree of enlargement, one can estimate the contribution of an individual grain (which is otherwise too small to be seen unaided). Many other molecular applications of Poisson noise have been developed, e.g., estimating the number density of
receptor Receptor may refer to: * Sensory receptor, in physiology, any structure which, on receiving environmental stimuli, produces an informative nerve impulse *Receptor (biochemistry), in biochemistry, a protein molecule that receives and responds to a ...
molecules in a
cell membrane The cell membrane (also known as the plasma membrane (PM) or cytoplasmic membrane, and historically referred to as the plasmalemma) is a biological membrane that separates and protects the interior of all cells from the outside environment ( ...
. : \Pr(N_t=k) = f(k;\lambda t) = \frac. In causal set theory the discrete elements of spacetime follow a Poisson distribution in the volume.


Computational methods

The Poisson distribution poses two different tasks for dedicated software libraries: ''evaluating'' the distribution P(k;\lambda), and ''drawing random numbers'' according to that distribution.


Evaluating the Poisson distribution

Computing P(k;\lambda) for given k and \lambda is a trivial task that can be accomplished by using the standard definition of P(k;\lambda) in terms of exponential, power, and factorial functions. However, the conventional definition of the Poisson distribution contains two terms that can easily overflow on computers: and . The fraction of to ! can also produce a rounding error that is very large compared to ''e'', and therefore give an erroneous result. For numerical stability the Poisson probability mass function should therefore be evaluated as :\!f(k; \lambda)= \exp \left k\ln \lambda - \lambda - \ln \Gamma (k+1) \right which is mathematically equivalent but numerically stable. The natural logarithm of the
Gamma function In mathematics, the gamma function (represented by , the capital letter gamma from the Greek alphabet) is one commonly used extension of the factorial function to complex numbers. The gamma function is defined for all complex numbers except ...
can be obtained using the lgamma function in the C standard library (C99 version) or R, the gammaln function in
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
or
SciPy SciPy (pronounced "sigh pie") is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal ...
, or the log_gamma function in Fortran 2008 and later. Some computing languages provide built-in functions to evaluate the Poisson distribution, namely * R: function dpois(x, lambda); *
Excel ExCeL London (an abbreviation for Exhibition Centre London) is an exhibition centre, international convention centre and former hospital in the Custom House area of Newham, East London. It is situated on a site on the northern quay of the ...
: function POISSON( x, mean, cumulative), with a flag to specify the cumulative distribution; *
Mathematica Wolfram Mathematica is a software system with built-in libraries for several areas of technical computing that allow machine learning, statistics, symbolic computation, data manipulation, network analysis, time series analysis, NLP, optimizat ...
: univariate Poisson distribution as PoissonDistribution math>\lambda/code>, bivariate Poisson distribution as MultivariatePoissonDistribution math>\theta_,/code>,.


Random variate generation

The less trivial task is to draw integer
random variate In probability and statistics, a random variate or simply variate is a particular outcome of a ''random variable'': the random variates which are other outcomes of the same random variable might have different values (random numbers). A random d ...
from the Poisson distribution with given \lambda. Solutions are provided by: * R: function rpois(n, lambda); *
GNU Scientific Library The GNU Scientific Library (or GSL) is a software library for numerical computations in applied mathematics and science. The GSL is written in C; wrappers are available for other programming languages. The GSL is part of the GNU Project and is d ...
(GSL): functio
gsl_ran_poisson
A simple algorithm to generate random Poisson-distributed numbers (
pseudo-random number sampling Non-uniform random variate generation or pseudo-random number sampling is the numerical practice of generating pseudo-random numbers (PRN) that follow a given probability distribution. Methods are typically based on the availability of a unifo ...
) has been given by Knuth: algorithm ''poisson random number (Knuth)'': init: Let L ← ''e''−λ, k ← 0 and p ← 1. do: k ← k + 1. Generate uniform random number u in ,1and let p ← p × u. while p > L. return k − 1. The complexity is linear in the returned value , which is on average. There are many other algorithms to improve this. Some are given in Ahrens & Dieter, see below. For large values of , the value of = ''e'' may be so small that it is hard to represent. This can be solved by a change to the algorithm which uses an additional parameter STEP such that ''e''−STEP does not underflow: algorithm ''poisson random number (Junhao, based on Knuth)'': init: Let Left ← , k ← 0 and p ← 1. do: k ← k + 1. Generate uniform random number u in (0,1) and let p ← p × u. while p < 1 and Left > 0: if Left > STEP: p ← p × ''e''STEP Left ← Left − STEP else: p ← p × ''e''Left Left ← 0 while p > 1. return k − 1. The choice of STEP depends on the threshold of overflow. For double precision floating point format the threshold is near ''e''700, so 500 should be a safe ''STEP''. Other solutions for large values of include
rejection sampling In numerical analysis and computational statistics, rejection sampling is a basic technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm" and is a type of ...
and using Gaussian approximation.
Inverse transform sampling Inverse transform sampling (also known as inversion sampling, the inverse probability integral transform, the inverse transformation method, Smirnov transform, or the golden ruleAalto University, N. Hyvönen, Computational methods in inverse probl ...
is simple and efficient for small values of , and requires only one uniform random number ''u'' per sample. Cumulative probabilities are examined in turn until one exceeds ''u''. algorithm ''Poisson generator based upon the inversion by sequential search'': init: Let x ← 0, p ← ''e''−λ, s ← p. Generate uniform random number u in ,1 while u > s do: x ← x + 1. p ← p × / x. s ← s + p. return x.


See also

*
Binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no quest ...
*
Compound Poisson distribution In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. T ...
*
Conway–Maxwell–Poisson distribution In probability theory and statistics, the Conway–Maxwell–Poisson (CMP or COM–Poisson) distribution is a discrete probability distribution named after Richard W. Conway, William L. Maxwell, and Siméon Denis Poisson that generalizes the Po ...
*
Erlang distribution The Erlang distribution is a two-parameter family of continuous probability distributions with support x \in independent exponential distribution">exponential variables with mean 1/\lambda each. Equivalently, it is the distribution of the tim ...
*
Hermite distribution In probability theory and statistics, the Hermite distribution, named after Charles Hermite, is a discrete probability distribution used to model ''count data'' with more than one parameter. This distribution is flexible in terms of its ability to ...
*
Index of dispersion In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a p ...
*
Negative binomial distribution In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-r ...
*
Poisson clumping Poisson may refer to: People *Siméon Denis Poisson, French mathematician Places * Poissons, a commune of Haute-Marne, France * Poisson, Saône-et-Loire, a commune of Saône-et-Loire, France Other uses * Poisson (surname), a French surname * Poi ...
*
Poisson point process In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space with the essential feature that the points occur independently of one ...
*
Poisson regression In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable ''Y'' has a Poisson distribution, and assumes the logari ...
*
Poisson sampling In survey methodology, Poisson sampling (sometimes denoted as ''PO sampling'') is a sampling process where each element of the population is subjected to an independent Bernoulli trial which determines whether the element becomes part of the sampl ...
*
Poisson wavelet In mathematics, in functional analysis, several different wavelets are known by the name Poisson wavelet. In one context, the term "Poisson wavelet" is used to denote a family of wavelets labeled by the set of positive integers, the members of wh ...
*
Queueing theory Queueing theory is the mathematical study of waiting lines, or queues. A queueing model is constructed so that queue lengths and waiting time can be predicted. Queueing theory is generally considered a branch of operations research because the ...
*
Renewal theory Renewal theory is the branch of probability theory that generalizes the Poisson process for arbitrary holding times. Instead of exponentially distributed holding times, a renewal process may have any independent and identically distributed (IID) ho ...
*
Robbins lemma In statistics, the Robbins lemma, named after Herbert Robbins, states that if ''X'' is a random variable having a Poisson distribution with parameter ''λ'', and ''f'' is any function for which the expected value E(''f''(''X'')) exists, then. ...
*
Skellam distribution The Skellam distribution is the discrete probability distribution of the difference N_1-N_2 of two statistically independent random variables N_1 and N_2, each Poisson distribution, Poisson-distributed with respective expected values \mu_1 and \mu ...
*
Tweedie distribution In probability and statistics, the Tweedie distributions are a family of probability distributions which include the purely continuous normal, gamma and inverse Gaussian distributions, the purely discrete scaled Poisson distribution, and the ...
*
Zero-inflated model In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations. Zero-inflated Poisson One well-known zero-inflated model is D ...
*
Zero-truncated Poisson distribution In probability theory, the zero-truncated Poisson (ZTP) distribution is a certain discrete probability distribution whose support is the set of positive integers. This distribution is also known as the conditional Poisson distribution or the pos ...


References


Citations


Sources

* * * {{Authority control Articles with example pseudocode Conjugate prior distributions Factorial and binomial topics Infinitely divisible probability distributions