In
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, the binomial distribution with parameters and is the
discrete probability distribution
In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spa ...
of the number of successes in a sequence of
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in Pennsylvania, United States
* Independentes (English: Independents), a Portuguese artist ...
experiment
An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs whe ...
s, each asking a
yes–no question, and each with its own
Boolean-valued
outcome: ''success'' (with probability ) or ''failure'' (with probability ). A single success/failure experiment is also called a
Bernoulli trial
In the theory of probability and statistics, a Bernoulli trial (or binomial trial) is a random experiment with exactly two possible outcomes, "success" and "failure", in which the probability of success is the same every time the experiment is ...
or Bernoulli experiment, and a sequence of outcomes is called a
Bernoulli process
In probability and statistics, a Bernoulli process (named after Jacob Bernoulli) is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The ...
; for a single trial, i.e., , the binomial distribution is a
Bernoulli distribution
In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli, is the discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with pro ...
. The binomial distribution is the basis for the
binomial test
Binomial test is an exact test of the statistical significance of deviations from a theoretically expected distribution of observations into two categories using sample data.
Usage
A binomial test is a statistical hypothesis test used to deter ...
of
statistical significance
In statistical hypothesis testing, a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true. More precisely, a study's defined significance level, denoted by \alpha, is the ...
.
The binomial distribution is frequently used to model the number of successes in a sample of size drawn
with replacement from a population of size . If the sampling is carried out without replacement, the draws are not independent and so the resulting distribution is a
hypergeometric distribution, not a binomial one. However, for much larger than , the binomial distribution remains a good approximation, and is widely used.
Definitions
Probability mass function
If the
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
follows the binomial distribution with parameters and , we write . The probability of getting exactly successes in independent Bernoulli trials (with the same rate ) is given by the
probability mass function
In probability and statistics, a probability mass function (sometimes called ''probability function'' or ''frequency function'') is a function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes i ...
:
:
for , where
:
is the
binomial coefficient
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the t ...
. The formula can be understood as follows: is the probability of obtaining the sequence of independent Bernoulli trials in which trials are "successes" and the remaining trials result in "failure". Since the trials are independent with probabilities remaining constant between them, any sequence of trials with successes (and failures) has the same probability of being achieved (regardless of positions of successes within the sequence). There are
such sequences, since the binomial coefficient
counts the number of ways to choose the positions of the successes among the trials. The binomial distribution is concerned with the probability of obtaining ''any'' of these sequences, meaning the probability of obtaining one of them () must be added
times, hence
.
In creating reference tables for binomial distribution probability, usually, the table is filled in up to values. This is because for , the probability can be calculated by its complement as
:
Looking at the expression as a function of , there is a value that maximizes it. This value can be found by calculating
:
and comparing it to 1. There is always an integer that satisfies
:
is monotone increasing for and monotone decreasing for , with the exception of the case where is an integer. In this case, there are two values for which is maximal: and . is the ''most probable'' outcome (that is, the most likely, although this can still be unlikely overall) of the Bernoulli trials and is called the
mode.
Equivalently, . Taking the
floor function
In mathematics, the floor function is the function that takes as input a real number , and gives as output the greatest integer less than or equal to , denoted or . Similarly, the ceiling function maps to the least integer greater than or eq ...
, we obtain .
Example
Suppose a
biased coin comes up heads with probability 0.3 when tossed. The probability of seeing exactly 4 heads in 6 tosses is
:
Cumulative distribution function
The
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ever ...
can be expressed as:
:
where
is the "floor" under , i.e. the
greatest integer less than or equal to .
It can also be represented in terms of the
regularized incomplete beta function, as follows:
:
which is equivalent to the
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ever ...
s of the
beta distribution
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval , 1
The comma is a punctuation mark that appears in several variants in different languages. Some typefaces render it as a small line, slightly curved or straight, but inclined from the vertical; others give it the appearance of a miniature fille ...
or (0, 1) in terms of two positive Statistical parameter, parameters, denoted by ''alpha'' (''α'') an ...
and of the
-distribution:
:
:
Some closed-form bounds for the cumulative distribution function are given
below.
Properties
Expected value and variance
If , that is, is a binomially distributed random variable, being the total number of experiments and ''p'' the probability of each experiment yielding a successful result, then the
expected value
In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...
of is:
:
This follows from the linearity of the expected value along with the fact that is the sum of identical Bernoulli random variables, each with expected value . In other words, if
are identical (and independent) Bernoulli random variables with parameter , then and
:
The
variance
In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
is:
:
This similarly follows from the fact that the variance of a sum of independent random variables is the sum of the variances.
Higher moments
The first 6
central moment
In probability theory and statistics, a central moment is a moment of a probability distribution of a random variable about the random variable's mean; that is, it is the expected value of a specified integer power of the deviation of the random ...
s, defined as
, are given by
:
The non-central moments satisfy
:
and in general
[
][
]
:
where
are the
Stirling numbers of the second kind, and
is the
th
falling power of
.
A simple bound
follows by bounding the Binomial moments via the
higher Poisson moments:
:
This shows that if
, then