
In
probability
Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
and
statistics
Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, an urn problem is an idealized
mental exercise in which some objects of real interest (such as atoms, people, cars, etc.) are represented as colored balls in an
urn or other container. One pretends to remove one or more balls from the urn; the goal is to determine the probability of drawing one color or another,
or some other properties. A number of important variations are described below.
An urn model is either a set of probabilities that describe events within an urn problem, or it is a
probability distribution
In probability theory and statistics, a probability distribution is a Function (mathematics), function that gives the probabilities of occurrence of possible events for an Experiment (probability theory), experiment. It is a mathematical descri ...
, or a family of such distributions, of
random variable
A random variable (also called random quantity, aleatory variable, or stochastic variable) is a Mathematics, mathematical formalization of a quantity or object which depends on randomness, random events. The term 'random variable' in its mathema ...
s associated with urn problems.
History
In ''
Ars Conjectandi
(Latin for "The Art of Conjecturing") is a book on combinatorics and mathematical probability written by Jacob Bernoulli and published in 1713, eight years after his death, by his nephew, Nicolaus I Bernoulli. The seminal work consolidated, ap ...
'' (1713),
Jacob Bernoulli
Jacob Bernoulli (also known as James in English or Jacques in French; – 16 August 1705) was a Swiss mathematician. He sided with Gottfried Wilhelm Leibniz during the Leibniz–Newton calculus controversy and was an early proponent of Leibniz ...
considered the problem of determining, given a number of pebbles drawn from an urn, the proportions of different colored pebbles within the urn. This problem was known as the ''
inverse probability
In probability theory, inverse probability is an old term for the probability distribution of an unobserved variable.
Today, the problem of determining an unobserved variable (by whatever method) is called inferential statistics. The method of i ...
'' problem, and was a topic of research in the eighteenth century, attracting the attention of
Abraham de Moivre
Abraham de Moivre FRS (; 26 May 166727 November 1754) was a French mathematician known for de Moivre's formula, a formula that links complex numbers and trigonometry, and for his work on the normal distribution and probability theory.
He move ...
and
Thomas Bayes
Thomas Bayes ( , ; 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem.
Bayes never published what would become his m ...
.
Bernoulli used the
Latin
Latin ( or ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally spoken by the Latins (Italic tribe), Latins in Latium (now known as Lazio), the lower Tiber area aroun ...
word ''
urna
In Buddhist art and culture, the Urna (ūrṇā, ūrṇākeśa or ūrṇākośa (Pāli uṇṇa), and known as in Chinese) is a spiral or circular dot placed on the forehead of Buddhist images as an auspicious mark.
As set out in the '' Lakk ...
'', which primarily means a clay vessel, but is also the term used in ancient Rome for a vessel of any kind for collecting
ballots or lots; the present-day
Italian
Italian(s) may refer to:
* Anything of, from, or related to the people of Italy over the centuries
** Italians, a Romance ethnic group related to or simply a citizen of the Italian Republic or Italian Kingdom
** Italian language, a Romance languag ...
or
Spanish
Spanish might refer to:
* Items from or related to Spain:
**Spaniards are a nation and ethnic group indigenous to Spain
**Spanish language, spoken in Spain and many countries in the Americas
**Spanish cuisine
**Spanish history
**Spanish culture
...
word for
ballot box
A ballot box is a temporarily sealed container, usually a square box though sometimes a tamper resistant bag, with a narrow slot in the top sufficient to accept a ballot paper in an election but which prevents anyone from accessing the votes cas ...
is still ''
urna
In Buddhist art and culture, the Urna (ūrṇā, ūrṇākeśa or ūrṇākośa (Pāli uṇṇa), and known as in Chinese) is a spiral or circular dot placed on the forehead of Buddhist images as an auspicious mark.
As set out in the '' Lakk ...
''. Bernoulli's inspiration may have been
lotteries
A lottery (or lotto) is a form of gambling that involves the drawing of numbers at random for a prize. Some governments outlaw lotteries, while others endorse it to the extent of organizing a national or state lottery. It is common to find som ...
,
election
An election is a formal group decision-making process whereby a population chooses an individual or multiple individuals to hold Public administration, public office.
Elections have been the usual mechanism by which modern representative d ...
s, or
games of chance
A game of chance is in contrast with a game of skill. It is a game whose outcome is strongly influenced by some randomizing device. Common devices used include dice, spinning tops, playing cards, roulette wheels, numbered balls, or in the case ...
which involved drawing balls from a container, and it has been asserted that elections in medieval and renaissance
Venice
Venice ( ; ; , formerly ) is a city in northeastern Italy and the capital of the Veneto Regions of Italy, region. It is built on a group of 118 islands that are separated by expanses of open water and by canals; portions of the city are li ...
, including that of the
doge
Doge, DoGE or DOGE may refer to:
Internet culture
* Doge (meme), an Internet meme primarily associated with the Shiba Inu dog breed
** Dogecoin, a cryptocurrency named after the meme
** Kabosu (dog), the dog portrayed in the original Doge image ...
, often included the
choice of electors by lot, using balls of different colors drawn from an urn.
Basic urn model
In this basic urn model in
probability theory
Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expre ...
, the urn contains ''x'' white and ''y'' black balls, well-mixed together. One ball is drawn randomly from the urn and has its color observed; it is then placed back in the urn (or not), and the selection process is repeated.
[Urn Model: Simple Definition, Examples and Applications — The basic urn model](_blank)
/ref>
Possible questions that can be answered in this model are:
* Can I infer the proportion of white and black balls from ''n'' observations? With what degree of confidence?
* Knowing ''x'' and ''y'', what is the probability of drawing a specific sequence (e.g. one white followed by one black)?
* If I only observe ''n'' balls, how sure can I be that there are no black balls? (A variation both on the first and the second question)
Examples of urn problems
* binomial distribution
In probability theory and statistics, the binomial distribution with parameters and is the discrete probability distribution of the number of successes in a sequence of statistical independence, independent experiment (probability theory) ...
: the distribution of the number of successful draws (trials), i.e. extraction of white balls, given ''n'' draws with replacement in an urn with black and white balls.
* multinomial distribution In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a ''k''-sided die rolled ''n'' times. For ''n'' statistical independence, indepen ...
: there are balls of more than two colors. Each time a ball is extracted, it is returned before drawing another ball. This is also known as ' Balls into bins'.
Occupancy problem
the distribution of the number of occupied urns after the random assignment of ''k'' balls into ''n'' urns, related to the coupon collector's problem and birthday problem
In probability theory, the birthday problem asks for the probability that, in a set of randomly chosen people, at least two will share the same birthday. The birthday paradox is the counterintuitive fact that only 23 people are needed for that ...
.
* negative binomial distribution
In probability theory and statistics, the negative binomial distribution, also called a Pascal distribution, is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Berno ...
: number of draws before a certain number of failures (incorrectly colored draws) occurs.
* geometric distribution
In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:
* The probability distribution of the number X of Bernoulli trials needed to get one success, supported on \mathbb = \;
* T ...
: number of draws before the first successful (correctly colored) draw.
* hypergeometric distribution
In probability theory and statistics, the hypergeometric distribution is a Probability distribution#Discrete probability distribution, discrete probability distribution that describes the probability of k successes (random draws for which the ...
: the balls are not returned to the urn once extracted. Hence, the number of total marbles in the urn decreases. This is referred to as "drawing without replacement", by opposition to "drawing with replacement".
* multivariate hypergeometric distribution: the balls are not returned to the urn once extracted, but with balls of more than two colors.
* Mixed replacement/non-replacement: the urn contains ''x'' white and ''y'' black balls. While black balls are set aside after a draw (non-replacement), white balls are returned to the urn after a draw (replacement). The probability ''P(m,k)'' that ''k'' black balls will be drawn after ''m'' draws can be calculated recursively using the formula .Matheplanet: Ein Urnenproblem - reloaded
/ref>
* Pólya urn/beta-binomial distribution
In probability theory and statistics, the beta-binomial distribution is a family of discrete probability distributions on a finite support of non-negative integers arising when the probability of success in each of a fixed or known number of Ber ...
: each time a ball is drawn, it is replaced along with an additional ball of the same colour. Hence, the number of total balls in the urn grows.
* Hoppe urn: a Pólya urn with an additional ball called the mutator. When the mutator is drawn it is replaced along with an additional ball of an entirely new colour.
* Statistical physics
In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. Sometimes called statistical physics or statistical thermodynamics, its applicati ...
: derivation of energy and velocity distributions.
* The Ellsberg paradox
In decision theory, the Ellsberg paradox (or Ellsberg's paradox) is a paradox in which people's decisions are inconsistent with subjective expected utility theory. John Maynard Keynes published a version of the paradox in 1921. Daniel Ellsberg ...
.
See also
* Balls into bins
* Coin-tossing problems
* Coupon collector's problem
* Dirichlet-multinomial distribution
* Noncentral hypergeometric distributions
* Pólya urn model
References
Further reading
* Johnson, Norman L.; and Kotz, Samuel (1977); ''Urn Models and Their Application: An Approach to Modern Discrete Probability Theory'', Wiley
* Mahmoud, Hosam M. (2008); ''Pólya Urn Models'', Chapman & Hall/CRC. {{ISBN, 1-4200-5983-1
Probability problems
Thought experiments