HOME

TheInfoList



OR:

In
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and 1, where, roughly speaking, ...
and
statistics Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
, an urn problem is an idealized
mental exercise Brain training (also called cognitive training) is a program of regular activities purported to maintain or improve one's cognitive abilities. The phrase “cognitive ability” usually refers to components of fluid intelligence such as executive ...
in which some objects of real interest (such as atoms, people, cars, etc.) are represented as colored balls in an
urn An urn is a vase, often with a cover, with a typically narrowed neck above a rounded body and a footed pedestal. Describing a vessel as an "urn", as opposed to a vase or other terms, generally reflects its use rather than any particular shape or ...
or other container. One pretends to remove one or more balls from the urn; the goal is to determine the probability of drawing one color or another, or some other properties. A number of important variations are described below. An urn model is either a set of probabilities that describe events within an urn problem, or it is a
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon ...
, or a family of such distributions, of
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
s associated with urn problems.Dodge, Yadolah (2003) ''Oxford Dictionary of Statistical Terms'', OUP.


History

In ''
Ars Conjectandi (Latin for "The Art of Conjecturing") is a book on combinatorics and mathematical probability written by Jacob Bernoulli and published in 1713, eight years after his death, by his nephew, Niklaus Bernoulli. The seminal work consolidated, apar ...
'' (1713),
Jacob Bernoulli Jacob Bernoulli (also known as James or Jacques; – 16 August 1705) was one of the many prominent mathematicians in the Bernoulli family. He was an early proponent of Leibnizian calculus and sided with Gottfried Wilhelm Leibniz during the Le ...
considered the problem of determining, given a number of pebbles drawn from an urn, the proportions of different colored pebbles within the urn. This problem was known as the ''
inverse probability In probability theory, inverse probability is an obsolete term for the probability distribution of an unobserved variable. Today, the problem of determining an unobserved variable (by whatever method) is called inferential statistics, the method o ...
'' problem, and was a topic of research in the eighteenth century, attracting the attention of
Abraham de Moivre Abraham de Moivre FRS (; 26 May 166727 November 1754) was a French mathematician known for de Moivre's formula, a formula that links complex numbers and trigonometry, and for his work on the normal distribution and probability theory. He move ...
and
Thomas Bayes Thomas Bayes ( ; 1701 7 April 1761) was an English statistician, philosopher and Presbyterian minister who is known for formulating a specific case of the theorem that bears his name: Bayes' theorem. Bayes never published what would become his ...
. Bernoulli used the
Latin Latin (, or , ) is a classical language belonging to the Italic languages, Italic branch of the Indo-European languages. Latin was originally a dialect spoken in the lower Tiber area (then known as Latium) around present-day Rome, but through ...
word '' urna'', which primarily means a clay vessel, but is also the term used in ancient Rome for a vessel of any kind for collecting
ballots A ballot is a device used to cast votes in an election and may be found as a piece of paper or a small ball used in secret voting. It was originally a small ball (see blackballing) used to record decisions made by voters in Italy around the 16t ...
or lots; the present-day
Italian Italian(s) may refer to: * Anything of, from, or related to the people of Italy over the centuries ** Italians, an ethnic group or simply a citizen of the Italian Republic or Italian Kingdom ** Italian language, a Romance language *** Regional It ...
word for ballot box is still '' urna''. Bernoulli's inspiration may have been
lotteries A lottery is a form of gambling that involves the drawing of numbers at random for a prize. Some governments outlaw lotteries, while others endorse it to the extent of organizing a national or state lottery. It is common to find some degree of ...
,
election An election is a formal group decision-making process by which a population chooses an individual or multiple individuals to hold public office. Elections have been the usual mechanism by which modern representative democracy has oper ...
s, or
games of chance A game of chance is in contrast with a game of skill. It is a game whose outcome is strongly influenced by some randomizing device. Common devices used include dice, spinning tops, playing cards, roulette wheels, or numbered balls drawn from a ...
which involved drawing balls from a container, and it has been asserted that elections in medieval and renaissance
Venice Venice ( ; it, Venezia ; vec, Venesia or ) is a city in northeastern Italy and the capital of the Veneto region. It is built on a group of 118 small islands that are separated by canals and linked by over 400 bridges. The isl ...
, including that of the
doge A doge ( , ; plural dogi or doges) was an elected lord and head of state in several Italian city-states, notably Venice and Genoa, during the medieval and renaissance periods. Such states are referred to as " crowned republics". Etymology The ...
, often included the choice of electors by lot, using balls of different colors drawn from an urn.


Basic urn model

In this basic urn model in
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set ...
, the urn contains ''x'' white and ''y'' black balls, well-mixed together. One ball is drawn randomly from the urn and its color observed; it is then placed back in the urn (or not), and the selection process is repeated.Urn Model: Simple Definition, Examples and Applications — The basic urn model
/ref> Possible questions that can be answered in this model are: * Can I infer the proportion of white and black balls from ''n'' observations? With what degree of confidence? * Knowing ''x'' and ''y'', what is the probability of drawing a specific sequence (e.g. one white followed by one black)? * If I only observe ''n'' balls, how sure can I be that there are no black balls? (A variation both on the first and the second question)


Examples of urn problems

*
beta-binomial distribution In probability theory and statistics, the beta-binomial distribution is a family of discrete probability distributions on a finite support of non-negative integers arising when the probability of success in each of a fixed or known number of Ber ...
: as above, except that every time a ball is observed, an additional ball of the same color is added to the urn. Hence, the number of total balls in the urn grows. See
Pólya urn model In statistics, a Pólya urn model (also known as a Pólya urn scheme or simply as Pólya's urn), named after George Pólya, is a type of statistical model used as an idealized mental exercise framework, unifying many treatments. In an urn model, ...
. *
binomial distribution In probability theory and statistics, the binomial distribution with parameters ''n'' and ''p'' is the discrete probability distribution of the number of successes in a sequence of ''n'' independent experiments, each asking a yes–no ques ...
: the distribution of the number of successful draws (trials), i.e. extraction of white balls, given ''n'' draws with replacement in an urn with black and white balls. * Hoppe urn: a Pólya urn with an additional ball called the mutator. When the mutator is drawn it is replaced along with an additional ball of an entirely new colour. *
hypergeometric distribution In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, ''without'' ...
: the balls are not returned to the urn once extracted. Hence, the number of total marbles in the urn decreases. This is referred to as "drawing without replacement", by opposition to "drawing with replacement". *
multivariate hypergeometric distribution In probability theory and statistics, the hypergeometric distribution is a discrete probability distribution that describes the probability of k successes (random draws for which the object drawn has a specified feature) in n draws, ''without'' ...
: the balls are not returned to the urn once extracted, but with balls of more than two colors. *
geometric distribution In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions: * The probability distribution of the number ''X'' of Bernoulli trials needed to get one success, supported on the set \; ...
: number of draws before the first successful (correctly colored) draw. * Mixed replacement/non-replacement: the urn contains black and white balls. While black balls are set aside after a draw (non-replacement), white balls are returned to the urn after a draw (replacement). What is the distribution of the number of black balls drawn after m draws? *
multinomial distribution In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a ''k''-sided dice rolled ''n'' times. For ''n'' independent trials each of wh ...
: there are balls of more than two colors. Each time a ball is extracted, it is returned before drawing another ball. This is also known as ' Balls into bins'. *
negative binomial distribution In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non- ...
: number of draws before a certain number of failures (incorrectly colored draws) occurs.
Occupancy problem
the distribution of the number of occupied urns after the random assignment of ''k'' balls into ''n'' urns, related to the coupon collector's problem and
birthday problem In probability theory, the birthday problem asks for the probability that, in a set of randomly chosen people, at least two will share a birthday. The birthday paradox is that, counterintuitively, the probability of a shared birthday exceeds 5 ...
. * Pólya urn: each time a ball of a particular colour is drawn, it is replaced along with an additional ball of the same colour. *
Statistical physics Statistical physics is a branch of physics that evolved from a foundation of statistical mechanics, which uses methods of probability theory and statistics, and particularly the mathematical tools for dealing with large populations and approxim ...
: derivation of energy and velocity distributions. * The
Ellsberg paradox In decision theory, the Ellsberg paradox (or Ellsberg's paradox) is a paradox in which people's decisions are inconsistent with subjective expected utility theory. Daniel Ellsberg popularized the paradox in his 1961 paper, “Risk, Ambiguity, and ...
.


See also

* Balls into bins * Coin-tossing problems * Coupon collector's problem *
Dirichlet-multinomial distribution In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribu ...
* Noncentral hypergeometric distributions


References


Further reading

* Johnson, Norman L.; and Kotz, Samuel (1977); ''Urn Models and Their Application: An Approach to Modern Discrete Probability Theory'', Wiley * Mahmoud, Hosam M. (2008); ''Pólya Urn Models'', Chapman & Hall/CRC. {{ISBN, 1-4200-5983-1 Probability problems Thought experiments