HOME

TheInfoList



OR:

In
computing Computing is any goal-oriented activity requiring, benefiting from, or creating computing machinery. It includes the study and experimentation of algorithmic processes, and development of both hardware and software. Computing has scientific, e ...
, the alias method is a family of efficient
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
s for sampling from a discrete probability distribution, published in 1974 by A. J. Walker. That is, it returns integer values according to some arbitrary
probability distribution In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
. The algorithms typically use or preprocessing time, after which random values can be drawn from the distribution in time.


Operation

Internally, the algorithm consults two tables, a ''
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
table Table may refer to: * Table (furniture), a piece of furniture with a flat surface and one or more legs * Table (landform), a flat area of land * Table (information), a data arrangement with rows and columns * Table (database), how the table data ...
'' and an ''alias table'' (for ). To generate a random outcome, a fair
dice Dice (singular die or dice) are small, throwable objects with marked sides that can rest in multiple positions. They are used for generating random values, commonly as part of tabletop games, including dice games, board games, role-playing g ...
is rolled to determine an index into the two tables. Based on the probability stored at that index, a
biased coin In probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin. In the ...
is then flipped, and the outcome of the flip is used to choose between a result of and . More concretely, the algorithm operates as follows: # Generate a
uniform A uniform is a variety of clothing worn by members of an organization while participating in that organization's activity. Modern uniforms are most often worn by armed forces and paramilitary organizations such as police, emergency services, se ...
random variate . # Let and . (This makes uniformly distributed on and uniformly distributed on .) # If , return . This is the biased coin flip. # Otherwise, return . An alternative formulation of the probability table, proposed by Marsaglia et al. as the "square histogram" method, uses the condition in the third step (where ) instead of computing .


Table generation

The distribution may be padded with additional probabilities to increase to a convenient value, such as a power of two. To generate the table, first initialize . While doing this, divide the table entries into three categories: * The "overfull" group, where , * The "underfull" group, where and has not been initialized, and * The "exactly full" group, where or ''has'' been initialized. If , the corresponding value will never be consulted and is unimportant, but a value of is sensible. As long as not all table entries are exactly full, repeat the following steps: # Arbitrarily choose an overfull entry and an underfull entry . (If one of these exists, the other must, as well.) # Allocate the unused space in entry to outcome , by setting . # Remove the allocated space from entry by changing . # Entry is now exactly full. # Assign entry to the appropriate category based on the new value of . Each iteration moves at least one entry to the "exactly full" category (and the last moves two), so the procedure is guaranteed to terminate after at most iterations. Each iteration can be done in time, so the table can be set up in time. Vose points out that floating-point rounding errors may cause the guarantee referred to in step 1 to be violated. If one category empties before the other, the remaining entries may have set to 1 with negligible error. The solution accounting for floating point is sometimes called the Walker-Vose method or the Vose alias method. The Alias structure is not unique. As the lookup procedure is slightly faster if (because does not need to be consulted), one goal during table generation is to maximize the sum of the . Doing this optimally turns out to be
NP hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard pr ...
, but a
greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...
comes reasonably close: rob from the richest and give to the poorest. That is, at each step choose the largest and the smallest . Because this requires sorting the , it requires time.


Efficiency

Although the alias method is very efficient if generating a uniform deviate is itself fast, there are cases where it is far from optimal in terms of random bit usage. This is because it uses a full-precision random variate each time, even when only a few random bits are needed. One case arises when the probabilities are particularly well balanced, so many and is not needed. Generating is a waste of time. For example if , then a 32-bit random variate could be used to make 32 choices, but the alias method will only generate one. Another case arises when the probabilities are strongly unbalanced, so many . For example if and , then the great majority of the time, only a few random bits are required to determine that case 1 applies. In such cases, the table method described by Marsaglia et al. is more efficient. If we make many choices with the same probability we can on average require much less than one unbiased random bit. Using
arithmetic coding Arithmetic coding (AC) is a form of entropy encoding used in lossless data compression. Normally, a string of characters is represented using a fixed number of bits per character, as in the ASCII code. When a string is converted to arithmetic e ...
techniques arithmetic we can approach the limit given by the
binary entropy function In information theory, the binary entropy function, denoted \operatorname H(p) or \operatorname H_\text(p), is defined as the entropy of a Bernoulli process with probability p of one of two values. It is a special case of \Eta(X), the entropy fun ...
.


Literature

*
Donald Knuth Donald Ervin Knuth ( ; born January 10, 1938) is an American computer scientist, mathematician, and professor emeritus at Stanford University. He is the 1974 recipient of the ACM Turing Award, informally considered the Nobel Prize of computer sc ...
, ''
The Art of Computer Programming ''The Art of Computer Programming'' (''TAOCP'') is a comprehensive monograph written by the computer scientist Donald Knuth presenting programming algorithms and their analysis. Volumes 1–5 are intended to represent the central core of compu ...
'', Vol 2: Seminumerical Algorithms, section 3.4.1.


Implementations

* http://www.keithschwarz.com/darts-dice-coins/ Keith Schwarz: Detailed explanation, numerically stable version of Vose's algorithm, and link to Java implementation * https://jugit.fz-juelich.de/mlz/ransampl Joachim Wuttke: Implementation as a small C library. *https://gist.github.com/0b5786e9bfc73e75eb8180b5400cd1f8 Liam Huang's Implementation in C++ * https://github.com/joseftw/jos.weightedresult/blob/develop/src/JOS.WeightedResult/AliasMethodVose.cs C# implementation of Vose's algorithm.


References

{{reflist Pseudorandom number generators