Bertrand's Ballot Theorem
   HOME

TheInfoList



OR:

In
combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and an end in obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many appl ...
, Bertrand's ballot problem is the question: "In an
election An election is a formal group decision-making process by which a population chooses an individual or multiple individuals to hold public office. Elections have been the usual mechanism by which modern representative democracy has opera ...
where candidate A receives ''p'' votes and candidate B receives ''q'' votes with ''p'' > ''q'', what is the
probability Probability is the branch of mathematics concerning numerical descriptions of how likely an Event (probability theory), event is to occur, or how likely it is that a proposition is true. The probability of an event is a number between 0 and ...
that A will be strictly ahead of B throughout the count?" The answer is :\frac. The result was first published by W. A. Whitworth in 1878, but is named after
Joseph Louis François Bertrand Joseph Louis François Bertrand (; 11 March 1822 – 5 April 1900) was a French mathematician who worked in the fields of number theory, differential geometry, probability theory, economics and thermodynamics. Biography Joseph Bertrand was the ...
who rediscovered it in 1887. In Bertrand's original paper, he sketches a proof based on a general formula for the number of favourable sequences using a
recursion relation In mathematics, a recurrence relation is an equation according to which the nth term of a sequence of numbers is equal to some combination of the previous terms. Often, only k previous terms of the sequence appear in the equation, for a parameter ...
. He remarks that it seems probable that such a simple result could be proved by a more direct method. Such a proof was given by
Désiré André Désiré André (André Antoine Désiré) (March 29, 1840, Lyon – September 12, 1917, Paris) was a French mathematician, best known for his work on Catalan numbers and alternating permutations. Biography He is the son of Auguste Antoine Dési ...
, based on the observation that the unfavourable sequences can be divided into two equally probable cases, one of which (the case where B receives the first vote) is easily computed; he proves the equality by an explicit
bijection In mathematics, a bijection, also known as a bijective function, one-to-one correspondence, or invertible function, is a function between the elements of two sets, where each element of one set is paired with exactly one element of the other s ...
. A variation of his method is popularly known as André's reflection method, although André did not use any reflections. The Bertrand's ballot theorem is equivalent to the Cycle lemma.


Example

Suppose there are 5 voters, of whom 3 vote for candidate ''A'' and 2 vote for candidate ''B'' (so ''p'' = 3 and ''q'' = 2). There are ten possibilities for the order of the votes cast: *''AAABB'' *''AABAB'' *''ABAAB'' *''BAAAB'' *''AABBA'' *''ABABA'' *''BAABA'' *''ABBAA'' *''BABAA'' *''BBAAA'' For the order ''AABAB'', the tally of the votes as the election progresses is: For each column the tally for ''A'' is always larger than the tally for ''B'' so the ''A'' is always strictly ahead of ''B''. For the order ''AABBA'' the tally of the votes as the election progresses is: For this order, ''B'' is tied with ''A'' after the fourth vote, so ''A'' is not always strictly ahead of ''B''. Of the 10 possible orders, ''A'' is always ahead of ''B'' only for ''AAABB'' and ''AABAB''. So the probability that ''A'' will always be strictly ahead is :\frac=\frac, and this is indeed equal to \frac as the theorem predicts.


Equivalent problems

Rather than computing the probability that a random vote counting order has the desired property, one can instead compute the number of favourable counting orders, then divide by the total number of ways in which the votes could have been counted. (This is the method used by Bertrand.) The total number of ways is the
binomial coefficient In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers and is written \tbinom. It is the coefficient of the t ...
\tbinom; Bertrand's proof shows that the number of favourable orders in which to count the votes is \tbinom-\tbinom (though he does not give this number explicitly). And indeed after division this gives \tfrac-\tfrac=\tfrac. Another equivalent problem is to calculate the number of
random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb Z ...
s on the
integer An integer is the number zero (), a positive natural number (, , , etc.) or a negative integer with a minus sign (−1, −2, −3, etc.). The negative numbers are the additive inverses of the corresponding positive numbers. In the language ...
s that consist of ''n'' steps of unit length, beginning at the origin and ending at the point ''m'', that never become negative. Assuming ''n'' and ''m'' have the same parity and n\ge m\ge 0, this number is :\binom-\binom = \frac\binom. When m=0 and n is even, this gives the
Catalan number In combinatorial mathematics, the Catalan numbers are a sequence of natural numbers that occur in various counting problems, often involving recursively defined objects. They are named after the French-Belgian mathematician Eugène Charles Cata ...
\frac1\binom. [Note that if m=0, then n must be even. Otherwise, the random walk cannot end at 0. This can be seen as follows: let P be the number of "positive" moves, i.e., to the right, and let N be the number of "negative" moves, i.e., to the left. Since P+N=n and P-N=m, we have P=\frac and N=\frac. Since P and N are integers, if m=0, then \frac must be an integer. In other words, n must be even.]


Proof by reflection

For A to be strictly ahead of B throughout the counting of the votes, there can be no ties. Separate the counting sequences according to the first vote. Any sequence that begins with a vote for B must reach a tie at some point, because A eventually wins. For any sequence that begins with A and reaches a tie, reflect the votes up to the point of the first tie (so any A becomes a B, and vice versa) to obtain a sequence that begins with B. Hence every sequence that begins with A and reaches a tie is in one-to-one correspondence with a sequence that begins with B, and the probability that a sequence begins with B is q/(p+q), so the probability that A always leads the vote is : = 1- the probability of sequences that tie at some point : = 1- the probability of sequences that tie at some point and begin with A or B : = 1- 2 \times ( the probability of sequences that tie at some point and begin with B ) : = 1- 2 \times ( the probability that a sequence begins with B ) : = 1-2\frac=\frac


Proof by induction

Another method of proof is by
mathematical induction Mathematical induction is a method for proving that a statement ''P''(''n'') is true for every natural number ''n'', that is, that the infinitely many cases ''P''(0), ''P''(1), ''P''(2), ''P''(3), ...  all hold. Informal metaphors help ...
: *We loosen the condition p > q to p \geq q. Clearly, the theorem is correct when p = q, since in this case the first candidate will not be ''strictly'' ahead after all the votes have been counted (so the probability is 0). *Clearly the theorem is true if ''p'' > 0 and ''q'' = 0 when the probability is 1, given that the first candidate receives all the votes; it is also true when ''p'' = ''q'' > 0 as we have just seen. *Assume it is true both when ''p'' = ''a'' − 1 and ''q'' = ''b'', and when ''p'' = ''a'' and ''q'' = ''b'' − 1, with ''a'' > ''b'' > 0. (We don't need to consider the case a = b here, since we have already disposed of it before.) Then considering the case with ''p'' = ''a'' and ''q'' = ''b'', the last vote counted is either for the first candidate with probability ''a''/(''a'' + ''b''), or for the second with probability ''b''/(''a'' + ''b''). So the probability of the first being ahead throughout the count to the penultimate vote counted (and also after the final vote) is: ::+=. *And so it is true for all ''p'' and ''q'' with ''p'' > ''q'' > 0.


Proof via the Cycle lemma

A simple proof is based on the beautiful Cycle Lemma of Dvoretzky and Motzkin. Call a ballot sequence ''dominating'' if A is strictly ahead of B throughout the counting of the votes. The Cycle Lemma asserts that any sequence of p A's and q B's, where p> q, has precisely p-q dominating cyclic permutations. To see this, just arrange the given sequence of p+q A's and B's in a circle and repeatedly remove adjacent pairs AB until only p-q A's remain. Each of these A's was the start of a dominating cyclic permutation before anything was removed. So p-q out of the p+q cyclic permutations of any arrangement of p A votes and q B votes are dominating.


Proof by martingales

Let n= p+q. Define the "backwards counting" stochastic process X_k = \frac; \quad k = 0, 1, ..., n-1 where S_ is the lead of candidate A over B, after n-k votes have come in. Claim: X_k is a martingale process.
Given X_k, we know that S_ = (n-k)X_k, so of the first n-k votes, \frac2 (n-k) were for candidate A, and \frac2 (n-k) were for candidate B.
So, with probability \frac2, we have S_ = S_ -1, and X_ = \fracX_k - \frac. Similarly for the other one. Then compute to find E X_k= X_k.
Define the stopping time T as either the minimum k such that X_k =0, or n-1 if there's no such k. Then the probability that candidate A leads all the time is just E _T/math>, which by the optional stopping theorem isE _T= E _0= \frac


Bertrand's and André's proofs

Bertrand expressed the solution as :\frac where \mu=p+q is the total number of voters and m=p is the number of voters for the first candidate. He states that the result follows from the formula :P_=P_+P_, where P_ is the number of favourable sequences, but "it seems probable that such a simple result could be shown in a more direct way". Indeed, a more direct proof was soon produced by Désiré André. His approach is often mistakenly labelled "the reflection principle" by modern authors but in fact uses a permutation. He shows that the "unfavourable" sequences (those that reach an intermediate tie) consist of an equal number of sequences that begin with A as those that begin with B. Every sequence that begins with B is unfavourable, and there are \tbinom such sequences with a B followed by an arbitrary sequence of (''q''-1) B's and ''p'' A's. Each unfavourable sequence that begins with A can be transformed to an arbitrary sequence of (''q''-1) B's and ''p'' A's by finding the first B that violates the rule (by causing the vote counts to tie) and deleting it, and interchanging the order of the remaining parts. To reverse the process, take any sequence of (''q''-1) B's and ''p'' A's and search from the end to find where the number of A's first exceeds the number of B's, and then interchange the order of the parts and place a B in between. For example, the unfavourable sequence AABBABAA corresponds uniquely to the arbitrary sequence ABAAAAB. From this, it follows that the number of favourable sequences of ''p'' A's and ''q'' B's is :\binom-2\binom=\binom\frac and thus the required probability is :\frac as expected.


Variant: ties allowed

The original problem is to find the probability that the first candidate is always strictly ahead in the vote count. One may instead consider the problem of finding the probability that the second candidate is never ahead (that is, with ties are allowed). In this case, the answer is :\frac. The variant problem can be solved by the reflection method in a similar way to the original problem. The number of possible vote sequences is \tbinom. Call a sequence "bad" if the second candidate is ever ahead, and if the number of bad sequences can be enumerated then the number of "good" sequences can be found by subtraction and the probability can be computed. Represent a voting sequence as a
lattice path In combinatorics, a lattice path in the -dimensional integer lattice of length with steps in the set , is a sequence of vectors such that each consecutive difference v_i - v_ lies in . A lattice path may lie in any lattice in , but the int ...
on the Cartesian plane as follows: * Start the path at (0, 0) * Each time a vote for the first candidate is received move right 1 unit. * Each time a vote for the second candidate is received move up 1 unit. Each such path corresponds to a unique sequence of votes and will end at (''p'', ''q''). A sequence is 'good' exactly when the corresponding path never goes above the diagonal line ''y'' = ''x''; equivalently, a sequence is 'bad' exactly when the corresponding path touches the line ''y'' = ''x'' + 1. For each 'bad' path ''P'', define a new path ''P''′ by reflecting the part of ''P'' up to the first point it touches the line across it. ''P''′ is a path from (−1, 1) to (''p'', ''q''). The same operation applied again restores the original ''P''. This produces a one-to-one correspondence between the 'bad' paths and the paths from (−1, 1) to (''p'', ''q''). The number of these paths is \tbinom and so that is the number of 'bad' sequences. This leaves the number of 'good' sequences as :\binom - \binom = \binom\frac. Since there are \tbinom altogether, the probability of a sequence being good is \tfrac. In fact, the solutions to the original problem and the variant problem are easily related. For candidate A to be strictly ahead throughout the vote count, they must receive the first vote and for the remaining votes (ignoring the first) they must be either strictly ahead or tied throughout the count. Hence the solution to the original problem is :\frac\frac=\frac as required. Conversely, the tie case can be derived from the non-tie case. Note that the ''number'' of non-tie sequences with p+1 votes for A is equal to the number of tie sequences with p votes for A. The number of non-tie votes with p + 1 votes for A votes is \tfrac \tbinom , which by algebraic manipulation is \tfrac \tbinom , so the ''fraction'' of sequences with p votes for A votes is \tfrac.


Notes


References


Ballot theorems, old and new
L. Addario-Berry, B.A. Reed, 2007, i
Horizons of combinatorics
Editors Ervin Győri, G. Katona, Gyula O. H. Katona,
László Lovász László Lovász (; born March 9, 1948) is a Hungarian mathematician and professor emeritus at Eötvös Loránd University, best known for his work in combinatorics, for which he was awarded the 2021 Abel Prize jointly with Avi Wigderson. He ...
, Springer, 2008,


External links


The Ballot Problem
(includes scans of the original French articles and English translations) * Bernard Bru
Les leçons de calcul des probabilités de Joseph Bertrand
history of the problem (in French) * {{MathWorld , title=Ballot Problem , urlname=BallotProblem Probability problems Enumerative combinatorics Theorems in combinatorics Probability theorems Articles containing proofs Voting theory