A jury theorem is a

mathematical theorem In mathematics, a theorem is a statement that has been proved, or can be proved. The ''proof'' of a theorem is a logical argument that uses the inference rules of a deductive system to establish that the theorem is a logical consequence of the ...

proving that, under certain assumptions, a decision attained using

majority voting Majority rule is a principle that means the decision-making power belongs to the group that has the most members. In politics, majority rule requires the deciding vote to have majority, that is, more than half the votes. It is the binary deci ...

in a large group is more likely to be correct than a decision attained by a single expert. It serves as a formal argument for the idea of

wisdom of the crowd The wisdom of the crowd is the collective opinion of a diverse independent group of individuals rather than that of a single expert. This process, while not new to the Information Age, has been pushed into the mainstream spotlight by social infor ...

, for decision of questions of fact by

jury trial A jury trial, or trial by jury, is a Trial, legal proceeding in which a jury makes a decision or Question of law, findings of fact. It is distinguished from a bench trial in which a judge or Judicial panel, panel of judges makes all decisions. ...

, and for

democracy Democracy (From grc, δημοκρατία, dēmokratía, ''dēmos'' 'people' and ''kratos'' 'rule') is a form of government in which the people have the authority to deliberate and decide legislation (" direct democracy"), or to choose gov ...

in general. The first and most famous jury theorem is

Condorcet's jury theorem Condorcet's jury theorem is a political science theorem about the relative probability of a given group of individuals arriving at a correct decision. The theorem was first expressed by the Marquis de Condorcet in his 1785 work ''Essay on the App ...

. It assumes that all voters have independent probabilities to vote for the correct alternative, these probabilities are larger than 1/2, and are the same for all voters. Under these assumptions, the probability that the majority decision is correct is strictly larger when the group is larger; and when the group size tends to infinity, the probability that the majority decision is correct tends to 1. There are many other jury theorems, relaxing some or all of these assumptions.

Setting

The premise of all jury theorems is that there is an ''

objective truth In philosophy, objectivity is the concept of truth independent from individual subjectivity (bias caused by one's perception, emotions, or imagination). A proposition is considered to have objective truth when its truth conditions are met witho ...

'', which is unknown to the voters. Most theorems focus on ''binary issues'' (issues with two possible states), for example, whether a certain

defendant In court proceedings, a defendant is a person or object who is the party either accused of committing a crime in criminal prosecution or against whom some type of civil relief is being sought in a civil case. Terminology varies from one jurisdic ...

is guilty or innocent, whether a certain

stock In finance, stock (also capital stock) consists of all the shares by which ownership of a corporation or company is divided.Longman Business English Dictionary: "stock - ''especially AmE'' one of the shares into which ownership of a company ...

is going to rise or fall, etc. There are

n

voters (or jurors), and their goal is to reveal the truth. Each voter has an ''

opinion An opinion is a judgment, viewpoint, or statement that is not conclusive, rather than facts, which are true statements. Definition A given opinion may deal with subjective matters in which there is no conclusive finding, or it may deal with f ...

'' about which of the two options is correct. The opinion of each voter is either correct (i.e., equals the true state), or wrong (i.e., differs than the true state). This is in contrast to other settings of

voting Voting is a method by which a group, such as a meeting or an electorate, can engage for the purpose of making a collective decision or expressing an opinion usually following discussions, debates or election campaigns. Democracies elect holde ...

, in which the opinion of each voter represents his/her subjective preferences and is thus always "correct" for this specific voter. The opinion of a voter can be considered a

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

: for each voter, there is a positive probability that his opinion equals the true state. The group decision is determined by the ''

majority rule Majority rule is a principle that means the decision-making power belongs to the group that has the most members. In politics, majority rule requires the deciding vote to have majority, that is, more than half the votes. It is the binary deci ...

''. For example, if a majority of voters says "guilty" then the decision is "guilty", while if a majority says "innocent" then the decision is "innocent". To avoid ties, it is often assumed that the number of voters

n

is odd. Alternatively, if

n

is even, then ties are broken by tossing a

fair coin In probability theory and statistics, a sequence of independent Bernoulli trials with probability 1/2 of success on each trial is metaphorically called a fair coin. One for which the probability is not 1/2 is called a biased or unfair coin. In the ...

. Jury theorems are interested in the ''probability of correctness'' - the probability that the majority decision coincides with the objective truth. Typical jury theorems make two kinds of claims on this probability: # ''Growing Reliability'': the probability of correctness is larger when the group is larger. # ''Crowd Infallibility'': the probability of correctness goes to 1 when the group size goes to infinity. Claim 1 is often called the ''non-asymptotic part'' and claim 2 is often called the ''asymptotic part'' of the jury theorem. Obviously, these claims are not always true, but they are true under certain assumptions on the voters. Different jury theorems make different assumptions.

Independence, competence, and uniformity

Condorcet's jury theorem makes the following three assumptions: # ''Unconditional Independence'': the voters make up their minds independently. In other words, their opinions are

independent random variables Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independe ...

. # ''Unconditional Competence'': the probability that the opinion of a single voter coincides with the objective truth is larger than 1/2 (i.e., the voter is smarter than a random coin-toss). # ''Uniformity'': all voters have the same probability of being correct. The jury theorem of Condorcet says that these three assumptions imply Growing Reliability and Crowd Infallibility.

Correlated votes: weakening the independence assumption

The opinions of different voters are often correlated, so Unconditional Independence may not hold. In this case, the Growing Reliability claim might fail.

Example

Let

p

be the probability of a juror voting for the correct alternative and

c

be the (second-order) ''

correlation coefficient A correlation coefficient is a numerical measure of some type of correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two components ...

'' between any two correct votes. If all higher-order correlation coefficients in the Bahadur representation of the joint probability distribution of votes equal to zero, and

(p,c)\in\mathcal_n

is an admissible pair, then the probability of the jury collectively reaching the correct decision under simple majority is given by: :

P(n,p,c)=I_p\left(\frac,\frac\right)+0.5c(n-1)(0.5-p)\frac,

where

I_p

is the '' regularized incomplete beta function''. ''Example:'' Take a jury of three jurors

(n=3)

, with individual competence

p=0.55

and second-order correlation

c=0.4

. Then

P(3,0.55,0.4)=0.54505

. The competence of the jury is lower than the competence of a single juror, which equals to

0.55

. Moreover, enlarging the jury by two jurors

(n=5)

decreases the jury competence even further,

P(5,0.55,0.4)=0.5196194

. Note that

p=0.55

and

c=0.4

is an admissible pair of parameters. For

n=5

and

p=0.55

, the maximum admissible second-order correlation coefficient equals

\approx 0.43

. The above example shows that when the individual competence is low but the correlation is high: * The collective competence under simple majority may fall below that of a single juror; * Enlarging the jury may decrease its collective competence. The above result is due to Kaniovski and Zaigraev. They also discuss optimal jury design for homogenous juries with correlated votes. There are several jury theorems that weaken the Independence assumption in various ways.

Truth-sensitive independence and competence

In binary decision problems, there is often one option that is easier to detect that the other one. For example, it may be easier to detect that a defendant is guilty (as there is clear evidence for guilt) than to detect that he is innocent. In this case, the probability that the opinion of a single voter is correct is represented by two different numbers: probability given that option #1 is correct, and probability given that option #2 is correct. This also implies that opinions of different voters are

correlated In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...

. This motivates the following relaxations of the above assumptions: # ''Conditional Independence'': for each of the two options, the voters' opinions given that this option is the true one are

. # ''Conditional Competence'': for each of the two options, the probability that a single voter's opinion is correct given that this option is true is larger than 1/2. # ''Conditional Uniformity'': for each of the two options, all voters have the same probability of being correct given that this option is true. Growing Reliability and Crowd Infallibility continue to hold under these weaker assumptions. One criticism of Conditional Competence is that it depends on the way the decision question is formulated. For example, instead of asking whether the defendant is guilty or innocent, one can ask whether the defendant is guilty of exactly 10 charges (option A), or guilty of another number of charges (0..9 or more than 11). This changes the conditions, and hence, the conditional probability. Moreover, if the state is very specific, then the probability of voting correctly might be below 1/2, so Conditional Competence might not hold.

Effect of an opinion leader

Another cause of correlation between voters is the existence of an

opinion leader Opinion leadership is leadership by an active media user who interprets the meaning of media messages or content for lower-end media users. Typically opinion leaders are held in high esteem by those who accept their opinions. Opinion leadership com ...

. Suppose each voter makes an independent decision, but then each voter, with some fixed probability, changes his opinion to match that of the opinion leader. Jury theorems by Boland and Boland, Proschan and Tong shows that, if (and only if) the probability of following the opinion leader is less than 1-1/2''p'' (where ''p'' is the competence level of all voters), then Crowd Infallibility holds.

Problem-sensitive independence and competence

In addition to the dependence on the true option, there are many other reasons for which voters' opinions may be correlated. For example: * Deliberation among voters; *

Peer pressure Peer pressure is the direct or indirect influence on peers, i.e., members of social groups with similar interests, experiences, or social statuses. Members of a peer group are more likely to influence a person's beliefs, values, and behavior. A g ...

; * False evidence (e.g. a guilty defendant that excels at pretending to be innocent); * External conditions (e.g. poor weather affecting their judgement). *Any other common cause of votes It is possible to weaken the Conditional Independence assumption, and conditionalize on ''all'' common causes of the votes (rather than just the state). In other words, the votes are now independent ''conditioned on the specific decision problem''. However, in a specific problem, the Conditional Competence assumption may not be valid. For example, in a specific problem with false evidence, it is likely that most voters will have a wrong opinion. Thus, the two assumptions - conditional independence and conditional competence - are not justifiable simultaneously (under the same conditionalization). A possible solution is to weaken Conditional Competence as follows. For each voter and each problem ''x'', there is a probability ''p''(''x'') that the voter's opinion is correct in this specific problem. Since ''x'' is a random variable, ''p''(''x'') is a random variable too. Conditional Competence requires that ''p''(''x'') > 1/2 with probability 1. The weakened assumption is: * ''Tendency to Competence'': for each voter, and for each ''r''>0, the probability that ''p''(''x'') = 1/2+''r'' is at least as large as the probability that ''p''(''x'') = 1/2-''r''. A jury theorem by Dietrich and Spiekerman says that Conditional Independence, Tendency to Competence, and Conditional Uniformity, together imply Growing Reliability. Note that Crowd Infallibility is not implied. In fact, the probability of correctness tends to a value which is below 1, if and only of Conditional Competence does not hold.

Bounded correlation

A jury theorem by Pivato shows that, if the average covariance between voters becomes small as the population becomes large, then Crowd Infallibility holds (for some voting rule). There are other jury theorems that take into account the degree to which votes may be correlated.

Diverse capabilities: weakening the uniformity assumption

Different voters often have different competence levels, so the Uniformity assumption does not hold. In this case, both Growing Reliability and Crowd Infallibility may not hold. This may happen if new voters have much lower competence than existing voters, so that adding new voters decreases the group's probability of correctness. In some cases, the probability of correctness might converge to 1/2 (- a random decision) rather than to 1.

Stronger competence requirements

Uniformity can be dismissed if the Competence assumption is strengthened. There are several ways to strengthen it: * Strong Competence: for each voter ''i'', the probability of correctness ''p_i'' is at least 1/2+''e'', where ''e''>0 is fixed for all voters. In other words: the competence is bounded away from a fair coin toss. A jury theorem by Paroush shows that Strong Competence and Conditional Independence together imply Crowd Infallibility (but not Growing Reliability). * Average Competence: the ''average'' of the individual competence levels of the voters (i.e. the average of their individual probabilities of deciding correctly) is slightly greater than half, or converges to a value above 1/2. Jury theorems by Grofman, Owen and Feld, and Berend and Paroush, show that Average Competence and Conditional Independence together imply Crowd Infallibility (but not Growing Reliability).

Random voter selection

instead of assuming that the voter identity is fixed, one can assume that there is a large pool of potential voters with different competence levels, and the actual voters are selected at random from this pool (as in

sortition In governance, sortition (also known as selection by lottery, selection by lot, allotment, demarchy, stochocracy, aleatoric democracy, democratic lottery, and lottocracy) is the selection of political officials as a random sample from a larger ...

). A jury theorem by Ben Yashar and Paroush shows that, under certain conditions, the correctness probability of a jury, or of a subset of it chosen at random, is larger than the correctness probability of a single juror selected at random. A more general jury theorem by Berend and Sapir proves that Growing Reliability holds in this setting: the correctness probability of a random committee increases with the committee size. The theorem holds, under certain conditions, even with correlated votes. A jury theorem by Owen, Grofman and Feld analyzes a setting where the competence level is random. They show what distribution of individual competence maximizes or minimizes the probability of correctness.

Weighted majority rule

When the competence levels of the voters are known, the simple majority rule may not be the best decision rule. There are various works on identifying the ''optimal decision rule'' - the rule maximizing the group correctness probability. Nitzan and Paroush show that, under Unconditional Independence, the optimal decision rule is a ''weighted'' majority rule, where the weight of each voter with correctness probability ''p_i'' is log(''p_i''/(1-''p_i'')), and an alternative is selected iff the sum of weights of its supporters is above some threshold. Grofman and Shapley analyze the effect of interdependencies between voters on the optimal decision rule. Ben-Yashar and Nitzan prove a more general result. Dietrich generalizes this result to a setting that does not require prior probabilities of the 'correctness' of the two alternative. The only required assumption is Epistemic Monotonicity, which says that, if under certain profile alternative ''x'' is selected, and the profile changes such that ''x'' becomes more probable, then x is still selected. Dietrich shows that Epistemic Monotonicity implies that the optimal decision rule is weighted majority with a threshold. In the same paper, he generalizes the optimal decision rule to a setting that does not require the input to be a vote for one of the alternatives. It can be, for example, a subjective degree of belief. Moreover, competence parameters do not need to be known. For example, if the inputs are subjective beliefs ''x''₁,...,''x_n'', then the optimal decision rule sums log(''x_i''/(1-''x_i'')) and checks whether the sum is above some threshold. Epistemic Monotonicity is not sufficient for computing the threshold itself; the threshold can be computed by assuming expected-utility maximization and prior probabilities. A general problem with the weighted majority rules is that they require to know the competence levels of the different voters, which is usually hard to compute in an objective way. Baharad, Goldberger, Koppel and Nitzan present an algorithm that solves this problem using statistical machine learning. It requires as input only a list of past votes; it does not need to know whether these votes were correct or not. If the list is sufficiently large, then its probability of correctness converges to 1 even if the individual voters' competence levels are close to 1/2.

More than two options

Often, decision problems involve three or more options. This critical limitation was in fact recognized by Condorcet (see

Condorcet's paradox The Condorcet paradox (also known as the voting paradox or the paradox of voting) in social choice theory is a situation noted by the Marquis de Condorcet in the late 18th century, in which collective preferences can be cyclic, even if the prefer ...

), and in general it is very difficult to reconcile individual decisions between three or more outcomes (see

Arrow's theorem Arrow's impossibility theorem, the general possibility theorem or Arrow's paradox is an impossibility theorem in social choice theory that states that when voters have three or more distinct alternatives (options), no ranked voting electoral syste ...

). This limitation may also be overcome by means of a sequence of votes on pairs of alternatives, as is commonly realized via the legislative amendment process. (However, as per Arrow's theorem, this creates a "path dependence" on the exact sequence of pairs of alternatives; e.g., which amendment is proposed first can make a difference in what amendment is ultimately passed, or if the law—with or without amendments—is passed at all.) With three or more options, Conditional Competence can be generalized as follows: * Multioption Conditional Competence: for any two options ''x'' and ''y'', if ''x'' is correct and ''y'' is not, then any voter is more likely to vote for ''x'' than for ''y''. A jury theorem by List and Goodin shows that Multioption Conditional Competence and Conditional Independence together imply Crowd Infallibility. Dietrich and Spiekermann conjecture that they imply Growing Reliability too. Another related jury theorem is by Everaere, Konieczny and Marquis. When there are more than two options, there are various voting rules that can be used instead of simple majority. The statistic and utilitarian properties of such rules are analyzed e.g. by Pivato.

Indirect majority systems

Condorcet's theorem considers a ''direct majority system'', in which all votes are counted directly towards the final outcome. Many countries use an ''indirect majority system'', in which the voters are divided into groups. The voters in each group decide on an outcome by an internal majority vote; then, the groups decide on the final outcome by a majority vote among them. For example, suppose there are 15 voters. In a direct majority system, a decision is accepted whenever at least 8 votes support it. Suppose now that the voters are grouped into 3 groups of size 5 each. A decision is accepted whenever at least 2 groups support it, and in each group, a decision is accepted whenever at least 3 voters support it. Therefore, a decision may be accepted even if only 6 voters support it. Boland, Proschan and Tong prove that, when the voters are independent and p>1/2, a direct majority system - as in Condorcet's theorem - always has a higher chance of accepting the correct decision than any indirect majority system. Berg and Paroush consider multi-tier voting hierarchies, which may have several levels with different decision-making rules in each level. They study the optimal voting structure, and compares the competence against the benefit of time-saving and other expenses. Goodin and Spiekermann compute the amount by which a small group of experts should be better than the average voters, in order for them to accept better decisions.

Strategic voting

It is well-known that, when there are three or more alternatives, and voters have different preferences, they may engage in

strategic voting Strategic voting, also called tactical voting, sophisticated voting or insincere voting, occurs in voting systems when a voter votes for another candidate or party than their ''sincere preference'' to prevent an undesirable outcome. For example, ...

, for example, vote for the second-best option in order to prevent the worst option from being elected. Surprisingly, strategic voting might occur even with two alternatives and when all voters have the same preference, which is to reveal the truth. For example, suppose the question is whether a defendant is guilty or innocent, and suppose a certain juror thinks the true answer is "guilty". However, he also knows that his vote is effective only if the other votes are tied. But, if other votes are tied, it means that the probability that the defendant is guilty is close to 1/2. Taking this into account, our juror might decide that this probability is not sufficient for deciding "guilty", and thus will vote "innocent". But if all other voters do the same, the wrong answer is derived. In game-theoretic terms, truthful voting is might not be a

Nash equilibrium In game theory, the Nash equilibrium, named after the mathematician John Nash, is the most common way to define the solution of a non-cooperative game involving two or more players. In a Nash equilibrium, each player is assumed to know the equili ...

. This problem has been termed ''the swing voter's curse'', as it is analogous to the winner's curse in auction theory. A jury theorem by Peleg and Zamir shows sufficient and necessary conditions for the existence of a Bayesian-Nash equilibrium that satisfies Condorcet's jury theorem. Bozbay, Dietrich and Peters show voting rules that lead to efficient aggregation of the voters' private information even with strategic voting. In practice, this problem may not be very severe, since most voters care not only about the final outcome, but also about voting correctly by their conscience. Moreover, most voters are not sophisticated enough to vote strategically.

Subjective opinions

The notion of "correctness" may not be meaningful when making policy decisions, which are based on values or preferences, rather than just on facts. Some defenders of the theorem hold that it is applicable when voting is aimed at determining which policy best promotes the public good, rather than at merely expressing individual preferences. On this reading, what the theorem says is that although each member of the electorate may only have a vague perception of which of two policies is better, majority voting has an amplifying effect. The "group competence level", as represented by the probability that the majority chooses the better alternative, increases towards 1 as the size of the electorate grows assuming that each voter is more often right than wrong. Several papers show that, under reasonable conditions, large groups are better trackers of the majority preference.

References

External links

* {{SEP, jury-theorems, Jury Theorems, Franz Dietrich & Kai Spiekermann, November 17, 2021 Probability theorems Voting theory