HOME

TheInfoList



OR:

The margin of error is a statistic expressing the amount of random sampling error in the results of a survey. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of a census of the entire
population Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction usi ...
. The margin of error will be positive whenever a population is incompletely sampled and the outcome measure has positive
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbe ...
, which is to say, the measure ''varies''. The term ''margin of error'' is often used in non-survey contexts to indicate
observational error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a " mista ...
in reporting measured quantities.


Concept

Consider a simple ''yes/no'' poll P as a sample of n respondents drawn from a population N \text(n \ll N) reporting the percentage p of ''yes'' responses. We would like to know how close p is to the true result of a survey of the entire population N, without having to conduct one. If, hypothetically, we were to conduct poll P over subsequent samples of n respondents (newly drawn from N), we would expect those subsequent results p_1,p_2,\ldots to be normally distributed about \overline. The ''margin of error'' describes the distance within which a specified percentage of these results is expected to vary from \overline. According to the 68-95-99.7 rule, we would expect that 95% of the results p_1,p_2,\ldots will fall within ''about'' two
standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, whil ...
s (\plusmn2\sigma_) either side of the true mean \overline.  This interval is called the
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
, and the ''radius'' (half the interval) is called the ''margin of error'', corresponding to a 95% ''confidence level''. Generally, at a confidence level \gamma, a sample sized n of a population having expected standard deviation \sigma has a margin of error :MOE_\gamma = z_\gamma \times \sqrt where z_\gamma denotes the ''quantile'' (also, commonly, a '' z-score''), and \sqrt is the standard error.


Standard deviation and standard error

We would expect the normally distributed values  p_1,p_2,\ldots to have a standard deviation which somehow varies with n. The smaller n, the wider the margin. This is called the standard error \sigma_\overline. For the single result from our survey, we ''assume'' that p = \overline, and that ''all'' subsequent results p_1,p_2,\ldots together would have a variance \sigma_^2=P(1-P). : \text = \sigma_\overline \approx \sqrt \approx \sqrt Note that p(1-p) corresponds to the variance of a
Bernoulli distribution In probability theory and statistics, the Bernoulli distribution, named after Swiss mathematician Jacob Bernoulli,James Victor Uspensky: ''Introduction to Mathematical Probability'', McGraw-Hill, New York 1937, page 45 is the discrete probabi ...
.


Maximum margin of error at different confidence levels

For a confidence ''level'' \gamma, there is a corresponding confidence ''interval'' about the mean \mu\plusmn z_\gamma\sigma, that is, the interval mu-z_\gamma\sigma,\mu+z_\gamma\sigma/math> within which values of P should fall with probability \gamma. Precise values of z_\gamma are given by the quantile function of the normal distribution (which the 68-95-99.7 rule approximates). Note that z_\gamma is undefined for , \gamma, \ge 1, that is, z_ is undefined, as is z_. Since \max \sigma_P^2 = \max P(1-P) = 0.25 at p = 0.5, we can arbitrarily set p=\overline = 0.5, calculate \sigma_, \sigma_\overline, and z_\gamma\sigma_\overline to obtain the ''maximum'' margin of error for P at a given confidence level \gamma and sample size n, even before having actual results.  With p=0.5,n=1013 :MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 1.96\sqrt = 0.98/\sqrt=\plusmn3.1% :MOE_(0.5) = z_\sigma_\overline \approx z_\sqrt = 2.58\sqrt = 1.29/\sqrt=\plusmn4.1% Also, usefully, for any reported MOE_ :MOE_ = \fracMOE_ \approx 1.3 \times MOE_


Specific margins of error

If a poll has multiple percentage results (for example, a poll measuring a single multiple-choice preference), the result closest to 50% will have the highest margin of error. Typically, it is this number that is reported as the margin of error for the entire poll. Imagine poll P reports p_,p_,p_ as 71%, 27%, 2%, n=1013 :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.89/\sqrt=\plusmn2.8% (as in the figure above) :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.87/\sqrt=\plusmn2.7% :MOE_(P_) = z_\sigma_\overline \approx 1.96\sqrt = 0.27/\sqrt=\plusmn0.8% As a given percentage approaches the extremes of 0% or 100%, its margin of error approaches ±0%.


Comparing percentages

Imagine multiple-choice poll P reports p_,p_,p_ as 46%, 42%, 12%, n=1013. As described above, the margin of error reported for the poll would typically be MOE_(P_), as p_is closest to 50%. The popular notion of ''statistical tie'' or ''statistical dead heat,'' however, concerns itself not with the accuracy of the individual results, but with that of the ''ranking'' of the results. Which is in first? If, hypothetically, we were to conduct poll P over subsequent samples of n respondents (newly drawn from N), and report result p_ = p_ - p_, we could use the ''standard error of difference'' to understand how p_,p_,p_,\ldots is expected to fall about \overline. For this, we need to apply the ''sum of variances'' to obtain a new variance, \sigma_^2 , : \sigma_^2=\sigma_^2 = \sigma_^2 + \sigma_^2-2\sigma_ = p_(1-p_) + p_(1-p_) + 2p_p_ where \sigma_ = -P_P_ is the
covariance In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the le ...
of P_and P_. Thus (after simplifying), : \text = \sigma_ \approx \sqrt = \sqrt = 0.029, P_=P_-P_ : MOE_(P_) = z_\sigma_ \approx \plusmn : MOE_(P_) = z_\sigma_ \approx \plusmn Note that this assumes that P_ is close to constant, that is, respondents choosing either A or B would almost never chose C (making P_and P_ close to ''perfectly negatively correlated''). With three or more choices in closer contention, choosing a correct formula for \sigma_^2 becomes more complicated.


Effect of finite population size

The formulae above for the margin of error assume that there is an infinitely large population and thus do not depend on the size of population N, but only on the sample size n. According to
sampling theory In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians atte ...
, this assumption is reasonable when the sampling fraction is small. The margin of error for a particular sampling method is essentially the same regardless of whether the population of interest is the size of a school, city, state, or country, as long as the sampling ''fraction'' is small. In cases where the sampling fraction is larger (in practice, greater than 5%), analysts might adjust the margin of error using a finite population correction to account for the added precision gained by sampling a much larger percentage of the population. FPC can be calculated using the formula (Equation 1) :\operatorname = \sqrt ...and so, if poll P were conducted over 24% of, say, an electorate of 300,000 voters, :MOE_(0.5) = z_\sigma_\overline \approx \frac=\plusmn0.4% :MOE_(0.5) = z_\sigma_\overline\sqrt\approx \frac\sqrt=\plusmn0.3% Intuitively, for appropriately large N, :\lim_ \sqrt\approx 1 :\lim_ \sqrt = 0 In the former case, n is so small as to require no correction. In the latter case, the poll effectively becomes a census and sampling error becomes moot.


See also

* Engineering tolerance * Key relevance *
Measurement uncertainty In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a measured quantity. All measurements are subject to uncertainty and a measurement result is complete only when it is accompanied by ...
*
Random error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistake" ...


References


Sources

* Sudman, Seymour and Bradburn, Norman (1982). ''Asking Questions: A Practical Guide to Questionnaire Design''. San Francisco: Jossey Bass. *


External links

* * {{mathworld , urlname = MarginofError , title = Margin of Error Error Measurement Sampling (statistics) Statistical deviation and dispersion Statistical intervals