Sampling Variance
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, sampling errors are incurred when the statistical characteristics of a
population Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...
are estimated from a subset, or sample, of that population. Since the sample does not include all members of the population, statistics of the sample (often known as
estimators In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...
), such as means and quartiles, generally differ from the statistics of the entire population (known as
parameters A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
). The difference between the sample statistic and population parameter is considered the sampling
error An error (from the Latin , meaning 'to wander'Oxford English Dictionary, s.v. “error (n.), Etymology,” September 2023, .) is an inaccurate or incorrect action, thought, or judgement. In statistics, "error" refers to the difference between t ...
.Sarndal, Swenson, and Wretman (1992), Model Assisted Survey Sampling, Springer-Verlag, For example, if one measures the height of a thousand individuals from a population of one million, the average height of the thousand is typically not the same as the average height of all one million people in the country. Since sampling is almost always done to estimate population parameters that are unknown, by definition exact measurement of the sampling errors will not be possible; however they can often be estimated, either by general methods such as
bootstrapping In general, bootstrapping usually refers to a self-starting process that is supposed to continue or grow without external input. Many analytical techniques are often called bootstrap methods in reference to their self-starting or self-supporting ...
, or by specific methods incorporating some assumptions (or guesses) regarding the true population distribution and parameters thereof.


Description


Sampling Error

The sampling error is the
error An error (from the Latin , meaning 'to wander'Oxford English Dictionary, s.v. “error (n.), Etymology,” September 2023, .) is an inaccurate or incorrect action, thought, or judgement. In statistics, "error" refers to the difference between t ...
caused by observing a sample instead of the whole population. The sampling error is the difference between a sample statistic used to estimate a population parameter and the actual but unknown value of the parameter.


Effective Sampling

In
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
, a truly
random sample In this statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the whole ...
means selecting individuals from a population with an equivalent
probability Probability is a branch of mathematics and statistics concerning events and numerical descriptions of how likely they are to occur. The probability of an event is a number between 0 and 1; the larger the probability, the more likely an e ...
; in other words, picking individuals from a group without bias. Failing to do this correctly will result in a
sampling bias In statistics, sampling bias is a bias (statistics), bias in which a sample is collected in such a way that some members of the intended statistical population, population have a lower or higher sampling probability than others. It results in a b ...
, which can dramatically increase the sample error in a
systematic Systematic may refer to: Science * Short for systematic error * Systematic fault * Systematic bias, errors that are introduced by an inaccuracy inherent to the system Economy * Systematic trading, a way of defining trade goals, risk control ...
way. For example, attempting to measure the average height of the entire human population of the Earth, but measuring a sample only from one country, could result in a large over- or under-estimation. In reality, obtaining an unbiased sample can be difficult as many parameters (in this example, country, age, gender, and so on) may strongly bias the estimator and it must be ensured that none of these factors play a part in the selection process. Even in a perfect non-biased sample, the sample error will still exist due to the remaining statistical component; consider that measuring only two or three individuals and taking the average would produce a wildly varying result each time. The likely size of the sampling error can generally be reduced by taking a larger sample.


Sample Size Determination

The cost of increasing a sample size may be prohibitive in reality. Since the sample error can often be estimated beforehand as a function of the sample size, various methods of
sample size determination Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences abo ...
are used to weigh the predicted accuracy of an estimator against the predicted cost of taking a larger sample.


Bootstrapping and Standard Error

As discussed, a sample statistic, such as an average or percentage, will generally be subject to sample-to-sample variation. By comparing many samples, or splitting a larger sample up into smaller ones (potentially with overlap), the spread of the resulting sample statistics can be used to estimate the
standard error The standard error (SE) of a statistic (usually an estimator of a parameter, like the average or mean) is the standard deviation of its sampling distribution or an estimate of that standard deviation. In other words, it is the standard deviati ...
on the sample.


In Genetics

The term "sampling error" has also been used in a related but fundamentally different sense in the field of
genetics Genetics is the study of genes, genetic variation, and heredity in organisms.Hartl D, Jones E (2005) It is an important branch in biology because heredity is vital to organisms' evolution. Gregor Mendel, a Moravian Augustinians, Augustinian ...
; for example in the
bottleneck effect A population bottleneck or genetic bottleneck is a sharp reduction in the size of a population due to environmental events such as famines, earthquakes, floods, fires, disease, and droughts; or human activities such as genocide, speciocide, wide ...
or
founder effect In population genetics, the founder effect is the loss of genetic variation that occurs when a new population is established by a very small number of individuals from a larger population. It was first fully outlined by Ernst Mayr in 1942, us ...
, when natural disasters or migrations dramatically reduce the size of a population, resulting in a smaller population that may or may not fairly represent the original one. This is a source of
genetic drift Genetic drift, also known as random genetic drift, allelic drift or the Wright effect, is the change in the Allele frequency, frequency of an existing gene variant (allele) in a population due to random chance. Genetic drift may cause gene va ...
, as certain
alleles An allele is a variant of the sequence of nucleotides at a particular location, or locus, on a DNA molecule. Alleles can differ at a single position through single nucleotide polymorphisms (SNP), but they can also have insertions and deletions ...
become more or less common), and has been referred to as "sampling error", despite not being an "error" in the statistical sense.


See also

*
Margin of error The margin of error is a statistic expressing the amount of random sampling error in the results of a Statistical survey, survey. The larger the margin of error, the less confidence one should have that a poll result would reflect the result of ...
*
Propagation of uncertainty In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of ex ...
*
Ratio estimator The ratio estimator is a statistical estimator for the ratio of means of two random variables. Ratio estimates are biased and corrections must be made when they are used in experimental or survey work. The ratio estimates are asymmetrical and symm ...
*
Sampling (statistics) In this statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a population (statistics), statistical population to estimate char ...


References

{{Reflist Sampling (statistics) Errors and residuals Auditing terms