HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the Horvitz–Thompson estimator, named after Daniel G. Horvitz and Donovan J. Thompson, is a method for estimating the total and mean of a pseudo-population in a
stratified sample In statistics, stratified sampling is a method of sampling from a population which can be partitioned into subpopulations. In statistical surveys, when subpopulations within an overall population vary, it could be advantageous to sample each s ...
.
Inverse probability weighting Inverse probability weighting is a statistical technique for calculating statistics standardized to a pseudo-population different from that in which the data was collected. Study designs with a disparate sampling population and population of target ...
is applied to account for different proportions of observations within strata in a target population. The Horvitz–Thompson estimator is frequently applied in survey analyses and can be used to account for
missing data In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation. Missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. Mis ...
, as well as many sources of unequal selection probabilities.


The method

Formally, let Y_i, i = 1, 2, \ldots, n be an
independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s * Independ ...
sample from ''n'' of ''N ≥ n'' distinct strata with a common mean ''μ''. Suppose further that \pi_i is the
inclusion probability In statistics, in the theory relating to sampling from finite populations, the sampling probability (also known as inclusion probability) of an element or member of the population, is its probability of becoming part of the sample during the draw ...
that a randomly sampled individual in a superpopulation belongs to the ''i''th stratum. The Hansen and Hurwitz (1943) estimator of the total is given by: : \hat_ = \sum_^n \pi_i ^ Y_i, and the Horvitz–Thompson estimate of the mean is given by: : \hat_ = N^\hat_ = N^\sum_^n \pi_i ^ Y_i. In a
Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
probabilistic framework \pi_i is considered the proportion of individuals in a target population belonging to the ''i''th stratum. Hence, \pi_i^ Y_i could be thought of as an estimate of the complete sample of persons within the ''i''th stratum. The Horvitz–Thompson estimator can also be expressed as the limit of a weighted bootstrap resampling estimate of the mean. It can also be viewed as a special case of multiple imputation approaches. For post-stratified study designs, estimation of \pi and \mu are done in distinct steps. In such cases, computating the variance of \hat_ is not straightforward. Resampling techniques such as the bootstrap or the jackknife can be applied to gain consistent estimates of the variance of the Horvitz–Thompson estimator. The "survey" package for R conducts analyses for post-stratified data using the Horvitz–Thompson estimator.


Proof of Horvitz-Thompson Unbiased Estimation of the Mean

The Horvitz–Thompson estimator can be shown to be unbiased when evaluating the expectation of the Horvitz–Thompson estimator, \mathbf\bar_n^, as follows: : \mathbf\bar_n^ = \mathbf\frac \sum_^\frac : =\mathbf\frac\sum_^\frac1_ : =\sum_^P(D_n^)\left frac\sum_^\frac1_\right : =\frac\sum_^\frac\sum_^1_P(D_n^) : =\frac\sum_^\left(\frac\right)\pi_i : =\frac\sum_^X_i : \text~D_n = \ The Hansen-Hurwitz (1943) is known to be inferior to the Horvitz–Thompson (1952) strategy, associated with a number of Inclusion Probabilities Proportional to Size (IPPS) sampling procedures.PRABHU-AJGAONKAR, S. G. "Comparison of the Horvitz-Thompson Strategy with the Hansen-Hurwitz Strategy." Survey Methodology (1987): 221
(pdf)
/ref>


References


External links


Survey Package Website for R
{{DEFAULTSORT:Horvitz-Thompson estimator Sampling (statistics) Survey methodology Missing data