Jackknife resampling

TheInfoList

OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the jackknife (jackknife cross-validation) is a cross-validation technique and, therefore, a form of resampling. It is especially useful for
bias Bias is a disproportionate weight ''in favor of'' or ''against'' an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group ...
and
variance In probability theory and statistics, variance is the expected value, expectation of the squared Deviation (statistics), deviation of a random variable from its population mean or sample mean. Variance is a measure of statistical dispersion, di ...
estimation. The jackknife pre-dates other common resampling methods such as the bootstrap. Given a sample of size $n$, a jackknife
estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, th ...
can be built by aggregating the parameter estimates from each subsample of size $\left(n-1\right)$ obtained by omitting one observation. The jackknife technique was developed by Maurice Quenouille (1924–1973) from 1949 and refined in 1956.
John Tukey John Wilder Tukey (; June 16, 1915 – July 26, 2000) was an American mathematician A mathematician is someone who uses an extensive knowledge of mathematics in their work, typically to solve mathematical problems. Mathematicians are conce ...
expanded on the technique in 1958 and proposed the name "jackknife" because, like a physical jack-knife (a compact folding knife), it is a rough-and-ready tool that can improvise a solution for a variety of problems even though specific problems may be more efficiently solved with a purpose-designed tool. The jackknife is a linear approximation of the bootstrap.

# A simple example: Mean estimation

The jackknife
estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, th ...
of a parameter is found by systematically leaving out each observation from a dataset and calculating the parameter estimate over the remaining observations and then aggregating these calculations. For example, if the parameter to be estimated is the population mean of random variable ''$x$'', then for a given set of i.i.d. observations $x_1, ..., x_n$ the natural estimator is the sample mean: :$\bar =\frac \sum_^ x_i =\frac \sum_ x_i,$ where the last sum used another way to indicate that the index $i$ runs over the set . Then we proceed as follows: For each

# Estimating the bias of an estimator

The jackknife technique can be used to estimate (and correct) the bias of an estimator calculated over the entire sample. Suppose $\theta$ is the target parameter of interest, which is assumed to be some functional of the distribution of $x$. Based on a finite set of observations $x_1, ..., x_n$, which is assumed to consist of i.i.d. copies of $x$, the estimator $\hat$ is constructed: :$\hat =f_n\left(x_1,\ldots,x_n\right).$ The value of $\hat$ is sample-dependent, so this value will change from one random sample to another. By definition, the bias of $\hat$ is as follows: : One may wish to compute several values of $\hat$ from several samples, and average them, to calculate an empirical approximation of

# Estimating the variance of an estimator

The jackknife technique can be also used to estimate the variance of an estimator calculated over the entire sample.