statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, Hodges' estimator (or the Hodges–Le Cam estimator), named for Joseph Hodges, is a famous

counterexample A counterexample is any exception to a generalization. In logic a counterexample disproves the generalization, and does so rigorously in the fields of mathematics and philosophy. For example, the fact that "student John Smith is not lazy" is a c ...

of an

estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on Sample (statistics), observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguish ...

which is " superefficient", i.e. it attains smaller asymptotic variance than regular

efficient estimator In statistics, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator needs fewer input data or observations than a less efficient one to achiev ...

s. The existence of such a counterexample is the reason for the introduction of the notion of

regular estimator Regular estimators are a class of statistical estimators that satisfy certain regularity conditions which make them amenable to asymptotic analysis. The convergence of a regular estimator's distribution is, in a sense, locally uniform. This is oft ...

s. Hodges' estimator improves upon a regular estimator at a single point. In general, any superefficient estimator may surpass a regular estimator at most on a set of

Lebesgue measure In measure theory, a branch of mathematics, the Lebesgue measure, named after French mathematician Henri Lebesgue, is the standard way of assigning a measure to subsets of higher dimensional Euclidean '-spaces. For lower dimensions or , it c ...

zero. Although Hodges discovered the estimator he never published it; the first publication was in the doctoral thesis of

Lucien Le Cam Lucien Marie Le Cam (November 18, 1924 – April 25, 2000) was a mathematician and statistician. Biography Le Cam was born November 18, 1924, in Croze, France. His parents were farmers, and unable to afford higher education for him; his father d ...

Construction

Suppose

\hat_n

is a "common" estimator for some parameter

\theta

: it is

consistent In deductive logic, a consistent theory is one that does not lead to a logical contradiction. A theory T is consistent if there is no formula \varphi such that both \varphi and its negation \lnot\varphi are elements of the set of consequences ...

, and converges to some

asymptotic distribution In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the limiting distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing appr ...

L_\theta

(usually this is a

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

with mean zero and variance which may depend on

\theta

) at the

\sqrt

-rate: :

\sqrt(\hat\theta_n - \theta)\ \xrightarrow\ L_\theta\ .

Then the Hodges' estimator

\hat_n^H

is defined as :

\hat\theta_n^H = \begin\hat\theta_n, & \text , \hat\theta_n,  \geq n^, \text \\ 0, & \text , \hat\theta_n,  < n^.\end

This estimator is equal to

\hat_n

everywhere except on the small interval

n^,n^/math>, where it is equal to zero. It is not difficult to see that this estimator is

for

\theta

, and its

is :

\begin
    & n^\alpha(\hat\theta_n^H - \theta) \ \xrightarrow\ 0, \qquad\text \theta = 0, \\
    &\sqrt(\hat\theta_n^H - \theta)\ \xrightarrow\ L_\theta, \quad \text \theta\neq 0,
  \end

for any

\alpha\in\mathbb

. Thus this estimator has the same asymptotic distribution as

\hat_n

for all

\theta\neq 0

, whereas for

\theta=0

the rate of convergence becomes arbitrarily fast. This estimator is ''superefficient'', as it surpasses the asymptotic behavior of the efficient estimator

\hat_n

at least at one point

\theta=0

. It is not true that the Hodges estimator is equivalent to the sample mean, but much better when the true mean is 0. The correct interpretation is that, for finite

n

, the truncation can lead to worse square error than the sample mean estimator for

E /math> close to 0, as is shown in the example in the following section.

Le Cam shows that this behaviour is typical: superefficiency at the point θ implies the existence of a sequence \theta_n \rightarrow \theta such that \lim \inf E \theta_n \ell (\sqrt n (\hat \theta_n - \theta_n )) is strictly larger than the Cramér-Rao bound . For the
extreme case where the asymptotic risk at θ is zero, the \liminf is even infinite for a sequence \theta_n \rightarrow \theta .

In general, superefficiency may only be attained on a subset of Lebesgue measure zero of the parameter space \Theta . Vaart AW van der. ''Asymptotic Statistics''. Cambridge University Press; 1998.

Example

Suppose ''x''₁, ..., ''x_n'' is an

independent and identically distributed Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...

(IID) random sample from normal distribution with unknown mean but known variance. Then the common estimator for the population mean ''θ'' is the arithmetic mean of all observations:

\scriptstyle\bar

. The corresponding Hodges' estimator will be

\scriptstyle\hat\theta^H_n \;=\; \bar\cdot\mathbf\

, where 1 denotes the

indicator function In mathematics, an indicator function or a characteristic function of a subset of a set is a function that maps elements of the subset to one, and all other elements to zero. That is, if is a subset of some set , then the indicator functio ...

. The

mean square error In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator (of a procedure for estimating an unobserved quantity) measures the average of the squares of the errors—that is, the average squared difference between ...

(scaled by ''n'') associated with the regular estimator ''x'' is constant and equal to 1 for all ''θ''s. At the same time the mean square error of the Hodges' estimator

\scriptstyle\hat\theta_n^H

behaves erratically in the vicinity of zero, and even becomes unbounded as . This demonstrates that the Hodges' estimator is not

regular Regular may refer to: Arts, entertainment, and media Music * "Regular" (Badfinger song) * Regular tunings of stringed instruments, tunings with equal intervals between the paired notes of successive open strings Other uses * Regular character, ...

, and its asymptotic properties are not adequately described by limits of the form (''θ'' fixed, ).

Notes

References

* * * * {{refend Estimator

Construction

Example

See also

Notes

References