Neyman construction, named after
Jerzy Neyman
Jerzy Neyman (April 16, 1894 – August 5, 1981; born Jerzy Spława-Neyman; ) was a Polish mathematician and statistician who spent the first part of his professional career at various institutions in Warsaw, Poland and then at University Colleg ...
, is a
frequentist
Frequentist inference is a type of statistical inference based in frequentist probability, which treats “probability” in equivalent terms to “frequency” and draws conclusions from sample-data by means of emphasizing the frequency or pro ...
method to construct an interval at a
confidence level
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
such that if we
repeat the experiment many times the interval will contain the true value of some parameter a fraction
of the time.
Theory
Assume
are random variables with joint pdf
, which depends on k unknown parameters. For convenience, let
be the sample space defined by the n random variables and subsequently define a sample point in the sample space as
Neyman originally proposed defining two functions
and
such that for any sample point,
,
*
* L and U are single valued and defined.
Given an observation,
, the probability that
lies between
and
is defined as
with probability of
or
. These calculated probabilities fail to draw meaningful inference about
since the probability is simply zero or unity. Furthermore, under the frequentist construct the model parameters are unknown constants and not permitted to be random variables.
For example if
, then
. Likewise, if
, then
As Neyman describes in his 1937 paper, suppose that we consider all points in the sample space, that is,
, which are a system of random variables defined by the joint pdf described above. Since
and
are functions of
they too are random variables and one can examine the meaning of the following probability statement:
:Under the frequentist construct the model parameters are unknown constants and not permitted to be random variables. Considering all the sample points in the sample space as random variables defined by the joint pdf above, that is all
it can be shown that
and
are functions of random variables and hence random variables. Therefore one can look at the probability of
and
for some
. If
is the true value of
, we can define
and
such that the probability
and
is equal to pre-specified
confidence level
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
.
That is,
where
and
and
are the upper and lower confidence limits for
Coverage probability
The
coverage probability
In statistics, the coverage probability is a technique for calculating a confidence interval which is the proportion of the time that the interval contains the true value of interest. For example, suppose our interest is in the mean number of mon ...
,
, for Neyman construction is the frequency of experiments in which the confidence interval contains the actual value of interest. Generally, the coverage probability is set to a
confidence. For Neyman construction, the coverage probability is set to some value
where
. This value
tells how confident we are that the true value will be contained in the interval.
Implementation
A Neyman construction can be carried out by performing multiple experiments that construct data sets corresponding to a given value of the parameter. The experiments are fitted with conventional methods, and the space of fitted parameter values constitutes the band which the confidence interval can be selected from.
Classic example
Suppose
, where
and
are unknown constants where we wish to estimate
. We can define (2) single value functions,
and
, defined by the process above such that given a pre-specified confidence level,
, and random sample
:
:
where
is the
standard error
The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error o ...
, and the
sample mean
The sample mean (or "empirical mean") and the sample covariance are statistics computed from a sample of data on one or more random variables.
The sample mean is the average value (or mean value) of a sample of numbers taken from a larger popu ...
and standard deviation are:
:
:
The factor
follows a ''t'' distribution with (n-1) degrees of freedom,
~t
Another Example
are iid random variables, and let
. Suppose
. Now to construct a confidence interval with
level of confidence. We know
is sufficient for
. So,
:
:
:
This produces a
confidence interval for
where,
:
:
.
See also
*
Probability interpretations
The word probability has been used in a variety of ways since it was first applied to the mathematical study of games of chance. Does probability measure the real, physical, tendency of something to occur, or is it a measure of how strongly one b ...
References
{{Reflist
Estimation methods