Scoring algorithm, also known as Fisher's scoring, is a form of

Newton's method In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-valu ...

used in

statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...

to solve

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...

equations numerically, named after

Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...

Sketch of derivation

Let

Y_1,\ldots,Y_n

random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...

s, independent and identically distributed with twice differentiable p.d.f.

f(y; \theta)

, and we wish to calculate the

maximum likelihood estimator In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statist ...

(M.L.E.)

\theta^*

\theta

. First, suppose we have a starting point for our algorithm

\theta_0

, and consider a

Taylor expansion In mathematics, the Taylor series or Taylor expansion of a function is an infinite sum of terms that are expressed in terms of the function's derivatives at a single point. For most common functions, the function and the sum of its Taylor serie ...

of the score function,

V(\theta)

, about

\theta_0

: :

V(\theta) \approx V(\theta_0) - \mathcal(\theta_0)(\theta - \theta_0), \,

where :

\mathcal(\theta_0) = - \sum_^n \left. \nabla \nabla^ \_ \log f(Y_i ; \theta)

is the observed information matrix at

\theta_0

. Now, setting

\theta = \theta^*

, using that

V(\theta^*) = 0

and rearranging gives us: :

\theta^* \approx \theta_ + \mathcal^(\theta_)V(\theta_). \,

We therefore use the algorithm :

\theta_ = \theta_ + \mathcal^(\theta_)V(\theta_), \,

and under certain regularity conditions, it can be shown that

\theta_m \rightarrow \theta^*

Fisher scoring

In practice,

\mathcal(\theta)

is usually replaced by

\mathcal(\theta)= \mathrm mathcal(\theta) /math>, the

Fisher information In mathematical statistics, the Fisher information (sometimes simply called information) is a way of measuring the amount of information that an observable random variable ''X'' carries about an unknown parameter ''θ'' of a distribution that model ...

, thus giving us the Fisher Scoring Algorithm: :

\theta_ = \theta_ + \mathcal^(\theta_)V(\theta_)

.. Under some regularity conditions, if

\theta_m

is a

consistent estimator In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter ''θ''0—having the property that as the number of data points used increases indefinitely, the result ...

, then

\theta_

(the correction after a single step) is 'optimal' in the sense that its error distribution is asymptotically identical to that of the true max-likelihood estimate.

Sketch of derivation

Fisher scoring

See also

References

Further reading