Linear prediction is a mathematical operation where future values of a

discrete-time In mathematical dynamics, discrete time and continuous time are two alternative frameworks within which variables that evolve over time are modeled. Discrete time Discrete time views values of variables as occurring at distinct, separate "poi ...

signal A signal is both the process and the result of transmission of data over some media accomplished by embedding some variation. Signals are important in multiple subject fields including signal processing, information theory and biology. In ...

are estimated as a

linear function In mathematics, the term linear function refers to two distinct but related notions: * In calculus and related areas, a linear function is a function whose graph is a straight line, that is, a polynomial function of degree zero or one. For di ...

of previous samples. In

digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are a ...

, linear prediction is often called

linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model ...

(LPC) and can thus be viewed as a subset of filter theory. In

system analysis System analysis in the field of electrical engineering characterizes electrical systems and their properties. System analysis can be used to represent almost anything from population growth to audio speakers; electrical engineers often use it b ...

, a subfield of

mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, linear prediction can be viewed as a part of

mathematical model A mathematical model is an abstract and concrete, abstract description of a concrete system using mathematics, mathematical concepts and language of mathematics, language. The process of developing a mathematical model is termed ''mathematical m ...

ling or

optimization Mathematical optimization (alternatively spelled ''optimisation'') or mathematical programming is the selection of a best element, with regard to some criteria, from some set of available alternatives. It is generally divided into two subfiel ...

The prediction model

The most common representation is :

\widehat(n) = \sum_^p a_i x(n-i)\,

where

\widehat(n)

is the predicted signal value,

x(n-i)

the previous observed values, with

p \leq n

, and

a_i

the predictor coefficients. The error generated by this estimate is :

e(n) = x(n) - \widehat(n)\,

where

x(n)

is the true signal value. These equations are valid for all types of (one-dimensional) linear prediction. The differences are found in the way the predictor coefficients

a_i

are chosen. For multi-dimensional signals the error metric is often defined as :

e(n) = \, x(n) - \widehat(n)\, \,

where

\, \cdot\,

is a suitable chosen vector norm. Predictions such as

\widehat(n)

are routinely used within

Kalman filter In statistics and control theory, Kalman filtering (also known as linear quadratic estimation) is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, to produce estimates of unk ...

s and smoothers to estimate current and past signal values, respectively, from noisy measurements.

Estimating the parameters

The most common choice in optimization of parameters

a_i

is the

root mean square In mathematics, the root mean square (abbrev. RMS, or rms) of a set of values is the square root of the set's mean square. Given a set x_i, its RMS is denoted as either x_\mathrm or \mathrm_x. The RMS is also known as the quadratic mean (denote ...

criterion which is also called the

autocorrelation Autocorrelation, sometimes known as serial correlation in the discrete time case, measures the correlation of a signal with a delayed copy of itself. Essentially, it quantifies the similarity between observations of a random variable at differe ...

criterion. In this method we minimize the expected value of the squared error

E^2(n) /math>, which yields the equation

: \sum_^p a_i R(j-i) = R(j), for 1 ≤ ''j'' ≤ ''p'', where ''R'' is the

of signal ''x''_''n'', defined as :

\ R(i) = E\\,

, and ''E'' is the

expected value In probability theory, the expected value (also called expectation, expectancy, expectation operator, mathematical expectation, mean, expectation value, or first Moment (mathematics), moment) is a generalization of the weighted average. Informa ...

. In the multi-dimensional case this corresponds to minimizing the L₂ norm. The above equations are called the normal equations or Yule-Walker equations. In matrix form the equations can be equivalently written as :

\mathbf = \mathbf

where the autocorrelation matrix

\mathbf

is a symmetric,

p \times p

Toeplitz matrix with elements

r_ = R(i-j), 0 \leq i, j, the vector \mathbf is the autocorrelation vector r_j = R(j), 0, and \mathbf =_1, a_2, \,\cdots\, , a_, a_p /math>, the parameter vector.

Another, more general, approach is to minimize the sum of squares of the errors defined in the form

: e(n) = x(n) - \widehat(n) = x(n) - \sum_^p a_i x(n-i) = - \sum_^p a_i x(n-i) where the optimisation problem searching over all a_i must now be constrained with a_0=-1 .

On the other hand, if the mean square prediction error is constrained to be unity and the prediction error equation is included on top of the normal equations, the augmented set of equations is obtained as

: \ \mathbf =, 0, ... , 0 where the index i ranges from 0 to p, and \mathbf is a (p+1)\times(p+1) matrix.

Specification of the parameters of the linear predictor is a wide topic and a large number of other approaches have been proposed. In fact, the autocorrelation method is the most common and it is used, for example, for

speech coding Speech coding is an application of data compression to digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...

in the

GSM The Global System for Mobile Communications (GSM) is a family of standards to describe the protocols for second-generation (2G) digital cellular networks, as used by mobile devices such as mobile phones and Mobile broadband modem, mobile broadba ...

standard. Solution of the matrix equation

\mathbf = \mathbf

is computationally a relatively expensive process. The

Gaussian elimination In mathematics, Gaussian elimination, also known as row reduction, is an algorithm for solving systems of linear equations. It consists of a sequence of row-wise operations performed on the corresponding matrix of coefficients. This method can a ...

for matrix inversion is probably the oldest solution but this approach does not efficiently use the symmetry of

\mathbf

. A faster algorithm is the Levinson recursion proposed by

Norman Levinson Norman Levinson (August 11, 1912 in Lynn, Massachusetts – October 10, 1975 in Boston) was an American mathematician. Some of his major contributions were in the study of Fourier transforms, complex analysis, non-linear differential equations, ...

in 1947, which recursively calculates the solution. In particular, the autocorrelation equations above may be more efficiently solved by the Durbin algorithm. In 1986, Philippe Delsarte and Y.V. Genin proposed an improvement to this algorithm called the split Levinson recursion, which requires about half the number of multiplications and divisions.Delsarte, P. and Genin, Y. V. (1986), ''The split Levinson algorithm'', ''IEEE Transactions on Acoustics, Speech, and Signal Processing'', v. ASSP-34(3), pp. 470–478 It uses a special symmetrical property of parameter vectors on subsequent recursion levels. That is, calculations for the optimal predictor containing

p

terms make use of similar calculations for the optimal predictor containing

p-1

terms. Another way of identifying model parameters is to iteratively calculate state estimates using

s and obtaining

maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...

estimates within

expectation–maximization algorithm In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent varia ...

s. For equally-spaced values, a polynomial interpolation is a linear combination of the known values. If the discrete time signal is estimated to obey a polynomial of degree

p-1,

then the predictor coefficients

a_i

are given by the corresponding row of the triangle of binomial transform coefficients. This estimate might be suitable for a slowly varying signal with low noise. The predictions for the first few values of

p

are :

\begin
p=1 & : & \widehat(n) = 1x(n-1)\\
p=2 & : & \widehat(n) = 2x(n-1) - 1x(n-2) \\
p=3 & : & \widehat(n) = 3x(n-1) - 3x(n-2) + 1x(n-3)\\
p=4 & : & \widehat(n) = 4x(n-1) - 6x(n-2) + 4x(n-3) - 1x(n-4)\\
\end

References

External links

PLP and RASTA (and MFCC, and inversion) in Matlab
{{DEFAULTSORT:Linear prediction Signal estimation Statistical forecasting

The prediction model

Estimating the parameters

See also

References

Further reading

External links