HOME

TheInfoList



OR:

Recursive least squares (RLS) is an
adaptive filter An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. Because of the complexity of the optimization algorit ...
algorithm that recursively finds the coefficients that minimize a weighted linear least squares cost function relating to the input signals. This approach is in contrast to other algorithms such as the
least mean squares Least mean squares (LMS) algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean square of the error signal (difference between the desired and the actual ...
(LMS) that aim to reduce the mean square error. In the derivation of the RLS, the input signals are considered
deterministic Determinism is a philosophical view, where all events are determined completely by previously existing causes. Deterministic theories throughout the history of philosophy have developed from diverse and sometimes overlapping motives and consi ...
, while for the LMS and similar algorithms they are considered
stochastic Stochastic (, ) refers to the property of being well described by a random probability distribution. Although stochasticity and randomness are distinct in that the former refers to a modeling approach and the latter refers to phenomena themselv ...
. Compared to most of its competitors, the RLS exhibits extremely fast convergence. However, this benefit comes at the cost of high computational complexity.


Motivation

RLS was discovered by
Gauss Johann Carl Friedrich Gauss (; german: Gauß ; la, Carolus Fridericus Gauss; 30 April 177723 February 1855) was a German mathematician and physicist who made significant contributions to many fields in mathematics and science. Sometimes refer ...
but lay unused or ignored until 1950 when Plackett rediscovered the original work of Gauss from 1821. In general, the RLS can be used to solve any problem that can be solved by
adaptive filter An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. Because of the complexity of the optimization algorit ...
s. For example, suppose that a signal d(n) is transmitted over an echoey,
noisy channel In information theory, the noisy-channel coding theorem (sometimes Shannon's theorem or Shannon's limit), establishes that for any given degree of noise contamination of a communication channel, it is possible to communicate discrete data (dig ...
that causes it to be received as :x(n)=\sum_^q b_n(k) d(n-k)+v(n) where v(n) represents
additive noise Additive white Gaussian noise (AWGN) is a basic noise model used in information theory to mimic the effect of many random processes that occur in nature. The modifiers denote specific characteristics: * ''Additive'' because it is added to any nois ...
. The intent of the RLS filter is to recover the desired signal d(n) by use of a p+1-tap FIR filter, \mathbf: :d(n) \approx \sum_^ w(k)x(n-k)=\mathbf^\mathit \mathbf_n where \mathbf_n = (n)\quad x(n-1)\quad\ldots\quad x(n-p)T is the
column vector In linear algebra, a column vector with m elements is an m \times 1 matrix consisting of a single column of m entries, for example, \boldsymbol = \begin x_1 \\ x_2 \\ \vdots \\ x_m \end. Similarly, a row vector is a 1 \times n matrix for some n, c ...
containing the p+1 most recent samples of x(n). The estimate of the recovered desired signal is :\hat(n) = \sum_^ w_n(k)x(n-k)=\mathbf_n^\mathit \mathbf_n The goal is to estimate the parameters of the filter \mathbf, and at each time n we refer to the current estimate as \mathbf_n and the adapted least-squares estimate by \mathbf_. \mathbf_n is also a column vector, as shown below, and the
transpose In linear algebra, the transpose of a matrix is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix by producing another matrix, often denoted by (among other notations). The tr ...
, \mathbf_n^\mathit, is a
row vector In linear algebra, a column vector with m elements is an m \times 1 matrix consisting of a single column of m entries, for example, \boldsymbol = \begin x_1 \\ x_2 \\ \vdots \\ x_m \end. Similarly, a row vector is a 1 \times n matrix for some n, c ...
. The
matrix product In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the s ...
\mathbf_n^\mathit \mathbf_n (which is the
dot product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an algebra ...
of \mathbf_n and \mathbf_n) is \hat(n), a scalar. The estimate is ''"good"'' if \hat(n)-d(n) is small in magnitude in some
least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
sense. As time evolves, it is desired to avoid completely redoing the least squares algorithm to find the new estimate for \mathbf_, in terms of \mathbf_n. The benefit of the RLS algorithm is that there is no need to invert matrices, thereby saving computational cost. Another advantage is that it provides intuition behind such results as the
Kalman filter For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimat ...
.


Discussion

The idea behind RLS filters is to minimize a cost function C by appropriately selecting the filter coefficients \mathbf_n, updating the filter as new data arrives. The error signal e(n) and desired signal d(n) are defined in the
negative feedback Negative feedback (or balancing feedback) occurs when some function (Mathematics), function of the output of a system, process, or mechanism is feedback, fed back in a manner that tends to reduce the fluctuations in the output, whether caused by ...
diagram below: The error implicitly depends on the filter coefficients through the estimate \hat(n): :e(n)=d(n)-\hat(n) The weighted least squares error function C—the cost function we desire to minimize—being a function of e(n) is therefore also dependent on the filter coefficients: :C(\mathbf_n)=\sum_^n\lambda^e^2(i) where 0<\lambda\le 1 is the "forgetting factor" which gives exponentially less weight to older error samples. The cost function is minimized by taking the partial derivatives for all entries k of the coefficient vector \mathbf_ and setting the results to zero :\frac=\sum_^n 2\lambda^e(i)\cdot \frac=-\sum_^n 2\lambda^e(i)\,x(i-k)=0 \qquad k=0,1,\ldots,p Next, replace e(n) with the definition of the error signal :\sum_^n\lambda^\left (i)-\sum_^p w_n(\ell)x(i-\ell)\right(i-k)= 0\qquad k=0,1,\ldots,p Rearranging the equation yields :\sum_^p w_n(\ell)\left sum_^n \lambda^\,x(i-l)x(i-k)\right \sum_^n \lambda^d(i)x(i-k) \qquad k=0,1,\ldots,p This form can be expressed in terms of matrices :\mathbf_(n)\,\mathbf_=\mathbf_(n) where \mathbf_(n) is the weighted sample covariance matrix for x(n), and \mathbf_(n) is the equivalent estimate for the cross-covariance between d(n) and x(n). Based on this expression we find the coefficients which minimize the cost function as :\mathbf_=\mathbf_^(n)\,\mathbf_(n) This is the main result of the discussion.


Choosing λ

The smaller \lambda is, the smaller is the contribution of previous samples to the covariance matrix. This makes the filter ''more'' sensitive to recent samples, which means more fluctuations in the filter co-efficients. The \lambda=1 case is referred to as the ''growing window RLS algorithm''. In practice, \lambda is usually chosen between 0.98 and 1. By using type-II maximum likelihood estimation the optimal \lambda can be estimated from a set of data.


Recursive algorithm

The discussion resulted in a single equation to determine a coefficient vector which minimizes the cost function. In this section we want to derive a recursive solution of the form :\mathbf_=\mathbf_+\Delta\mathbf_ where \Delta\mathbf_ is a correction factor at time . We start the derivation of the recursive algorithm by expressing the cross covariance \mathbf_(n) in terms of \mathbf_(n-1) : where \mathbf(i) is the dimensional data vector :\mathbf(i)= (i), x(i-1), \dots , x(i-p) Similarly we express \mathbf_(n) in terms of \mathbf_(n-1) by : In order to generate the coefficient vector we are interested in the inverse of the deterministic auto-covariance matrix. For that task the
Woodbury matrix identity In mathematics (specifically linear algebra), the Woodbury matrix identity, named after Max A. Woodbury, says that the inverse of a rank-''k'' correction of some matrix can be computed by doing a rank-''k'' correction to the inverse of the origina ...
comes in handy. With : The Woodbury matrix identity follows : To come in line with the standard literature, we define : where the ''gain vector'' g(n) is : Before we move on, it is necessary to bring \mathbf(n) into another form : Subtracting the second term on the left side yields : With the recursive definition of \mathbf(n) the desired form follows :\mathbf(n)=\mathbf(n)\mathbf(n) Now we are ready to complete the recursion. As discussed : The second step follows from the recursive definition of \mathbf_(n ). Next we incorporate the recursive definition of \mathbf(n) together with the alternate form of \mathbf(n) and get : With \mathbf_=\mathbf(n-1)\mathbf_(n-1) we arrive at the update equation : where \alpha(n)=d(n)-\mathbf^(n)\mathbf_ is the ''
a priori ("from the earlier") and ("from the later") are Latin phrases used in philosophy to distinguish types of knowledge, justification, or argument by their reliance on empirical evidence or experience. knowledge is independent from current ...
'' error. Compare this with the '' a posteriori'' error; the error calculated ''after'' the filter is updated: :e(n)=d(n)-\mathbf^(n)\mathbf_n That means we found the correction factor :\Delta\mathbf_=\mathbf(n)\alpha(n) This intuitively satisfying result indicates that the correction factor is directly proportional to both the error and the gain vector, which controls how much sensitivity is desired, through the weighting factor, \lambda.


RLS algorithm summary

The RLS algorithm for a ''p''-th order RLS filter can be summarized as The recursion for P follows an algebraic Riccati equation and thus draws parallels to the
Kalman filter For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimat ...
.


Lattice recursive least squares filter (LRLS)

The lattice recursive least squares
adaptive filter An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. Because of the complexity of the optimization algorit ...
is related to the standard RLS except that it requires fewer arithmetic operations (order ''N''). It offers additional advantages over conventional LMS algorithms such as faster convergence rates, modular structure, and insensitivity to variations in eigenvalue spread of the input correlation matrix. The LRLS algorithm described is based on ''a posteriori'' errors and includes the normalized form. The derivation is similar to the standard RLS algorithm and is based on the definition of d(k)\,\!. In the forward prediction case, we have d(k) = x(k)\,\! with the input signal x(k-1)\,\! as the most up to date sample. The backward prediction case is d(k) = x(k-i-1)\,\!, where i is the index of the sample in the past we want to predict, and the input signal x(k)\,\! is the most recent sample.Albu, Kadlec, Softley, Matousek, Hermanek, Coleman, Faga
"Implementation of (Normalised) RLS Lattice on Virtex"
Digital Signal Processing, 2001, accessed December 24, 2011.


Parameter summary

:\kappa_f(k,i)\,\! is the forward reflection coefficient :\kappa_b(k,i)\,\! is the backward reflection coefficient :e_f(k,i)\,\! represents the instantaneous ''a posteriori'' forward prediction error :e_b(k,i)\,\! represents the instantaneous ''a posteriori'' backward prediction error :\xi^d_(k,i)\,\! is the minimum least-squares backward prediction error :\xi^d_(k,i)\,\! is the minimum least-squares forward prediction error :\gamma(k,i)\,\! is a conversion factor between ''a priori'' and ''a posteriori'' errors :v_i(k)\,\! are the feedforward multiplier coefficients. :\varepsilon\,\! is a small positive constant that can be 0.01


LRLS algorithm summary

The algorithm for a LRLS filter can be summarized as


Normalized lattice recursive least squares filter (NLRLS)

The normalized form of the LRLS has fewer recursions and variables. It can be calculated by applying a normalization to the internal variables of the algorithm which will keep their magnitude bounded by one. This is generally not used in real-time applications because of the number of division and square-root operations which comes with a high computational load.


NLRLS algorithm summary

The algorithm for a NLRLS filter can be summarized as


See also

*
Adaptive filter An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. Because of the complexity of the optimization algorit ...
*
Kernel adaptive filter In signal processing, a kernel adaptive filter is a type of nonlinear adaptive filter. An adaptive filter is a filter that adapts its transfer function to changes in signal properties over time by minimizing an error or loss function that characte ...
*
Least mean squares filter Least mean squares (LMS) algorithms are a class of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean square of the error signal (difference between the desired and the actual ...
*
Zero-forcing equalizer The zero-forcing equalizer is a form of linear equalization algorithm used in communication systems which applies the inverse of the frequency response of the channel. This form of equalizer was first proposed by Robert Lucky. The zero-forcing e ...


References

* * Simon Haykin, ''Adaptive Filter Theory'', Prentice Hall, 2002, * M.H.A Davis, R.B. Vinter, ''Stochastic Modelling and Control'', Springer, 1985, * Weifeng Liu, Jose Principe and Simon Haykin, ''Kernel Adaptive Filtering: A Comprehensive Introduction'', John Wiley, 2010, * R.L.Plackett, ''Some Theorems in Least Squares'', Biometrika, 1950, 37, 149–157, * C.F.Gauss, ''Theoria combinationis observationum erroribus minimis obnoxiae'', 1821, Werke, 4. Gottinge


Notes

{{DEFAULTSORT:Recursive Least Squares Filter Digital signal processing Filter theory Statistical signal processing