Savitzky–Golay Filter
   HOME

TheInfoList



OR:

A Savitzky–Golay filter is a
digital filter In signal processing, a digital filter is a system that performs mathematical operations on a Sampling (signal processing), sampled, discrete-time signal to reduce or enhance certain aspects of that signal. This is in contrast to the other ma ...
that can be applied to a set of
digital data Digital data, in information theory and information systems, is information represented as a string of Discrete mathematics, discrete symbols, each of which can take on one of only a finite number of values from some alphabet (formal languages ...
points for the purpose of
smoothing In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. In smoothing, the d ...
the data, that is, to increase the precision of the data without distorting the signal tendency. This is achieved, in a process known as
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
, by fitting successive sub-sets of adjacent data points with a low-degree
polynomial In mathematics, a polynomial is a Expression (mathematics), mathematical expression consisting of indeterminate (variable), indeterminates (also called variable (mathematics), variables) and coefficients, that involves only the operations of addit ...
by the method of
linear least squares Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and ...
. When the data points are equally spaced, an
analytical solution In mathematics, an expression or equation is in closed form if it is formed with constants, variables, and a set of functions considered as ''basic'' and connected by arithmetic operations (, and integer powers) and function composition. C ...
to the least-squares equations can be found, in the form of a single set of "convolution coefficients" that can be applied to all data sub-sets, to give estimates of the smoothed signal, (or derivatives of the smoothed signal) at the central point of each sub-set. The method, based on established mathematical procedures,. "Graduation Formulae obtained by fitting a Polynomial." was popularized by
Abraham Savitzky Abraham Savitzky (May 29, 1919 in New York City – February 5, 1999 in Naples, Florida) was an United States of America, American analytical chemist. Biography Savitzky received his bachelor's degree from the New York State College for Teachers ...
and Marcel J. E. Golay, who published tables of convolution coefficients for various polynomials and sub-set sizes in 1964. Some errors in the tables have been corrected. The method has been extended for the treatment of 2- and 3-dimensional data. Savitzky and Golay's paper is one of the most widely cited papers in the journal ''Analytical Chemistry'' and is classed by that journal as one of its "10 seminal papers" saying "it can be argued that the dawn of the computer-controlled analytical instrument can be traced to this article".


Applications

The data consists of a set of points \; j = 1, ..., n, where x_j is an independent variable and y_j is an observed value. They are treated with a set of m convolution coefficients, C_i, according to the expression :Y_j= \sum _^C_i\, y_,\qquad \frac \le j \le n-\frac Selected convolution coefficients are shown in the tables, below. For example, for smoothing by a 5-point quadratic polynomial, m = 5, i = -2, -1, 0, 1, 2 and the j^ smoothed data point, Y_j, is given by :Y_j = \frac (-3 y_ + 12 y_ + 17 y_j + 12 y_ -3 y_), where, C_ = -3/35, C_ = 12/35, etc. There are numerous applications of smoothing, such as avoiding the propagation of noise through an algorithm chain, or sometimes simply to make the data appear to be less noisy than it really is. The following are applications of numerical differentiation of data. Note When calculating the ''n''th derivative, an additional scaling factor of \frac may be applied to all calculated data points to obtain absolute values (see expressions for \frac, below, for details). #Location of
maxima and minima In mathematical analysis, the maximum and minimum of a function are, respectively, the greatest and least value taken by the function. Known generically as extremum, they may be defined either within a given range (the ''local'' or ''relative ...
in experimental data curves. This was the application that first motivated Savitzky. The first derivative of a function is zero at a maximum or minimum. The diagram shows data points belonging to a synthetic Lorentzian curve, with added noise (blue diamonds). Data are plotted on a scale of half width, relative to the peak maximum at zero. The smoothed curve (red line) and 1st derivative (green) were calculated with 7-point cubic Savitzky–Golay filters.
Linear interpolation In mathematics, linear interpolation is a method of curve fitting using linear polynomials to construct new data points within the range of a discrete set of known data points. Linear interpolation between two known points If the two known po ...
of the first derivative values at positions either side of the zero-crossing gives the position of the peak maximum. 3rd derivatives can also be used for this purpose. #Location of an end-point in a
titration curve Titrations are often recorded on graphs called titration curves, which generally contain the volume of the titrant as the independent variable and the pH of the solution as the dependent variable (because it changes depending on the composition ...
. An end-point is an
inflection point In differential calculus and differential geometry, an inflection point, point of inflection, flex, or inflection (rarely inflexion) is a point on a smooth plane curve at which the curvature changes sign. In particular, in the case of the graph ...
where the second derivative of the function is zero. The titration curve for
malonic acid Malonic acid is a dicarboxylic acid with structure CH2(COOH)2. The ionized form of malonic acid, as well as its esters and salts, are known as malonates. For example, diethyl malonate is malonic acid's diethyl ester. The name originates from ...
illustrates the power of the method. The first end-point at 4 ml is barely visible, but the second derivative allows its value to be easily determined by linear interpolation to find the zero crossing. #Baseline flattening. In
analytical chemistry Analytical skill, Analytical chemistry studies and uses instruments and methods to Separation process, separate, identify, and Quantification (science), quantify matter. In practice, separation, identification or quantification may constitute t ...
it is sometimes necessary to measure the height of an
absorption band In spectroscopy, an absorption band is a range of wavelengths, frequency, frequencies or energies in the electromagnetic spectrum that are characteristic of a particular transition from initial to final state in a substance. According to quantum ...
against a curved baseline. Because the curvature of the baseline is much less than the curvature of the absorption band, the second derivative effectively flattens the baseline. Three measures of the derivative height, which is proportional to the absorption band height, are the "peak-to-valley" distances h1 and h2, and the height from baseline, h3. #Resolution enhancement in spectroscopy. Bands in the second derivative of a spectroscopic curve are narrower than the bands in the spectrum: they have reduced half-width. This allows partially overlapping bands to be "resolved" into separate (negative) peaks. The diagram illustrates how this may be used also for
chemical analysis Analytical chemistry studies and uses instruments and methods to separate, identify, and quantify matter. In practice, separation, identification or quantification may constitute the entire analysis or be combined with another method. Separa ...
, using measurement of "peak-to-valley" distances. In this case the valleys are a property of the 2nd derivative of a Lorentzian. (''x''-axis position is relative to the position of the peak maximum on a scale of half width at half height). #Resolution enhancement with 4th derivative (positive peaks). The minima are a property of the 4th derivative of a Lorentzian.


Moving average

The "moving average filter" is a trivial example of a Savitzky–Golay filter that is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends or cycles. Each subset of the data set is fit with a straight horizontal line as opposed to a higher order polynomial. An unweighted moving average filter is the simplest convolution filter. The moving average is often used for a quick technical analysis of financial data, like stock prices, returns or trading volumes. It is also used in economics to examine gross domestic product, employment or other macroeconomic time series. It was not included in some tables of Savitzky-Golay convolution coefficients as all the coefficient values are identical, with the value \frac.


Derivation of convolution coefficients

When the data points are equally spaced, an
analytical solution In mathematics, an expression or equation is in closed form if it is formed with constants, variables, and a set of functions considered as ''basic'' and connected by arithmetic operations (, and integer powers) and function composition. C ...
to the least-squares equations can be found. This solution forms the basis of the
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
method of numerical smoothing and differentiation. Suppose that the data consists of a set of ''n'' points (''x''''j'', ''y''''j'') (''j'' = 1, ..., ''n''), where ''x''''j'' is an independent variable and ''y''''j'' is a datum value. A polynomial will be fitted by
linear least squares Linear least squares (LLS) is the least squares approximation of linear functions to data. It is a set of formulations for solving statistical problems involved in linear regression, including variants for ordinary (unweighted), weighted, and ...
to a set of ''m'' (an odd number) adjacent data points, each separated by an interval ''h''. Firstly, a change of variable is made :z = where is the value of the central point. ''z'' takes the values \tfrac,\cdots,0,\cdots,\tfrac (e.g. ''m'' = 5 → ''z'' = −2, −1, 0, 1, 2).With even values of ''m'', ''z'' will run from 1 − ''m'' to ''m'' − 1 in steps of 2 The polynomial, of degree ''k'' is defined as :Y = a_0 + a_1 z + a_2 z^2 \cdots + a_k z^k.The simple
moving average In statistics, a moving average (rolling average or running average or moving mean or rolling mean) is a calculation to analyze data points by creating a series of averages of different selections of the full data set. Variations include: #Simpl ...
is a special case with ''k'' = 0, ''Y'' = ''a''0. In this case all convolution coefficients are equal to 1/''m''.
The coefficients ''a''0, ''a''1 etc. are obtained by solving the normal equations (bold a represents a
vector Vector most often refers to: * Euclidean vector, a quantity with a magnitude and a direction * Disease vector, an agent that carries and transmits an infectious pathogen into another living organism Vector may also refer to: Mathematics a ...
, bold J represents a
matrix Matrix (: matrices or matrixes) or MATRIX may refer to: Science and mathematics * Matrix (mathematics), a rectangular array of numbers, symbols or expressions * Matrix (logic), part of a formula in prenex normal form * Matrix (biology), the m ...
). : = \left( \right)^ ^ , where \mathbf J is a
Vandermonde matrix In linear algebra, a Vandermonde matrix, named after Alexandre-Théophile Vandermonde, is a matrix with the terms of a geometric progression in each row: an (m + 1) \times (n + 1) matrix :V = V(x_0, x_1, \cdots, x_m) = \begin 1 & x_0 & x_0^2 & \dot ...
, that is i-th row of \mathbf J has values 1, z_i, z_i^2, \dots. For example, for a cubic polynomial fitted to 5 points, ''z''= −2, −1, 0, 1, 2 the normal equations are solved as follows. : \mathbf = \begin 1 & -2 & 4 & -8 \\ 1 & - 1 & 1 & -1 \\ 1 & 0 & 0 & 0 \\ 1 & 1 & 1 & 1 \\ 1 & 2 & 4 & 8 \end : \mathbf = \begin m & \sum z & \sum z^2 & \sum z^3 \\ \sum z & \sum z^2 & \sum z^3 & \sum z^4 \\ \sum z^2 & \sum z^3 & \sum z^4 & \sum z^5 \\ \sum z^3 & \sum z^4 & \sum z^5 & \sum z^6 \\ \end= \begin m & 0 & \sum z^2 &0 \\ 0 & \sum z^2 & 0 & \sum z^4 \\ \sum z^2 & 0 & \sum z^4 &0 \\ 0 &\sum z^4 & 0 & \sum z^6 \\ \end= \begin 5 & 0 & 10 & 0 \\ 0 & 10 & 0 & 34 \\ 10 & 0 & 34 & 0 \\ 0 & 34 & 0 & 130\\ \end Now, the normal equations can be factored into two separate sets of equations, by rearranging rows and columns, with : \mathbf _\text=\begin 5 & 10 \\ 10 & 34 \\ \end \quad \mathrm \quad \mathbf _\text=\begin 10 & 34 \\ 34 & 130 \\ \end Expressions for the inverse of each of these matrices can be obtained using
Cramer's rule In linear algebra, Cramer's rule is an explicit formula for the solution of a system of linear equations with as many equations as unknowns, valid whenever the system has a unique solution. It expresses the solution in terms of the determinants of ...
: (\mathbf)^_\text= \begin 34 & -10\\ -10 & 5 \\ \end \quad \mathrm \quad (\mathbf)^_\text= \begin 130 & -34\\ -34 & 10 \\ \end The normal equations become : \begin \\ \\\end_j= \begin 34 & -10\\ -10 & 5 \end \begin 1&1&1&1&1\\ 4&1&0&1&4\\ \end \beginy_ \\ y_ \\ y_j \\ y_ \\ y_ \end and : \begin a_1 \\ a_3 \\\end_j= \begin 130 & -34\\ -34 & 10 \\ \end \begin -2&-1&0&1&2\\ -8&-1&0&1&8\\ \end \beginy_ \\ y_ \\ y_j \\ y_ \\ y_ \\\end Multiplying out and removing common factors, :\begin a_&=(-3y_+12y_+17y_j+12y_-3y_)\\ a_&=(y_ - 8y_ +8y_ -y_)\\ a_&=(2y_ - y_ -2y_j -y_ +2y_)\\ a_&=(-y_ +2y_ -2y_ +y_) \end The coefficients of ''y'' in these expressions are known as
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions f and g that produces a third function f*g, as the integral of the product of the two ...
coefficients. They are elements of the matrix : \mathbf In general, :(C \otimes y)_j\ = Y_j= \sum _^C_i\, y_,\qquad \frac \le j \le n-\frac In matrix notation this example is written as : \begin Y_3\\ Y_4\\ Y_5\\ \vdots \end = \begin - 3 & 12 & 17 & 12 & - 3 & 0 & 0 & \cdots \\ 0 & - 3 & 12 & 17 & 12 & - 3 & 0 & \cdots \\ 0 & 0 & - 3 & 12 & 17 & 12 & - 3 & \cdots\\ \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots \end \begin y_1\\ y_2\\ y_3\\ y_4\\ y_5\\ y_6\\ y_7\\ \vdots \end Tables of convolution coefficients, calculated in the same way for ''m'' up to 25, were published for the Savitzky–Golay smoothing filter in 1964, The value of the central point, ''z'' = 0, is obtained from a single set of coefficients, ''a''0 for smoothing, ''a''1 for 1st derivative etc. The numerical derivatives are obtained by differentiating ''Y''. This means that the ''derivatives are calculated for the smoothed data curve''. For a cubic polynomial :\begin Y &= a_0 + a_1 z + a_2 z^2 + a_3 z^3= a_0 &\text z = 0, x=\bar x\\ \frac &= \frac\left( \right) = \fraca_1 &\text z = 0, x=\bar x\\ \frac &= \frac\left( \right) = \fraca_2 &\text z =0, x=\bar x\\ \frac &= \fraca_3 \end In general, polynomials of degree (0 and 1),Smoothing using the moving average is equivalent, with equally spaced points, to local fitting with a (sloping) straight line (2 and 3), (4 and 5) etc. give the same coefficients for smoothing and even derivatives. Polynomials of degree (1 and 2), (3 and 4) etc. give the same coefficients for odd derivatives.


Algebraic expressions

It is not necessary always to use the Savitzky–Golay tables. The summations in the matrix JTJ can be evaluated in closed form, :\begin \sum_^ z^2 &= \\ \sum z^4 &= \\ \sum z^6 &= \end so that algebraic formulae can be derived for the convolution coefficients.The expressions given here are different from those of Madden, which are given in terms of the variable m' = (m − 1)/2. Functions that are suitable for use with a curve that has an
inflection point In differential calculus and differential geometry, an inflection point, point of inflection, flex, or inflection (rarely inflexion) is a point on a smooth plane curve at which the curvature changes sign. In particular, in the case of the graph ...
are: :Smoothing, polynomial degree 2,3 : C_ = \frac; \quad \frac \le i \le \frac (the range of values for ''i'' also applies to the expressions below) :1st derivative: polynomial degree 3,4 C_ = \frac :2nd derivative: polynomial degree 2,3 C_ = \frac :3rd derivative: polynomial degree 3,4 C_ = \frac Simpler expressions that can be used with curves that don't have an inflection point are: :Smoothing, polynomial degree 0,1 (moving average): C_ = \frac :1st derivative, polynomial degree 1,2: C_ = \frac Higher derivatives can be obtained. For example, a fourth derivative can be obtained by performing two passes of a second derivative function.


Use of orthogonal polynomials

An alternative to fitting ''m'' data points by a simple polynomial in the subsidiary variable, ''z'', is to use
orthogonal polynomials In mathematics, an orthogonal polynomial sequence is a family of polynomials such that any two different polynomials in the sequence are orthogonal In mathematics, orthogonality (mathematics), orthogonality is the generalization of the geom ...
. :Y = b_0 P^0(z) + b_1 P^1(z) \cdots + b_k P^k(z). where ''P''0, ..., ''P''''k'' is a set of mutually orthogonal polynomials of degree 0, ..., ''k''. Full details on how to obtain expressions for the orthogonal polynomials and the relationship between the coefficients ''b'' and ''a'' are given by Guest. Expressions for the convolution coefficients are easily obtained because the normal equations matrix, ''J''T''J'', is a
diagonal matrix In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagon ...
as the product of any two orthogonal polynomials is zero by virtue of their mutual orthogonality. Therefore, each non-zero element of its inverse is simply the reciprocal of the corresponding element in the normal equation matrix. The calculation is further simplified by using
recursion Recursion occurs when the definition of a concept or process depends on a simpler or previous version of itself. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in m ...
to build orthogonal Gram polynomials. The whole calculation can be coded in a few lines of PASCAL, a computer language well-adapted for calculations involving recursion.


Treatment of first and last points

Savitzky–Golay filters are most commonly used to obtain the smoothed or derivative value at the central point, ''z'' = 0, using a single set of convolution coefficients. (''m'' − 1)/2 points at the start and end of the series cannot be calculated using this process. Various strategies can be employed to avoid this inconvenience. * The data could be artificially extended by adding, in reverse order, copies of the first (''m'' − 1)/2 points at the beginning and copies of the last (''m'' − 1)/2 points at the end. For instance, with ''m'' = 5, two points are added at the start and end of the data ''y''1, ..., ''y''''n''. :''y''3,''y''2,''y''1, ... ,''y''''n'', ''y''''n''−1, ''y''''n''−2. *Looking again at the fitting polynomial, it is obvious that data can be calculated for all values of ''z'' by using all sets of convolution coefficients for a single polynomial, a0 .. ak. :For a cubic polynomial :\begin Y&=a_0+a_1z + a_2 z^2 + a_3 z^3\\ \frac &= \frac(a_1 + 2a_2 z + 3a_3 z^2) \\ \frac &= \frac 1 (2a_2 + 6a_3 z) \\ \frac &= \fraca_3 \end *Convolution coefficients for the missing first and last points can also be easily obtained. This is also equivalent to fitting the first (''m'' + 1)/2 points with the same polynomial, and similarly for the last points.


Weighting the data

It is implicit in the above treatment that the data points are all given equal weight. Technically, the
objective function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost ...
:U = \sum_i w_i(Y_i-y_i)^2 being minimized in the least-squares process has unit weights, ''w''''i'' = 1. When weights are not all the same the normal equations become :\mathbf a = \left( \mathbf \mathbf \right)^ \mathbf \mathbf\qquad W_ \ne 1, If the same set of diagonal weights is used for all data subsets, W = \text(w_1,w_2,...,w_m), an analytical solution to the normal equations can be written down. For example, with a quadratic polynomial, : \mathbf = \begin \sum w_i~~~ & \sum w_iz_i & \sum w_iz_i^2 \\ \sum w_iz_i & \sum w_iz_i^2 & \sum w_iz_i^3 \\ \sum w_iz_i^2 & \sum w_iz_i^3 & \sum w_iz_i^4 \end An explicit expression for the inverse of this matrix can be obtained using
Cramer's rule In linear algebra, Cramer's rule is an explicit formula for the solution of a system of linear equations with as many equations as unknowns, valid whenever the system has a unique solution. It expresses the solution in terms of the determinants of ...
. A set of convolution coefficients may then be derived as :\mathbf C = \left( \mathbf \mathbf \right)^ \mathbf. Alternatively the coefficients, C, could be calculated in a spreadsheet, employing a built-in matrix inversion routine to obtain the inverse of the normal equations matrix. This set of coefficients, once calculated and stored, can be used with all calculations in which the same weighting scheme applies. A different set of coefficients is needed for each different weighting scheme. It was shown that Savitzky–Golay filter can be improved by introducing weights that decrease at the ends of the fitting interval.


Two-dimensional convolution coefficients

Two-dimensional smoothing and differentiation can also be applied to tables of data values, such as intensity values in a photographic image which is composed of a rectangular grid of pixels. Such a grid is referred as a kernel, and the data points that constitute the kernel are referred as nodes. The trick is to transform the rectangular kernel into a single row by a simple ordering of the indices of the nodes. Whereas the one-dimensional filter coefficients are found by fitting a polynomial in the subsidiary variable ''z'' to a set of ''m'' data points, the two-dimensional coefficients are found by fitting a polynomial in subsidiary variables ''v'' and ''w'' to a set of the values at the ''m'' × ''n'' kernel nodes. The following example, for a bivariate polynomial of total degree 3, ''m'' = 7, and ''n'' = 5, illustrates the process, which parallels the process for the one dimensional case, above. :v = \frac; w = \frac :Y = a_ + a_v + a_w + a_v^2 + a_vw+a_w^2 + a_v^3 + a_v^2w + a_vw^2 + a_w^3 The rectangular kernel of 35 data values, : becomes a vector when the rows are placed one after another. :d = (''d''1 ... ''d''35)T The Jacobian has 10 columns, one for each of the parameters ''a''00 − ''a''03, and 35 rows, one for each pair of ''v'' and ''w'' values. Each row has the form :J_\text = 1\quad v\quad w\quad v^2\quad vw\quad w^2\quad v^3\quad v^2w\quad vw^2\quad w^3 The convolution coefficients are calculated as :\mathbf = \left( \mathbf^T \mathbf \right)^ \mathbf^T The first row of C contains 35 convolution coefficients, which can be multiplied with the 35 data values, respectively, to obtain the polynomial coefficient a_, which is the smoothed value at the central node of the kernel (i.e. at the 18th node of the above table). Similarly, other rows of C can be multiplied with the 35 values to obtain other polynomial coefficients, which, in turn, can be used to obtain smoothed values and different smoothed partial derivatives at different nodes. Nikitas and Pappa-Louisi showed that depending on the format of the used polynomial, the quality of smoothing may vary significantly. They recommend using the polynomial of the form : Y = \sum_^p \sum_^q a_ v^i w^j because such polynomials can achieve good smoothing both in the central and in the near-boundary regions of a kernel, and therefore they can be confidently used in smoothing both at the internal and at the near-boundary data points of a sampled domain. In order to avoid ill-conditioning when solving the least-squares problem, ''p'' < ''m'' and ''q'' < ''n''. For software that calculates the two-dimensional coefficients and for a database of such C's, see the section on multi-dimensional convolution coefficients, below.


Multi-dimensional convolution coefficients

The idea of two-dimensional convolution coefficients can be extended to the higher spatial dimensions as well, in a straightforward manner, by arranging multidimensional distribution of the kernel nodes in a single row. Following the aforementioned finding by Nikitas and Pappa-Louisi in two-dimensional cases, usage of the following form of the polynomial is recommended in multidimensional cases: : Y = \sum_^ \sum_^ \cdots \sum_^ (a_ \times u_1^ u_2^ \cdots u_D^) where ''D'' is the dimension of the space, a 's are the polynomial coefficients, and ''u'''s are the coordinates in the different spatial directions. Algebraic expressions for partial derivatives of any order, be it mixed or otherwise, can be easily derived from the above expression. Note that C depends on the manner in which the kernel nodes are arranged in a row and on the manner in which the different terms of the expanded form of the above polynomial is arranged, when preparing the Jacobian. Accurate computation of C in multidimensional cases becomes challenging, as precision of standard floating point numbers available in computer programming languages no longer remain sufficient. The insufficient precision causes the floating point truncation errors to become comparable to the magnitudes of some C elements, which, in turn, severely degrades its accuracy and renders it useless
Chandra Shekhar
has brought forth two
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use and view the source code, design documents, or content of the product. The open source model is a decentrali ...
software, Advanced Convolution Coefficient Calculator (ACCC) and Precise Convolution Coefficient Calculator (PCCC), which handle these accuracy issues adequately. ACCC performs the computation by using floating point numbers, in an iterative manner. The precision of the floating-point numbers is gradually increased in each iteration, by using
GNU MPFR The GNU Multiple Precision Floating-Point Reliable Library (GNU MPFR) is a GNU portable C library for arbitrary-precision binary floating-point computation with correct rounding, based on GNU Multi-Precision Library. Library MPFR's computati ...
. Once the obtained C's in two consecutive iterations start having same significant digits until a pre-specified distance, the convergence is assumed to have reached. If the distance is sufficiently large, the computation yields a highly accurate C. PCCC employs rational number calculations, by using
GNU Multiple Precision Arithmetic Library GNU Multiple Precision Arithmetic Library (GMP) is a free software, free library for arbitrary-precision arithmetic, operating on Sign (mathematics), signed integers, Rational data type, rational numbers, and Floating-point arithmetic, floating-p ...
, and yields a fully accurate C, in the
rational number In mathematics, a rational number is a number that can be expressed as the quotient or fraction of two integers, a numerator and a non-zero denominator . For example, is a rational number, as is every integer (for example, The set of all ...
format. In the end, these rational numbers are converted into floating point numbers, until a pre-specified number of significant digits.
A database of C's
that are calculated by using ACCC, for symmetric kernels and both symmetric and asymmetric polynomials, on unity-spaced kernel nodes, in the 1, 2, 3, and 4 dimensional spaces, is made available. Chandra Shekhar has also laid out a mathematical framework that describes usage of C calculated on unity-spaced kernel nodes to perform filtering and partial differentiations (of various orders) on non-uniformly spaced kernel nodes, allowing usage of C provided in the aforementioned database. Although this method yields approximate results only, they are acceptable in most engineering applications, provided that non-uniformity of the kernel nodes is weak.


Some properties of convolution

#The sum of convolution coefficients for smoothing is equal to one. The sum of coefficients for odd derivatives is zero. #The sum of squared convolution coefficients for smoothing is equal to the value of the central coefficient. #Smoothing of a function leaves the area under the function unchanged. #Convolution of a symmetric function with even-derivative coefficients conserves the centre of symmetry. #Properties of derivative filters.


Signal distortion and noise reduction

It is inevitable that the signal will be distorted in the convolution process. From property 3 above, when data which has a peak is smoothed the peak height will be reduced and the half-width will be increased. Both the extent of the distortion and S/N (
signal-to-noise ratio Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in deci ...
) improvement: *decrease as the degree of the polynomial increases *increase as the width, ''m'' of the convolution function increases For example, If the noise in all data points is uncorrelated and has a constant
standard deviation In statistics, the standard deviation is a measure of the amount of variation of the values of a variable about its Expected value, mean. A low standard Deviation (statistics), deviation indicates that the values tend to be close to the mean ( ...
, ''σ'', the standard deviation on the noise will be decreased by convolution with an ''m''-point smoothing function toThe expressions under the square root sign are the same as the expression for the convolution coefficient with z=0 :polynomial degree 0 or 1:\sqrt \sigma (
moving average In statistics, a moving average (rolling average or running average or moving mean or rolling mean) is a calculation to analyze data points by creating a series of averages of different selections of the full data set. Variations include: #Simpl ...
) :polynomial degree 2 or 3: \sqrt \sigma. These functions are shown in the plot at the right. For example, with a 9-point linear function (moving average) two thirds of the noise is removed and with a 9-point quadratic/cubic smoothing function only about half the noise is removed. Most of the noise remaining is low-frequency noise(see ''Frequency characteristics of convolution filters'', below). Although the moving average function gives better noise reduction it is unsuitable for smoothing data which has curvature over ''m'' points. A quadratic filter function is unsuitable for getting a derivative of a data curve with an
inflection point In differential calculus and differential geometry, an inflection point, point of inflection, flex, or inflection (rarely inflexion) is a point on a smooth plane curve at which the curvature changes sign. In particular, in the case of the graph ...
because a quadratic polynomial does not have one. The optimal choice of polynomial order and number of convolution coefficients will be a compromise between noise reduction and distortion.


Multipass filters

One way to mitigate distortion and improve noise removal is to use a filter of smaller width and perform more than one convolution with it. For two passes of the same filter this is equivalent to one pass of a filter obtained by convolution of the original filter with itself. For example, 2 passes of the filter with coefficients (1/3, 1/3, 1/3) is equivalent to 1 pass of the filter with coefficients (1/9, 2/9, 3/9, 2/9, 1/9). The disadvantage of multipassing is that the equivalent filter width for n passes of an m–point function is n (m-1) + 1 so multipassing is subject to greater end-effects. Nevertheless, multipassing has been used to great advantage. For instance, some 40–80 passes on data with a signal-to-noise ratio of only 5 gave useful results. The noise reduction formulae given above do not apply because
correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
between calculated data points increases with each pass.


Frequency characteristics of convolution filters

Convolution maps to multiplication in the Fourier co-domain. The
discrete Fourier transform In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced Sampling (signal processing), samples of a function (mathematics), function into a same-length sequence of equally-spaced samples of the discre ...
of a convolution filter is a
real-valued function In mathematics, a real-valued function is a function whose values are real numbers. In other words, it is a function that assigns a real number to each member of its domain. Real-valued functions of a real variable (commonly called ''real ...
which can be represented as :FT(\theta)= \sum_^ C_j \cos(j\theta) θ runs from 0 to 180 degrees, after which the function merely repeats itself. The plot for a 9-point quadratic/cubic smoothing function is typical. At very low angle, the plot is almost flat, meaning that low-frequency components of the data will be virtually unchanged by the smoothing operation. As the angle increases the value decreases so that higher frequency components are more and more attenuated. This shows that the convolution filter can be described as a
low-pass filter A low-pass filter is a filter that passes signals with a frequency lower than a selected cutoff frequency and attenuates signals with frequencies higher than the cutoff frequency. The exact frequency response of the filter depends on the filt ...
: the noise that is removed is primarily high-frequency noise and low-frequency noise passes through the filter. Some high-frequency noise components are attenuated more than others, as shown by undulations in the Fourier transform at large angles. This can give rise to small oscillations in the smoothed data and phase reversal, i.e., high-frequency oscillations in the data get inverted by Savitzky–Golay filtering.


Convolution and correlation

Convolution affects the correlation between errors in the data. The effect of convolution can be expressed as a linear transformation. :Y_j\ = \sum_^ C_i\, y_ By the law of
error propagation In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of ex ...
, the variance-covariance matrix of the data, A will be transformed into B according to :\mathbf B=\mathbf C \mathbf A\mathbf C^T To see how this applies in practice, consider the effect of a 3-point moving average on the first three calculated points, , assuming that the data points have equal
variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...
and that there is no correlation between them. A will be an
identity matrix In linear algebra, the identity matrix of size n is the n\times n square matrix with ones on the main diagonal and zeros elsewhere. It has unique properties, for example when the identity matrix represents a geometric transformation, the obje ...
multiplied by a constant, ''σ''2, the variance at each point. : \mathbf B = \begin 1 & 1 & 1 & 0 & 0\\ 0 & 1 & 1 & 1 & 0\\ 0 & 0 & 1 & 1 & 1 \end \begin 1 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 1 \end \begin 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end = \begin 3 &2 &1 \\ 2 &3 &2 \\ 1 &2 &3 \end In this case the
correlation coefficient A correlation coefficient is a numerical measure of some type of linear correlation, meaning a statistical relationship between two variables. The variables may be two columns of a given data set of observations, often called a sample, or two c ...
s, :\rho_=\frac,\ (i\ne j) between calculated points ''i'' and ''j'' will be :\rho_= = 0.66, \, \rho_

0.33
In general, the calculated values are correlated even when the observed values are not correlated. The correlation extends over calculated points at a time.


Multipass filters

To illustrate the effect of multipassing on the noise and correlation of a set of data, consider the effects of a second pass of a 3-point moving average filter. For the second passThe same result is obtained with one pass of the equivalent filter with coefficients (1/9, 2/9, 3/9, 2/9, 1/9) and an identity variance-covariance matrix : \mathbf = \begin 1 & 1 & 1 & 0 & 0 \\ 0 & 1 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 & 1 \end \begin 3 & 2 & 1 & 0 &0 \\ 2 & 3 & 2 & 0 &0 \\ 1 & 2 & 3 & 0 &0 \\ 0 & 0 & 0 & 0 &0 \\ 0 & 0 & 0 & 0 &0 \end \begin 1 & 0 & 0 \\ 1 & 1 & 0 \\ 1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1 \end = \begin 19 &16 &10 &4 &1 \\ 16 &19 &16 &10 &4 \\ 10 &16 &19 &16 &10 \\ 4 &10 &16 &19 &16 \\ 1 &4 &10 &16 &19 \end After two passes, the standard deviation of the central point has decreased to \sqrt\sigma = 0.48\sigma, compared to 0.58''σ'' for one pass. The noise reduction is a little less than would be obtained with one pass of a 5-point moving average which, under the same conditions, would result in the smoothed points having the smaller standard deviation of 0.45''σ''. Correlation now extends over a span of 4 sequential points with correlation coefficients :\rho_= = 0.84, \rho_

0.53, \rho_

0.21, \rho_

0.05
The advantage obtained by performing two passes with the narrower smoothing function is that it introduces less distortion into the calculated data.


Comparison with other filters and alternatives

Compared with other smoothing filters, e.g. convolution with a
Gaussian Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below. There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
or multi-pass moving-average filtering, Savitzky–Golay filters have an initially flatter response and sharper cutoff in the frequency domain, especially for high orders of the fit polynomial (see frequency characteristics). For data with limited signal bandwidth, this means that Savitzky–Golay filtering can provide better
signal-to-noise ratio Signal-to-noise ratio (SNR or S/N) is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in deci ...
than many other filters; e.g., peak heights of spectra are better preserved than for other filters with similar noise suppression. Disadvantages of the Savitzky–Golay filters are comparably poor suppression of some high frequencies (poor
stopband A stopband is a band of frequencies, between specified limits, through which a circuit, such as a filter or telephone circuit, does not allow signals to pass, or the attenuation is above the required stopband attenuation level. Depending on app ...
suppression) and artifacts when using polynomial fits for the first and last points. Alternative smoothing methods that share the advantages of Savitzky–Golay filters and mitigate at least some of their disadvantages are Savitzky–Golay filters with properly chosen alternative fitting weights,
Whittaker–Henderson smoothing Whittaker–Henderson smoothing or Whittaker–Henderson graduation is a digital filter that can be applied to a set of digital data points for the purpose of smoothing the data, that is, to increase the precision of the data without distorting th ...
and Hodrick–Prescott filter (equivalent methods closely related to smoothing splines), and convolution with a windowed
sinc function In mathematics, physics and engineering, the sinc function ( ), denoted by , has two forms, normalized and unnormalized.. In mathematics, the historical unnormalized sinc function is defined for by \operatorname(x) = \frac. Alternatively, ...
.


Implementations in programming languages


MATLAB

* sgolayfilt
/code> fro

Available since before version R2006b.


Python

*
/code> from modul
Lightkurve
Lightkurve is the official library for analysis of Kepler & TESS Telescope data. *
/code> from module
SciPy SciPy (pronounced "sigh pie") is a free and open-source Python library used for scientific computing and technical computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, fast Fourier ...
. SciPy is a robust library widely used for scientific computing in the academic community.


See also

* Kernel smoother – Different terminology for many of the same processes, used in
statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...
*
Local regression Local regression or local polynomial regression, also known as moving regression, is a generalization of the moving average and polynomial regression. Its most common methods, initially developed for scatterplot smoothing, are LOESS (locally ...
— the LOESS and LOWESS methods *
Numerical differentiation In numerical analysis, numerical differentiation algorithms estimate the derivative of a mathematical function or subroutine using values of the function and perhaps other knowledge about the function. Finite differences The simplest method is ...
– Application to differentiation of
function Function or functionality may refer to: Computing * Function key, a type of key on computer keyboards * Function model, a structured representation of processes in a system * Function object or functor or functionoid, a concept of object-orie ...
s *
Smoothing spline Smoothing splines are function estimates, \hat f(x), obtained from a set of noisy observations y_i of the target f(x_i), in order to balance a measure of goodness of fit of \hat f(x_i) to y_i with a derivative based measure of the smoothness of ...
*
Stencil (numerical analysis) In mathematics, especially the areas of numerical analysis concentrating on the numerical partial differential equations, numerical solution of partial differential equations, a stencil is a geometric arrangement of a nodal group that relate to the ...
– Application to the solution of differential equations * Hodrick–Prescott filter *
Kalman filter In statistics and control theory, Kalman filtering (also known as linear quadratic estimation) is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, to produce estimates of unk ...


Appendix


Tables of selected convolution coefficients

Consider a set of data points . The Savitzky–Golay tables refer to the case that the step is constant, ''h''. Examples of the use of the so-called convolution coefficients, with a cubic polynomial and a window size, ''m'', of 5 points are as follows. ;Smoothing: Y_j = \frac (-3 \times y_ + 12 \times y_ + 17 \times y_j + 12 \times y_ -3 \times y_) ; ;1st derivative: Y'_j = \frac (1 \times y_ - 8 \times y_ + 0 \times y_j + 8 \times y_ - 1 \times y_) ; ;2nd derivative: Y''_j = \frac (2 \times y_ - 1 \times y_ - 2 \times y_j - 1 \times y_ + 2 \times y_). Selected values of the convolution coefficients for polynomials of degree 1, 2, 3, 4 and 5 are given in the following tables The values were calculated using the PASCAL code provided in Gorry.


Notes


References

*


External links


Advanced Convolution Coefficient Calculator (ACCC) for multidimensional least-squares filters




{{DEFAULTSORT:Savitzky-Golay filter for smoothing and differentiation Filter theory Signal estimation