HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
,
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
, and
time series analysis In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
, a sinusoidal model is used to approximate a sequence ''Yi'' to a
sine In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is oppo ...
function: :Y_i = C + \alpha\sin(\omega T_i + \phi) + E_i where ''C'' is constant defining a
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
level, α is an
amplitude The amplitude of a periodic variable is a measure of its change in a single period (such as time or spatial period). The amplitude of a non-periodic signal is its magnitude compared with a reference value. There are various definitions of amplit ...
for the
sine In mathematics, sine and cosine are trigonometric functions of an angle. The sine and cosine of an acute angle are defined in the context of a right triangle: for the specified angle, its sine is the ratio of the length of the side that is oppo ...
, ω is the
angular frequency In physics, angular frequency "''ω''" (also referred to by the terms angular speed, circular frequency, orbital frequency, radian frequency, and pulsatance) is a scalar measure of rotation rate. It refers to the angular displacement per unit tim ...
, ''Ti'' is a time variable, φ is the
phase-shift In physics and mathematics, the phase of a periodic function F of some real variable t (such as time) is an angle-like quantity representing the fraction of the cycle covered up to t. It is denoted \phi(t) and expressed in such a scale that it ...
, and ''Ei'' is the error sequence. This sinusoidal model can be fit using
nonlinear least squares Non-linear least squares is the form of least squares analysis used to fit a set of ''m'' observations with a model that is non-linear in ''n'' unknown parameters (''m'' ≥ ''n''). It is used in some forms of nonlinear regression. The ...
; to obtain a good fit, routines may require good starting values for the unknown parameters. Fitting a model with a single sinusoid is a special case of
spectral density estimation In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density (also known as the power spectral density) of a signal from a sequence of time samples of the signa ...
and
least-squares spectral analysis Least-squares spectral analysis (LSSA) is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis. Fourier analysis, the most used spectral method in science, generally ...
.


Good starting values


Good starting value for the mean

A good starting value for ''C'' can be obtained by calculating the
mean There are several kinds of mean in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value (magnitude and sign) of a given data set. For a data set, the ''arithme ...
of the data. If the data show a
trend A fad or trend is any form of collective behavior that develops within a culture, a generation or social group in which a group of people enthusiastically follow an impulse (psychology), impulse for a short period. Fads are objects or behavior ...
, i.e., the assumption of constant location is violated, one can replace ''C'' with a linear or quadratic
least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
fit. That is, the model becomes :Y_i = (B_0 + B_1T_i) + \alpha\sin(2\pi\omega T_i + \phi) + E_i or :Y_i = (B_0 + B_1T_i+B_2T_i^2) + \alpha\sin(2\pi\omega T_i + \phi) + E_i


Good starting value for frequency

The starting value for the frequency can be obtained from the dominant frequency in a
periodogram In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods (see spectral estimation). It is the most co ...
. A complex demodulation phase plot can be used to refine this initial estimate for the frequency.


Good starting values for amplitude

The
root mean square In mathematics and its applications, the root mean square of a set of numbers x_i (abbreviated as RMS, or rms and denoted in formulas as either x_\mathrm or \mathrm_x) is defined as the square root of the mean square (the arithmetic mean of the ...
of the detrended data can be scaled by the square root of two to obtain an estimate of the sinusoid amplitude. A complex demodulation amplitude plot can be used to find a good starting value for the amplitude. In addition, this plot can indicate whether or not the amplitude is constant over the entire range of the data or if it varies. If the plot is essentially flat, i.e., zero slope, then it is reasonable to assume a constant amplitude in the non-linear model. However, if the slope varies over the range of the plot, one may need to adjust the model to be: :Y_i = C + (B_0 + B_1 T_i)\sin(2\pi\omega T_i + \phi) + E_i That is, one may replace α with a function of time. A linear fit is specified in the model above, but this can be replaced with a more elaborate function if needed.


Model validation

As with any
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
, the fit should be subjected to graphical and quantitative techniques of
model validation In statistics, model validation is the task of evaluating whether a chosen statistical model is appropriate or not. Oftentimes in statistical inference, inferences from models that appear to fit their data may be flukes, resulting in a misunderstan ...
. For example, a
run sequence plot A run chart, also known as a run-sequence plot is a graph that displays observed data in a time sequence. Often, the data displayed represent some aspect of the output or performance of a manufacturing or other business process. It is therefore ...
to check for significant shifts in location, scale, start-up effects and
outliers In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are ...
. A
lag plot In the analysis of data, a correlogram is a chart of correlation statistics. For example, in time series analysis, a plot of the sample autocorrelations r_h\, versus h\, (the time lags) is an autocorrelogram. If cross-correlation is plotted ...
can be used to verify the residuals are independent. The outliers also appear in the lag plot, and a
histogram A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or "bucket") the range of values—that is, divide the ent ...
and
normal probability plot The normal probability plot is a graphical technique to identify substantive departures from normality. This includes identifying outliers, skewness, kurtosis, a need for transformations, and mixtures. Normal probability plots are made of raw d ...
to check for skewness or other non- normality in the residuals.


Extensions

A different method consists in transforming the non-linear regression to a linear regression thanks to a convenient integral equation. Then, there is no need for initial guess and no need for iterative process : the fitting is directly obtained.The method is explained in the chapter "Generalized sinusoidal regression" pp.54-63 in the paper

/ref>


See also

*
Pitch detection algorithm Pitch may refer to: Acoustic frequency * Pitch (music), the perceived frequency of sound including "definite pitch" and "indefinite pitch" ** Absolute pitch or "perfect pitch" ** Pitch class, a set of all pitches that are a whole number of octave ...
* Spectral density estimation#Single tone


References


External links


Beam deflection case study
{{DEFAULTSORT:Sinusoidal Model Regression with time series structure Regression models