HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a
non-parametric Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being distri ...
method to
estimate Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is der ...
the
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
of a
random variable A random variable (also called random quantity, aleatory variable, or stochastic variable) is a mathematical formalization of a quantity or object which depends on random events. It is a mapping or a function from possible outcomes (e.g., the po ...
based on ''
kernels Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learnin ...
'' as weights. KDE answers a fundamental data smoothing problem where inferences about the
population Population typically refers to the number of people in a single area, whether it be a city or town, region, country, continent, or the world. Governments typically quantify the size of the resident population within their jurisdiction using a ...
are made, based on a finite data
sample Sample or samples may refer to: Base meaning * Sample (statistics), a subset of a population – complete data set * Sample (signal), a digital discrete sample of a continuous analog signal * Sample (material), a specimen or small quantity of s ...
. In some fields such as
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
and
econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8 ...
it is also termed the Parzen–Rosenblatt window method, after
Emanuel Parzen Emanuel Parzen (April 21, 1929 – February 6, 2016) was an American statistician. He worked and published on signal detection theory and time series analysis, where he pioneered the use of kernel density estimation (also known as the Parzen w ...
and Murray Rosenblatt, who are usually credited with independently creating it in its current form. One of the famous applications of kernel density estimation is in estimating the class-conditional marginal densities of data when using a naive Bayes classifier, which can improve its prediction accuracy.


Definition

Let (''x''1, ''x''2, ..., ''xn'') be
independent and identically distributed In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is usual ...
samples drawn from some univariate distribution with an unknown
density Density (volumetric mass density or specific mass) is the substance's mass per unit of volume. The symbol most often used for density is ''ρ'' (the lower case Greek letter rho), although the Latin letter ''D'' can also be used. Mathematical ...
''ƒ'' at any given point ''x''. We are interested in estimating the shape of this function ''ƒ''. Its ''kernel density estimator'' is : \widehat_h(x) = \frac\sum_^n K_h (x - x_i) = \frac \sum_^n K\Big(\frac\Big), where ''K'' is the
kernel Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learn ...
— a non-negative function — and is a
smoothing In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. In smoothing, the dat ...
parameter called the ''bandwidth''. A kernel with subscript ''h'' is called the ''scaled kernel'' and defined as . Intuitively one wants to choose ''h'' as small as the data will allow; however, there is always a trade-off between the bias of the estimator and its variance. The choice of bandwidth is discussed in more detail below. A range of kernel functions are commonly used: uniform, triangular, biweight, triweight, Epanechnikov, normal, and others. The Epanechnikov kernel is optimal in a mean square error sense, though the loss of efficiency is small for the kernels listed previously. Due to its convenient mathematical properties, the normal kernel is often used, which means , where ''ϕ'' is the standard normal density function. The construction of a kernel density estimate finds interpretations in fields outside of density estimation. For example, in
thermodynamics Thermodynamics is a branch of physics that deals with heat, work, and temperature, and their relation to energy, entropy, and the physical properties of matter and radiation. The behavior of these quantities is governed by the four laws of the ...
, this is equivalent to the amount of heat generated when
heat kernel In the mathematical study of heat conduction and diffusion, a heat kernel is the fundamental solution to the heat equation on a specified domain with appropriate boundary conditions. It is also one of the main tools in the study of the spectru ...
s (the fundamental solution to the
heat equation In mathematics and physics, the heat equation is a certain partial differential equation. Solutions of the heat equation are sometimes known as caloric functions. The theory of the heat equation was first developed by Joseph Fourier in 1822 for t ...
) are placed at each data point locations ''xi''. Similar methods are used to construct
discrete Laplace operator In mathematics, the discrete Laplace operator is an analog of the continuous Laplace operator, defined so that it has meaning on a graph or a discrete grid. For the case of a finite-dimensional graph (having a finite number of edges and vertice ...
s on point clouds for
manifold learning Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low- ...
(e.g.
diffusion map Diffusion maps is a dimensionality reduction or feature extraction algorithm introduced by Coifman and Lafon which computes a family of embeddings of a data set into Euclidean space (often low-dimensional) whose coordinates can be computed from ...
).


Example

Kernel density estimates are closely related to
histograms A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or "bucket") the range of values—that is, divide the en ...
, but can be endowed with properties such as smoothness or continuity by using a suitable kernel. The diagram below based on these 6 data points illustrates this relationship: For the histogram, first, the horizontal axis is divided into sub-intervals or bins which cover the range of the data: In this case, six bins each of width 2. Whenever a data point falls inside this interval, a box of height 1/12 is placed there. If more than one data point falls inside the same bin, the boxes are stacked on top of each other. For the kernel density estimate, normal kernels with variance 2.25 (indicated by the red dashed lines) are placed on each of the data points ''xi''. The kernels are summed to make the kernel density estimate (solid blue curve). The smoothness of the kernel density estimate (compared to the discreteness of the histogram) illustrates how kernel density estimates converge faster to the true underlying density for continuous random variables.


Bandwidth selection

The
bandwidth Bandwidth commonly refers to: * Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range * Bandwidth (computing), the rate of data transfer, bit rate or thr ...
of the kernel is a
free parameter A free parameter is a variable in a mathematical model which cannot be predicted precisely or constrained by the model and must be estimated experimentally or theoretically. A mathematical model, theory, or conjecture is more likely to be right ...
which exhibits a strong influence on the resulting estimate. To illustrate its effect, we take a simulated
random sample In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians atte ...
from the standard
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
(plotted at the blue spikes in the
rug plot Rug or RUG may refer to: * Rug, or carpet, a textile floor covering * Rug, slang for a toupée * Ghent University (''Rijksunversiteit Gent'', or RUG) * Really Useful Group, or RUG, a company set up by Andrew Lloyd Webber * Rugby railway station, N ...
on the horizontal axis). The grey curve is the true density (a normal density with mean 0 and variance 1). In comparison, the red curve is ''undersmoothed'' since it contains too many spurious data artifacts arising from using a bandwidth ''h'' = 0.05, which is too small. The green curve is ''oversmoothed'' since using the bandwidth ''h'' = 2 obscures much of the underlying structure. The black curve with a bandwidth of ''h'' = 0.337 is considered to be optimally smoothed since its density estimate is close to the true density. An extreme situation is encountered in the limit h \to 0 (no smoothing), where the estimate is a sum of ''n'' delta functions centered at the coordinates of analyzed samples. In the other extreme limit h \to \infty the estimate retains the shape of the used kernel, centered on the mean of the samples (completely smooth). The most common optimality criterion used to select this parameter is the expected ''L''2
risk function In mathematical optimization and decision theory, a loss function or cost function (sometimes also called an error function) is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cos ...
, also termed the
mean integrated squared error In statistics, the mean integrated squared error (MISE) is used in density estimation. The MISE of an estimate of an unknown probability density is given by :\operatorname\, f_n-f\, _2^2=\operatorname\int (f_n(x)-f(x))^2 \, dx where ''ƒ'' is ...
: : \operatorname (h) = \operatorname\!\left , \int (\hat_h(x) - f(x))^2 \, dx \right/math> Under weak assumptions on ''ƒ'' and ''K'', (''ƒ'' is the, generally unknown, real density function), :\operatorname(h) = \operatorname(h) + \mathcal((nh)^ + h^4) where ''o'' is the
little o notation Big ''O'' notation is a mathematical notation that describes the asymptotic analysis, limiting behavior of a function (mathematics), function when the Argument of a function, argument tends towards a particular value or infinity. Big O is a memb ...
, and ''n'' the sample size (as above). The AMISE is the asymptotic MISE, i. e. the two leading terms, :\operatorname(h) = \frac + \frac m_2(K)^2 h^4 R(f'') where R(g) = \int g(x)^2 \, dx for a function ''g'', m_2(K) = \int x^2 K(x) \, dx and f'' is the second derivative of f and K is the kernel. The minimum of this AMISE is the solution to this differential equation : \frac \operatorname(h) = -\frac + m_2(K)^2 h^3 R(f'') = 0 or :h_ = \frac n^ = C n^ Neither the AMISE nor the ''h''AMISE formulas can be used directly since they involve the unknown density function f or its second derivative f''. To overcome that difficulty a variety of automatic, data-based methods were developed to select the bandwidth. Many review studies were carried out to compare their efficacies, with the general consensus that the plug-in selectors and cross validation selectors are the most useful over a wide range of data sets. Substituting any bandwidth ''h'' which has the same asymptotic order ''n''−1/5 as ''h''AMISE into the AMISE gives that AMISE(''h'') = ''O''(''n''−4/5), where ''O'' is the
big o notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Lan ...
. It can be shown that, under weak assumptions, there cannot exist a non-parametric estimator that converges at a faster rate than the kernel estimator. Note that the ''n''−4/5 rate is slower than the typical ''n''−1 convergence rate of parametric methods. If the bandwidth is not held fixed, but is varied depending upon the location of either the estimate (balloon estimator) or the samples (pointwise estimator), this produces a particularly powerful method termed adaptive or variable bandwidth kernel density estimation. Bandwidth selection for kernel density estimation of heavy-tailed distributions is relatively difficult.


A rule-of-thumb bandwidth estimator

If Gaussian basis functions are used to approximate
univariate In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate ...
data, and the underlying density being estimated is Gaussian, the optimal choice for ''h'' (that is, the bandwidth that minimises the
mean integrated squared error In statistics, the mean integrated squared error (MISE) is used in density estimation. The MISE of an estimate of an unknown probability density is given by :\operatorname\, f_n-f\, _2^2=\operatorname\int (f_n(x)-f(x))^2 \, dx where ''ƒ'' is ...
) is: :h = \left(\frac\right)^ \approx 1.06 \, \hat\, n^, An h value is considered more robust when it improves the fit for long-tailed and skewed distributions or for bimodal mixture distributions. This is often done empirically by replacing \hat by the parameter A below: :A = min(standard deviation,
interquartile range In descriptive statistics, the interquartile range (IQR) is a measure of statistical dispersion, which is the spread of the data. The IQR may also be called the midspread, middle 50%, fourth spread, or H‑spread. It is defined as the difference ...
/1.34). Another modification that will improve the model is to reduce the factor from 1.06 to 0.9. Then the final formula would be: :h = 0.9\, \min\left(\hat, \frac\right)\, n^ where \hat is the standard deviation of the samples, n is the sample size. IQR is the interquartile range. This approximation is termed the ''normal distribution approximation'', Gaussian approximation, or ''
Silverman Silverman may refer to: * a kind of living statue Surnames * Abraham George Silverman (1900–1973), American mathematician * Allan Silverman (born 1955), American philosopher * Barry G. Silverman (born 1951), American federal judge * Belle ...
's rule of thumb''. While this rule of thumb is easy to compute, it should be used with caution as it can yield widely inaccurate estimates when the density is not close to being normal. For example, when estimating the bimodal
Gaussian mixture model In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observatio ...
:\textstyle\frace^+\frace^ from a sample of 200 points. The figure on the right shows the true density and two kernel density estimates—one using the rule-of-thumb bandwidth, and the other using a solve-the-equation bandwidth. The estimate based on the rule-of-thumb bandwidth is significantly oversmoothed.


Relation to the characteristic function density estimator

Given the sample (''x''1, ''x''2, ..., ''xn''), it is natural to estimate the
characteristic function In mathematics, the term "characteristic function" can refer to any of several distinct concepts: * The indicator function of a subset, that is the function ::\mathbf_A\colon X \to \, :which for a given subset ''A'' of ''X'', has value 1 at points ...
as : \widehat\varphi(t) = \frac \sum_^n e^ Knowing the characteristic function, it is possible to find the corresponding probability density function through the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
formula. One difficulty with applying this inversion formula is that it leads to a diverging integral, since the estimate \scriptstyle\widehat\varphi(t) is unreliable for large ''t''’s. To circumvent this problem, the estimator \scriptstyle\widehat\varphi(t) is multiplied by a damping function , which is equal to 1 at the origin and then falls to 0 at infinity. The “bandwidth parameter” ''h'' controls how fast we try to dampen the function \scriptstyle\widehat\varphi(t). In particular when ''h'' is small, then ''ψh''(''t'') will be approximately one for a large range of ''t''’s, which means that \scriptstyle\widehat\varphi(t) remains practically unaltered in the most important region of ''t''’s. The most common choice for function ''ψ'' is either the uniform function , which effectively means truncating the interval of integration in the inversion formula to , or the
Gaussian function In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form f(x) = \exp (-x^2) and with parametric extension f(x) = a \exp\left( -\frac \right) for arbitrary real constants , and non-zero . It is ...
. Once the function ''ψ'' has been chosen, the inversion formula may be applied, and the density estimator will be : \begin \widehat(x) &= \frac \int_^ \widehat\varphi(t)\psi_h(t) e^ \, dt = \frac \int_^ \frac \sum_^n e^ \psi(ht) \, dt \\ pt &= \frac \sum_^n \frac \int_^ e^ \psi(ht) \, d(ht) = \frac \sum_^n K\Big(\frac\Big), \end where ''K'' is the
Fourier transform A Fourier transform (FT) is a mathematical transform that decomposes functions into frequency components, which are represented by the output of the transform as a function of frequency. Most commonly functions of time or space are transformed, ...
of the damping function ''ψ''. Thus the kernel density estimator coincides with the characteristic function density estimator.


Geometric and topological features

We can extend the definition of the (global) mode to a local sense and define the local modes: :M = \ Namely, M is the collection of points for which the density function is locally maximized. A natural estimator of M is a plug-in from KDE, where g(x) and \lambda_1(x) are KDE version of g(x) and \lambda_1(x). Under mild assumptions, M_c is a consistent estimator of M. Note that one can use the mean shift algorithm to compute the estimator M_c numerically.


Statistical implementation

A non-exhaustive list of software implementations of kernel density estimators includes: * In Analytica release 4.4, the ''Smoothing'' option for PDF results uses KDE, and from expressions it is available via the built-in Pdf function. * In C/
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...

FIGTree
is a library that can be used to compute kernel density estimates using normal kernels. MATLAB interface available. * In
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...

libagf
is a library for
variable kernel density estimation In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varied depending upon either the location of the samples or the location of th ...
. * In
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
,
mlpack mlpack is a machine learning software library for C++, built on top of the Armadillo library and thensmallennumerical optimization library. mlpack has an emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possib ...
is a library that can compute KDE using many different kernels. It allows to set an error tolerance for faster computation.
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
and R interfaces are available. * in C# and F#, Math.NET Numerics is an open source library for numerical computation which include
kernel density estimation
* In
CrimeStat CrimeStat is a crime mapping software program. CrimeStat is Windows-based program that conducts spatial and statistical analysis and is designed to interface with a geographic information system (GIS). The program is developed by Ned Levine & Assoc ...
, kernel density estimation is implemented using five different kernel functions – normal, uniform, quartic, negative exponential, and triangular. Both single- and dual-kernel density estimate routines are available. Kernel density estimation is also used in interpolating a Head Bang routine, in estimating a two-dimensional Journey-to-crime density function, and in estimating a three-dimensional Bayesian Journey-to-crime estimate. * In
ELKI ELKI (for ''Environment for DeveLoping KDD-Applications Supported by Index-Structures'') is a data mining (KDD, knowledge discovery in databases) software framework developed for use in research and teaching. It was originally at the database ...
, kernel density functions can be found in the package de.lmu.ifi.dbs.elki.math.statistics.kernelfunctions * In
ESRI Esri (; Environmental Systems Research Institute) is an American multinational geographic information system (GIS) software company. It is best known for its ArcGIS products. With a 43% market share, Esri is the world's leading supplier of GIS ...
products, kernel density mapping is managed out of the Spatial Analyst toolbox and uses the Quartic(biweight) kernel. * In
Excel ExCeL London (an abbreviation for Exhibition Centre London) is an exhibition centre, international convention centre and former hospital in the Custom House area of Newham, East London. It is situated on a site on the northern quay of the ...
, the Royal Society of Chemistry has created an add-in to run kernel density estimation based on thei
Analytical Methods Committee Technical Brief 4
* In
gnuplot gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits. The program runs on all major computers and operating systems (Linux, Unix, Microsoft Windows, macOS, FreeDOS, ...
, kernel density estimation is implemented by the smooth kdensity option, the datafile can contain a weight and bandwidth for each point, or the bandwidth can be set automatically according to "Silverman's rule of thumb" (see above). * In
Haskell Haskell () is a general-purpose, statically-typed, purely functional programming language with type inference and lazy evaluation. Designed for teaching, research and industrial applications, Haskell has pioneered a number of programming lan ...
, kernel density is implemented in th
statistics
package. * In
IGOR Pro IGOR Pro is a scientific data analysis software, numerical computing environment and programming language that runs on Windows or Mac operating systems. It is developed by WaveMetrics Inc., and was originally aimed at time series analysis, but ha ...
, kernel density estimation is implemented by the StatsKDE operation (added in Igor Pro 7.00). Bandwidth can be user specified or estimated by means of Silverman, Scott or Bowmann and Azzalini. Kernel types are: Epanechnikov, Bi-weight, Tri-weight, Triangular, Gaussian and Rectangular. * In
Java Java (; id, Jawa, ; jv, ꦗꦮ; su, ) is one of the Greater Sunda Islands in Indonesia. It is bordered by the Indian Ocean to the south and the Java Sea to the north. With a population of 151.6 million people, Java is the world's List ...
, the
Weka (machine learning) Waikato Environment for Knowledge Analysis (Weka), developed at the University of Waikato, New Zealand, is free software licensed under the GNU General Public License, and the companion software to the book "Data Mining: Practical Machine Learnin ...
package provide
weka.estimators.KernelEstimator
among others. * In
JavaScript JavaScript (), often abbreviated as JS, is a programming language that is one of the core technologies of the World Wide Web, alongside HTML and CSS. As of 2022, 98% of Website, websites use JavaScript on the Client (computing), client side ...
, the visualization package
D3.js D3.js (also known as D3, short for Data-Driven Documents) is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. It makes use of Scalable Vector Graphics (SVG), HTML5, and Cascading Style Sheets (CSS) sta ...
offers a KDE package in its science.stats package. * In JMP, the Graph Builder platform utilizes kernel density estimation to provide contour plots and high density regions (HDRs) for bivariate densities, and violin plots and HDRs for univariate densities. Sliders allow the user to vary the bandwidth. Bivariate and univariate kernel density estimates are also provided by the Fit Y by X and Distribution platforms, respectively. * In
Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g ...
, kernel density estimation is implemented in th
KernelDensity.jl
package. * In
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
, kernel density estimation is implemented through the ksdensity function (Statistics Toolbox). As of the 2018a release of MATLAB, both the bandwidth and kernel smoother can be specified, including other options such as specifying the range of the kernel density. Alternatively, a free MATLAB software package which implements an automatic bandwidth selection method is available from the MATLAB Central File Exchange for *
1-dimensional data
*
2-dimensional data
*
n-dimensional data

A free MATLAB toolbox with implementation of kernel regression, kernel density estimation, kernel estimation of hazard function and many others is available o

(this toolbox is a part of the book ). * In Mathematica, numeric kernel density estimation is implemented by the function SmoothKernelDistribution and symbolic estimation is implemented using the function KernelMixtureDistribution both of which provide data-driven bandwidths. * In
Minitab Minitab is a statistics package developed at the Pennsylvania State University by researchers Barbara F. Ryan, Thomas A. Ryan, Jr., and Brian L. Joiner in conjunction with Triola Statistics Company in 1972. It began as a light version of OMNITA ...
, the Royal Society of Chemistry has created a macro to run kernel density estimation based on their Analytical Methods Committee Technical Brief 4. * In the NAG Library, kernel density estimation is implemented via the g10ba routine (available in both the Fortran and the C versions of the Library). * I
Nuklei
C++ C++ (pronounced "C plus plus") is a high-level general-purpose programming language created by Danish computer scientist Bjarne Stroustrup as an extension of the C programming language, or "C with Classes". The language has expanded significan ...
kernel density methods focus on data from the Special Euclidean group SE(3). * In
Octave In music, an octave ( la, octavus: eighth) or perfect octave (sometimes called the diapason) is the interval between one musical pitch and another with double its frequency. The octave relationship is a natural phenomenon that has been refer ...
, kernel density estimation is implemented by the kernel_density option (econometrics package). * In
Origin Origin(s) or The Origin may refer to: Arts, entertainment, and media Comics and manga * ''Origin'' (comics), a Wolverine comic book mini-series published by Marvel Comics in 2002 * ''The Origin'' (Buffy comic), a 1999 ''Buffy the Vampire Sl ...
, 2D kernel density plot can be made from its user interface, and two functions, Ksdensity for 1D and Ks2density for 2D can be used from it
LabTalk
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, or C code. * In
Perl Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was offici ...
, an implementation can be found in th
Statistics-KernelEstimation module
* In
PHP PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. ...
, an implementation can be found in th
MathPHP library
* In
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
, many implementations exist:
pyqt_fit.kde Module
in th
PyQt-Fit package
SciPy (scipy.stats.gaussian_kde), Statsmodels (KDEUnivariate and KDEMultivariate), and
scikit-learn scikit-learn (formerly scikits.learn and also known as sklearn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector ...
(KernelDensity) (see comparison)
KDEpy
supports weighted data and its FFT implementation is orders of magnitude faster than the other implementations. The commonly used pandas librar

offers support for kde plotting through the plot method (df.plot(kind='kde')
. Th
getdist
package for weighted and correlated MCMC samples supports optimized bandwidth, boundary correction and higher-order methods for 1D and 2D distributions. One newly used package for kernel density estimation is seaborn ( import seaborn as sns , sns.kdeplot() ). A GPU implementation of KDE also exists. * In R, it is implemented through density in the base distribution, and bw.nrd0 function is used in stats package, this function uses the optimized formula in Silverman's book. bkde in th
KernSmooth library
ParetoDensityEstimation in th
DataVisualizations library
(for pareto distribution density estimation), kde in th

dkden and dbckden in th

(latter for boundary corrected kernel density estimation for bounded support), npudens in th

(numeric and categorical variable, categorical data), sm.density in th
sm library
For an implementation of the kde.R function, which does not require installing any packages or libraries, se
kde.R
Th

dedicated to urban analysis, implements kernel density estimation through kernel_smoothing. * In SAS, proc kde can be used to estimate univariate and bivariate kernel densities. * In Apache Spark, the KernelDensity() class * In Stata, it is implemented through kdensity; for example histogram x, kdensity. Alternatively a free Stata module KDENS is available allowing a user to estimate 1D or 2D density functions. * In
Swift Swift or SWIFT most commonly refers to: * SWIFT, an international organization facilitating transactions between banks ** SWIFT code * Swift (programming language) * Swift (bird), a family of birds It may also refer to: Organizations * SWIFT, ...
, it is implemented through SwiftStats.KernelDensityEstimation in the open-source statistics librar
SwiftStats


See also

*
Kernel (statistics) The term kernel is used in statistical analysis to refer to a window function. The term "kernel" has several distinct meanings in different branches of statistics. Bayesian statistics In statistics, especially in Bayesian statistics, the kerne ...
* Kernel smoothing *
Kernel regression In statistics, kernel regression is a non-parametric technique to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables ''X'' and ''Y''. In any nonparametric ...
*
Density estimation In statistics, probability density estimation or simply density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. The unobservable density function is thought of ...
(with presentation of other examples) *
Mean-shift Mean shift is a non-parametric feature-space mathematical analysis technique for locating the maxima of a density function, a so-called mode-seeking algorithm. Application domains include cluster analysis in computer vision and image processing. ...
*
Scale space Scale-space theory is a framework for multi-scale signal representation developed by the computer vision, image processing and signal processing communities with complementary motivations from physics and biological vision. It is a formal theor ...
: The triplets form a
scale space Scale-space theory is a framework for multi-scale signal representation developed by the computer vision, image processing and signal processing communities with complementary motivations from physics and biological vision. It is a formal theor ...
representation of the data. * Multivariate kernel density estimation *
Variable kernel density estimation In statistics, adaptive or "variable-bandwidth" kernel density estimation is a form of kernel density estimation in which the size of the kernels used in the estimate are varied depending upon either the location of the samples or the location of th ...
* Head/tail breaks


Further reading

* Härdle, Müller, Sperlich, Werwatz, ''Nonparametric and Semiparametric Methods'', Springer-Verlag Berlin Heidelberg 2004, pp. 39–83


References


External links


Introduction to kernel density estimation
A short tutorial which motivates kernel density estimators as an improvement over histograms.

A free online tool that generates an optimized kernel density estimate.
Free Online Software (Calculator)
computes the Kernel Density Estimation for a data series according to the following Kernels: Gaussian, Epanechnikov, Rectangular, Triangular, Biweight, Cosine, and Optcosine.

An online interactive example of kernel density estimation. Requires .NET 3.0 or later. {{DEFAULTSORT:Kernel density estimation Estimation of densities Nonparametric statistics Machine learning