HOME

TheInfoList



OR:

The term kernel is used in statistical analysis to refer to a
window function In signal processing and statistics, a window function (also known as an apodization function or tapering function) is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the int ...
. The term "kernel" has several distinct meanings in different branches of statistics.


Bayesian statistics

In statistics, especially in
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
, the kernel of a
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
(pdf) or probability mass function (pmf) is the form of the pdf or pmf in which any factors that are not functions of any of the variables in the domain are omitted. Note that such factors may well be functions of the
parameter A parameter (), generally, is any characteristic that can help in defining or classifying a particular system (meaning an event, project, object, situation, etc.). That is, a parameter is an element of a system that is useful, or critical, when ...
s of the pdf or pmf. These factors form part of the
normalization factor The concept of a normalizing constant arises in probability theory and a variety of other areas of mathematics. The normalizing constant is used to reduce any probability function to a probability density function with total probability of one. ...
of the probability distribution, and are unnecessary in many situations. For example, in
pseudo-random number sampling Non-uniform random variate generation or pseudo-random number sampling is the numerical practice of generating pseudo-random numbers (PRN) that follow a given probability distribution. Methods are typically based on the availability of a unifo ...
, most sampling algorithms ignore the normalization factor. In addition, in
Bayesian analysis Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and e ...
of
conjugate prior In Bayesian probability theory, if the posterior distribution p(\theta \mid x) is in the same probability distribution family as the prior probability distribution p(\theta), the prior and posterior are then called conjugate distributions, and th ...
distributions, the normalization factors are generally ignored during the calculations, and only the kernel considered. At the end, the form of the kernel is examined, and if it matches a known distribution, the normalization factor can be reinstated. Otherwise, it may be unnecessary (for example, if the distribution only needs to be sampled from). For many distributions, the kernel can be written in closed form, but not the normalization constant. An example is the normal distribution. Its
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
is :p(x, \mu,\sigma^2) = \frac e^ and the associated kernel is :p(x, \mu,\sigma^2) \propto e^ Note that the factor in front of the exponential has been omitted, even though it contains the parameter \sigma^2 , because it is not a function of the domain variable x .


Pattern analysis

The kernel of a
reproducing kernel Hilbert space In functional analysis (a branch of mathematics), a reproducing kernel Hilbert space (RKHS) is a Hilbert space of functions in which point evaluation is a continuous linear functional. Roughly speaking, this means that if two functions f and g in ...
is used in the suite of techniques known as kernel methods to perform tasks such as
statistical classification In statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation (or observations) belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagn ...
,
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
, and cluster analysis on data in an implicit space. This usage is particularly common in
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
.


Nonparametric statistics

In nonparametric statistics, a kernel is a weighting function used in
non-parametric Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being distri ...
estimation techniques. Kernels are used in
kernel density estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on '' kernels'' as ...
to estimate random variables' density functions, or in
kernel regression In statistics, kernel regression is a non-parametric technique to estimate the conditional expectation of a random variable. The objective is to find a non-linear relation between a pair of random variables ''X'' and ''Y''. In any nonparametric ...
to estimate the conditional expectation of a random variable. Kernels are also used in
time-series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
, in the use of the
periodogram In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods (see spectral estimation). It is the most ...
to estimate the spectral density where they are known as window functions. An additional use is in the estimation of a time-varying intensity for a point process where window functions (kernels) are convolved with time-series data. Commonly, kernel widths must also be specified when running a non-parametric estimation.


Definition

A kernel is a
non-negative In mathematics, the sign of a real number is its property of being either positive, negative, or zero. Depending on local conventions, zero may be considered as being neither positive nor negative (having no sign or a unique third sign), or it ...
real-valued In mathematics, value may refer to several, strongly related notions. In general, a mathematical value may be any definite mathematical object. In elementary mathematics, this is most often a number – for example, a real number such as or an i ...
integrable In mathematics, integrability is a property of certain dynamical systems. While there are several distinct formal definitions, informally speaking, an integrable system is a dynamical system with sufficiently many conserved quantities, or first ...
function ''K.'' For most applications, it is desirable to define the function to satisfy two additional requirements: * Normalization: :\int_^K(u)\,du = 1\,; *Symmetry: :K(-u) = K(u) \mbox u\,. The first requirement ensures that the method of kernel density estimation results in a
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
. The second requirement ensures that the average of the corresponding distribution is equal to that of the sample used. If ''K'' is a kernel, then so is the function ''K''* defined by ''K''*(''u'') = λ''K''(λ''u''), where λ > 0. This can be used to select a scale that is appropriate for the data.


Kernel functions in common use

Several types of kernel functions are commonly used: uniform, triangle, Epanechnikov, quartic (biweight), tricube, triweight, Gaussian, quadratic and cosine. In the table below, if K is given with a bounded support, then K(u) = 0 for values of ''u'' lying outside the support.


See also

*
Kernel density estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on '' kernels'' as ...
*
Kernel smoother A kernel smoother is a statistical technique to estimate a real valued function f: \mathbb^p \to \mathbb as the weighted average of neighboring observed data. The weight is defined by the ''kernel'', such that closer points are given higher weights ...
*
Stochastic kernel In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that in the general theory of Markov processes plays the role that the transition matrix does in the theory of Markov processes with a finite ...
*
Positive-definite kernel In operator theory, a branch of mathematics, a positive-definite kernel is a generalization of a positive-definite function or a positive-definite matrix. It was first introduced by James Mercer in the early 20th century, in the context of solving ...
*
Density estimation In statistics, probability density estimation or simply density estimation is the construction of an estimate, based on observed data, of an unobservable underlying probability density function. The unobservable density function is thought of ...
* Multivariate kernel density estimation


References

* * *{{cite journal, year=2002 , first1=D, last1=Comaniciu, first2= P, last2= Meer , title=Mean shift: A robust approach toward feature space analysis , journal=IEEE Transactions on Pattern Analysis and Machine Intelligence, volume= 24, issue= 5, pages= 603–619 , citeseerx = 10.1.1.76.8968 , doi=10.1109/34.1000236 Nonparametric statistics Time series Point processes Bayesian statistics