In the field of
mathematical modeling
A mathematical model is a description of a system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in the natural sciences (such as physics, b ...
, a radial basis function network is an
artificial neural network
Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.
An ANN is based on a collection of connected unit ...
that uses
radial basis function A radial basis function (RBF) is a real-valued function \varphi whose value depends only on the distance between the input and some fixed point, either the origin, so that \varphi(\mathbf) = \hat\varphi(\left\, \mathbf\right\, ), or some other fixed ...
s as
activation function
In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.
A standard integrated circuit can be seen as a digital network of activation functions that can be "ON" (1) or " ...
s. The output of the network is a
linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including
function approximation
In general, a function approximation problem asks us to select a function among a that closely matches ("approximates") a in a task-specific way. The need for function approximations arises in many branches of applied mathematics, and comput ...
,
time series prediction
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...
,
classification, and system
control
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controllin ...
. They were first formulated in a 1988 paper by Broomhead and Lowe, both researchers at the
Royal Signals and Radar Establishment
The Royal Signals and Radar Establishment (RSRE) was a scientific research establishment within the Ministry of Defence (MoD) of the United Kingdom. It was located primarily at Malvern in Worcestershire, England. The RSRE motto was ''Ubique ...
.
Network architecture
Radial basis function (RBF) networks typically have three layers: an input layer, a hidden layer with a non-linear RBF activation function and a linear output layer. The input can be modeled as a vector of real numbers
. The output of the network is then a scalar function of the input vector,
, and is given by
:
where
is the number of neurons in the hidden layer,
is the center vector for neuron
, and
is the weight of neuron
in the linear output neuron. Functions that depend only on the distance from a center vector are radially symmetric about that vector, hence the name radial basis function. In the basic form, all inputs are connected to each hidden neuron. The
norm
Naturally occurring radioactive materials (NORM) and technologically enhanced naturally occurring radioactive materials (TENORM) consist of materials, usually industrial wastes or by-products enriched with radioactive elements found in the envi ...
is typically taken to be the
Euclidean distance
In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points.
It can be calculated from the Cartesian coordinates of the points using the Pythagorean theorem, therefor ...
(although the
Mahalanobis distance appears to perform better with pattern recognition
) and the radial basis function is commonly taken to be
Gaussian
Carl Friedrich Gauss (1777–1855) is the eponym of all of the topics listed below.
There are over 100 topics all named after this German mathematician and scientist, all in the fields of mathematics, physics, and astronomy. The English eponymo ...
:
.
The Gaussian basis functions are local to the center vector in the sense that
:
i.e. changing parameters of one neuron has only a small effect for input values that are far away from the center of that neuron.
Given certain mild conditions on the shape of the activation function, RBF networks are
universal approximators on a
compact
Compact as used in politics may refer broadly to a pact or treaty; in more specific cases it may refer to:
* Interstate compact
* Blood compact, an ancient ritual of the Philippines
* Compact government, a type of colonial rule utilized in British ...
subset of
.
This means that an RBF network with enough hidden neurons can approximate any continuous function on a closed, bounded set with arbitrary precision.
The parameters
,
, and
are determined in a manner that optimizes the fit between
and the data.
Normalized
Normalized architecture
In addition to the above ''unnormalized'' architecture, RBF networks can be ''normalized''. In this case the mapping is
:
where
:
is known as a ''normalized radial basis function''.
Theoretical motivation for normalization
There is theoretical justification for this architecture in the case of stochastic data flow. Assume a
stochastic kernel In probability theory, a Markov kernel (also known as a stochastic kernel or probability kernel) is a map that in the general theory of Markov processes plays the role that the transition matrix does in the theory of Markov processes with a finite ...
approximation for the joint probability density
:
where the weights
and
are exemplars from the data and we require the kernels to be normalized
:
and
:
.
The probability densities in the input and output spaces are
:
and
:
The expectation of y given an input
is
:
where
:
is the conditional probability of y given
.
The conditional probability is related to the joint probability through
Bayes theorem
In probability theory and statistics, Bayes' theorem (alternatively Bayes' law or Bayes' rule), named after Thomas Bayes, describes the probability of an event, based on prior knowledge of conditions that might be related to the event. For examp ...
:
which yields
:
.
This becomes
:
when the integrations are performed.
Local linear models
It is sometimes convenient to expand the architecture to include
local linear models. In that case the architectures become, to first order,
:
and
:
in the unnormalized and normalized cases, respectively. Here
are weights to be determined. Higher order linear terms are also possible.
This result can be written
:
where
:
and
:
in the unnormalized case and
:
in the normalized case.
Here
is a
Kronecker delta function
In mathematics, the Kronecker delta (named after Leopold Kronecker) is a function of two variables, usually just non-negative integers. The function is 1 if the variables are equal, and 0 otherwise:
\delta_ = \begin
0 &\text i \neq j, \\
1 &\t ...
defined as
:
.
Training
RBF networks are typically trained from pairs of input and target values
,
by a two-step algorithm.
In the first step, the center vectors
of the RBF functions in the hidden layer are chosen. This step can be performed in several ways; centers can be randomly sampled from some set of examples, or they can be determined using
k-means clustering. Note that this step is
unsupervised.
The second step simply fits a linear model with coefficients
to the hidden layer's outputs with respect to some objective function. A common objective function, at least for regression/function estimation, is the least squares function:
:
where
:
.
We have explicitly included the dependence on the weights. Minimization of the least squares objective function by optimal choice of weights optimizes accuracy of fit.
There are occasions in which multiple objectives, such as smoothness as well as accuracy, must be optimized. In that case it is useful to optimize a regularized objective function such as
:
where
:
and
:
where optimization of S maximizes smoothness and
is known as a
regularization
Regularization may refer to:
* Regularization (linguistics)
* Regularization (mathematics)
* Regularization (physics)
In physics, especially quantum field theory, regularization is a method of modifying observables which have singularities in ...
parameter.
A third optional
backpropagation
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions gener ...
step can be performed to fine-tune all of the RBF net's parameters.
Interpolation
RBF networks can be used to interpolate a function
when the values of that function are known on finite number of points:
. Taking the known points
to be the centers of the radial basis functions and evaluating the values of the basis functions at the same points
the weights can be solved from the equation
: