In
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
, a Hyper basis function network, or HyperBF network, is a generalization of
radial basis function (RBF) networks concept, where the
Mahalanobis-like distance is used instead of Euclidean distance measure. Hyper basis function networks were first introduced by Poggio and Girosi in the 1990 paper “Networks for Approximation and Learning”.
[T. Poggio and F. Girosi (1990). "Networks for Approximation and Learning". ''Proc. IEEE'' Vol. 78, No. 9:1481-1497.][R.N. Mahdi, E.C. Rouchka (2011)]
"Reduced HyperBF Networks: Regularization by Explicit Complexity Reduction and Scaled Rprop-Based Training"
''IEEE Transactions of Neural Networks'' 2:673–686.
Network Architecture
The typical HyperBF network structure consists of a real input vector
, a hidden layer of activation functions and a linear output layer. The output of the network is a scalar function of the input vector,
, is given by
where
is a number of neurons in the hidden layer,
and
are the center and weight of neuron
. The
activation function at the HyperBF network takes the following form
where
is a positive definite
matrix. Depending on the application, the following types of matrices
are usually considered
[F. Schwenker, H.A. Kestler and G. Palm (2001). "Three Learning Phases for Radial-Basis-Function Network" ''Neural Netw.'' 14:439-458.]
*
, where
. This case corresponds to the regular RBF network.
*
, where
. In this case, the basis functions are radially symmetric, but are scaled with different width.
*
, where
. Every neuron has an elliptic shape with a varying size.
* Positive definite matrix, but not diagonal.
Training
Training HyperBF networks involves estimation of weights
, shape and centers of neurons
and
. Poggio and Girosi (1990) describe the training method with moving centers and adaptable neuron shapes. The outline of the method is provided below.
Consider the quadratic loss of the network
. The following conditions must be satisfied at the optimum:
, ,
where
. Then in the gradient descent method the values of
that minimize