Learning Vector Quantization
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, learning vector quantization (LVQ) is a prototype-based supervised classification
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific problems or to perform a computation. Algorithms are used as specifications for performing ...
. LVQ is the supervised counterpart of
vector quantization Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by di ...
systems.


Overview

LVQ can be understood as a special case of an
artificial neural network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
, more precisely, it applies a winner-take-all
Hebbian learning Hebbian theory is a neuroscientific theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptatio ...
-based approach. It is a precursor to
self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the t ...
s (SOM) and related to
neural gas Neural gas is an artificial neural network, inspired by the self-organizing map and introduced in 1991 by Thomas Martinetz and Klaus Schulten. The neural gas is a simple algorithm for finding optimal data representations based on feature ve ...
, and to the
k-nearest neighbor algorithm In statistics, the ''k''-nearest neighbors algorithm (''k''-NN) is a non-parametric supervised learning method first developed by Evelyn Fix and Joseph Hodges in 1951, and later expanded by Thomas Cover. It is used for classification and reg ...
(k-NN). LVQ was invented by
Teuvo Kohonen Teuvo Kalevi Kohonen (11 July 1934 – 13 December 2021) was a prominent Finnish academic ( Dr. Eng.) and researcher. He was professor emeritus of the Academy of Finland.feature space In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern r ...
of observed data. In winner-take-all training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly. An advantage of LVQ is that it creates prototypes that are easy to interpret for experts in the respective application domain. LVQ systems can be applied to
multi-class classification In machine learning and statistical classification, multiclass classification or multinomial classification is the problem of classifying instances into one of three or more classes (classifying instances into one of two classes is called binary c ...
problems in a natural way. A key issue in LVQ is the choice of an appropriate measure of distance or similarity for training and classification. Recently, techniques have been developed which adapt a parameterized distance measure in the course of training the system, see e.g. (Schneider, Biehl, and Hammer, 2009) and references therein. LVQ can be a source of great help in classifying text documents.


Algorithm

Below follows an informal description.
The algorithm consists of three basic steps. The algorithm's input is: * how many neurons the system will have M (in the simplest case it is equal to the number of classes) * what weight each neuron has \vec for i = 0,1,...,M - 1 * the corresponding label c_i to each neuron \vec * how fast the neurons are learning \eta * and an input list L containing all the vectors of which the labels are known already (training set). The algorithm's flow is: # For next input \vec (with label y) in L find the closest neuron \vec,
i.e. d(\vec,\vec) = \min\limits_i , where \, d is the metric used ( Euclidean, etc. ). # Update \vec. A better explanation is get \vec closer to the input \vec, if \vec and \vec belong to the same label and get them further apart if they don't.
\vec \gets \vec + \eta \cdot \left( \vec - \vec \right) if c_m = y (closer together)
or \vec \gets \vec - \eta \cdot \left( \vec - \vec \right) if c_m \neq y (further apart). # While there are vectors left in L go to step 1, else terminate. Note: \vec and \vec{x} are vectors in feature space.


References


Further reading


Self-Organizing Maps and Learning Vector Quantization for Feature Sequences, Somervuo and Kohonen. 2004
(pdf)


External links


lvq_pak
official release (1996) by Kohonen and his team Artificial neural networks Classification algorithms