HOME

TheInfoList



OR:

Generative topographic map (GTM) is a
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
method that is a probabilistic counterpart of the
self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the to ...
(SOM), is probably convergent and does not require a shrinking
neighborhood A neighbourhood (British English, Irish English, Australian English and Canadian English) or neighborhood (American English; see spelling differences) is a geographically localised community within a larger city, town, suburb or rural area, ...
or a decreasing step size. It is a
generative model In statistical classification, two main approaches are called the generative approach and the discriminative approach. These compute classifiers by different approaches, differing in the degree of statistical modelling. Terminology is inconsis ...
: the data is assumed to arise by first probabilistically picking a point in a low-dimensional space, mapping the point to the observed high-dimensional input space (via a smooth function), then adding noise in that space. The parameters of the low-dimensional probability distribution, the smooth map and the noise are all learned from the training data using the expectation-maximization (EM) algorithm. GTM was introduced in 1996 in a paper by
Christopher Bishop Christopher Michael Bishop (born 7 April 1959) is the Laboratory Director at Microsoft Research, Microsoft Research Cambridge, Honorary Professor of Computer Science at the University of Edinburgh and a Fellow of Darwin College, Cambridge. Bis ...
, Markus Svensen, and Christopher K. I. Williams.


Details of the algorithm

The approach is strongly related to
density networks Density (volumetric mass density or specific mass) is the substance's mass per unit of volume. The symbol most often used for density is ''ρ'' (the lower case Greek letter rho), although the Latin letter ''D'' can also be used. Mathematically ...
which use
importance sampling Importance sampling is a Monte Carlo method for evaluating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Its introduction in statistics is generally att ...
and a multi-layer perceptron to form a non-linear
latent variable model A latent variable model is a statistical model that relates a set of observable variables (also called ''manifest variables'' or ''indicators'') to a set of latent variables. It is assumed that the responses on the indicators or manifest variabl ...
. In the GTM the latent space is a discrete grid of points which is assumed to be non-linearly projected into data space. A
Gaussian noise Gaussian noise, named after Carl Friedrich Gauss, is a term from signal processing theory denoting a kind of signal noise that has a probability density function (pdf) equal to that of the normal distribution (which is also known as the Gaussia ...
assumption is then made in data space so that the model becomes a constrained mixture of Gaussians. Then the model's likelihood can be maximized by EM. In theory, an arbitrary nonlinear parametric deformation could be used. The optimal parameters could be found by gradient descent, etc. The suggested approach to the nonlinear mapping is to use a
radial basis function network In the field of mathematical modeling, a radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inp ...
(RBF) to create a nonlinear mapping between the latent space and the data space. The nodes of the RBF network then form a
feature space In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern r ...
and the nonlinear mapping can then be taken as a
linear transform In mathematics, and more specifically in linear algebra, a linear map (also called a linear mapping, linear transformation, vector space homomorphism, or in some contexts linear function) is a mapping V \to W between two vector spaces that pre ...
of this feature space. This approach has the advantage over the suggested density network approach that it can be optimised analytically.


Uses

In data analysis, GTMs are like a nonlinear version of
principal components analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...
, which allows high-dimensional data to be modelled as resulting from Gaussian noise added to sources in lower-dimensional latent space. For example, to locate stocks in plottable 2D space based on their hi-D time-series shapes. Other applications may want to have fewer sources than data points, for example mixture models. In generative deformational modelling, the latent and data spaces have the same dimensions, for example, 2D images or 1 audio sound waves. Extra 'empty' dimensions are added to the source (known as the 'template' in this form of modelling), for example locating the 1D sound wave in 2D space. Further nonlinear dimensions are then added, produced by combining the original dimensions. The enlarged latent space is then projected back into the 1D data space. The probability of a given projection is, as before, given by the product of the likelihood of the data under the Gaussian noise model with the prior on the deformation parameter. Unlike conventional spring-based deformation modelling, this has the advantage of being analytically optimizable. The disadvantage is that it is a 'data-mining' approach, i.e. the shape of the deformation prior is unlikely to be meaningful as an explanation of the possible deformations, as it is based on a very high, artificial- and arbitrarily constructed nonlinear latent space. For this reason the prior is learned from data rather than created by a human expert, as is possible for spring-based models.


Comparison with Kohonen's self-organizing maps

While nodes in the self-organizing map (SOM) can wander around at will, GTM nodes are constrained by the allowable transformations and their probabilities. If the deformations are well-behaved the topology of the latent space is preserved. The SOM was created as a biological model of neurons and is a heuristic algorithm. By contrast, the GTM has nothing to do with neuroscience or cognition and is a probabilistically principled model. Thus, it has a number of advantages over SOM, namely: * it explicitly formulates a density model over the data. * it uses a cost function that quantifies how well the map is trained. * it uses a sound optimization procedure ( EM algorithm). GTM was introduced by Bishop, Svensen and Williams in their Technical Report in 1997 (Technical Report NCRG/96/015, Aston University, UK) published later in Neural Computation. It was also described in the PhD thesis of Markus Svensen (Aston, 1998).


Applications

{{Empty section, date=May 2015


See also

* Self-organizing map (SOM) *
Artificial Neural Network Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected unit ...
*
Connectionism Connectionism refers to both an approach in the field of cognitive science that hopes to explain mental phenomena using artificial neural networks (ANN) and to a wide range of techniques and algorithms using ANNs in the context of artificial in ...
* Data mining *
Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
*
Nonlinear dimensionality reduction Nonlinear dimensionality reduction, also known as manifold learning, refers to various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-d ...
*
Neural network software Neural network software is used to simulate, research, develop, and apply artificial neural networks, software concepts adapted from biological neural networks, and in some cases, a wider array of adaptive systems such as artificial intelligence ...
*
Pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphi ...


External links


Bishop, Svensen and Williams Generative Topographic Mapping paper

Generative topographic mapping
developed at the Neural Computing Research Group os Aston University (UK). ( Matlab toolbox ) Artificial neural networks