Latent Manifold
   HOME

TheInfoList



OR:

A latent space, also known as a latent feature space or embedding space, is an
embedding In mathematics, an embedding (or imbedding) is one instance of some mathematical structure contained within another instance, such as a group that is a subgroup. When some object X is said to be embedded in another object Y, the embedding is g ...
of a set of items within a manifold in which items resembling each other are positioned closer to one another in the latent space. Position within the latent space can be viewed as being defined by a set of
latent variable In statistics, latent variables (from Latin: present participle of ''lateo'', “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or me ...
s that emerge from the resemblances from the objects. In most cases, the
dimensionality In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordin ...
of the latent space is chosen to be lower than the dimensionality of the
feature space In machine learning and pattern recognition, a feature is an individual measurable property or characteristic of a phenomenon. Choosing informative, discriminating and independent features is a crucial element of effective algorithms in pattern r ...
from which the data points are drawn, making the construction of a latent space an example of
dimensionality reduction Dimensionality reduction, or dimension reduction, is the transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data, ideally ...
, which can also be viewed as a form of
data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compressio ...
. Latent spaces are usually fit via
machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...
, and they can then be used as feature spaces in machine learning models, including classifiers and other supervised predictors. The interpretation of the latent spaces of machine learning models is an active field of study, but latent space interpretation is difficult to achieve. Due to the black-box nature of machine learning models, the latent space may be completely unintuitive. Additionally, the latent space may be high-dimensional, complex, and nonlinear, which may add to the difficulty of interpretation. Some visualization techniques have been developed to connect the latent space to the visual world, but there is often not a direct connection between the latent space interpretation and the model itself. Such techniques include
t-distributed stochastic neighbor embedding t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally d ...
(t-SNE), where the latent space is mapped to two dimensions for visualization. Latent space distances lack physical units, so the interpretation of these distances may depend on the application. A number of algorithms exist to create latent space embeddings given a set of data items and a
similarity function In statistics and related fields, a similarity measure or similarity function or similarity metric is a real-valued function that quantifies the similarity between two objects. Although no single definition of a similarity exists, usually such mea ...
.


See also

*
Induced topology In topology and related areas of mathematics, a subspace of a topological space ''X'' is a subset ''S'' of ''X'' which is equipped with a topology induced from that of ''X'' called the subspace topology (or the relative topology, or the induced to ...
*
Clustering algorithm Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...
*
Intrinsic dimension The intrinsic dimension for a data set can be thought of as the number of variables needed in a minimal representation of the data. Similarly, in signal processing of multidimensional signals, the intrinsic dimension of the signal describes how many ...
*
Latent semantic analysis Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the do ...
*
Manifold hypothesis In theoretical computer science and the study of machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set ...
* Nonlinear dimensionality reduction *
Self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the t ...


References

Data mining Cluster analysis {{AI-stub