HOME

TheInfoList



OR:

Projection pursuit (PP) is a type of statistical technique which involves finding the most "interesting" possible projections in multidimensional data. Often, projections which deviate more from a
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
are considered to be more interesting. As each projection is found, the data are reduced by removing the component along that projection, and the process is repeated to find new projections; this is the "pursuit" aspect that motivated the technique known as
matching pursuit Matching pursuit (MP) is a sparse approximation algorithm which finds the "best matching" projections of multidimensional data onto the span of an over-complete (i.e., redundant) dictionary D. The basic idea is to approximately represent a signal ...
. The idea of projection pursuit is to locate the projection or projections from
high-dimensional space In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. Thus, a line has a dimension of one (1D) because only one coordina ...
to low-dimensional space that reveal the most details about the structure of the data set. Once an interesting set of projections has been found, existing structures (clusters, surfaces, etc.) can be extracted and analyzed separately. Projection pursuit has been widely used for
blind source separation Source separation, blind signal separation (BSS) or blind source separation, is the separation of a set of source signal processing, signals from a set of mixed signals, without the aid of information (or with very little information) about the s ...
, so it is very important in
independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents ar ...
. Projection pursuit seeks one projection at a time such that the extracted signal is as non-Gaussian as possible.


History

Projection pursuit technique were originally proposed and experimented by Kruskal. Related ideas occur in Switzer (1970) "Numerical classification" pp31–43 in "Computer Applications in the Earth Sciences: Geostatistics, and Switzer and Wright (1971) "Numerical classification of eocene nummulitids," Mathematical Geology pp 297–311. The first successful implementation is due to Jerome H. Friedman and John Tukey (1974), who named projection pursuit. The original purpose of projection pursuit was to machine-pick "interesting" low-dimensional projections of a high-dimensional point cloud by numerically maximizing a certain objective function or projection index. Several years later, Friedman and Stuetzle extended the idea behind projection pursuit and added
projection pursuit regression In statistics, projection pursuit regression (PPR) is a statistical model developed by Jerome H. Friedman and Werner Stuetzle which is an extension of additive models. This model adapts the additive models in that it first projects the data matr ...
(PPR), projection pursuit classification (PPC), and projection pursuit density estimation (PPDE).


Feature

The most exciting feature of projection pursuit is that it is one of the very few multivariate methods able to bypass the "curse of dimensionality" caused by the fact that high-dimensional space is mostly empty. In addition, projection pursuit is able to ignore irrelevant (i.e. noisy and information-poor) variables. This is a distinct advantage over methods based on interpoint distances like minimal spanning trees, multidimensional scaling and most clustering techniques. Many of the methods of classical multivariate analysis turn out to be special cases of projection pursuit. Examples are principal component analysis and
discriminant analysis Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features ...
, and the quartimax and oblimax methods in
factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
. One serious drawback of projection pursuit methods is their high demand on computer time.


See also

*
Projection pursuit regression In statistics, projection pursuit regression (PPR) is a statistical model developed by Jerome H. Friedman and Werner Stuetzle which is an extension of additive models. This model adapts the additive models in that it first projects the data matr ...
*
Targeted projection pursuit Targeted projection pursuit is a type of statistical technique used for exploratory data analysis, information visualization, and feature selection. It allows the user to interactively explore very complex data (typically having tens to hundreds o ...


References

{{Authority control Exploratory data analysis Multivariate statistics