Spectral Clustering

picture info	Spectral Clustering In multivariate statistics, spectral clustering techniques make use of the spectrum (eigenvalues) of the similarity matrix of the data to perform dimensionality reduction before clustering in fewer dimensions. The similarity matrix is provided as an input and consists of a quantitative assessment of the relative similarity of each pair of points in the dataset. In application to image segmentation, spectral clustering is known as segmentation-based object categorization. Definitions Given an enumerated set of data points, the similarity matrix may be defined as a symmetric matrix A, where A_\geq 0 represents a measure of the similarity between data points with indices i and j. The general approach to spectral clustering is to use a standard clustering method (there are many such methods, ''k''-means is discussed below) on relevant eigenvectors of a Laplacian matrix of A. There are many different ways to define a Laplacian which have different mathematical interpretations, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Laplacian Matrix In the mathematical field of graph theory, the Laplacian matrix, also called the graph Laplacian, admittance matrix, Kirchhoff matrix or discrete Laplacian, is a matrix representation of a graph. Named after Pierre-Simon Laplace, the graph Laplacian matrix can be viewed as a matrix form of the negative discrete Laplace operator on a graph approximating the negative continuous Laplacian obtained by the finite difference method. The Laplacian matrix relates to many useful properties of a graph. Together with Kirchhoff's theorem, it can be used to calculate the number of spanning trees for a given graph. The sparsest cut of a graph can be approximated through the Fiedler vector — the eigenvector corresponding to the second smallest eigenvalue of the graph Laplacian — as established by Cheeger's inequality. The spectral decomposition of the Laplacian matrix allows constructing low dimensional embeddings that appear in many machine learning applications and determines a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	LOBPCG Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest (or smallest) eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem :A x= \lambda B x, for a given pair (A, B) of complex Hermitian or real symmetric matrices, where the matrix B is also assumed positive-definite. Background Kantorovich in 1948 proposed calculating the smallest eigenvalue \lambda_1 of a symmetric matrix A by steepest descent using a direction r = Ax-\lambda (x) x of a scaled gradient of a Rayleigh quotient \lambda(x) = (x, Ax)/(x, x) in a scalar product (x, y) = x'y, with the step size computed by minimizing the Rayleigh quotient in the linear span of the vectors x and w, i.e. in a locally optimal manner. Samokish proposed applying a preconditioner T to the residual vector r to generate the preconditioned direction w = T r and derived asymptotic, as x approaches the eigenvector, convergence rate bounds. D' ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Preconditioner In mathematics, preconditioning is the application of a transformation, called the preconditioner, that conditions a given problem into a form that is more suitable for numerical solving methods. Preconditioning is typically related to reducing a condition number of the problem. The preconditioned problem is then usually solved by an iterative method. Preconditioning for linear systems In linear algebra and numerical analysis, a preconditioner P of a matrix A is a matrix such that P^A has a smaller condition number than A. It is also common to call T=P^ the preconditioner, rather than P, since P itself is rarely explicitly available. In modern preconditioning, the application of T=P^, i.e., multiplication of a column vector, or a block of column vectors, by T=P^, is commonly performed in a matrix-free fashion, i.e., where neither P, nor T=P^ (and often not even A) are explicitly available in a matrix form. Preconditioners are useful in iterative methods to solve a linear ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Ill-conditioned In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the input, and how much error in the output results from an error in the input. Very frequently, one is solving the inverse problem: given f(x) = y, one is solving for ''x,'' and thus the condition number of the (local) inverse must be used. In linear regression the condition number of the moment matrix can be used as a diagnostic for multicollinearity. The condition number is an application of the derivative, and is formally defined as the value of the asymptotic worst-case relative change in output for a relative change in input. The "function" is the solution of a problem and the "arguments" are the data in the problem. The condition number is frequently applied to questions in linear algebra, in which case the derivative is straightforward b ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Lanczos Algorithm The Lanczos algorithm is an iterative method devised by Cornelius Lanczos that is an adaptation of power methods to find the m "most useful" (tending towards extreme highest/lowest) eigenvalues and eigenvectors of an n \times n Hermitian matrix, where m is often but not necessarily much smaller than n . Although computationally efficient in principle, the method as initially formulated was not useful, due to its numerical instability. In 1970, Ojalvo and Newman showed how to make the method numerically stable and applied it to the solution of very large engineering structures subjected to dynamic loading. This was achieved using a method for purifying the Lanczos vectors (i.e. by repeatedly reorthogonalizing each newly generated vector with all previously generated ones) to any degree of accuracy, which when not performed, produced a series of vectors that were highly contaminated by those associated with the lowest natural frequencies. In their original work, these authors ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Matrix-free Methods In computational mathematics, a matrix-free method is an algorithm for solving a linear system of equations or an eigenvalue problem that does not store the coefficient matrix explicitly, but accesses the matrix by evaluating matrix-vector products. Such methods can be preferable when the matrix is so big that storing and manipulating it would cost a lot of memory and computing time, even with the use of methods for sparse matrices. Many iterative methods allow for a matrix-free implementation, including: * the power method, * the Lanczos algorithm, * Locally Optimal Block Preconditioned Conjugate Gradient Method (LOBPCG), * Wiedemann's coordinate recurrence algorithm, and * the conjugate gradient method. Distributed solutions have also been explored using coarse-grain parallel software systems to achieve homogeneous solutions of linear systems. It is generally used in solving non-linear equations like Euler's equations in Computational Fluid Dynamics. Matrix-free conjugate gra ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	K-means Clustering ''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. ''k''-means clustering minimizes within-cluster variances ( squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances. For instance, better Euclidean solutions can be found using k-medians and k-medoids. The problem is computationally difficult (NP-hard); however, efficient heuristic algorithms converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refi ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	DBSCAN Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander and Xiaowei Xu in 1996. It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed together (points with many nearby neighbors), marking as outliers points that lie alone in low-density regions (whose nearest neighbors are too far away). DBSCAN is one of the most common clustering algorithms and also most cited in scientific literature. In 2014, the algorithm was awarded the test of time award (an award given to algorithms which have received substantial attention in theory and practice) at the leading data mining conference, ACM SIGKDD. , the follow-up paper "DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN" appears in the list of the 8 most downloaded articles of the prestigious ACM Transactions on Database Sy ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Hierarchical Clustering In data mining and statistics, hierarchical clustering (also called hierarchical cluster analysis or HCA) is a method of cluster analysis that seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two categories: * Agglomerative: This is a " bottom-up" approach: Each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. * Divisive: This is a "top-down" approach: All observations start in one cluster, and splits are performed recursively as one moves down the hierarchy. In general, the merges and splits are determined in a greedy manner. The results of hierarchical clustering are usually presented in a dendrogram. The standard algorithm for hierarchical agglomerative clustering (HAC) has a time complexity of \mathcal(n^3) and requires \Omega(n^2) memory, which makes it too slow for even medium data sets. However, for some special cases, optimal efficient agglomerative methods (of comp ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Fiedler Vector The algebraic connectivity (also known as Fiedler value or Fiedler eigenvalue after Miroslav Fiedler) of a graph ''G'' is the second-smallest eigenvalue (counting multiple eigenvalues separately) of the Laplacian matrix of ''G''. This eigenvalue is greater than 0 if and only if ''G'' is a connected graph. This is a corollary to the fact that the number of times 0 appears as an eigenvalue in the Laplacian is the number of connected components in the graph. The magnitude of this value reflects how well connected the overall graph is. It has been used in analyzing the robustness and synchronizability of networks. Properties The truncated icosahedron or Buckminsterfullerene graph has a traditional connectivity (graph theory)">connectivity of 3, and an algebraic connectivity of 0.243. The algebraic connectivity of undirected graphs with nonnegative weights, a(G)\geq0 with the inequality being strict if and only if G is connected. However, the algebraic connectivity can be negative f ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Eigenvalue In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted by \lambda, is the factor by which the eigenvector is scaled. Geometrically, an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched. If the eigenvalue is negative, the direction is reversed. Loosely speaking, in a multidimensional vector space, the eigenvector is not rotated. Formal definition If is a linear transformation from a vector space over a field into itself and is a nonzero vector in , then is an eigenvector of if is a scalar multiple of . This can be written as T(\mathbf) = \lambda \mathbf, where is a scalar in , known as the eigenvalue, characteristic value, or characteristic roo ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]